My TaskLogAppender splits each log entry into its own file?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

My TaskLogAppender splits each log entry into its own file?

Anthony D. Urso
I have started to use the following log4j xml to send logs to both the
mapreduce tasklog and to the syslog daemon.  Unfortunately, it creates
a new log split in the tasklog for each log entry.

Is this a problem with the TaskLogAppender?  If not, does anyone know
method of tapping into the mapreduce logging that won't cause zillions
of log splits to be created?

XML properties and an exception below:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
    <!-- Log base level and above to syslog, for posterity's sake.
    <appender name="syslog" class="org.apache.log4j.net.SyslogAppender">
        <param name="SyslogHost" value="localhost" />
        <param name="Facility" value="LOCAL5" />
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%p %c: %m%n"/>
        </layout>
    </appender>  -->
   
    <!-- Log everything to the task log. -->
    <appender name="TLA" class="org.apache.hadoop.mapred.TaskLogAppender">
        <param name="taskId" value="${hadoop.tasklog.taskid}"/>
        <!-- keep alot, they are one line each -->
        <param name="noKeepSplits" value="999999"/>
        <param name="totalLogFileSize" value="100"/>
        <param name="purgeLogSplits" value="true"/>
        <param name="logsRetainHours" value="4"/>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d %p %c: %m%n"/>
        </layout>
    </appender>
   
    <!-- A console logger for hadoop. -->
    <appender name="console" class="org.apache.log4j.ConsoleAppender">
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d %p %c: %m%n"/>
        </layout>
    </appender>

    <!-- Log task stuff tasklog and syslog -->
    <logger name="com.example.package" additivity="false">
        <level value="ALL" />
        <!-- <appender-ref ref="syslog" /> -->
        <appender-ref ref="TLA" />
    </logger>

    <!-- Log hadoop stuff as per the original log4j.properties. -->
    <root>
        <level value="INFO" />
        <appender-ref ref="console" />
    </root>
</log4j:configuration
>
An exception, perhaps related...

ERROR Failed to close the task's log with the exception: java.io.IOException: Bad file descriptor
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.hadoop.mapred.TaskLog$Writer.writeIndexRecord(TaskLog.java:252)
        at org.apache.hadoop.mapred.TaskLog$Writer.close(TaskLog.java:236)

--
 Au

 PGP Key ID: 0x385B44CB
 Fingerprint: 9E9E B116 DB2C D734 C090  E72F 43A0 95C4 385B 44CB
    "Maximus vero fugiens a quodam Urso, milite Romano, interemptus est"
                                               - Getica 235
Reply | Threaded
Open this post in threaded view
|

Re: My TaskLogAppender splits each log entry into its own file?

Arun C Murthy-2
Hi Anthony,

On Wed, Jul 18, 2007 at 07:42:58PM -0700, Anthony D. Urso wrote:
>I have started to use the following log4j xml to send logs to both the
>mapreduce tasklog and to the syslog daemon.  Unfortunately, it creates
>a new log split in the tasklog for each log entry.
>
>Is this a problem with the TaskLogAppender?  If not, does anyone know
>method of tapping into the mapreduce logging that won't cause zillions
>of log splits to be created?
>

Tweak the following in your hadoop-site.xml (please see hadoop-defaults.xml for their descriptions):
mapred.userlog.purgesplits (set this to *false*)
mapred.userlog.num.splits (set this to 1)

hth,
Arun

>XML properties and an exception below:
>
><?xml version="1.0" encoding="UTF-8" ?>
><!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
><log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
>    <!-- Log base level and above to syslog, for posterity's sake.
>    <appender name="syslog" class="org.apache.log4j.net.SyslogAppender">
>        <param name="SyslogHost" value="localhost" />
>        <param name="Facility" value="LOCAL5" />
>        <layout class="org.apache.log4j.PatternLayout">
>            <param name="ConversionPattern" value="%p %c: %m%n"/>
>        </layout>
>    </appender>  -->
>    
>    <!-- Log everything to the task log. -->
>    <appender name="TLA" class="org.apache.hadoop.mapred.TaskLogAppender">
>        <param name="taskId" value="${hadoop.tasklog.taskid}"/>
>        <!-- keep alot, they are one line each -->
>        <param name="noKeepSplits" value="999999"/>
>        <param name="totalLogFileSize" value="100"/>
>        <param name="purgeLogSplits" value="true"/>
>        <param name="logsRetainHours" value="4"/>
>        <layout class="org.apache.log4j.PatternLayout">
>            <param name="ConversionPattern" value="%d %p %c: %m%n"/>
>        </layout>
>    </appender>
>    
>    <!-- A console logger for hadoop. -->
>    <appender name="console" class="org.apache.log4j.ConsoleAppender">
>        <layout class="org.apache.log4j.PatternLayout">
>            <param name="ConversionPattern" value="%d %p %c: %m%n"/>
>        </layout>
>    </appender>
>
>    <!-- Log task stuff tasklog and syslog -->
>    <logger name="com.example.package" additivity="false">
>        <level value="ALL" />
>        <!-- <appender-ref ref="syslog" /> -->
>        <appender-ref ref="TLA" />
>    </logger>
>
>    <!-- Log hadoop stuff as per the original log4j.properties. -->
>    <root>
>        <level value="INFO" />
>        <appender-ref ref="console" />
>    </root>
></log4j:configuration
>>
>An exception, perhaps related...
>
>ERROR Failed to close the task's log with the exception: java.io.IOException: Bad file descriptor
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:260)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at org.apache.hadoop.mapred.TaskLog$Writer.writeIndexRecord(TaskLog.java:252)
> at org.apache.hadoop.mapred.TaskLog$Writer.close(TaskLog.java:236)
>
>--
> Au
>
> PGP Key ID: 0x385B44CB
> Fingerprint: 9E9E B116 DB2C D734 C090  E72F 43A0 95C4 385B 44CB
>    "Maximus vero fugiens a quodam Urso, milite Romano, interemptus est"
>                                               - Getica 235