Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # dev >> Review Request 13035: SQOOP-744: log4j configuration for generated mapreduce job


+
Raghav Gautam 2013-07-29, 22:32
+
Jarek Cecho 2013-07-30, 01:10
+
Raghav Gautam 2013-07-30, 19:32
+
Raghav Gautam 2013-07-30, 19:37
+
Jarek Cecho 2013-08-14, 23:57
Copy link to this message
-
Re: Review Request 13035: SQOOP-744: log4j configuration for generated mapreduce job


> On Aug. 14, 2013, 4:57 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/resources/META-INF/log4j.properties, lines 20-23
> > <https://reviews.apache.org/r/13035/diff/2/?file=330782#file330782line20>
> >
> >     I've tried the patch on a real cluster and got following output (please accept my apologies for the really long text):
> >    
> >     Task Logs: 'attempt_201308141631_0001_m_000001_0'
> >    
> >    
> >     stdout logs
> >    
> >    
> >     stderr logs
> >     2660 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - SqoopOutputFormatLoadExecutor consumer thread is starting
> >     2748 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - Running loader class org.apache.sqoop.job.etl.HdfsTextImportLoader
> >     2752 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Starting progress service
> >     2777 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Running extractor class org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> >     2782 [pool-2-thread-1] DEBUG org.apache.sqoop.job.mr.ProgressRunnable  - Auto-progress thread reporting progress
> >     3969 [main] INFO  org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor  - Using query: SELECT * FROM text WHERE 100001 <= id AND id < 200001
> >     32122 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Extractor has finished
> >     32129 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Stopping progress service
> >     32136 [main] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> >     34002 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - Loader has finished
> >     34002 [main] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> >    
> >    
> >     syslog logs
> >     2013-08-14 16:47:17,950 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
> >     2013-08-14 16:47:19,451 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
> >     2013-08-14 16:47:19,452 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId> >     2013-08-14 16:47:20,160 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> >     2013-08-14 16:47:20,172 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@549b6220
> >     2013-08-14 16:47:20,585 INFO org.apache.hadoop.mapred.MapTask: Processing split: org.apache.sqoop.job.mr.SqoopSplit@23de4dd8
> >     2013-08-14 16:47:20,610 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: SqoopOutputFormatLoadExecutor consumer thread is starting
> >     2013-08-14 16:47:20,698 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: Running loader class org.apache.sqoop.job.etl.HdfsTextImportLoader
> >     2013-08-14 16:47:20,702 INFO org.apache.sqoop.job.mr.SqoopMapper: Starting progress service
> >     2013-08-14 16:47:20,727 INFO org.apache.sqoop.job.mr.SqoopMapper: Running extractor class org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> >     2013-08-14 16:47:20,732 DEBUG org.apache.sqoop.job.mr.ProgressRunnable: Auto-progress thread reporting progress
> >     2013-08-14 16:47:21,919 INFO org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor: Using query: SELECT * FROM text WHERE 100001 <= id AND id < 200001
> >     2013-08-14 16:47:50,072 INFO org.apache.sqoop.job.mr.SqoopMapper: Extractor has finished
> >     2013-08-14 16:47:50,079 INFO org.apache.sqoop.job.mr.SqoopMapper: Stopping progress service
> >     2013-08-14 16:47:50,086 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor: SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed

Syslog is where all the logging goes and I think this is controlled by $HADOOP_CONF_DIR/log4j.properties. The issue with this is that it has logs that are completely unrelated to our Sqoop jobs. And since $HADOOP_CONF_DIR/log4j.properties is not under Sqoop's control there is not much we can do there.

The patch allows Sqoop job to have it's own log4j.properties. This allows it to have it's own appender and conversion pattern and print sqoop's logs to stderr exclusively. This would be useful in situations where we need to pull these logs from Hadoop and show them to the user to help them with debugging and stuff.
- Raghav
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13035/#review25193
On July 30, 2013, 12:37 p.m., Raghav Gautam wrote:
+
Jarek Cecho 2013-08-19, 00:49
+
Raghav Gautam 2013-08-21, 20:54
+
Jarek Cecho 2013-08-22, 05:21
+
Raghav Gautam 2013-08-23, 22:53
+
Jarek Cecho 2013-08-24, 18:45