Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Tips on finding the file that is causing sqoop to choke


Copy link to this message
-
Re: Tips on finding the file that is causing sqoop to choke
Jarek Jarcec Cecho 2013-03-17, 14:37
Hi Matthieu,
this looks to me as a corrupted data on first sight. It seems that Sqoop is trying to access more columns that are on the input line. We've significantly improved error logging for this kind of exceptions, so it would be great if you could upgrade to latest Sqoop 1.4.3 [1] and rerun your failing job.

Jarcec

Links:
1: http://www.apache.org/dist/sqoop/1.4.3/

On Fri, Mar 15, 2013 at 05:16:45PM -0400, Matthieu Labour wrote:
> Sometimes when processing lots of logs I get the following errors. Please
> do you have any tips on where/ how i can find out the file / line that is
> causing sqoop to choke?
> Thank you
>
> 2013-03-15 21:01:53,619 INFO org.apache.hadoop.util.NativeCodeLoader
> (main): Loaded the native-hadoop library
> 2013-03-15 21:01:54,301 WARN
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl (main): Source name ugi
> already exists!
> 2013-03-15 21:01:54,562 INFO org.apache.hadoop.mapred.MapTask (main): Host
> name: domU-12-31-39-02-61-87.compute-1.internal
> 2013-03-15 21:01:54,618 INFO org.apache.hadoop.util.ProcessTree (main):
> setsid exited with exit code 0
> 2013-03-15 21:01:54,651 INFO org.apache.hadoop.mapred.Task (main):  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1b7ae22
> 2013-03-15 21:01:55,123 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader
> (main): Loaded native gpl library
> 2013-03-15 21:01:55,167 WARN com.hadoop.compression.lzo.LzoCodec (main):
> Could not find build properties file with revision hash
> 2013-03-15 21:01:55,167 INFO com.hadoop.compression.lzo.LzoCodec (main):
> Successfully loaded & initialized native-lzo library [hadoop-lzo rev
> UNKNOWN]
> 2013-03-15 21:01:55,178 WARN
> org.apache.hadoop.io.compress.snappy.LoadSnappy (main): Snappy native
> library is available
> 2013-03-15 21:01:55,178 INFO
> org.apache.hadoop.io.compress.snappy.LoadSnappy (main): Snappy native
> library loaded
> 2013-03-15 21:01:59,652 DEBUG
> org.apache.sqoop.mapreduce.AsyncSqlOutputFormat (Thread-13): Committing
> transaction of 100 statements
> 2013-03-15 21:01:59,874 DEBUG org.apache.sqoop.mapreduce.AutoProgressMapper
> (main): Instructing auto-progress thread to quit.
> 2013-03-15 21:01:59,875 INFO org.apache.sqoop.mapreduce.AutoProgressMapper
> (Thread-14): Auto-progress thread is finished. keepGoing=false
> 2013-03-15 21:01:59,875 DEBUG org.apache.sqoop.mapreduce.AutoProgressMapper
> (main): Waiting for progress thread shutdown...
> 2013-03-15 21:01:59,875 DEBUG org.apache.sqoop.mapreduce.AutoProgressMapper
> (main): Progress thread shutdown detected.
> 2013-03-15 21:01:59,878 INFO org.apache.hadoop.mapred.TaskLogsTruncater
> (main): Initializing logs' truncater with mapRetainSize=-1 and
> reduceRetainSize=-1
> 2013-03-15 21:01:59,958 INFO org.apache.hadoop.io.nativeio.NativeIO (main):
> Initialized cache for UID to User mapping with a cache timeout of 14400
> seconds.
> 2013-03-15 21:01:59,958 INFO org.apache.hadoop.io.nativeio.NativeIO (main):
> Got UserName hadoop for UID 106 from the native implementation
> 2013-03-15 21:01:59,961 WARN org.apache.hadoop.mapred.Child (main): Error
> running child
> java.util.NoSuchElementException
> at java.util.AbstractList$Itr.next(AbstractList.java:350)
> at
> ml_ys_log_gmt_daily_experiment_2.__loadFromFields(ml_ys_log_gmt_daily_experiment_2.java:1290)
> at
> ml_ys_log_gmt_daily_experiment_2.parse(ml_ys_log_gmt_daily_experiment_2.java:1144)
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:77)
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:36)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)