We have a job that processes several hundred files in a directory
We generally glob the directory in a single load statement
Sometimes the jobs chokes on a bad row in a single file
I could have sworn that pig printed the file name of the chunks it is processing in the task log but cannot see it
Does anyone know under what conditions file names are printed, or how to find the file that is causing the issues?
Romain Rigaux 2010-10-25, 16:02
Guy Bayes 2010-10-25, 16:09