-Re: How to get/operate the InputFileName in pig 0.8.1
Jameson Li 2011-06-16, 13:09
Great. Depend on the
the setting:-Dpig.noSplitCombination=true, I can get the filename in the
But I have another problem.
I modify the UDF code and ant it and generate the newest jar file(I am sure
the jar file has updated)
pig -x local
a = load 'aaa';
b = foreach a generate com.company.pig.myUDF();
I found that the result has been using the old jar file and UDF class, and I
think UDF classes has been caced somewhere.
Am I right?
And how to using the really newest jar file after re-compile?
Thanks very much.
2011/6/15 Daniel Dai <[EMAIL PROTECTED]>
> Check http://wiki.apache.org/pig/PigStorageWithInputPath, also you will
> need to disable split combination: -Dpig.noSplitCombination=true
> On 06/13/2011 04:07 AM, Jameson Li wrote:
> I hava some files in the hdfs://path/load/ like this:
> These files are generate by other M/R jobs. The files are only contains one
> column, and the number in the file name between 'file_' and '_00001' is a
> I want to add the id into its input format like this(I think I should to
> write a LoadFunc to get the id):
> a = load '/path/load/' as com.company.pig.
> dump a;
> //here the parameter 'a' will have two columns:one is the origin column and
> the other is the id.
> And my question are these:
> 1, Does there have the existing func that I can get the id from the file
> 2, I think the method in pig 0.6.0 can help me:
> org.apache.pig.impl.io.BufferedPositionedInputStream, long,
> long)> <http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/builtin/PigStorage.html#bindTo(java.lang.String,org.apache.pig.impl.io.BufferedPositionedInputStream,long,long)>*(String<http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html?is-external=true> <http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html?is-external=true>
> fileName, BufferedPositionedInputStream<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/impl/io/BufferedPositionedInputStream.html> <http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/impl/io/BufferedPositionedInputStream.html>
> long offset, long end)
> Specifies a portion of an InputStream to read tuples.
> but I can't find the same method in pig 0.8.1.
> Which method can I use to operate the input file in the pig 0.8.1 API?
> Thanks very much.