Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> LOAD multiple files with glob


Copy link to this message
-
Re: LOAD multiple files with glob
Just tried this:
----------------------------------------------------
REGISTER 'hdfs:///lib/avro-1.7.2.jar';
REGISTER 'hdfs:///lib/json-simple-1.1.1.jar';
REGISTER 'hdfs:///lib/piggybank.jar';

DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage();

avro = load '/data/2012/trace_ejb3/2012-01-0*.avro' USING
AvroStorage();

groups = group avro by tracetype;

dump groups;
----------------------------------------------------

gave me:

<file avro-test.pig, line 10, column 23> Invalid field projection.
Projected field [tracetype] does not exist.

Pig Stack Trace
---------------
ERROR 1025:
<file avro-test.pig, line 10, column 23> Invalid field projection.
Projected field [tracetype] does not exist.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
to open iterator for alias groups
at org.apache.pig.PigServer.openIterator(PigServer.java:862)
at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:555)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store
alias groups
at org.apache.pig.PigServer.storeEx(PigServer.java:961)
at org.apache.pig.PigServer.store(PigServer.java:924)
at org.apache.pig.PigServer.openIterator(PigServer.java:837)
... 12 more
Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR
1025:
<file avro-test.pig, line 10, column 23> Invalid field projection.
Projected field [tracetype] does not exist.
at
org.apache.pig.newplan.logical.expression.ProjectExpression.findColNum(ProjectExpression.java:183)
at
org.apache.pig.newplan.logical.expression.ProjectExpression.setColumnNumberFromAlias(ProjectExpression.java:166)
at
org.apache.pig.newplan.logical.visitor.ColumnAliasConversionVisitor$1.visit(ColumnAliasConversionVisitor.java:53)
at
org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207)
at
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at
org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:101)
at
org.apache.pig.newplan.logical.relational.LOCogroup.accept(LOCogroup.java:235)
at
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1621)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1616)
at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1339)
at org.apache.pig.PigServer.storeEx(PigServer.java:956)
... 14 more
===============================================================================

Maybe globbing with [] doesnt work, but wildcard works? No idea why i
get the error above though..

Kind regards,

Bart

Cheolsoo Park schreef op 25.11.2012 15:33:
> Hi Bart,
>
> avro = load '/data/2012/trace_ejb3/2012-**01-*.avro' USING
> AvroStorage();
> gives me:
> Schema for avro unknown.
>
> This should work. The error that you're getting is not from
> AvroStorage but
> PigServer.
>
> grep -r "Schema for .* unknown" *
> src/org/apache/pig/PigServer.java:
>  System.out.println("Schema for " + alias + " unknown.");
> ...
>
> It looks like that you have an error in your Pig script. Can you