Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig AvroStorage : storing the data


Copy link to this message
-
Pig AvroStorage : storing the data
REGISTER /homes/immilind/HadoopLocal/Jars/avro-1.7.1.jar
REGISTER /homes/immilind/HadoopLocal/Jars/piggybank.jar

employee= load '/user/immilind/AvroData' USING
org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
DESCRIBE employee;
DUMP employee;
--works fine and dumps data up to this point.
STORE NewEmployee INTO '/user/immilind/AvroData/StoredAvro' USING
AvroStorage();
employee= load '/user/immilind/AvroData/StoredAvro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();
DESCRIBE employee;
DUMP employee;
Error :  ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not
resolve AvroStorage using imports: [, org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]

Am I required to register any new JAR ?
Moreoer, I am trying if PigStorage works with Avro data as follows

REGISTER /homes/immilind/HadoopLocal/Jars/avro-1.7.1.jar
REGISTER /homes/immilind/HadoopLocal/Jars/piggybank.jar

employee= load '/user/immilind/AvroData' USING
org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
DESCRIBE employee;
DUMP employee;

NewEmployee = foreach employee generate name as name, age as age,dept as
dept,office as office,salary as salary,lastname as lastname;
STORE NewEmployee INTO '/user/immilind/AvroData/StoredAvro' USING
PigStorage(',');
--works fine till here by creating a Data file

employee = LOAD '/user/immilind/AvroData/StoredAvro' USING PigStorage() as
(name:chararray, age:int, dept:chararray, office:chararray, salary:int,
lastname:chararray);
DESCRIBE employee;
DUMP employee;

OR

REGISTER /homes/immilind/HadoopLocal/Jars/avro-1.7.1.jar
REGISTER /homes/immilind/HadoopLocal/Jars/piggybank.jar

employee= load '/user/immilind/AvroData' USING
org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
DESCRIBE employee;
DUMP employee;

STORE employee INTO '/user/immilind/AvroData/StoredAvro' USING
PigStorage(',');
--works fine till here by creating a Data file

employee = LOAD '/user/immilind/AvroData/StoredAvro' USING PigStorage() as
(name:chararray, age:int, dept:chararray, office:chararray, salary:int,
lastname:chararray);
DESCRIBE employee;
DUMP employee;
2013-01-11 15:49:55,740 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2245: Cannot get schema from loadFunc
org.apache.pig.piggybank.storage.avro.AvroStorage

Is it possible to load  AvroData and then store is as plain data n load it
again ?
+
Cheolsoo Park 2013-01-11, 19:33
+
Milind Vaidya 2013-01-11, 21:02
+
Russell Jurney 2013-01-11, 21:04
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB