Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Efficient load for data with large number of columns


Copy link to this message
-
Re: Efficient load for data with large number of columns
suppose my data has 100 columns or fields, and i want to impose a schema.
is there a way i can create a separate file describing the schema of these
fields, and let PIG read the schema from that file?
Yes, if you put a json file named " .pig_schema" in the same directory as your data, Pig will use it to determine the schema:

http://pig.apache.org/docs/r0.10.0/func.html#pigstorage

Regards,
Marcos
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB