Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> possible to infer schema from TSV header?

Copy link to this message
possible to infer schema from TSV header?
I have TSVs with a lot of columns, and I would like to address them by
name, as specified in the header line (first row), within Pig.

The best I can come up with a.t.m is to write a script that strips the
header line from the file and converts it to the form (col1:string,
col2:string, ...), then plug that schema string into the AS portion of
my LOAD statement. Then I'll project columns I want and manually
typecast them.

Is there a better, simple way?