Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - How do I read headers from first line into schema?


Copy link to this message
-
How do I read headers from first line into schema?
Kimmel, Chad 2013-07-09, 19:32
Hi, what I am trying to do is read the headers from the first line as the field names into the schema. For instance, given the following tab deliminated file

--samplefile.txt—
Name  Job      Age
Chad   Engineer          23
Mike    Stats    34
Chris    IT         25

Instead of deleting the first line and loading in the field names using the AS function:

rows = LOAD 'samplefile.txt’ USING PigStorage('\t') AS (Name:chararray,day,Job:chararray,Age:int);

I would like to instead read it in as part of the PIG script directly.  The reason why this is important for my project is because each file being read in has field names which change (i.e. dynamic) for each file, and I need to keep a record of these unique field names.

Does anyone know how to solve this problem?  I think the LoadMetaData might be useful, but I don’t know how to use it. Thanks!

Chad

Chad Kimmel Sr. Statistical Analyst | comScore, Inc. (NASDAQ:SCOR)

o +1 (571) 306-6439 | [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>

...........................................................................................................

comScore Media Metrix® Multi-Platform: Audience Analytics for the Brave New Digital World
www.comscore.com/multiplatform<http://www.comscore.com/multiplatform>