Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Need help to resolve one error


Copy link to this message
-
Need help to resolve one error
Hi I am new to PIG scripting.

I am trying to parse XML values through a pig script but getting the error.

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve
org.apache.pig.piggybank.storage.XMLLoader using imports: [,
org.apache.pig.builtin., org.apache.pig.impl.builtin.]

my XML file is this

<CATALOG>
<CD>
<TITLE>hadoop developer</TITLE>
<ARTIST>Haider</ARTIST>
<COUNTRY>india</COUNTRY>
<COMPANY>Deloitte</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>2013</YEAR>
</CD>
</CATALOG>

and my pig script is

REGISTER '/home/hdmaster/Downloads/piggybank.jar';
A = LOAD '/home/hdmaster/Downloads/sample.xml' USING
org.apache.pig.piggybank.storage.XMLLoader('CD')
    AS (x:chararray);

B = foreach A GENERATE FLATTEN(REGEX_EXTRACT_ALL(x,

'<CD>\\n\\s*<TITLE>(.*)</TITLE>\\n\\s*<ARTIST>(.*)</ARTIST>\\n\\s*<COUNTRY>(.*)</COUNTRY>\\n\\s*<COMPANY>(.*)</COMPANY>\\n\\s*<PRICE>(.*)</PRICE>\\n\\s*<YEAR>(.*)</YEAR>\\n\\s*</CD>'))

    AS (title:chararray, artist:chararray, country:chararray,
company:chararray, price:double, year:int);

store B into '/home/hdmaster/Downloads/results'
Any help on this highly helpful to me.

Thanks
Haider
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB