Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> XML -> Pig UDF


I want to extend the existing XMLLoader to go beyond capturing the text
inside a tag and to actually create a Pig mapping of the Document Object
Model the XML represents. This would be similar to elephant-bird's
JsonLoader.

For instance, check this example: https://gist.github.com/4368194

Semi-structured data can vary, so this behavior can be risky but... I want
people to be able to load JSON and XML data easily their first session with
Pig.

Russell Jurney http://datasyndrome.com
+
Vitalii Tymchyshyn 2012-12-24, 08:09
+
Russell Jurney 2012-12-24, 08:13
+
Vitalii Tymchyshyn 2012-12-29, 23:00