Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Load data from xml using Mapper.py in hive


Copy link to this message
-
RE: Load data from xml using Mapper.py in hive
You could load this whole xml file into a table with a single row and a single column. The default record delimiter is \n but you can create a table where the record delimiter is \001. Once you do that you can follow the approach that you described below. Will this solve your problem?

Ashish

________________________________
From: Shuja Rehman [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 09, 2010 3:07 PM
To: [EMAIL PROTECTED]
Subject: Load data from xml using Mapper.py in hive

Hi
I have created a table in hive (Suppose table1 with two columns, col1 and col2 )

now i have an xml file for which i have write a python script which read the xml file and transform it in single row with tab seperated
e.g the output of python script can be

row 1 = val1     val2
row2 =  val3     val4

so the output of file has straight rows with the help of python script. now i want to load this into created table. I have seen the example of in which the data is first loaded in u_data table then transform it using python script in u_data_new but in m scenario. it does not fit as i have xml file as source.
Kindly let me know can I achieve this??
Thanks

--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB