Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Load data from xml using Mapper.py in hive

Copy link to this message
RE: Load data from xml using Mapper.py in hive
You could load this whole xml file into a table with a single row and a single column. The default record delimiter is \n but you can create a table where the record delimiter is \001. Once you do that you can follow the approach that you described below. Will this solve your problem?


From: Shuja Rehman [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 09, 2010 3:07 PM
Subject: Load data from xml using Mapper.py in hive

I have created a table in hive (Suppose table1 with two columns, col1 and col2 )

now i have an xml file for which i have write a python script which read the xml file and transform it in single row with tab seperated
e.g the output of python script can be

row 1 = val1     val2
row2 =  val3     val4

so the output of file has straight rows with the help of python script. now i want to load this into created table. I have seen the example of in which the data is first loaded in u_data table then transform it using python script in u_data_new but in m scenario. it does not fit as i have xml file as source.
Kindly let me know can I achieve this??

Shuja-ur-Rehman Baig
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445