|
|
-
Load XML file into HIVE
Sadananda Hegde 2012-08-30, 19:57
Hi,
I would like to load an XML data file into HIVE. I created a table with just one column:
create table xmltable (xmldata String ) STORED AS TEXTFILE;
and then loaded the xml file into that table
LOAD DATA LOCAL INPATH '/test.xml' OVERWRITE INTO TABLE xmltable;
I thought I can use the XPATH to extract individual elements. But I am not sure
1) How to specify the root node as a record terminator on CREATE TABLE statement (it's using '\n' by default) 2) Change the current context / node for the XPATH
Can some one provide guidance and may be point to some good examples?
Thanks, Sadu
-
Re: Load XML file into HIVE
Matt Tucker 2012-08-30, 23:25
Hi,
I was working on this several months ago, and ended up having to flatten each XML document to one root node per line. I believe that the other option would be to write a custom InputFormat.
Matt
On Aug 30, 2012, at 3:57 PM, Sadananda Hegde <[EMAIL PROTECTED]> wrote:
> Hi, > > I would like to load an XML data file into HIVE. I created a table with just one column: > > create table xmltable (xmldata String ) > STORED AS TEXTFILE; > > and then loaded the xml file into that table > > LOAD DATA LOCAL INPATH '/test.xml' > OVERWRITE INTO TABLE xmltable; > > I thought I can use the XPATH to extract individual elements. But I am not sure > > 1) How to specify the root node as a record terminator on CREATE TABLE statement (it's using '\n' by default) > 2) Change the current context / node for the XPATH > > Can some one provide guidance and may be point to some good examples? > > Thanks, > Sadu
-
Re: Load XML file into HIVE
Sadananda Hegde 2012-09-02, 13:06
Thanks, Matt. We might go with 'flattening' option. I was hoping I can do it using XPATH without need for any custom coding. Regards, Sadu On Thu, Aug 30, 2012 at 6:25 PM, Matt Tucker <[EMAIL PROTECTED]> wrote:
> Hi, > > I was working on this several months ago, and ended up having to flatten > each XML document to one root node per line. I believe that the other > option would be to write a custom InputFormat. > > Matt > > > > On Aug 30, 2012, at 3:57 PM, Sadananda Hegde <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I would like to load an XML data file into HIVE. I created a table with > just one column: > > > > create table xmltable (xmldata String ) > > STORED AS TEXTFILE; > > > > and then loaded the xml file into that table > > > > LOAD DATA LOCAL INPATH '/test.xml' > > OVERWRITE INTO TABLE xmltable; > > > > I thought I can use the XPATH to extract individual elements. But I am > not sure > > > > 1) How to specify the root node as a record terminator on CREATE TABLE > statement (it's using '\n' by default) > > 2) Change the current context / node for the XPATH > > > > Can some one provide guidance and may be point to some good examples? > > > > Thanks, > > Sadu >
-
Re: Load XML file into HIVE
Ramkumar 2012-09-03, 10:55
The simplest solution I can think of is to write some intermediate layer of code in PIG that uses XMLLoader to convert it to a csv/tsv and then read it in hive.
________________________________ From: Sadananda Hegde <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Sunday, September 2, 2012 2:06 PM Subject: Re: Load XML file into HIVE
Thanks, Matt. We might go with 'flattening' option. I was hoping I can do it using XPATH without need for any custom coding.
Regards, Sadu
On Thu, Aug 30, 2012 at 6:25 PM, Matt Tucker <[EMAIL PROTECTED]> wrote:
Hi, > >I was working on this several months ago, and ended up having to flatten each XML document to one root node per line. I believe that the other option would be to write a custom InputFormat. > >Matt > > > > >On Aug 30, 2012, at 3:57 PM, Sadananda Hegde <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> I would like to load an XML data file into HIVE. I created a table with just one column: >> >> create table xmltable (xmldata String ) >> STORED AS TEXTFILE; >> >> and then loaded the xml file into that table >> >> LOAD DATA LOCAL INPATH '/test.xml' >> OVERWRITE INTO TABLE xmltable; >> >> I thought I can use the XPATH to extract individual elements. But I am not sure >> >> 1) How to specify the root node as a record terminator on CREATE TABLE statement (it's using '\n' by default) >> 2) Change the current context / node for the XPATH >> >> Can some one provide guidance and may be point to some good examples? >> >> Thanks, >> Sadu >
|
|