|
|
-
Create Table with Line Terminated other than '\n'
Shuja Rehman 2010-06-11, 00:39
Hi I want to create a table in hive which should have row formated line terminated other than '\n'. so i can read xml file as single cell in one row and column of table. kindly let me know how to do this? THanks
-- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
+
Shuja Rehman 2010-06-11, 00:39
-
Re: Create Table with Line Terminated other than '\n'
Carl Steinbach 2010-06-11, 01:31
Hi Shuja, The grammar for Hive's CREATE TABLE statement is discussed here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_TableYou need to use the "LINES TERMINATED BY" clause in the CREATE TABLE statement in order to specify a line terminator other than "\n". Carl On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Hi > I want to create a table in hive which should have row formated line > terminated other than '\n'. so i can read xml file as single cell in one row > and column of table. > kindly let me know how to do this? > THanks > > > > -- > Regards > Shuja-ur-Rehman Baig > _________________________________ > MS CS - School of Science and Engineering > Lahore University of Management Sciences (LUMS) > Sector U, DHA, Lahore, 54792, Pakistan > Cell: +92 3214207445 >
+
Carl Steinbach 2010-06-11, 01:31
-
Re: Create Table with Line Terminated other than '\n'
Zheng Shao 2010-06-11, 05:22
Also, changing "LINES TERMINATED BY" probably won't work, because hadoop's TextInputFormat does not allow line terminators other than "\n". Zheng On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach <[EMAIL PROTECTED]> wrote: > Hi Shuja, > The grammar for Hive's CREATE TABLE statement is discussed > here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table> You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE > statement in order to specify a line terminator other than "\n". > Carl > > On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED]> wrote: >> >> Hi >> I want to create a table in hive which should have row formated line >> terminated other than '\n'. so i can read xml file as single cell in one row >> and column of table. >> kindly let me know how to do this? >> THanks >> >> >> >> -- >> Regards >> Shuja-ur-Rehman Baig >> _________________________________ >> MS CS - School of Science and Engineering >> Lahore University of Management Sciences (LUMS) >> Sector U, DHA, Lahore, 54792, Pakistan >> Cell: +92 3214207445 > > -- Yours, Zheng http://www.linkedin.com/in/zshao
+
Zheng Shao 2010-06-11, 05:22
-
Re: Create Table with Line Terminated other than '\n'
Shuja Rehman 2010-06-11, 08:38
Hi yeah Zheng,hadoop does not allowing other than \n. as i tried like this *create table test (xmlFile String)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\001' ;* but it giving me the error saying that *ERROR ql.Driver: FAILED: Error in semantic analysis: LINES TERMINATED BY only supports newline '\n' right now* Then what can be the solution???? ANY HELP????????????? On Fri, Jun 11, 2010 at 7:22 AM, Zheng Shao <[EMAIL PROTECTED]> wrote: > Also, changing "LINES TERMINATED BY" probably won't work, because > hadoop's TextInputFormat does not allow line terminators other than > "\n". > > Zheng > > On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach <[EMAIL PROTECTED]> wrote: > > Hi Shuja, > > The grammar for Hive's CREATE TABLE statement is discussed > > here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table> > You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE > > statement in order to specify a line terminator other than "\n". > > Carl > > > > On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > >> > >> Hi > >> I want to create a table in hive which should have row formated line > >> terminated other than '\n'. so i can read xml file as single cell in one > row > >> and column of table. > >> kindly let me know how to do this? > >> THanks > >> > >> > >> > >> -- > >> Regards > >> Shuja-ur-Rehman Baig > >> _________________________________ > >> MS CS - School of Science and Engineering > >> Lahore University of Management Sciences (LUMS) > >> Sector U, DHA, Lahore, 54792, Pakistan > >> Cell: +92 3214207445 > > > > > > > > -- > Yours, > Zheng > http://www.linkedin.com/in/zshao> -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
+
Shuja Rehman 2010-06-11, 08:38
-
Re: Create Table with Line Terminated other than '\n'
Shuja Rehman 2010-06-11, 11:38
Zheng Shao !!!! Any other solution??? On Fri, Jun 11, 2010 at 10:38 AM, Shuja Rehman <[EMAIL PROTECTED]>wrote: > Hi > yeah Zheng,hadoop does not allowing other than \n. as i tried like this > > *create table test (xmlFile String)ROW FORMAT DELIMITED FIELDS TERMINATED > BY '\t' LINES TERMINATED BY '\001' ;* > > but it giving me the error saying that > > *ERROR ql.Driver: FAILED: Error in semantic analysis: LINES TERMINATED BY > only supports newline '\n' right now* > > Then what can be the solution???? > > ANY HELP????????????? > > > On Fri, Jun 11, 2010 at 7:22 AM, Zheng Shao <[EMAIL PROTECTED]> wrote: > >> Also, changing "LINES TERMINATED BY" probably won't work, because >> hadoop's TextInputFormat does not allow line terminators other than >> "\n". >> >> Zheng >> >> On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach <[EMAIL PROTECTED]> >> wrote: >> > Hi Shuja, >> > The grammar for Hive's CREATE TABLE statement is discussed >> > here: >> http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table>> > You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE >> > statement in order to specify a line terminator other than "\n". >> > Carl >> > >> > On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED]> >> wrote: >> >> >> >> Hi >> >> I want to create a table in hive which should have row formated line >> >> terminated other than '\n'. so i can read xml file as single cell in >> one row >> >> and column of table. >> >> kindly let me know how to do this? >> >> THanks >> >> >> >> >> >> >> >> -- >> >> Regards >> >> Shuja-ur-Rehman Baig >> >> _________________________________ >> >> MS CS - School of Science and Engineering >> >> Lahore University of Management Sciences (LUMS) >> >> Sector U, DHA, Lahore, 54792, Pakistan >> >> Cell: +92 3214207445 >> > >> > >> >> >> >> -- >> Yours, >> Zheng >> http://www.linkedin.com/in/zshao>> > > > > -- > Regards > Shuja-ur-Rehman Baig > _________________________________ > MS CS - School of Science and Engineering > Lahore University of Management Sciences (LUMS) > Sector U, DHA, Lahore, 54792, Pakistan > Cell: +92 3214207445 > -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
+
Shuja Rehman 2010-06-11, 11:38
-
RE: Create Table with Line Terminated other than '\n'
Ashish Thusoo 2010-06-11, 23:03
The other option is to write use the regular expression serde .. something on the lines... create table xyz(doc STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "java regular expression", "output.format.string" = "%1$s" ) STORED AS SEQUENCEFILE; I think that may work for you. The input.regex parameter has a java regular expression that groups columns in a row (in your case there will be only one column). The output.format.string says that %1 grouping is the only column in this row which is of type string. Ashish ________________________________ From: Shuja Rehman [mailto:[EMAIL PROTECTED]] Sent: Friday, June 11, 2010 4:38 AM To: [EMAIL PROTECTED] Subject: Re: Create Table with Line Terminated other than '\n' Zheng Shao !!!! Any other solution??? On Fri, Jun 11, 2010 at 10:38 AM, Shuja Rehman <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi yeah Zheng,hadoop does not allowing other than \n. as i tried like this create table test (xmlFile String)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\001' ; but it giving me the error saying that ERROR ql.Driver: FAILED: Error in semantic analysis: LINES TERMINATED BY only supports newline '\n' right now Then what can be the solution???? ANY HELP????????????? On Fri, Jun 11, 2010 at 7:22 AM, Zheng Shao <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Also, changing "LINES TERMINATED BY" probably won't work, because hadoop's TextInputFormat does not allow line terminators other than "\n". Zheng On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: > Hi Shuja, > The grammar for Hive's CREATE TABLE statement is discussed > here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table> You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE > statement in order to specify a line terminator other than "\n". > Carl > > On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: >> >> Hi >> I want to create a table in hive which should have row formated line >> terminated other than '\n'. so i can read xml file as single cell in one row >> and column of table. >> kindly let me know how to do this? >> THanks >> >> >> >> -- >> Regards >> Shuja-ur-Rehman Baig >> _________________________________ >> MS CS - School of Science and Engineering >> Lahore University of Management Sciences (LUMS) >> Sector U, DHA, Lahore, 54792, Pakistan >> Cell: +92 3214207445 > > -- Yours, Zheng http://www.linkedin.com/in/zshao-- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445 -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
+
Ashish Thusoo 2010-06-11, 23:03
-
RE: Create Table with Line Terminated other than '\n'
Namit Jain 2010-06-12, 03:07
I dont think this will work. If the data is already in TextFormat, I think the SequenceFileRecordReader will be used to read the data which will break Do you already have the xml file from somewhere, or is also generated from hive ________________________________________ From: Ashish Thusoo [[EMAIL PROTECTED]] Sent: Friday, June 11, 2010 4:03 PM To: [EMAIL PROTECTED] Subject: RE: Create Table with Line Terminated other than '\n' The other option is to write use the regular expression serde .. something on the lines... create table xyz(doc STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "java regular expression", "output.format.string" = "%1$s" ) STORED AS SEQUENCEFILE; I think that may work for you. The input.regex parameter has a java regular expression that groups columns in a row (in your case there will be only one column). The output.format.string says that %1 grouping is the only column in this row which is of type string. Ashish ________________________________ From: Shuja Rehman [mailto:[EMAIL PROTECTED]] Sent: Friday, June 11, 2010 4:38 AM To: [EMAIL PROTECTED] Subject: Re: Create Table with Line Terminated other than '\n' Zheng Shao !!!! Any other solution??? On Fri, Jun 11, 2010 at 10:38 AM, Shuja Rehman <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi yeah Zheng,hadoop does not allowing other than \n. as i tried like this create table test (xmlFile String)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\001' ; but it giving me the error saying that ERROR ql.Driver: FAILED: Error in semantic analysis: LINES TERMINATED BY only supports newline '\n' right now Then what can be the solution???? ANY HELP????????????? On Fri, Jun 11, 2010 at 7:22 AM, Zheng Shao <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Also, changing "LINES TERMINATED BY" probably won't work, because hadoop's TextInputFormat does not allow line terminators other than "\n". Zheng On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: > Hi Shuja, > The grammar for Hive's CREATE TABLE statement is discussed > here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table> You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE > statement in order to specify a line terminator other than "\n". > Carl > > On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: >> >> Hi >> I want to create a table in hive which should have row formated line >> terminated other than '\n'. so i can read xml file as single cell in one row >> and column of table. >> kindly let me know how to do this? >> THanks >> >> >> >> -- >> Regards >> Shuja-ur-Rehman Baig >> _________________________________ >> MS CS - School of Science and Engineering >> Lahore University of Management Sciences (LUMS) >> Sector U, DHA, Lahore, 54792, Pakistan >> Cell: +92 3214207445 > > -- Yours, Zheng http://www.linkedin.com/in/zshao-- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445 -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
+
Namit Jain 2010-06-12, 03:07
-
Re: Create Table with Line Terminated other than '\n'
Shuja Rehman 2010-06-12, 10:33
Hi Namit Jain! The xml file is not generated from hive. I have xml files from somewhere else. On Sat, Jun 12, 2010 at 5:07 AM, Namit Jain <[EMAIL PROTECTED]> wrote: > I dont think this will work. > > If the data is already in TextFormat, I think the SequenceFileRecordReader > will be used to read the data > which will break > > Do you already have the xml file from somewhere, or is also generated from > hive > > > > ________________________________________ > From: Ashish Thusoo [[EMAIL PROTECTED]] > Sent: Friday, June 11, 2010 4:03 PM > To: [EMAIL PROTECTED] > Subject: RE: Create Table with Line Terminated other than '\n' > > The other option is to write use the regular expression serde .. something > on the lines... > > create table xyz(doc STRING) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > WITH SERDEPROPERTIES ( > "input.regex" = "java regular expression", > "output.format.string" = "%1$s" > ) > STORED AS SEQUENCEFILE; > > I think that may work for you. > > The input.regex parameter has a java regular expression that groups columns > in a row (in your case there will be only one > column). The output.format.string says that %1 grouping is the only column > in this row which is of type string. > > Ashish > > > > ________________________________ > From: Shuja Rehman [mailto:[EMAIL PROTECTED]] > Sent: Friday, June 11, 2010 4:38 AM > To: [EMAIL PROTECTED] > Subject: Re: Create Table with Line Terminated other than '\n' > > Zheng Shao !!!! Any other solution??? > > On Fri, Jun 11, 2010 at 10:38 AM, Shuja Rehman <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > Hi > yeah Zheng,hadoop does not allowing other than \n. as i tried like this > > create table test (xmlFile String)ROW FORMAT DELIMITED FIELDS TERMINATED BY > '\t' LINES TERMINATED BY '\001' ; > > but it giving me the error saying that > > ERROR ql.Driver: FAILED: Error in semantic analysis: LINES TERMINATED BY > only supports newline '\n' right now > > Then what can be the solution???? > > ANY HELP????????????? > > > On Fri, Jun 11, 2010 at 7:22 AM, Zheng Shao <[EMAIL PROTECTED]<mailto: > [EMAIL PROTECTED]>> wrote: > Also, changing "LINES TERMINATED BY" probably won't work, because > hadoop's TextInputFormat does not allow line terminators other than > "\n". > > Zheng > > On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach <[EMAIL PROTECTED]<mailto: > [EMAIL PROTECTED]>> wrote: > > Hi Shuja, > > The grammar for Hive's CREATE TABLE statement is discussed > > here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table> > You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE > > statement in order to specify a line terminator other than "\n". > > Carl > > > > On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > >> > >> Hi > >> I want to create a table in hive which should have row formated line > >> terminated other than '\n'. so i can read xml file as single cell in one > row > >> and column of table. > >> kindly let me know how to do this? > >> THanks > >> > >> > >> > >> -- > >> Regards > >> Shuja-ur-Rehman Baig > >> _________________________________ > >> MS CS - School of Science and Engineering > >> Lahore University of Management Sciences (LUMS) > >> Sector U, DHA, Lahore, 54792, Pakistan > >> Cell: +92 3214207445 > > > > > > > > -- > Yours, > Zheng > http://www.linkedin.com/in/zshao> > > > -- > Regards > Shuja-ur-Rehman Baig > _________________________________ > MS CS - School of Science and Engineering > Lahore University of Management Sciences (LUMS) > Sector U, DHA, Lahore, 54792, Pakistan > Cell: +92 3214207445 > > > > -- > Regards > Shuja-ur-Rehman Baig > _________________________________ > MS CS - School of Science and Engineering > Lahore University of Management Sciences (LUMS) > Sector U, DHA, Lahore, 54792, Pakistan > Cell: +92 3214207445 > -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
+
Shuja Rehman 2010-06-12, 10:33
-
Re: Create Table with Line Terminated other than '\n'
Amr Awadallah 2010-06-12, 06:23
Zheng, I thought that was fixed per you work here, no? https://issues.apache.org/jira/browse/HIVE-302Then what did you fix? -- amr On 6/10/2010 10:22 PM, Zheng Shao wrote: > Also, changing "LINES TERMINATED BY" probably won't work, because > hadoop's TextInputFormat does not allow line terminators other than > "\n". > > Zheng > > On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach<[EMAIL PROTECTED]> wrote: > >> Hi Shuja, >> The grammar for Hive's CREATE TABLE statement is discussed >> here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table>> You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE >> statement in order to specify a line terminator other than "\n". >> Carl >> >> On Thu, Jun 10, 2010 at 5:39 PM, Shuja Rehman<[EMAIL PROTECTED]> wrote: >> >>> Hi >>> I want to create a table in hive which should have row formated line >>> terminated other than '\n'. so i can read xml file as single cell in one row >>> and column of table. >>> kindly let me know how to do this? >>> THanks >>> >>> >>> >>> -- >>> Regards >>> Shuja-ur-Rehman Baig >>> _________________________________ >>> MS CS - School of Science and Engineering >>> Lahore University of Management Sciences (LUMS) >>> Sector U, DHA, Lahore, 54792, Pakistan >>> Cell: +92 3214207445 >>> >> >> > > >
+
Amr Awadallah 2010-06-12, 06:23
-
Re: Create Table with Line Terminated other than '\n'
Zheng Shao 2010-06-13, 03:24
That patch basically throws an error if user specified a non-newline line terminator. Without the patch it will produce unexpected result, successfully. Sent from my iPhone On Jun 11, 2010, at 11:23 PM, Amr Awadallah <[EMAIL PROTECTED]> wrote: > Zheng, I thought that was fixed per you work here, no? > > https://issues.apache.org/jira/browse/HIVE-302> > Then what did you fix? > > -- amr > > On 6/10/2010 10:22 PM, Zheng Shao wrote: >> Also, changing "LINES TERMINATED BY" probably won't work, because >> hadoop's TextInputFormat does not allow line terminators other than >> "\n". >> >> Zheng >> >> On Thu, Jun 10, 2010 at 6:31 PM, Carl Steinbach<[EMAIL PROTECTED]> >> wrote: >> >>> Hi Shuja, >>> The grammar for Hive's CREATE TABLE statement is discussed >>> here: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table>>> You need to use the "LINES TERMINATED BY" clause in the CREATE TABLE >>> statement in order to specify a line terminator other than "\n". >>> Carl >>> >>> On Thu, Jun 10, 2010 at 5:39 PM, Shuja >>> Rehman<[EMAIL PROTECTED]> wrote: >>> >>>> Hi >>>> I want to create a table in hive which should have row formated >>>> line >>>> terminated other than '\n'. so i can read xml file as single cell >>>> in one row >>>> and column of table. >>>> kindly let me know how to do this? >>>> THanks >>>> >>>> >>>> >>>> -- >>>> Regards >>>> Shuja-ur-Rehman Baig >>>> _________________________________ >>>> MS CS - School of Science and Engineering >>>> Lahore University of Management Sciences (LUMS) >>>> Sector U, DHA, Lahore, 54792, Pakistan >>>> Cell: +92 3214207445 >>>> >>> >>> >> >> >>
+
Zheng Shao 2010-06-13, 03:24
|
|