Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Problem: LINES TERMINATED BY only supports newline '\n' right now.


Copy link to this message
-
Re: Problem: LINES TERMINATED BY only supports newline '\n' right now.
Mark Grover 2012-06-04, 21:06
Hi Tabraiz,
The 10 in the source code is what '\n' is in ASCII (base 10). That's why you see it. It still represents a linefeed.

Mark

----- Original Message -----
From: "tabraiz anwer" <[EMAIL PROTECTED]>
To: "Mark Grover" <[EMAIL PROTECTED]>, "hive group" <[EMAIL PROTECTED]>
Sent: Monday, June 4, 2012 4:42:02 PM
Subject: Re: Problem: LINES TERMINATED BY only supports newline '\n' right now.

hello Mark,
instead of '\n\ we can also termincate records by '10' i have see the exmaple in hive wiki where they are creating tables and their records are terminating by '\001\'
i have checked src of hive syntax analyzer . there are only two option of LINES termination one is '\n' and other is '10' now the question arises is how i can add another line termination values of '\001\
Regards.

From: Mark Grover <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; tabraiz anwer <[EMAIL PROTECTED]>
Sent: Monday, 4 June 2012 9:26 PM
Subject: Re: Problem: LINES TERMINATED BY only supports newline '\n' right now.

Hi Tabriz,
As far as I know, newlines are the only supported way to separate records right now. As a corollary if a single logical records exists across multiple lines, you will have to get rid of the extra newlines for all of it to be in the same record.

So, to get around it, you can do one of two things:
1) Pre-process your files to break records apart on newlines.
2) As Ed Capriolo suggested in a previous email thread, you could try to use streaming, parse out your XML there and emit out multiple records.

Mark

----- Original Message -----
From: "tabraiz anwer" < [EMAIL PROTECTED] >
To: "hive group" < [EMAIL PROTECTED] >
Sent: Monday, June 4, 2012 12:08:12 PM
Subject: Problem: LINES TERMINATED BY only supports newline '\n' right now.

Hi,
i had tried to create the table by "LINES terminated by '\001' "
and it is giving me the error
Error in semantic analysis: 3:66 LINES TERMINATED BY only supports newline '\n' right now. Error encountered near token ''\001''
CREATE TABLE xmlgw4 ( transactionid string, typeid string,
sentxml string,receivedxml string )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\001'
STORED AS TEXTFILE;

instead of '\n' i am using '\001' because i have an xml value , which i want to store in hive that invludes \n values..
using hive version : hive-0.8.1

any suggestion?
Regards.