Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> hive cli escaping TAB and NEW LINE Characters.


Copy link to this message
-
Re: hive cli escaping TAB and NEW LINE Characters.
Maybe i misread your original post.  Didn't you say you were parsing the
hive client output?

You don't have to change the way you're writing the data - you only have to
change the output hive emits.
so for example when producing hive output i presume you do something like
this currently:

     hive> select col1, col2 from table;  -- where col1 or col2 have
embedded tabs and newlines?
so do this instead:

    hive> select transform(col1, col2) using 'script' as col1, col2 from
table;

where you write 'script' and do your encoding as suggested in the previous
post.

NB. pay heed to the first warning on this page:
https://cwiki.apache.org/Hive/languagemanual-transform.html
On Mon, May 6, 2013 at 2:41 AM, Valluri, Sathish <[EMAIL PROTECTED]>wrote:

>  This is the idea which I have thought, But in our scenario we have less
> control on writing avro data with delimited TABS and NEWLINES.(encoding
> tabs and newlines with other characters).****
>
> Since avro data can be pumped on to the Warehouse system from many sources
> and if we have to implement this kind of logic we have handle this TABS and
> NEWLINES encoding on every data writing part.****
>
> Interested if this can be handled without delimiting avro data, like
> reading the AVRO data and transforming into other encoding format and
> sending to the cli output in this format.****
>
> And our app will decode the data and display.****
>
> ** **
>
> Regards****
>
> Sathish Valluri****
>
> ** **
>
> *From:* Sanjay Subramanian [mailto:[EMAIL PROTECTED]]
> *Sent:* Saturday, May 04, 2013 12:08 AM
> *To:* [EMAIL PROTECTED]
>
> *Subject:* Re: hive cli escaping TAB and NEW LINE Characters.****
>
>  ** **
>
> +1 to Stephens suggestion…****
>
> ** **
>
> *From: *Stephen Sprague <[EMAIL PROTECTED]>
> *Reply-To: *"[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> *Date: *Friday, May 3, 2013 11:29 AM
> *To: *"[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> *Subject: *Re: hive cli escaping TAB and NEW LINE Characters.****
>
> ** **
>
> hate to sound like a broken record but when all else fails think about the
> transform() function. The notion here is of encoding your tabs and newlines
> to something like '\t' and '\n' (literally) for instance. If those aren't
> unique enough use '<<tab>>' and "<<newline>>' (you get the idea)  then
> having your app decode those strings to real tabs and real newlines when
> reading it.****
>
> What do you think?****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> On Fri, May 3, 2013 at 2:07 AM, Valluri, Sathish <[EMAIL PROTECTED]>
> wrote:****
>
> Hi All,****
>
>  ****
>
> We have an application which parses hive cli output and displays results.*
> ***
>
> I have an external table with data in avro format, the contents in this
> avro file have TAB and NEW LINES in the Avro data part.****
>
> Since hive cli output rows are delimited by NEWLINES and columns are
> delimited by TABS, if the actual content have TABS and NEW LINE characters
> parsing the result set is giving wrong results.****
>
> Can anyone suggest some ideas regarding delimiting the TABS and NEW LINE
> characters in the hive cli output if the actual contents of the columns
> have TABS and NEW LINES.****
>
>  ****
>
> Regards****
>
> Sathish Valluri****
>
> ** **
>
> ** **
>
> CONFIDENTIALITY NOTICE
> =====================> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.*
> ***
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB