Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Special characters in web log file causing issues


Copy link to this message
-
Re: Special characters in web log file causing issues
yes Raj,

thats a unix command
On Tue, Jul 9, 2013 at 6:48 AM, Hadoop Raj <[EMAIL PROTECTED]> wrote:

> Hi Sanjay,
>
> Is that a unix trap command or any other thing? Please let me know.
>
>
> Sent from my iPhone
>
> On Jul 8, 2013, at 7:46 PM, Sanjay Subramanian <
> [EMAIL PROTECTED]> wrote:
>
> U may have to remove non-printable chars first, save an intermediate file
> and then load into Hive
>
>  tr -cd '[:print:]\r\n\t'
>
>  Or if u have *strings* function that will only output printable chars
>
>
>   From: Raj Hadoop <[EMAIL PROTECTED]>
> Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, Raj Hadoop <
> [EMAIL PROTECTED]>
> Date: Monday, July 8, 2013 1:52 PM
> To: Hive <[EMAIL PROTECTED]>
> Subject: Special characters in web log file causing issues
>
>
>   Hi ,
>
> The log file that I am trying to load throuh Hive has some special
> characters
>
> The field is shown below and the special characters *¿¿***are also shown.
>
>      Shockwave Flash;Chrome Remote Desktop Viewer;Native Client;Chrome
> PDF Viewer;Adobe Acrobat;Microsoft Office 2010;Motive Plug-
>      in;Motive Management Plug-in;Google Update;Java(TM) Platform SE 7 U21
> ;McAfee SiteAdvisor;McAfee Virtual Technician;Windows     Live*¿¿ *Photo
> Gallery;McAfee SecurityCenter;Silverlig
>
>
> The above is causing the record to be terminated and loading another
> line.  How can I avoid this type of issues and how to load the proper data
> ? Any suggestions please.
>
>  Thanks,
> Raj
>
> CONFIDENTIALITY NOTICE
> =====================> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.
>
>
--
Nitin Pawar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB