Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Special characters in web log file causing issues


Copy link to this message
-
Re: Special characters in web log file causing issues
yes Raj,

thats a unix command
On Tue, Jul 9, 2013 at 6:48 AM, Hadoop Raj <[EMAIL PROTECTED]> wrote:

> Hi Sanjay,
>
> Is that a unix trap command or any other thing? Please let me know.
>
>
> Sent from my iPhone
>
> On Jul 8, 2013, at 7:46 PM, Sanjay Subramanian <
> [EMAIL PROTECTED]> wrote:
>
> U may have to remove non-printable chars first, save an intermediate file
> and then load into Hive
>
>  tr -cd '[:print:]\r\n\t'
>
>  Or if u have *strings* function that will only output printable chars
>
>
>   From: Raj Hadoop <[EMAIL PROTECTED]>
> Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, Raj Hadoop <
> [EMAIL PROTECTED]>
> Date: Monday, July 8, 2013 1:52 PM
> To: Hive <[EMAIL PROTECTED]>
> Subject: Special characters in web log file causing issues
>
>
>   Hi ,
>
> The log file that I am trying to load throuh Hive has some special
> characters
>
> The field is shown below and the special characters *¿¿***are also shown.
>
>      Shockwave Flash;Chrome Remote Desktop Viewer;Native Client;Chrome
> PDF Viewer;Adobe Acrobat;Microsoft Office 2010;Motive Plug-
>      in;Motive Management Plug-in;Google Update;Java(TM) Platform SE 7 U21
> ;McAfee SiteAdvisor;McAfee Virtual Technician;Windows     Live*¿¿ *Photo
> Gallery;McAfee SecurityCenter;Silverlig
>
>
> The above is causing the record to be terminated and loading another
> line.  How can I avoid this type of issues and how to load the proper data
> ? Any suggestions please.
>
>  Thanks,
> Raj
>
> CONFIDENTIALITY NOTICE
> =====================> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.
>
>
--
Nitin Pawar