Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Dropping embedded newlines for csv


Copy link to this message
-
Re: Dropping embedded newlines for csv
Hi Chalcy,
I'm glad that you're enjoying sqoop a lot :-)

I'm sorry for the confusion I've mistakenly caused. Name of the parameter is --hive-import-drop-delims in all cases. What I meant is that this argument can be used independently on argument --hive-import. So that you can drop HIVE delimiters (\n, \r, \0) and still be importing data directly into HDFS without any other HIVE interaction - I believe that you even do not need HIVE installation for doing so at all. Hope that this helps to clarify the confusion a bit.

Jarcec

On Thu, Sep 20, 2012 at 02:07:16PM -0400, Chalcy wrote:
> Hi Jarec,
>
> I did not know that hive-import-drop-delims works wihout hive-import.  In
> that case, do we want to call this parameter as just --drop-import-delims
> instead of hive-drop-import-delims?
>
> Thanks,
> Chalcy
>
> On Thu, Sep 20, 2012 at 2:04 PM, Chalcy <[EMAIL PROTECTED]> wrote:
>
> > I use the hive-drop-import-delims for hive import and that was the problem
> > I had to solve a year ago.  Since you want the data in hdfs, you can do a
> > workaround, like do hive import and use the underlying hdfs, like
> > /user/hive/warehouse/mynewlineremoveddata.
> >
> > Sqoop is a great tool.  Using sqoop for all database imports.
> >
> > Thanks,
> > Chalcy
> >
> >
> > On Thu, Sep 20, 2012 at 1:55 PM, David Kincaid <[EMAIL PROTECTED]>wrote:
> >
> >> I'm brand new to Sqoop and am working on importing data from an Oracle database
> >> into HDFS. It is going to solve a number of problems I've been trying to
> >> solve, so I'm really excited about it. I have it working great right now
> >> except for one thing. One of the columns in one of that tables has
> >> newline characters in it. I'm importing to comma delimited files and
> >> need to strip off those embedded newline characters since the tool I'm
> >> reading the .csv files with isn't handling those well.
> >>
> >> I saw the option --hive-drop-import-delims which is exactly what I want,
> >> but I assume that only works when importing to Hive. How have others
> >> solved this problem?
> >>
> >> Thanks,
> >> Dave
> >>
> >
> >