Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Dropping embedded newlines for csv


Copy link to this message
-
Re: Dropping embedded newlines for csv
I got that, Jarcec.  If the parameter does not need hive, then why call
this as --hive-import-drop-delims.  Instead can be called,
--import-drop-delims, right?

hive-import in the name causes confusion :)  that was my point.

Sorry I did not spell your name right, Jarcec.

--Chalcy
On Thu, Sep 20, 2012 at 2:17 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:

> Hi Chalcy,
> I'm glad that you're enjoying sqoop a lot :-)
>
> I'm sorry for the confusion I've mistakenly caused. Name of the parameter
> is --hive-import-drop-delims in all cases. What I meant is that this
> argument can be used independently on argument --hive-import. So that you
> can drop HIVE delimiters (\n, \r, \0) and still be importing data directly
> into HDFS without any other HIVE interaction - I believe that you even do
> not need HIVE installation for doing so at all. Hope that this helps to
> clarify the confusion a bit.
>
> Jarcec
>
> On Thu, Sep 20, 2012 at 02:07:16PM -0400, Chalcy wrote:
> > Hi Jarec,
> >
> > I did not know that hive-import-drop-delims works wihout hive-import.  In
> > that case, do we want to call this parameter as just --drop-import-delims
> > instead of hive-drop-import-delims?
> >
> > Thanks,
> > Chalcy
> >
> > On Thu, Sep 20, 2012 at 2:04 PM, Chalcy <[EMAIL PROTECTED]> wrote:
> >
> > > I use the hive-drop-import-delims for hive import and that was the
> problem
> > > I had to solve a year ago.  Since you want the data in hdfs, you can
> do a
> > > workaround, like do hive import and use the underlying hdfs, like
> > > /user/hive/warehouse/mynewlineremoveddata.
> > >
> > > Sqoop is a great tool.  Using sqoop for all database imports.
> > >
> > > Thanks,
> > > Chalcy
> > >
> > >
> > > On Thu, Sep 20, 2012 at 1:55 PM, David Kincaid <[EMAIL PROTECTED]
> >wrote:
> > >
> > >> I'm brand new to Sqoop and am working on importing data from an
> Oracle database
> > >> into HDFS. It is going to solve a number of problems I've been trying
> to
> > >> solve, so I'm really excited about it. I have it working great right
> now
> > >> except for one thing. One of the columns in one of that tables has
> > >> newline characters in it. I'm importing to comma delimited files and
> > >> need to strip off those embedded newline characters since the tool I'm
> > >> reading the .csv files with isn't handling those well.
> > >>
> > >> I saw the option --hive-drop-import-delims which is exactly what I
> want,
> > >> but I assume that only works when importing to Hive. How have others
> > >> solved this problem?
> > >>
> > >> Thanks,
> > >> Dave
> > >>
> > >
> > >
>