Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Dropping embedded newlines for csv


Copy link to this message
-
Re: Dropping embedded newlines for csv
I got that, Jarcec.  If the parameter does not need hive, then why call
this as --hive-import-drop-delims.  Instead can be called,
--import-drop-delims, right?

hive-import in the name causes confusion :)  that was my point.

Sorry I did not spell your name right, Jarcec.

--Chalcy
On Thu, Sep 20, 2012 at 2:17 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:

> Hi Chalcy,
> I'm glad that you're enjoying sqoop a lot :-)
>
> I'm sorry for the confusion I've mistakenly caused. Name of the parameter
> is --hive-import-drop-delims in all cases. What I meant is that this
> argument can be used independently on argument --hive-import. So that you
> can drop HIVE delimiters (\n, \r, \0) and still be importing data directly
> into HDFS without any other HIVE interaction - I believe that you even do
> not need HIVE installation for doing so at all. Hope that this helps to
> clarify the confusion a bit.
>
> Jarcec
>
> On Thu, Sep 20, 2012 at 02:07:16PM -0400, Chalcy wrote:
> > Hi Jarec,
> >
> > I did not know that hive-import-drop-delims works wihout hive-import.  In
> > that case, do we want to call this parameter as just --drop-import-delims
> > instead of hive-drop-import-delims?
> >
> > Thanks,
> > Chalcy
> >
> > On Thu, Sep 20, 2012 at 2:04 PM, Chalcy <[EMAIL PROTECTED]> wrote:
> >
> > > I use the hive-drop-import-delims for hive import and that was the
> problem
> > > I had to solve a year ago.  Since you want the data in hdfs, you can
> do a
> > > workaround, like do hive import and use the underlying hdfs, like
> > > /user/hive/warehouse/mynewlineremoveddata.
> > >
> > > Sqoop is a great tool.  Using sqoop for all database imports.
> > >
> > > Thanks,
> > > Chalcy
> > >
> > >
> > > On Thu, Sep 20, 2012 at 1:55 PM, David Kincaid <[EMAIL PROTECTED]
> >wrote:
> > >
> > >> I'm brand new to Sqoop and am working on importing data from an
> Oracle database
> > >> into HDFS. It is going to solve a number of problems I've been trying
> to
> > >> solve, so I'm really excited about it. I have it working great right
> now
> > >> except for one thing. One of the columns in one of that tables has
> > >> newline characters in it. I'm importing to comma delimited files and
> > >> need to strip off those embedded newline characters since the tool I'm
> > >> reading the .csv files with isn't handling those well.
> > >>
> > >> I saw the option --hive-drop-import-delims which is exactly what I
> want,
> > >> but I assume that only works when importing to Hive. How have others
> > >> solved this problem?
> > >>
> > >> Thanks,
> > >> Dave
> > >>
> > >
> > >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB