Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Dropping embedded newlines for csv


+
David Kincaid 2012-09-20, 17:55
+
Jarek Jarcec Cecho 2012-09-20, 18:03
+
Chalcy 2012-09-20, 18:04
+
Chalcy 2012-09-20, 18:07
Copy link to this message
-
Re: Dropping embedded newlines for csv
Hi Chalcy,
I'm glad that you're enjoying sqoop a lot :-)

I'm sorry for the confusion I've mistakenly caused. Name of the parameter is --hive-import-drop-delims in all cases. What I meant is that this argument can be used independently on argument --hive-import. So that you can drop HIVE delimiters (\n, \r, \0) and still be importing data directly into HDFS without any other HIVE interaction - I believe that you even do not need HIVE installation for doing so at all. Hope that this helps to clarify the confusion a bit.

Jarcec

On Thu, Sep 20, 2012 at 02:07:16PM -0400, Chalcy wrote:
> Hi Jarec,
>
> I did not know that hive-import-drop-delims works wihout hive-import.  In
> that case, do we want to call this parameter as just --drop-import-delims
> instead of hive-drop-import-delims?
>
> Thanks,
> Chalcy
>
> On Thu, Sep 20, 2012 at 2:04 PM, Chalcy <[EMAIL PROTECTED]> wrote:
>
> > I use the hive-drop-import-delims for hive import and that was the problem
> > I had to solve a year ago.  Since you want the data in hdfs, you can do a
> > workaround, like do hive import and use the underlying hdfs, like
> > /user/hive/warehouse/mynewlineremoveddata.
> >
> > Sqoop is a great tool.  Using sqoop for all database imports.
> >
> > Thanks,
> > Chalcy
> >
> >
> > On Thu, Sep 20, 2012 at 1:55 PM, David Kincaid <[EMAIL PROTECTED]>wrote:
> >
> >> I'm brand new to Sqoop and am working on importing data from an Oracle database
> >> into HDFS. It is going to solve a number of problems I've been trying to
> >> solve, so I'm really excited about it. I have it working great right now
> >> except for one thing. One of the columns in one of that tables has
> >> newline characters in it. I'm importing to comma delimited files and
> >> need to strip off those embedded newline characters since the tool I'm
> >> reading the .csv files with isn't handling those well.
> >>
> >> I saw the option --hive-drop-import-delims which is exactly what I want,
> >> but I assume that only works when importing to Hive. How have others
> >> solved this problem?
> >>
> >> Thanks,
> >> Dave
> >>
> >
> >
+
Chalcy 2012-09-20, 18:23
+
Jarek Jarcec Cecho 2012-09-20, 18:51
+
David Kincaid 2012-09-20, 20:24
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB