Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk loading a CSV file into HBase


Copy link to this message
-
Re: Bulk loading a CSV file into HBase
GenericOptionsParser stops parsing the arguments as soon as first non
option is specified (refer :
http://commons.apache.org/cli/api-1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Options,
java.lang.String[], boolean))

So in this cases as soon parses sees the table name arg , it ignore all
other properties specified with -D opt. Note it not only ignores separator
it is also ignoring importtsv.skip.bad.lines option in your run which
failed.

On Thu, Mar 8, 2012 at 11:27 AM, Stack <[EMAIL PROTECTED]> wrote:

> On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> > 1. Update the HBase bulk load documentation and specify that separator
> > argument should be next to program name.
>
> This would help.
>
> > 2. Fix the problem in the code itself by handling the separator argument
> > explicitly. (Still, i am wondering why only separator value is not being
> > set in jobconf automatically if it is not provided next to program
> name??)
> >
>
> This is probably too late IIRC.  I haven't looked at code but
> GenericOptionsParser has probably already been run by the time the
> application starts to process args.  Duplicating what GOP in the
> application is probably not the way to go either?
>
> St.Ack
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB