Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk loading a CSV file into HBase


Copy link to this message
-
Re: Bulk loading a CSV file into HBase
GenericOptionsParser stops parsing the arguments as soon as first non
option is specified (refer :
http://commons.apache.org/cli/api-1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Options,
java.lang.String[], boolean))

So in this cases as soon parses sees the table name arg , it ignore all
other properties specified with -D opt. Note it not only ignores separator
it is also ignoring importtsv.skip.bad.lines option in your run which
failed.

On Thu, Mar 8, 2012 at 11:27 AM, Stack <[EMAIL PROTECTED]> wrote:

> On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> > 1. Update the HBase bulk load documentation and specify that separator
> > argument should be next to program name.
>
> This would help.
>
> > 2. Fix the problem in the code itself by handling the separator argument
> > explicitly. (Still, i am wondering why only separator value is not being
> > set in jobconf automatically if it is not provided next to program
> name??)
> >
>
> This is probably too late IIRC.  I haven't looked at code but
> GenericOptionsParser has probably already been run by the time the
> application starts to process args.  Duplicating what GOP in the
> application is probably not the way to go either?
>
> St.Ack
>