Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk loading a CSV file into HBase


Copy link to this message
-
Re: Bulk loading a CSV file into HBase
Yeah after digging further into the code: Line#374 in
GenericOptionsParser.java "commandLine = parser.parse(opts, args, true);"
is the culprit. Nice find, Shrijeet. That answers my question. :)

Stack:
Could you please tell me the meaning of "IIRC"? Updating the document is
good but as per the behavior of parse() other -D option will also be
ignored if  tablename is followed by any -D option .
Duplicating the GOP functionality does not seems to be a good idea . Maybe
instead of invoking "parser.parse(opts, args, true);" if somehow we can
invoke "parser.parse(opts, args, false);" then all will be good. I haven't
looked at the api to know about the possibility of same. This is just food
for thought.

Thanks,
Anil

On Thu, Mar 8, 2012 at 12:06 PM, Shrijeet Paliwal
<[EMAIL PROTECTED]>wrote:

> GenericOptionsParser stops parsing the arguments as soon as first non
> option is specified (refer :
>
> http://commons.apache.org/cli/api-1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Options
> ,
> java.lang.String[], boolean))
>
> So in this cases as soon parses sees the table name arg , it ignore all
> other properties specified with -D opt. Note it not only ignores separator
> it is also ignoring importtsv.skip.bad.lines option in your run which
> failed.
>
>
>
> On Thu, Mar 8, 2012 at 11:27 AM, Stack <[EMAIL PROTECTED]> wrote:
>
> > On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <[EMAIL PROTECTED]>
> wrote:
> > > 1. Update the HBase bulk load documentation and specify that separator
> > > argument should be next to program name.
> >
> > This would help.
> >
> > > 2. Fix the problem in the code itself by handling the separator
> argument
> > > explicitly. (Still, i am wondering why only separator value is not
> being
> > > set in jobconf automatically if it is not provided next to program
> > name??)
> > >
> >
> > This is probably too late IIRC.  I haven't looked at code but
> > GenericOptionsParser has probably already been run by the time the
> > application starts to process args.  Duplicating what GOP in the
> > application is probably not the way to go either?
> >
> > St.Ack
> >
>

--
Thanks & Regards,
Anil Gupta