Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk loading a CSV file into HBase


Copy link to this message
-
Re: Bulk loading a CSV file into HBase
Hi Lakshman,

As per your last email, it seems that updating the doc seems to be an easy
and right approach.

Thanks,
Anil Gupta

On Fri, Mar 9, 2012 at 12:20 AM, Laxman <[EMAIL PROTECTED]> wrote:

> Hi Anil,
>
> > instead of invoking "parser.parse(opts, args, true);" if somehow we can
> > invoke "parser.parse(opts, args, false);" then all will be good. I
> > haven't
> > looked at the api to know about the possibility of same.
>
> Changing to parser.parse(opts, args, false) solves this problem.
> I think, we need to consider the following before going for this change.
>
> This involves behavior change in legacy hadoop code.
> Directly changing from true to false may cause behavioral compatibility
> issue.
>
> Also, Setting it to false may not be correct all the times.
>
> Case #1 java
> "java -Dprop1=val1 <Class> arg1 arg2" is different from "java <Class> arg1
> arg2 -Dprop1=val1
>
> In this case it looks like parser.parse(opts, args, true) is correct
>
>
> Case #2 linux
> "ls -l /home" is same as "ls /home -l"
>
> In this case it looks like parser.parse(opts, args, false) is correct
>
> >> This is probably too late IIRC
> Hope, Stack also meant the same point here.
>
> > Could you please tell me the meaning of "IIRC"?
> IIRC - If I Recall/Remember Correctly
>
> --
> Regards,
> Laxman
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
> > anil gupta
> > Sent: Friday, March 09, 2012 3:12 AM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Bulk loading a CSV file into HBase
> >
> > Yeah after digging further into the code: Line#374 in
> > GenericOptionsParser.java "commandLine = parser.parse(opts, args,
> > true);"
> > is the culprit. Nice find, Shrijeet. That answers my question. :)
> >
> > Stack:
> > Could you please tell me the meaning of "IIRC"? Updating the document
> > is
> > good but as per the behavior of parse() other -D option will also be
> > ignored if  tablename is followed by any -D option .
> > Duplicating the GOP functionality does not seems to be a good idea .
> > Maybe
> > instead of invoking "parser.parse(opts, args, true);" if somehow we can
> > invoke "parser.parse(opts, args, false);" then all will be good. I
> > haven't
> > looked at the api to know about the possibility of same. This is just
> > food
> > for thought.
> >
> > Thanks,
> > Anil
> >
> >
> >
> > On Thu, Mar 8, 2012 at 12:06 PM, Shrijeet Paliwal
> > <[EMAIL PROTECTED]>wrote:
> >
> > > GenericOptionsParser stops parsing the arguments as soon as first non
> > > option is specified (refer :
> > >
> > > http://commons.apache.org/cli/api-
> > 1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Opt
> > ions
> > > ,
> > > java.lang.String[], boolean))
> > >
> > > So in this cases as soon parses sees the table name arg , it ignore
> > all
> > > other properties specified with -D opt. Note it not only ignores
> > separator
> > > it is also ignoring importtsv.skip.bad.lines option in your run which
> > > failed.
> > >
> > >
> > >
> > > On Thu, Mar 8, 2012 at 11:27 AM, Stack <[EMAIL PROTECTED]> wrote:
> > >
> > > > On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <[EMAIL PROTECTED]>
> > > wrote:
> > > > > 1. Update the HBase bulk load documentation and specify that
> > separator
> > > > > argument should be next to program name.
> > > >
> > > > This would help.
> > > >
> > > > > 2. Fix the problem in the code itself by handling the separator
> > > argument
> > > > > explicitly. (Still, i am wondering why only separator value is
> > not
> > > being
> > > > > set in jobconf automatically if it is not provided next to
> > program
> > > > name??)
> > > > >
> > > >
> > > > This is probably too late IIRC.  I haven't looked at code but
> > > > GenericOptionsParser has probably already been run by the time the
> > > > application starts to process args.  Duplicating what GOP in the
> > > > application is probably not the way to go either?
> > > >
> > > > St.Ack
> > > >
> > >
> >
> >
> >
Thanks & Regards,
Anil Gupta