Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulkload Problem


Copy link to this message
-
Re: Bulkload Problem
Thanks for that ... . I've started the Import again and hope it will work
this time.

regards
2013/10/20 Ted Yu <[EMAIL PROTECTED]>

> bq. I'm not sure which region is causing the problem
>
> I am not sure either.
> So I logged HBASE-9809:
> RegionTooBusyException should provide region name which was too busy
>
>
> On Sun, Oct 20, 2013 at 9:44 AM, John <[EMAIL PROTECTED]> wrote:
>
> > thanks for the answers!
> >
> > I'm not sure if the if the table is pre-splitted, but I don't think so.
> > Here is the java code: http://pastebin.com/6V5CzasL .
> >
> > So I think the splitting could be the reason why the region is busy, but
> > how can  I prevent this problem? Is there any configuration value in
> hbase
> > to wait longer? Maybe increase the repeat number from 10 to 10000 or
> > something like that? Which value is it?
> >
> > @Ted: I'm not sure which region is causing the problem, there are 7 nodes
> > and  1 master, so i couldn't paste a spefic log.
> >
> > kind regards
> >
> >
> >
> >
> > 2013/10/20 Ted Yu <[EMAIL PROTECTED]>
> >
> > > John:
> > > If you can pastebin region server log around 'Sun Oct 20 04:17:52',
> that
> > > would help too.
> > >
> > > Cheers
> > >
> > >
> > > On Sun, Oct 20, 2013 at 4:02 AM, Jean-Marc Spaggiari <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Hi John,
> > > >
> > > > Is your table pre-splitted?
> > > >
> > > > for me, sound like your RS is too busy doing other jobs to reply back
> > to
> > > > the client.
> > > >
> > > > Multiple options.
> > > > 1) It's due to a long garbage collection. Can you monitor it on your
> > > > servers?
> > > > 2) It's because the table is not pre-split and the server is working
> on
> > > > that and taking time.
> > > >
> > > > How many servers to you have for this test?
> > > >
> > > > JM
> > > >
> > > >
> > > > 2013/10/20 John <[EMAIL PROTECTED]>
> > > >
> > > > > Hi,
> > > > >
> > > > > I try to load a big amount of data into a hbase cluster. I've
> > imported
> > > > > successfully up to 3000 Millionen Datasets (KV Pairs). But if I try
> > to
> > > > > import 6000 Millionen I got this error after 60-95% of the import:
> > > > > http://pastebin.com/CCp6kS3m ...
> > > > >
> > > > > The System is not crashing or anything like this, All nodes are
> still
> > > up.
> > > > > It seems to me that one node is temporarily not available. Maybe is
> > it
> > > > > possibel to increase the repeat-number? (I think its default 10).
> > What
> > > > > value do I have to change for that?
> > > > >
> > > > >
> > > > > I'm using Cloudera 4.4.0-1 and the Hbase version 0.94.6-cdh4.4.0
> > > > >
> > > > > regards,
> > > > >
> > > > > john
> > > > >
> > > >
> > >
> >
>