Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulkload Problem


Copy link to this message
-
Re: Bulkload Problem
bq. I'm not sure which region is causing the problem

I am not sure either.
So I logged HBASE-9809:
RegionTooBusyException should provide region name which was too busy
On Sun, Oct 20, 2013 at 9:44 AM, John <[EMAIL PROTECTED]> wrote:

> thanks for the answers!
>
> I'm not sure if the if the table is pre-splitted, but I don't think so.
> Here is the java code: http://pastebin.com/6V5CzasL .
>
> So I think the splitting could be the reason why the region is busy, but
> how can  I prevent this problem? Is there any configuration value in hbase
> to wait longer? Maybe increase the repeat number from 10 to 10000 or
> something like that? Which value is it?
>
> @Ted: I'm not sure which region is causing the problem, there are 7 nodes
> and  1 master, so i couldn't paste a spefic log.
>
> kind regards
>
>
>
>
> 2013/10/20 Ted Yu <[EMAIL PROTECTED]>
>
> > John:
> > If you can pastebin region server log around 'Sun Oct 20 04:17:52', that
> > would help too.
> >
> > Cheers
> >
> >
> > On Sun, Oct 20, 2013 at 4:02 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Hi John,
> > >
> > > Is your table pre-splitted?
> > >
> > > for me, sound like your RS is too busy doing other jobs to reply back
> to
> > > the client.
> > >
> > > Multiple options.
> > > 1) It's due to a long garbage collection. Can you monitor it on your
> > > servers?
> > > 2) It's because the table is not pre-split and the server is working on
> > > that and taking time.
> > >
> > > How many servers to you have for this test?
> > >
> > > JM
> > >
> > >
> > > 2013/10/20 John <[EMAIL PROTECTED]>
> > >
> > > > Hi,
> > > >
> > > > I try to load a big amount of data into a hbase cluster. I've
> imported
> > > > successfully up to 3000 Millionen Datasets (KV Pairs). But if I try
> to
> > > > import 6000 Millionen I got this error after 60-95% of the import:
> > > > http://pastebin.com/CCp6kS3m ...
> > > >
> > > > The System is not crashing or anything like this, All nodes are still
> > up.
> > > > It seems to me that one node is temporarily not available. Maybe is
> it
> > > > possibel to increase the repeat-number? (I think its default 10).
> What
> > > > value do I have to change for that?
> > > >
> > > >
> > > > I'm using Cloudera 4.4.0-1 and the Hbase version 0.94.6-cdh4.4.0
> > > >
> > > > regards,
> > > >
> > > > john
> > > >
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB