Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase bulk load through co-processors


Copy link to this message
-
Re: HBase bulk load through co-processors
Sever,

I presume you're loading your data via online Puts via the MR job (as
opposed to generating HFiles). What are you hoping to gain from a
coprocessor implementation vs the 6 MR jobs? Have you pre-split your
tables? Can the RegionServer(s) handle all the concurrent mappers?

-n

On Mon, Jul 2, 2012 at 11:58 AM, Sever Fundatureanu <
[EMAIL PROTECTED]> wrote:

> I agree that increasing the timeout is not the best option, I will work
> both on better balancing the load and maybe doing it in increments like you
> suggested. However for now I want a quick fix to the problem.
>
> Just to see if I understand this right: a zookeeper node redirects my
> client to a region server node and then my client talk directly to this
> region server; now the timeout happens on the client while talking to the
> RS right? It expects some kind of confirmation and it times out.. if this
> is the case how can I increase this timeout? I only found in the
> documentation "zookeeper.session.timeout" which is the timeout between
> zookeeper and HBase.
>
> Thanks,
> Sever
>
> On Mon, Jul 2, 2012 at 8:19 PM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]
> > wrote:
>
> > Hi Sever,
> >
> > It seems one of the nodes in your cluster is overwhelmed with the load
> > you are giving him.
> >
> > So IMO, you have two options here:
> > First, you can try to reduce the load. I mean, split the bulk in
> > multiple smaller bulks and load them one by one to give the time to
> > your cluster to dispatch it correctly.
> > Second, you can inscreade the timeone from 60s to 120s. But you might
> > face the same issue with 120s so  I really recommand the fist option.
> >
> > JM
> >
> > 2012/7/2, Sever Fundatureanu <[EMAIL PROTECTED]>:
> > > Can someone please help me with this?
> > >
> > > Thanks,
> > > Sever
> > >
> > > On Tue, Jun 26, 2012 at 8:14 PM, Sever Fundatureanu <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > >> My keys are built of 4  8-byte Ids. I am currently doing the load with
> > MR
> > >> but I get a timeout when doing the loadIncrementalFiles call:
> > >>
> > >> 12/06/24 21:29:01 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > >> unrecoverable error from region server
> > >> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> > >> attempts=10, exceptions:
> > >> Sun Jun 24 21:29:01 CEST 2012,
> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@4699ecf9,
> > >> java.net.SocketTimeoutException: Call to das3002.cm.cluster/
> > >> 10.141.0.79:60020
> > >> failed on socket timeout exception: java.net.SocketTimeoutException:
> > >> 60000
> > >> millis timeout while waiting for channel to be ready for read. ch :
> > >> java.nio.channels.SocketChannel[co
> > >> nnected local=/10.141.0.254:43240 remote=das3002.cm.cluster/
> > >> 10.141.0.79:60020]
> > >>
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345)
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:487)
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:275)
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:273)
> > >>         at
> > >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > >>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > >>         at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > >>         at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > >>         at java.lang.Thread.run(Thread.java:662)
> > >> 12/06/24 21:30:52 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > >> unrecoverable error from region server
> > >>
> > >> Is there a way in which I can increase the timeout period?