Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: HBase bulk load through co-processors


+
Andrew Purtell 2012-06-26, 17:05
+
Sever Fundatureanu 2012-06-26, 18:14
+
Sever Fundatureanu 2012-07-02, 17:30
+
Jean-Marc Spaggiari 2012-07-02, 18:19
+
Sever Fundatureanu 2012-07-02, 18:58
Copy link to this message
-
Re: HBase bulk load through co-processors
Sever,

I presume you're loading your data via online Puts via the MR job (as
opposed to generating HFiles). What are you hoping to gain from a
coprocessor implementation vs the 6 MR jobs? Have you pre-split your
tables? Can the RegionServer(s) handle all the concurrent mappers?

-n

On Mon, Jul 2, 2012 at 11:58 AM, Sever Fundatureanu <
[EMAIL PROTECTED]> wrote:

> I agree that increasing the timeout is not the best option, I will work
> both on better balancing the load and maybe doing it in increments like you
> suggested. However for now I want a quick fix to the problem.
>
> Just to see if I understand this right: a zookeeper node redirects my
> client to a region server node and then my client talk directly to this
> region server; now the timeout happens on the client while talking to the
> RS right? It expects some kind of confirmation and it times out.. if this
> is the case how can I increase this timeout? I only found in the
> documentation "zookeeper.session.timeout" which is the timeout between
> zookeeper and HBase.
>
> Thanks,
> Sever
>
> On Mon, Jul 2, 2012 at 8:19 PM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]
> > wrote:
>
> > Hi Sever,
> >
> > It seems one of the nodes in your cluster is overwhelmed with the load
> > you are giving him.
> >
> > So IMO, you have two options here:
> > First, you can try to reduce the load. I mean, split the bulk in
> > multiple smaller bulks and load them one by one to give the time to
> > your cluster to dispatch it correctly.
> > Second, you can inscreade the timeone from 60s to 120s. But you might
> > face the same issue with 120s so  I really recommand the fist option.
> >
> > JM
> >
> > 2012/7/2, Sever Fundatureanu <[EMAIL PROTECTED]>:
> > > Can someone please help me with this?
> > >
> > > Thanks,
> > > Sever
> > >
> > > On Tue, Jun 26, 2012 at 8:14 PM, Sever Fundatureanu <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > >> My keys are built of 4  8-byte Ids. I am currently doing the load with
> > MR
> > >> but I get a timeout when doing the loadIncrementalFiles call:
> > >>
> > >> 12/06/24 21:29:01 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > >> unrecoverable error from region server
> > >> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> > >> attempts=10, exceptions:
> > >> Sun Jun 24 21:29:01 CEST 2012,
> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@4699ecf9,
> > >> java.net.SocketTimeoutException: Call to das3002.cm.cluster/
> > >> 10.141.0.79:60020
> > >> failed on socket timeout exception: java.net.SocketTimeoutException:
> > >> 60000
> > >> millis timeout while waiting for channel to be ready for read. ch :
> > >> java.nio.channels.SocketChannel[co
> > >> nnected local=/10.141.0.254:43240 remote=das3002.cm.cluster/
> > >> 10.141.0.79:60020]
> > >>
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345)
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:487)
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:275)
> > >>         at
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:273)
> > >>         at
> > >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > >>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > >>         at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > >>         at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > >>         at java.lang.Thread.run(Thread.java:662)
> > >> 12/06/24 21:30:52 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > >> unrecoverable error from region server
> > >>
> > >> Is there a way in which I can increase the timeout period?
+
Sever Fundatureanu 2012-07-07, 09:58
+
Nick Dimiduk 2012-07-09, 08:16
+
Sever Fundatureanu 2012-07-14, 12:24
+
Andrew Purtell 2012-07-14, 19:20
+
Sever Fundatureanu 2012-06-26, 16:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB