Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase - hiting only one node on insert ...


+
pasaliczaharije 2010-01-18, 11:54
+
pasaliczaharije 2010-01-18, 11:55
Copy link to this message
-
Re: HBase - hiting only one node on insert ...
I'm not sure why there would be 0 requests for most region servers, but I
usually se a higher number of requests (even when the cluster is idle) on
the regionserver that serves .META. My guess is that, on your cluster,
hadoop-node02 serves .META.

Cosmin
On 1/18/10 1:55 PM, "pasaliczaharije" <[EMAIL PROTECTED]> wrote:

>
> Sorry for messed text. Here is propper format:
>
>
> Hi
>
> we are having small Hadoop cluster environment with 7 nodes (8GB ram/8cores
> each node) + 1 master and on same nodes we deployed HBase (7 nodes).
>
> Currrenlty we are importing ~50milion records from csv files into hbase. csv
> can have about 100 columns and rowkey is uuid generated with java.util.UUID.
>
> We are having about 50files on HDFS which is imported into hbase by
> mapreduce.
>
> At start everything works fine, but after few minutes, we are having large
> load on second node. Here is list from hbase master.jsp
>
> hadoop-node01:60030 1263591474251 requests=184, regions=148, usedHeap=1196,
> maxHeap=1991
> hadoop-node02:60030 1263591474109 requests=663, regions=148, usedHeap=1489,
> maxHeap=1991
> hadoop-node03:60030 1263591474082 requests=161, regions=147, usedHeap=1526,
> maxHeap=1991
> hadoop-node04:60030 1263632774794 requests=142, regions=147, usedHeap=1213,
> maxHeap=1991
> hadoop-node06:60030 1263596977608 requests=152, regions=147, usedHeap=749,
> maxHeap=1991
> hadoop-node07:60030 1263597118777 requests=156, regions=148, usedHeap=1749,
> maxHeap=1991
> hadoop-node08:60030 1263597239565 requests=179, regions=148, usedHeap=1681,
> maxHeap=1991
>
> (second node having about 5times more requests than other nodes) and at some
> time we will have request=0 for all nodes excepts for node2 (where we are
> having about 600-1800).
>
> In general we used uuid to have some kind of uniform load for all nodes. I'm
> not sure is this some UUID thing (not uniform) or something other.
>
> Also, we are using default hadoop configuration (70nodes will result in 14
> maps which runs in parallel). Is this optimal for this kind of job?
>
> Any comments?
>
> Thanks
> -Zaharije
>
+
Zaharije Pasalic 2010-01-18, 17:12
+
Jean-Daniel Cryans 2010-01-18, 17:52