Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Re: Best Hbase Storage for PIG


Copy link to this message
-
Re: Best Hbase Storage for PIG
Michel Segel 2012-04-26, 11:48
32 cores w 32GB of Ram?

Pig isn't fast, but I have to question what you are using for hardware.
Who makes a 32 core box?
Assuming you mean 16 physical cores.

7 drives? Not enough spindles for the number of cores.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 26, 2012, at 6:38 AM, Rajgopal Vaithiyanathan <[EMAIL PROTECTED]> wrote:

> Hey all,
>
> The default - HBaseStorage() takes hell lot of time for puts.
>
> In a cluster of 5 machines, insertion of 175 Million records took 4Hours 45
> minutes
> Question -  Is this good enough ?
> each machine has 32 cores and 32GB ram with 7*600GB harddisks. HBASE's heap
> has been configured to 8GB.
> If the put speed is low, how can i improve them..?
>
> I tried tweaking the TableOutputFormat by increasing the WriteBufferSize to
> 24MB, and adding the multi put feature (by adding 10,000 puts in ArrayList
> and putting it as a batch).  After doing this,  it started throwing
>
> java.util.concurrent.ExecutionException: java.net.SocketTimeoutException:
> Call to slave1/172.21.208.176:60020 failed on socket timeout exception:
> java.net.SocketTimeoutException: 60000 millis timeout while waiting for
> channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected
> local=/172.21.208.176:41135remote=slave1/
> 172.21.208.176:60020]
>
> Which i assume is because, the clients took too long to put.
>
> The detailed log is as follows from one of the reduce job is as follows.
>
> I've 'censored' some of the details. which i assume is Okay.! :P
> 2012-04-23 20:07:12,815 INFO org.apache.hadoop.util.NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-04-23 20:07:13,097 WARN
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already
> exists!
> 2012-04-23 20:07:13,787 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:zookeeper.version=3.4.2-1221870, built on 12/21/2011 20:46 GMT
> 2012-04-23 20:07:13,787 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:host.name=*****.*****
> 2012-04-23 20:07:13,787 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.version=1.6.0_22
> 2012-04-23 20:07:13,787 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.vendor=Sun Microsystems Inc.
> 2012-04-23 20:07:13,787 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.home=/usr/lib/jvm/java-6-openjdk/jre
> 2012-04-23 20:07:13,787 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.class.path=****************************
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.library.path=**********************
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.io.tmpdir=***************************
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:java.compiler=<NA>
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:os.name=Linux
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:os.arch=amd64
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:os.version=2.6.38-8-server
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:user.name=raj
>
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:user.home=*********
> 2012-04-23 20:07:13,788 INFO org.apache.zookeeper.ZooKeeper: Client
> environment:user.dir=**********************:
> 2012-04-23 20:07:13,790 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=master:2181 sessionTimeout=180000
> watcher=hconnection
> 2012-04-23 20:07:13,822 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server /172.21.208.180:2181
> 2012-04-23 20:07:13,823 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
> this process is [EMAIL PROTECTED]e1
> 2012-04-23 20:07:13,825 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to master/172.21.208.180:2181, initiating session
+
Rajgopal Vaithiyanathan 2012-04-26, 12:09
+
Michel Segel 2012-04-26, 13:41
+
Rajgopal Vaithiyanathan 2012-04-27, 06:13
+
Raghu Angadi 2012-04-27, 16:38
+
Rajgopal Vaithiyanathan 2012-04-28, 07:08
+
M. C. Srivas 2012-04-28, 15:16
+
Subir S 2012-05-12, 08:56
+
Doug Meil 2012-04-26, 13:04