Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> RE: M/R vs hbase problem in production


Copy link to this message
-
Re: M/R vs hbase problem in production
yes I'm sure, the map stage is used to aggregate data for the reduce stage.

On Mon, Aug 15, 2011 at 11:20 PM, Buttler, David <[EMAIL PROTECTED]> wrote:

> Are you sure you need to use a reducer to put rows into hbase?  You can
> save a lot of time if you can put the rows into hbase directly in the
> mappers.
>
> Dave
>
> -----Original Message-----
> From: Lior Schachter [mailto:[EMAIL PROTECTED]]
> Sent: Sunday, August 14, 2011 9:32 AM
> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: M/R vs hbase problem in production
>
> Hi,
>
> cluster details:
> hbase 0.90.2. 10 machines. 1GB switch.
>
> use-case
> M/R job that inserts about 10 million rows to hbase in the reducer,
> followed
> by M/R that works with hdfs files.
> When the first job maps finish the second job maps starts and region server
> crushes.
> please note, that when running the 2 jobs separately they both finish
> successfully.
>
> From our monitoring we see that when the 2 jobs work together the network
> load reaches to our max bandwidth - 1GB.
>
> In the region server log we see these exceptions:
> a.
> 2011-08-14 18:37:36,263 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call multi(org.apache.hadoop.hbase.client.MultiAction@491fb2f4)
> from 10.11.87.73:33737: output error
> 2011-08-14 18:37:36,264 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 24 on 8041 caught: java.nio.channels.ClosedChannelException
>        at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1387)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1339)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083)
>
> b.
> 2011-08-14 18:41:56,225 WARN org.apache.hadoop.hdfs.DFSClient:
> DFSOutputStream ResponseProcessor exception  for block
> blk_-8181634225601608891_579246java.io.EOFException
>        at java.io.DataInputStream.readFully(DataInputStream.java:180)
>        at java.io.DataInputStream.readLong(DataInputStream.java:399)
>        at
>
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2548)
>
> c.
> 2011-08-14 18:42:02,960 WARN org.apache.hadoop.hdfs.DFSClient: Failed
> recovery attempt #0 from primary datanode 10.11.87.72:50010
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException:
> blk_-8181634225601608891_579246 is already commited, storedBlock == null.
>        at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:4877)
>        at
>
> org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:501)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
>
>        at org.apache.hadoop.ipc.Client.call(Client.java:740)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)