|
|
-
CopyTable utility fails on larger tables
David Koch 2012-12-05, 15:42
Hello,
I can copy relatively small tables (10gb, 7million rows) using the built-in HBase (0.92.1-cdh4.0.1) CopyTable utility but copying larger tables, say 150gb, 100million rows does not work.
The failed CopyTable job required 128 mappers according to the Job Tracker UI, all of these failed in the first attempt after 15 minutes, the job then ran another 1 hour while remaining at 0%. However, according to the counters many rows apparently had been mapped and emitted. Checking with HBase shell, I could not perform any action on the destination table (scan, get, count) and the HBase Master Web UI showed only one region for the destination table. I checked the log file on this region server and saw attached log record (extract).
What precautions should I take when copying tables? Do certain settings need to be de-activated for the duration of the job?
Thank you,
/David 2012-12-05 15:50:40,406 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region _xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 \x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. due to global heap pressure 2012-12-05 15:50:49,086 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs:// xxxxx-1.xxxxxx.net:8020/hbase/_xxxxxxx_EH_xxx/a75fba31d9883ed7be4ed4a7be0e592f/t/1788b9f6f9594e2e9efe4ea5230d134c, entries=418152, sequenceid=1440152048, memsize=217.0m, filesize=145.9m 2012-12-05 15:50:49,088 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~217.8m/228416264, currentsize=33.0m/34555320 for region _xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 \x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. in 8682ms, sequenceid=1440152048, compaction requested=true 2012-12-05 15:50:49,088 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region _xxxxxxx_EH_xxx,,1354713876229.37825c623850b16013ab0bf902d02746. has too many store files; delaying flush up to 90000ms 2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call multi(org.apache.hadoop.hbase.client.MultiAction@44848967), rpc version=1, client version=29, methodsFingerPrint=54742778 from 5.39.67.13:56290: output error 2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60020 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1663) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:934) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:1013) at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:419) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1356)
2012-12-05 15:50:49,763 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server listener on 60020: readAndProcess threw exception java.io.IOException: Connection reset by peer. Count of bytes read: 0 java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198) at sun.nio.ch.IOUtil.read(IOUtil.java:171) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243) at org.apache.hadoop.hbase.ipc.HBaseServer.channelRead(HBaseServer.java:1686) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1130) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:713) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:505) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:480) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012-12-05 15:50:49,792 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server listener on 60020: readAndProcess threw exception java.io.IOException: Connection reset by peer. Count of bytes read: 0 java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198) at sun.nio.ch.IOUtil.read(IOUtil.java:171) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243) at org.apache.hadoop.hbase.ipc.HBaseServer.channelRead(HBaseServer.java:1686) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1130) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:713) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:505) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:480) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012-12-05 15:50:49,851 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call multi(org.apache.hadoop.hbase.client.MultiAction@304fa425), rpc version=1, client version=29, methodsFingerPrint=54742778 from 5.39.67.13:56289: output error 2012-12-05 15:50:49,851 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 4 on 60020 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1663) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:934) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:1013) at org.ap
+
David Koch 2012-12-05, 15:42
-
Re: CopyTable utility fails on larger tables
Doug Meil 2012-12-05, 16:30
I agree it shouldn't fail (slow is one thing, fail is something else), but regarding "HBase Master Web UI showed only one region for the destination table.", you probably want to pre-split your destination table.
It's writing to one region, splitting, writing to those regions, splitting, etc. On 12/5/12 10:42 AM, "David Koch" <[EMAIL PROTECTED]> wrote:
>Hello, > >I can copy relatively small tables (10gb, 7million rows) using the >built-in >HBase (0.92.1-cdh4.0.1) CopyTable utility but copying larger tables, say >150gb, 100million rows does not work. > >The failed CopyTable job required 128 mappers according to the Job Tracker >UI, all of these failed in the first attempt after 15 minutes, the job >then >ran another 1 hour while remaining at 0%. However, according to the >counters many rows apparently had been mapped and emitted. Checking with >HBase shell, I could not perform any action on the destination table >(scan, >get, count) and the HBase Master Web UI showed only one region for the >destination table. I checked the log file on this region server and saw >attached log record (extract). > >What precautions should I take when copying tables? Do certain settings >need to be de-activated for the duration of the job? > >Thank you, > >/David > > >2012-12-05 15:50:40,406 INFO >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region >_xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. due to global heap >pressure >2012-12-05 15:50:49,086 INFO org.apache.hadoop.hbase.regionserver.Store: >Added hdfs:// >xxxxx-1.xxxxxx.net:8020/hbase/_xxxxxxx_EH_xxx/a75fba31d9883ed7be4ed4a7be0e >592f/t/1788b9f6f9594e2e9efe4ea5230d134c, >entries=418152, sequenceid=1440152048, memsize=217.0m, filesize=145.9m >2012-12-05 15:50:49,088 INFO org.apache.hadoop.hbase.regionserver.HRegion: >Finished memstore flush of ~217.8m/228416264, currentsize=33.0m/34555320 >for region _xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. in 8682ms, >sequenceid=1440152048, compaction requested=true >2012-12-05 15:50:49,088 WARN >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region >_xxxxxxx_EH_xxx,,1354713876229.37825c623850b16013ab0bf902d02746. has too >many store files; delaying flush up to 90000ms >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server >Responder, call >multi(org.apache.hadoop.hbase.client.MultiAction@44848967), >rpc version=1, client version=29, methodsFingerPrint=54742778 from >5.39.67.13:56290: output error >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server >handler 9 on 60020 caught: java.nio.channels.ClosedChannelException >at >sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) >at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) >at >org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1663 >) >at >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSer >ver.java:934) >at >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.ja >va:1013) >at >org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServ >er.java:419) >at >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1356) > >2012-12-05 15:50:49,763 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server >listener on 60020: readAndProcess threw exception java.io.IOException: >Connection reset by peer. Count of bytes read: 0 >java.io.IOException: Connection reset by peer >at sun.nio.ch.FileDispatcher.read0(Native Method) >at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) >at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198) >at sun.nio.ch.IOUtil.read(IOUtil.java:171) >at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243) >at >org.apache.hadoop.hbase.ipc.HBaseServer.channelRead(HBaseServer.java:1686) >at >org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseSer
+
Doug Meil 2012-12-05, 16:30
-
Re: CopyTable utility fails on larger tables
David Koch 2012-12-05, 17:44
Hello Doug,
Thank you for your reply. I will try with pre-splitting.
/David
On Wed, Dec 5, 2012 at 5:30 PM, Doug Meil <[EMAIL PROTECTED]>wrote:
> > I agree it shouldn't fail (slow is one thing, fail is something else), but > regarding "HBase Master Web UI showed only one region for the destination > table.", you probably want to pre-split your destination table. > > It's writing to one region, splitting, writing to those regions, > splitting, etc. > > > > > On 12/5/12 10:42 AM, "David Koch" <[EMAIL PROTECTED]> wrote: > > >Hello, > > > >I can copy relatively small tables (10gb, 7million rows) using the > >built-in > >HBase (0.92.1-cdh4.0.1) CopyTable utility but copying larger tables, say > >150gb, 100million rows does not work. > > > >The failed CopyTable job required 128 mappers according to the Job Tracker > >UI, all of these failed in the first attempt after 15 minutes, the job > >then > >ran another 1 hour while remaining at 0%. However, according to the > >counters many rows apparently had been mapped and emitted. Checking with > >HBase shell, I could not perform any action on the destination table > >(scan, > >get, count) and the HBase Master Web UI showed only one region for the > >destination table. I checked the log file on this region server and saw > >attached log record (extract). > > > >What precautions should I take when copying tables? Do certain settings > >need to be de-activated for the duration of the job? > > > >Thank you, > > > >/David > > > > > >2012-12-05 15:50:40,406 INFO > >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region > >_xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 > >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. due to global heap > >pressure > >2012-12-05 15:50:49,086 INFO org.apache.hadoop.hbase.regionserver.Store: > >Added hdfs:// > > > xxxxx-1.xxxxxx.net:8020/hbase/_xxxxxxx_EH_xxx/a75fba31d9883ed7be4ed4a7be0e > >592f/t/1788b9f6f9594e2e9efe4ea5230d134c, > >entries=418152, sequenceid=1440152048, memsize=217.0m, filesize=145.9m > >2012-12-05 15:50:49,088 INFO org.apache.hadoop.hbase.regionserver.HRegion: > >Finished memstore flush of ~217.8m/228416264, currentsize=33.0m/34555320 > >for region _xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 > >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. in 8682ms, > >sequenceid=1440152048, compaction requested=true > >2012-12-05 15:50:49,088 WARN > >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region > >_xxxxxxx_EH_xxx,,1354713876229.37825c623850b16013ab0bf902d02746. has too > >many store files; delaying flush up to 90000ms > >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > >Responder, call > >multi(org.apache.hadoop.hbase.client.MultiAction@44848967), > >rpc version=1, client version=29, methodsFingerPrint=54742778 from > >5.39.67.13:56290: output error > >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > >handler 9 on 60020 caught: java.nio.channels.ClosedChannelException > >at > >sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > >at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > >at > >org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1663 > >) > >at > >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSer > >ver.java:934) > >at > >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.ja > >va:1013) > >at > >org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServ > >er.java:419) > >at > >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1356) > > > >2012-12-05 15:50:49,763 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > >listener on 60020: readAndProcess threw exception java.io.IOException: > >Connection reset by peer. Count of bytes read: 0 > >java.io.IOException: Connection reset by peer > >at sun.nio.ch.FileDispatcher.read0(Native Method) > >at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
+
David Koch 2012-12-05, 17:44
-
Re: CopyTable utility fails on larger tables
David Koch 2012-12-06, 13:32
Hi Doug,
This did the trick. I pre-split the empty destination table and CopyTable worked as expected afterwards.
Thank you again,
/David On Wed, Dec 5, 2012 at 6:44 PM, David Koch <[EMAIL PROTECTED]> wrote:
> Hello Doug, > > Thank you for your reply. I will try with pre-splitting. > > /David > > > On Wed, Dec 5, 2012 at 5:30 PM, Doug Meil <[EMAIL PROTECTED]>wrote: > >> >> I agree it shouldn't fail (slow is one thing, fail is something else), but >> regarding "HBase Master Web UI showed only one region for the destination >> table.", you probably want to pre-split your destination table. >> >> It's writing to one region, splitting, writing to those regions, >> splitting, etc. >> >> >> >> >> On 12/5/12 10:42 AM, "David Koch" <[EMAIL PROTECTED]> wrote: >> >> >Hello, >> > >> >I can copy relatively small tables (10gb, 7million rows) using the >> >built-in >> >HBase (0.92.1-cdh4.0.1) CopyTable utility but copying larger tables, say >> >150gb, 100million rows does not work. >> > >> >The failed CopyTable job required 128 mappers according to the Job >> Tracker >> >UI, all of these failed in the first attempt after 15 minutes, the job >> >then >> >ran another 1 hour while remaining at 0%. However, according to the >> >counters many rows apparently had been mapped and emitted. Checking with >> >HBase shell, I could not perform any action on the destination table >> >(scan, >> >get, count) and the HBase Master Web UI showed only one region for the >> >destination table. I checked the log file on this region server and saw >> >attached log record (extract). >> > >> >What precautions should I take when copying tables? Do certain settings >> >need to be de-activated for the duration of the job? >> > >> >Thank you, >> > >> >/David >> > >> > >> >2012-12-05 15:50:40,406 INFO >> >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region >> >_xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 >> >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. due to global heap >> >pressure >> >2012-12-05 15:50:49,086 INFO org.apache.hadoop.hbase.regionserver.Store: >> >Added hdfs:// >> > >> xxxxx-1.xxxxxx.net:8020/hbase/_xxxxxxx_EH_xxx/a75fba31d9883ed7be4ed4a7be0e >> >592f/t/1788b9f6f9594e2e9efe4ea5230d134c, >> >entries=418152, sequenceid=1440152048, memsize=217.0m, filesize=145.9m >> >2012-12-05 15:50:49,088 INFO >> org.apache.hadoop.hbase.regionserver.HRegion: >> >Finished memstore flush of ~217.8m/228416264, currentsize=33.0m/34555320 >> >for region _xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92 >> >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. in 8682ms, >> >sequenceid=1440152048, compaction requested=true >> >2012-12-05 15:50:49,088 WARN >> >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region >> >_xxxxxxx_EH_xxx,,1354713876229.37825c623850b16013ab0bf902d02746. has too >> >many store files; delaying flush up to 90000ms >> >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC >> Server >> >Responder, call >> >multi(org.apache.hadoop.hbase.client.MultiAction@44848967), >> >rpc version=1, client version=29, methodsFingerPrint=54742778 from >> >5.39.67.13:56290: output error >> >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC >> Server >> >handler 9 on 60020 caught: java.nio.channels.ClosedChannelException >> >at >> >sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) >> >at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) >> >at >> >> >org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1663 >> >) >> >at >> >> >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSer >> >ver.java:934) >> >at >> >> >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.ja >> >va:1013) >> >at >> >> >org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServ >> >er.java:419) >> >at >> >> >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1356) >> > >> >2012-12-05 15:50:49,763 WARN org.apache.hadoop.ipc.HBaseServer: IPC
+
David Koch 2012-12-06, 13:32
|
|