Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - CopyTable utility fails on larger tables


Copy link to this message
-
Re: CopyTable utility fails on larger tables
David Koch 2012-12-06, 13:32
Hi Doug,

This did the trick. I pre-split the empty destination table and CopyTable
worked as expected afterwards.

Thank you again,

/David
On Wed, Dec 5, 2012 at 6:44 PM, David Koch <[EMAIL PROTECTED]> wrote:

> Hello Doug,
>
> Thank you for your reply. I will try with pre-splitting.
>
> /David
>
>
> On Wed, Dec 5, 2012 at 5:30 PM, Doug Meil <[EMAIL PROTECTED]>wrote:
>
>>
>> I agree it shouldn't fail (slow is one thing, fail is something else), but
>> regarding "HBase Master Web UI showed only one region for the destination
>> table.", you probably want to pre-split your destination table.
>>
>> It's writing to one region, splitting, writing to those regions,
>> splitting, etc.
>>
>>
>>
>>
>> On 12/5/12 10:42 AM, "David Koch" <[EMAIL PROTECTED]> wrote:
>>
>> >Hello,
>> >
>> >I can copy relatively small tables (10gb, 7million rows) using the
>> >built-in
>> >HBase (0.92.1-cdh4.0.1) CopyTable utility but copying larger tables, say
>> >150gb, 100million rows does not work.
>> >
>> >The failed CopyTable job required 128 mappers according to the Job
>> Tracker
>> >UI, all of these failed in the first attempt after 15 minutes, the job
>> >then
>> >ran another 1 hour while remaining at 0%. However, according  to the
>> >counters many rows apparently had been mapped and emitted. Checking with
>> >HBase shell, I could not perform any action on the destination table
>> >(scan,
>> >get, count) and the HBase Master Web UI showed only one region for the
>> >destination table. I checked the log file on this region server and saw
>> >attached log record (extract).
>> >
>> >What precautions should I take when copying tables? Do certain settings
>> >need to be de-activated for the duration of the job?
>> >
>> >Thank you,
>> >
>> >/David
>> >
>> >
>> >2012-12-05 15:50:40,406 INFO
>> >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
>> >_xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92
>> >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. due to global heap
>> >pressure
>> >2012-12-05 15:50:49,086 INFO org.apache.hadoop.hbase.regionserver.Store:
>> >Added hdfs://
>> >
>> xxxxx-1.xxxxxx.net:8020/hbase/_xxxxxxx_EH_xxx/a75fba31d9883ed7be4ed4a7be0e
>> >592f/t/1788b9f6f9594e2e9efe4ea5230d134c,
>> >entries=418152, sequenceid=1440152048, memsize=217.0m, filesize=145.9m
>> >2012-12-05 15:50:49,088 INFO
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >Finished memstore flush of ~217.8m/228416264, currentsize=33.0m/34555320
>> >for region _xxxxxxx_EH_xxx,{\xF0\xE4\xA2?!EQ\xB8\xC9tE\x19\x92
>> >\x08,1354713876229.a75fba31d9883ed7be4ed4a7be0e592f. in 8682ms,
>> >sequenceid=1440152048, compaction requested=true
>> >2012-12-05 15:50:49,088 WARN
>> >org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region
>> >_xxxxxxx_EH_xxx,,1354713876229.37825c623850b16013ab0bf902d02746. has too
>> >many store files; delaying flush up to 90000ms
>> >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> Server
>> >Responder, call
>> >multi(org.apache.hadoop.hbase.client.MultiAction@44848967),
>> >rpc version=1, client version=29, methodsFingerPrint=54742778 from
>> >5.39.67.13:56290: output error
>> >2012-12-05 15:50:49,760 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> Server
>> >handler 9 on 60020 caught: java.nio.channels.ClosedChannelException
>> >at
>> >sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>> >at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>> >at
>>
>> >org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1663
>> >)
>> >at
>>
>> >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSer
>> >ver.java:934)
>> >at
>>
>> >org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.ja
>> >va:1013)
>> >at
>>
>> >org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServ
>> >er.java:419)
>> >at
>>
>> >org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1356)
>> >
>> >2012-12-05 15:50:49,763 WARN org.apache.hadoop.ipc.HBaseServer: IPC