|
T Vinod Gupta
2012-01-12, 11:37
yuzhihong@...
2012-01-12, 13:13
T Vinod Gupta
2012-01-12, 17:12
Ted Yu
2012-01-12, 17:46
Stack
2012-01-12, 22:39
T Vinod Gupta
2012-01-13, 05:47
Stack
2012-01-13, 20:31
|
-
is there any way to copy data from one table to another while updating rowKey??T Vinod Gupta 2012-01-12, 11:37
I am badly stuck and can't find a way out. i want to change my rowkey
schema while copying data from 1 table to another. but a map reduce job to do this won't work because of large row sizes (responseTooLarge errors). so i am left with a 2 steps processing of exporting to hdfs files and importing from them to the 2nd table. so i wrote a custom exporter that changes the rowkey to newRowKey when doing context.write(newRowKey, result). but when i import these new files into new table, it doesnt work due to this exception in put - "The row in the recently added ... doesn't match the original one ....". is there no way out for me? please help thanks
-
Re: is there any way to copy data from one table to another while updating rowKey??yuzhihong@... 2012-01-12, 13:13
What version of hbase did you use ?
Can you post the stack trace for the exception ? Thanks On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <[EMAIL PROTECTED]> wrote: > I am badly stuck and can't find a way out. i want to change my rowkey > schema while copying data from 1 table to another. but a map reduce job to > do this won't work because of large row sizes (responseTooLarge errors). so > i am left with a 2 steps processing of exporting to hdfs files and > importing from them to the 2nd table. so i wrote a custom exporter that > changes the rowkey to newRowKey when doing context.write(newRowKey, > result). but when i import these new files into new table, it doesnt work > due to this exception in put - "The row in the recently added ... doesn't > match the original one ....". > > is there no way out for me? please help > > thanks
-
Re: is there any way to copy data from one table to another while updating rowKey??T Vinod Gupta 2012-01-12, 17:12
hbase version -
hbase(main):001:0> version 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011 here are the different exceptions - when copying table to another table - 12/01/12 11:06:41 INFO mapred.JobClient: Task Id : attempt_201201120656_0012_m_000001_0, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 14 actions: NotServingRegionException: 14 times, servers with issues: ip-10-68-145-124.ec2.internal:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62) at com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) region server logs say this - 2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handl er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from 10.68.1 45.124:44423: Size: 121.7m when doing special export and then import, here is the stack trace - java.io.IOException: The row in the recently added KeyValue 84784841:1319846400:daily:PotentialReach doesn't match the original one 84784841:PotentialReach:daily:1319846400 at org.apache.hadoop.hbase.client.Put.add(Put.java:168) at org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70) at org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60) at org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) On Thu, Jan 12, 2012 at 5:13 AM, <[EMAIL PROTECTED]> wrote: > What version of hbase did you use ? > > Can you post the stack trace for the exception ? > > Thanks > > > > On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <[EMAIL PROTECTED]> wrote: > > > I am badly stuck and can't find a way out. i want to change my rowkey > > schema while copying data from 1 table to another. but a map reduce job > to > > do this won't work because of large row sizes (responseTooLarge errors). > so > > i am left with a 2 steps processing of exporting to hdfs files and
-
Re: is there any way to copy data from one table to another while updating rowKey??Ted Yu 2012-01-12, 17:46
I think you need to manipulate the keyvalue to match the new row.
Take a look at the check: //Checking that the row of the kv is the same as the put int res = Bytes.compareTo(this.row, 0, row.length, kv.getBuffer(), kv.getRowOffset(), kv.getRowLength()); if(res != 0) { throw new IOException("The row in the recently added KeyValue " + Cheers On Thu, Jan 12, 2012 at 9:12 AM, T Vinod Gupta <[EMAIL PROTECTED]>wrote: > hbase version - > hbase(main):001:0> version > 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011 > > here are the different exceptions - > > when copying table to another table - > 12/01/12 11:06:41 INFO mapred.JobClient: Task Id : > attempt_201201120656_0012_m_000001_0, Status : FAILED > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 14 actions: NotServingRegionException: 14 times, servers with issues: > ip-10-68-145-124.ec2.internal:60020, > at > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227) > at > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826) > at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667) > at > > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) > at > > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) > at > > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531) > at > > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > > com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62) > at > > com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:416) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > > region server logs say this - > 2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > handl > er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from > 10.68.1 > 45.124:44423: Size: 121.7m > > when doing special export and then import, here is the stack trace - > java.io.IOException: The row in the recently added KeyValue > 84784841:1319846400:daily:PotentialReach doesn't match the original one > 84784841:PotentialReach:daily:1319846400 > at org.apache.hadoop.hbase.client.Put.add(Put.java:168) > at > > org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70) > at > org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60) > at > org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:45) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:416) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
-
Re: is there any way to copy data from one table to another while updating rowKey??Stack 2012-01-12, 22:39
And what is happening on the server
ip-10-68-145-124.ec2.internal:60020 such that 14 attempts at getting a region failed. Is that region on line during this time or being moved? If not online, why not? Was server opening the region taking too long (because of high-load?). Grep around the region name in master log to see what was happening with it at the time of the below fails. Folks copy from one table to the other all the time w/o need of an hdfs intermediary resting stop. St.Ack On Thu, Jan 12, 2012 at 9:46 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > I think you need to manipulate the keyvalue to match the new row. > Take a look at the check: > > //Checking that the row of the kv is the same as the put > int res = Bytes.compareTo(this.row, 0, row.length, > kv.getBuffer(), kv.getRowOffset(), kv.getRowLength()); > if(res != 0) { > throw new IOException("The row in the recently added KeyValue " + > > Cheers > > On Thu, Jan 12, 2012 at 9:12 AM, T Vinod Gupta <[EMAIL PROTECTED]>wrote: > >> hbase version - >> hbase(main):001:0> version >> 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011 >> >> here are the different exceptions - >> >> when copying table to another table - >> 12/01/12 11:06:41 INFO mapred.JobClient: Task Id : >> attempt_201201120656_0012_m_000001_0, Status : FAILED >> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed >> 14 actions: NotServingRegionException: 14 times, servers with issues: >> ip-10-68-145-124.ec2.internal:60020, >> at >> >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227) >> at >> >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241) >> at >> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826) >> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682) >> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667) >> at >> >> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) >> at >> >> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) >> at >> >> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531) >> at >> >> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >> at >> >> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62) >> at >> >> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:270) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:416) >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) >> at org.apache.hadoop.mapred.Child.main(Child.java:264) >> >> region server logs say this - >> 2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server >> handl >> er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from >> 10.68.1 >> 45.124:44423: Size: 121.7m >> >> when doing special export and then import, here is the stack trace - >> java.io.IOException: The row in the recently added KeyValue >> 84784841:1319846400:daily:PotentialReach doesn't match the original one >> 84784841:PotentialReach:daily:1319846400 >> at org.apache.hadoop.hbase.client.Put.add(Put.java:168) >> at >> >> org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70) >> at >> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60)
-
Re: is there any way to copy data from one table to another while updating rowKey??T Vinod Gupta 2012-01-13, 05:47
Stack,
Here are some of the failures im getting now. I don't know whats wrong with my hbase right now.. i literally stopped my main processes that write to the store. i wrote an app to delete bunch of old data which we dont need any more.. so that app is doing scans and deletes (specific columns of rows based on some custom logic). 2012-01-13 05:42:21,201 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 8 on 60020 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342) at org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1389) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083) 2012-01-13 05:42:22,812 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call multi(org.apache.hadoop.hbase.client.MultiAction@444ea383) from 10.68.145.124:35132: output error 2012-01-13 05:42:22,812 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342) at org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1389) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083) on my master server i see these happening - 2012-01-13 04:48:37,301 WARN org.apache.hadoop.hbase.master.CatalogJanitor: Fail ed scan of catalog table java.net.SocketTimeoutException: Call to /10.68.145.124:60020 failed on socket t imeout exception: java.net.SocketTimeoutException: 60000 millis timeout while wa iting for channel to be ready for read. ch : java.nio.channels.SocketChannel[con nected local=/10.68.145.124:40155 remote=/10.68.145.124:60020] at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.jav a:802) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257 ) at $Proxy6.getRegionInfo(Unknown Source) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(C atalogTracker.java:424) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnectio n(CatalogTracker.java:272) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTra cker.java:331) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConne ctionDefault(CatalogTracker.java:364) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:2 55) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:2 37) at org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.jav a:116) at org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.ja va:85) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) i did see some archive threads on this but i don't know what exactly is causing this and how to get out of this. thanks On Thu, Jan 12, 2012 at 2:39 PM, Stack <[EMAIL PROTECTED]> wrote: > And what is happening on the server > ip-10-68-145-124.ec2.internal:60020 such that 14 attempts at getting a > region failed. Is that region on line during this time or being > moved? If not online, why not? Was server opening the region taking
-
Re: is there any way to copy data from one table to another while updating rowKey??Stack 2012-01-13, 20:31
On Thu, Jan 12, 2012 at 9:47 PM, T Vinod Gupta <[EMAIL PROTECTED]> wrote:
> i wrote an app to delete bunch of old data which we dont need > any more.. so that app is doing scans and deletes (specific columns of rows > based on some custom logic). > You understand that you are writing a new entry per item you are deleting? > 2012-01-13 05:42:21,201 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 8 on 60020 caught: java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1389) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727) > at Looks like the client went away before the server had time to respond. You seeing hard-working servers? > 2012-01-13 04:48:37,301 WARN org.apache.hadoop.hbase.master.CatalogJanitor: > Fail > ed scan of catalog table > java.net.SocketTimeoutException: Call to /10.68.145.124:60020 failed on > socket t > imeout exception: java.net.SocketTimeoutException: 60000 millis timeout > while wa This looks like why client went away... didn't get a response w/i 60 seconds. St.Ack |