Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Replication - some timestamps off by 1 ms


Copy link to this message
-
Re: Replication - some timestamps off by 1 ms
Yeah increments won't work. I guess the warning isn't really visible
but one place you can see it is:

$ ./bin/hadoop jar ../hbase/hbase.jar
An example program must be given as the first argument.
Valid program names are:
  CellCounter: Count cells in HBase table
  completebulkload: Complete a bulk data load.
  copytable: Export a table from local cluster to peer cluster
  export: Write table data to HDFS.
  import: Import data written by Export.
  importtsv: Import data in TSV format.
  rowcounter: Count rows in HBase table
vvvv
  verifyrep: Compare the data from tables in two different clusters.
WARNING: It doesn't work for incrementColumnValues'd cells since the
timestamp is changed after being appended to the log.
^^^^

The problem is that increments' timestamps are different in the WAL
and in the final KV that's stored in HBase.

J-D

On Thu, Jul 11, 2013 at 12:19 PM, Patrick Schless
<[EMAIL PROTECTED]> wrote:
> It's possible, but I'm not sure. This is a live system, and we do use
> increment, and it's a smaller portion of our writes into HBase. I can try
> to duplicate it, but I can't say how these specific cells got written.
>
> Would incremented cells not get replicated correctly?
>
>
> On Thu, Jul 11, 2013 at 12:53 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
>
>> Are those incremented cells?
>>
>> J-D
>>
>> On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless
>> <[EMAIL PROTECTED]> wrote:
>> > I have had replication running for about a week now, and have had a lot
>> of
>> > data flowing to our slave cluster over that time. Now, I'm running the
>> > verifyrep MR job over a 1-hour period a couple days ago (which should be
>> > fully replicated), and I'm seeing a small number of "BADROWS".
>> > Spot-checking a few of them, the issue seems to be that the rows are
>> > present, and have the same values, but a single cell in the row will be
>> off
>> > by 1ms.
>> >
>> > For instance, the log reports this error:
>> > java.lang.Exception: This result was different:
>> >
>> keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:&s\xC0\x01/1373470923084/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8}
>> > compared to
>> >
>> keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:&s\xC0\x01/1373470923084/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8,
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8}
>> >
>> > Some diffing reduces the issue down to:
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8
>> > compared to
>> >
>> 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8.
>> >
>> > I'm assuming that the value before "/Put" is the cell's timestamp, which
>> > means that the copies are off by 1ms.
>> >
>> > Any idea what could cause this? So far (the job is still running), the
>> > problem seems rare (about 0.05% of rows).
>> >
>> > Thanks,
>> > Patrick
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB