Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> scan filtering column familly return wrong cell


Copy link to this message
-
Re: scan filtering column familly return wrong cell
Ok I can reply to myself ...

you have to add a clone of the KeyValue in the Put. So
  p.add(kv);
becomes
  p.add(kv.clone());

If not, I suppose only the last one is added in HBase (but the result is
quite weird and should be fixed IMO)

Cheers,

--
Damien
2012/11/9 Damien Hardy <[EMAIL PROTECTED]>

> Hello,
>
> I am a bit confused here...
>
> I try to execute a M/R to import data in HBase table 'Consultation'.
>
> Running on CDH4.1.2
>
> map function emits context.write(ImmutableBytesWritable, KeyValue)
>
> conf summary :
>     job.setOutputFormatClass(TableOutputFormat.class);
>     job.setInputFormatClass(DataDrivenDBInputFormat.class);
>     job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE,
> "Consultation");
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(KeyValue.class);
>
>
> The reduce class is :
>
>   static class ImportReducer
>   extends TableReducer<ImmutableBytesWritable, KeyValue,
> ImmutableBytesWritable> {
>     @Override
>     public void reduce(ImmutableBytesWritable row, Iterable<KeyValue> kvs,
> Reducer<ImmutableBytesWritable, KeyValue, ImmutableBytesWritable,
> Writable>.Context context)
>     throws java.io.IOException, InterruptedException {
>       Put p = new Put(row.copyBytes());
>       int i = 0;
>       byte[] rk = null;
>       for (KeyValue kv: kvs) {
>         p.add(kv);
>         if ( Bytes.compareTo(CF_VISITED, 0, CF_VISITED.length,
> kv.getBuffer(), kv.getFamilyOffset(), kv.getFamilyLength() ) == 0 ) {
>           i++;
>         }
>       }
>       p.add(CF_COUNTER,QA_COUNTER,Bytes.toBytes(i));
>       context.write(new ImmutableBytesWritable(row),p);
>     }
>   }
>
>
> hbase(main):038:0> scan 'Consultation', {COLUMNS=> *'visiting_tl'*, LIMIT
> => 10 }
> ROW
> COLUMN+CELL
>
>  00070db1aa26d1906a078a1e03f788cb-\x00\x13\x80\x15         column=*
> visited_tl:*\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7,
> timestamp=1266998781000,
> value=\x00\x00\x00\x00
>
>  001316263fc8b454bbd86dff1587a347-\x00>t\x05               column=*
> visited_tl:*\x7F\xFF\xFE\xD7\x0F\xB8u_\x00\x08\xE1\xA0,
> timestamp=1275341540000,
> value=\x00\x00\x00\x00
>
>  001497e68d7c71a3cd281860484fa6be-\x00/\x0E^               column=*
> visited_tl:*\x7F\xFF\xFE\xD8\x06\x9B\xB0\xB7\x00(3S,
> timestamp=1271199453000,
> value=\x00\x00\x00\x00
>
>  001845aac2462a1c24b36eb90ab698cf-\x00\x04\x1E\xF5         column=*
> visited_tl:*\x7F\xFF\xFE\xD6\xA8\xB9-\xEF\x002Po,
> timestamp=1277069546000,
> value=\x00\x00\x00\x01
>
>  0019cec2c1f38c42b1c540ef7708c6a9-\x00;\xE0\x97            column=*
> visited_tl:*\x7F\xFF\xFE\xD8\xF9\xC7\x0C_\x00\x02?.,
> timestamp=1267119748000,
> value=\x00\x00\x00\x00
>
>  001de6b92754b0ef44ee10bf2bdfe3c3-\x00%\x1AV               column=*
> visited_tl:*\x7F\xFF\xFE\xD6\xE4H\x99\xC7\x00\x0F\x7F9,
> timestamp=1276070291000,
> value=\x00\x00\x00\x01
>
>  00217f082f96eb12108c139b99a3ccb7-\x00\x02w\x08            column=*
> visited_tl:*\x7F\xFF\xFE\xD8\xEB\x1B\x95\xEF\x00\x0A7\x19,
> timestamp=1267365866000,
> value=\x00\x00\x00\x00
>
>  0021cbfd559f56dd298e4b4fee7626a9-\x00r\xBF\xFA            column=*
> visited_tl:*\x7F\xFF\xFE\xD6\xA1\x0B-\x0F\x00\x03\xBC\x8B,
> timestamp=1277198390000,
> value=\x00\x00\x00\x02
>
>  00266c02a60f9a6efb5d24317e6032a0-\x00\x0E]+               column=*
> visited_tl:*\x7F\xFF\xFE\xD6\xBC\x0D\xD1\x7F\x00/ q,
> timestamp=1276745232000,
> value=\x00\x00\x00\x01
>
>  0026dbbd6562da5b79f1b09e94e3b973-\x00C[\x93               column=*
> visited_tl:*\x7F\xFF\xFE\xD7\xB0\xFA\xB7/\x00\x02~\x09,
> timestamp=1272636066000,
> value=\x00\x00\x00\x01
>
> 10 row(s) in 2.1130 seconds
>
>
> hbase(main):036:0> get  'Consultation',
> "00070db1aa26d1906a078a1e03f788cb-\x00\x13\x80\x15"
> COLUMN
> CELL
>
>  *visited_tl:\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7*
> timestamp=1266998781000,
> value=\x00\x00\x00\x00
>
>  *visited_tl:\x7F\xFF\xFE\xD9\x00\xFC\xDB\xB7\x001\xC5\xA7*
> timestamp=1266998781000,
> value=\x00\x00\x00\x00
>
>  visits_count:_counter
Damien HARDY
IT Infrastructure Architect

Viadeo - 30 rue de la Victoire - 75009 Paris - France
T : +33 1 80 48 39 73 – F : +33 1 42 93 22 56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB