Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> [VOTE] port HADOOP-6218 (Split TFile by Record Sequence Number) to hadoop 0.20/0.21


Copy link to this message
-
[VOTE] port HADOOP-6218 (Split TFile by Record Sequence Number) to hadoop 0.20/0.21
HADOOP-6218 exposed the internal "Location" object as a global Record  
Sequence Number (RecNum). The feature is useful in a number of ways:  
(1) support progress reporting for upper layers (object file, zebra);  
(2) use RecNum as cursor by a secondary index; (3) support aligned  
split across multiple parallel TFiles. Given that TFile is still at  
its early stage of being adopted, I suggest that we port the patch  
back to hadoop 0.20/0.21 now.

-Hong
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB