Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Table entry count confusion


Copy link to this message
-
Table entry count confusion
I have an interesting dilemma wherein my Accumulo cluster overview says that
I have over 1.4 billion entries within the table and yet when I run scan
where I keep track of unique row ids, I get back a number that is
drastically less than (a little over 30 million) what the table claims to
have. I read the legend and it says, "Entries: Key/value pairs over each
instance, table or tablet." I was under the impression that Accumulo tables
did away with duplicate rows and hence my curiosity as to why there is
apparently 45 times more entries then there should be. Do I need to perform
a compaction or some other action to rid my cluster of what I believe to be
duplicate entries?

Thanks,
Jeff

-----

--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Table-entry-count-confusion-tp5629.html
Sent from the Developers mailing list archive at Nabble.com.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB