Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> What's the easiest way to count the number of <Key, Value> pairs in a directory?


Copy link to this message
-
Re: What's the easiest way to count the number of <Key, Value> pairs in a directory?
Are you storing the data in sequence files?

-Joey

On Fri, May 20, 2011 at 10:33 AM, W.P. McNeill <[EMAIL PROTECTED]> wrote:
> The keys are Text and the values are large custom data structures serialized
> with Avro.
>
> I also have counters for the job that generates these files that gives me
> this information but sometimes...Well, it's a long story.  Suffice to say
> that it's nice to have a post-hoc method too.  :-)
>
> The identity mapper sounds like the way to go.
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB