Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> What's the easiest way to count the number of <Key, Value> pairs in a directory?


Copy link to this message
-
Re: What's the easiest way to count the number of <Key, Value> pairs in a directory?
Are you storing the data in sequence files?

-Joey

On Fri, May 20, 2011 at 10:33 AM, W.P. McNeill <[EMAIL PROTECTED]> wrote:
> The keys are Text and the values are large custom data structures serialized
> with Avro.
>
> I also have counters for the job that generates these files that gives me
> this information but sometimes...Well, it's a long story.  Suffice to say
> that it's nice to have a post-hoc method too.  :-)
>
> The identity mapper sounds like the way to go.
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434