Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Sorting ...


Copy link to this message
-
Re: Sorting ...

On May 22, 2011 03:21:53 Mark question wrote:
> I'm trying to sort Sequence files using the Hadoop-Example TeraSort. But
> after taking a couple of minutes .. output is empty.

<snip>

> I'm trying to find what the input format for the TeraSort is, but it is not
> specified.
>
> Thanks for any thought,
> Mark

Terasort sorts lines of text.  The InputFormat (for version 0.20.2) is in

hadoop-0.20.2/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java

The documentation at the top of the class says "An input format that reads the
first 10 characters of each line as the key and the rest of the line as the
value."

HTH

--
Luca Pireddu
CRS4 - Distributed Computing Group
Loc. Pixina Manna Edificio 1
Pula 09010 (CA), Italy
Tel:  +39 0709250452