Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Re: File formats in Hadoop


Copy link to this message
-
Re: File formats in Hadoop
Niels Basjes 2011-03-20, 11:04
And then there is the matter of how you put the data in the file. I've
heard that some people write the data as protocolbuffers into the
sequence file.

2011/3/19 Harsh J <[EMAIL PROTECTED]>:
> Hello,
>
> On Sat, Mar 19, 2011 at 9:31 PM, Weishung Chung <[EMAIL PROTECTED]> wrote:
>> I am browsing through the hadoop.io package and was wondering what other
>> file formats are available in hadoop other than SequenceFile and TFile?
>
> Additionally, on Hadoop, there're MapFiles/SetFiles (Derivative of
> SequenceFiles, if you need maps/sets), and IFiles (Used by the
> map-output buffers to produce a key-value file for Reducers to use,
> internal use only).
>
> Apache Hive use RCFiles, which is very interesting too. Apache Avro
> provides Avro-Datafiles that are designed for use with Hadoop
> Map/Reduce + Avro-serialized data.
>
> I'm not sure of this one, but Pig probably was implementing a
> table-file-like solution of their own a while ago. Howl?
>
> --
> Harsh J
> http://harshj.com
>

--
Met vriendelijke groeten,

Niels Basjes