Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: File formats in Hadoop

Copy link to this message
Re: File formats in Hadoop
And then there is the matter of how you put the data in the file. I've
heard that some people write the data as protocolbuffers into the
sequence file.

2011/3/19 Harsh J <[EMAIL PROTECTED]>:
> Hello,
> On Sat, Mar 19, 2011 at 9:31 PM, Weishung Chung <[EMAIL PROTECTED]> wrote:
>> I am browsing through the hadoop.io package and was wondering what other
>> file formats are available in hadoop other than SequenceFile and TFile?
> Additionally, on Hadoop, there're MapFiles/SetFiles (Derivative of
> SequenceFiles, if you need maps/sets), and IFiles (Used by the
> map-output buffers to produce a key-value file for Reducers to use,
> internal use only).
> Apache Hive use RCFiles, which is very interesting too. Apache Avro
> provides Avro-Datafiles that are designed for use with Hadoop
> Map/Reduce + Avro-serialized data.
> I'm not sure of this one, but Pig probably was implementing a
> table-file-like solution of their own a while ago. Howl?
> --
> Harsh J
> http://harshj.com

Met vriendelijke groeten,

Niels Basjes