Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Sequence File usage queries


Copy link to this message
-
Re: Sequence File usage queries
Ted Yu 2011-02-19, 17:17
Option 2 is better.
Please see this in SequenceFile:
  public static Writer
    createWriter(FileSystem fs, Configuration conf, Path name,
                 Class keyClass, Class valClass, int bufferSize,
                 short replication, long blockSize,
                 CompressionType compressionType, CompressionCodec codec,
                 Progressable progress, Metadata metadata) throws
IOException {
On Thu, Feb 17, 2011 at 1:16 PM, Mapred Learn <[EMAIL PROTECTED]>wrote:

> Hi,
> I have a use case to upload some tera-bytes of text files as sequences
> files on HDFS.
>
> These text files have several layouts ranging from 32 to 62 columns
> (metadata).
>
> What would be a good way to upload these files along with their metadata:
>
> i) creating a key, value class per text file layout and use it to create
> and upload as sequence files ?
>
> ii) create SequenceFile.Metadata header in each file being uploaded as
> sequence file individually ?
>
> Any inputs are appreciated !
>
> Thanks
> -JJ
>