Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Sequence File usage queries


Copy link to this message
-
Re: Sequence File usage queries
I didn't find SequenceFile metadata viewer.
You need to write some code for #2 below.

On Wed, Feb 23, 2011 at 4:24 PM, Mapred Learn <[EMAIL PROTECTED]>wrote:

> Thanks !
>
> In this case, how can we print the metadata associated with the data
> (sequence files), if user accessing this data wants to know it:
> i) Is there any hadoop command that can do it ?
> ii) Or we will have to provide some interface to the user to see the
> metadata ?
>
> -JJ
>
> On Sat, Feb 19, 2011 at 9:17 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> Option 2 is better.
>> Please see this in SequenceFile:
>>   public static Writer
>>     createWriter(FileSystem fs, Configuration conf, Path name,
>>                  Class keyClass, Class valClass, int bufferSize,
>>                  short replication, long blockSize,
>>                  CompressionType compressionType, CompressionCodec codec,
>>                  Progressable progress, Metadata metadata) throws
>> IOException {
>>
>>
>>
>> On Thu, Feb 17, 2011 at 1:16 PM, Mapred Learn <[EMAIL PROTECTED]>wrote:
>>
>>> Hi,
>>> I have a use case to upload some tera-bytes of text files as sequences
>>> files on HDFS.
>>>
>>> These text files have several layouts ranging from 32 to 62 columns
>>> (metadata).
>>>
>>> What would be a good way to upload these files along with their metadata:
>>>
>>> i) creating a key, value class per text file layout and use it to create
>>> and upload as sequence files ?
>>>
>>> ii) create SequenceFile.Metadata header in each file being uploaded as
>>> sequence file individually ?
>>>
>>> Any inputs are appreciated !
>>>
>>> Thanks
>>> -JJ
>>>
>>
>>
>