Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - using the key from a SequenceFile


Copy link to this message
-
Re: using the key from a SequenceFile
Dilip Joseph 2012-04-19, 15:46
An example input format for using SequenceFile keys in hive is at
https://gist.github.com/2421795 .  The code just reverses how the key and
value are accessed in the standard SequenceFileRecordRecorder and
SequenceFileInputFormat that comes with hadoop.

You can use this custom input format by specifying the following when you
create the table:

STORED AS
    INPUTFORMAT 'com.mycompany.SequenceFileKeyInputFormat'

Dilip

On Thu, Apr 19, 2012 at 6:09 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:

> On Thu, Apr 19, 2012 at 3:07 AM, Ruben de Vries <[EMAIL PROTECTED]>
> wrote:
> > I’m trying to migrate a part of our current hadoop jobs from normal
> > mapreduce jobs to hive,
> >
> > Previously the data was stored in sequencefiles with the keys containing
> > valueable data!
>
> I think you'll want to define your table using a custom InputFormat
> that creates a virtual row based on both the key and value and then
> use the 'STORED AS INPUTFORMAT ...'
>
> -- Owen
>

--
_________________________________________
Dilip Antony Joseph
http://csgrad.blogspot.com
http://www.marydilip.info