Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Regarding Sequence File Loader in piggybank


Copy link to this message
-
Re: Regarding Sequence File Loader in piggybank
Ashutosh Chauhan 2011-10-28, 17:51
Please paste the error that you are getting.

Ashutosh
On Fri, Oct 28, 2011 at 05:49, Gayatri Rao <[EMAIL PROTECTED]> wrote:

> Sorry that was some bug at my writeFields method. its fixed now and I am
> able to load and dump the data.
> In SequenceFileLoader I have defined the corresponding keyconverter and
> value converter classes.
>
> So, when I say
>  raw = load  'in.txt' using SequenceFileLoader;
> dump raw
>
> It dumps the data but when I want to project the fields, it gives an error
> do i have to explicity specify the schema in load ? like:
>
> raw = load 'in.txt' using SequenceFileLoader as (t:(a:int,
> b:chararray,...))
>
>
>
> On Wed, Oct 26, 2011 at 1:27 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> wrote:
>
> > What do you expect to see, how did you create it, and what are the weird
> > values?
> > Any chance your compression settings are different for writing and
> reading?
> >
> > On Tue, Oct 25, 2011 at 7:41 AM, Gayatri Rao <[EMAIL PROTECTED]>
> wrote:
> > > Thanks Dmitriy.
> > > I was trying to implement the MyClassConverter for my custom class
> > > and I overided and implemented the method
> > >
> > > @Override
> > >    public Object bytesToObject(DataByteArray dataByteArray) throws
> > > IOException {
> > >
> > >        MyClass o = (MyClass) ReflectionUtils.newInstance(MyClass.class,
> > > null);
> > >        o.readFields(new DataInputStream(new
> > > ByteArrayInputStream(dataByteArray
> > >                .get())));
> > >        return o;
> > >
> > >    }
> > >
> > > and my MyClass.readFields is as follows:
> > >
> > > @Override
> > >    public void readFields(DataInput in) throws IOException {
> > >        num = in.readInt();
> > >        list = new ArrayList<String>();
> > >        for (int i = 0; i < 3; i++) {
> > >            list.add(WritableUtils.readString(in));
> > >        }
> > >
> > >    }
> > >
> > > This puts some weird data in num and list. Any idea what I might be
> doing
> > > wrong?
> > >
> > >
> > > On Tue, Oct 25, 2011 at 9:03 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> > wrote:
> > >
> > >> you can compile it with "ant -Dnothrift=true"
> > >>
> > >> There's also a "-Dnoprotobuf=true" option, but I just tried it and it
> > seems
> > >> we do require protobufs in 1 place that's not excluded when we skip
> > >> protocol
> > >> buffers, so you still need protoc version 2.3
> > >>
> > >> D
> > >>
> > >> On Mon, Oct 24, 2011 at 6:52 PM, Gayatri Rao <[EMAIL PROTECTED]>
> > wrote:
> > >>
> > >> > Thats great, thanks, I ll check it out. Is thrift a dependency for
> > >> > building?
> > >> >
> > >> > On Mon, Oct 24, 2011 at 6:49 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]
> >
> > >> > wrote:
> > >> >
> > >> > > Correct -- it's completely rewritten.
> > >> > >
> > >> > > We haven't published EB to a public maven repo, though I believe
> we
> > did
> > >> > add
> > >> > > a "maven-install" ant target to publish to your local maven repo.
> > >> > >
> > >> > > D
> > >> > >
> > >> > > On Mon, Oct 24, 2011 at 6:33 PM, Gayatri Rao <[EMAIL PROTECTED]
> >
> > >> > wrote:
> > >> > >
> > >> > > > I have  checked the SequenceFileLoader from elephantbird and it
> > seems
> > >> > to
> > >> > > > use
> > >> > > > a different SequenceFileLoader as oppose to the one there is in
> > >> > piggybank
> > >> > > > Is there any reason for that?
> > >> > > >
> > >> > > > On Mon, Oct 24, 2011 at 5:57 PM, Gayatri Rao <
> [EMAIL PROTECTED]
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > Thank Dmitriy.  Are the jars available in maven repository?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Gayatri
> > >> > > > >
> > >> > > > >
> > >> > > > > On Mon, Oct 24, 2011 at 11:55 AM, Dmitriy Ryaboy <
> > >> [EMAIL PROTECTED]
> > >> > > > >wrote:
> > >> > > > >
> > >> > > > >> We have a massively improved (well, rewritten from scratch)
> > >> > > > SequenceLoader
> > >> > > > >> in elephantbird. Take a look here:
> > >> > > > >>
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/load/SequenceFileLoader.java