Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # user >> How to load data in Drill


Copy link to this message
-
Re: How to load data in Drill
Hi,

Jinfeng, do you want a copy of my parquet file too?  If so, can send later
tonight.

Cheers,

Tom

On 3 December 2013 07:38, Madhu Borkar <[EMAIL PROTECTED]> wrote:

> Hi Jason nd Jinfeng,
> Thank you guys for taking your time to debug the problem. I have sent my
> data to Jinfeng.
> Other than parquet file, can I put my data in hbase (or any other data
> source) and query it thru drill?
> Please, let me know.
>
>
>
> On Mon, Dec 2, 2013 at 10:21 PM, Jinfeng Ni <[EMAIL PROTECTED]> wrote:
>
> > Hi Jason,
> >
> > Thanks for offering your help to look at this issue.
> >
> > I did try to see if the file PageReadStatus.java has been changed
> > recently.  The output of git log for that file shows the latest change is
> > Sep 9 for "DRILL-221 Add license header to all files".  I thought the
> > binary distribution is made after the license header was added.  But you
> > are right, there might be change after the binary distribution.
> >
> > Thanks,
> >
> > Jinfeng
> >
> >
> >
> > On Mon, Dec 2, 2013 at 10:03 PM, Jason Altekruse
> > <[EMAIL PROTECTED]>wrote:
> >
> > > Hi Madhu,
> > >
> > > I would be happy to take a look at this as well. I wrote most of the
> code
> > > we are using to read parquet files, so I should be able to figure out
> why
> > > we are getting an NPE with the files you are reading. I took a look
> back
> > at
> > > the previous thread where this issue was being discussed and noticed
> that
> > > you reported having installed Drill from binaries. Have you tried
> > compiling
> > > Drill with a more recent version of the source from our repository?
> > >
> > > We ended up learning that Apache does not consider binary releases
> > > official, while we will obviously be providing them for users in future
> > > releases, we ended up giving up on the binaries before we reached the
> end
> > > of the Apache approval process. As such, several bugs were fixed (not
> > > necessarily in the parquet reader) between this binary and our final m1
> > > source release. Since the release, there have also been code changes
> made
> > > that may solve the issue you are having, so we can test it against the
> > > latest development code to see if changes still need to be made to
> solve
> > > the problem.
> > >
> > > Jinfeng,
> > > This also could mean that line 92 that you found in the source does not
> > > match what 92 was at the time of building this release, just something
> to
> > > keep in mind if you look at this again.
> > >
> > > Thanks,
> > > Jason Altekruse
> > >
> > >
> > > On Mon, Dec 2, 2013 at 11:38 PM, Jinfeng Ni <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Hi Madhu,
> > > >
> > > > Yes, the log is helpful; I can see the NPE is raised in storage
> engine
> > > > component ParquetRecordReader,  not in the query execution component.
> > > >
> > > > Unfortunately, I can not reproduce this parquet reader NPE problem
> > using
> > > > either sample data (nation.parquet, region.parquet), or other TPCH
> > > parquet
> > > > files. From the log, I could see the NPE is raised in the following
> > code:
> > > >
> > > >     currentPage = new Page(
> > > >         bytesIn,
> > > >         pageHeader.data_page_header.num_values,
> > > >         pageHeader.uncompressed_page_size,
> > > >
> > > >
> > > >
> > >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.repetition_level_encoding),
> > > >
> > > >
> > > >
> > >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.definition_level_encoding),
> > > >
> > > >
> > > >
> > >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.encoding)
> > > >     );
> > > >
> > > > My guess is either pageHeader, or it's member data_page_header is
> NULL.
> > > But
> > > > without the parquet file to recreate this NPE, I do not have a way to
> > > > verify.
> > > >
> > > > Is it possible you share your parquet file ( after remove any
> sensitive
> > >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB