Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> pig and sync


Copy link to this message
-
Re: pig and sync
Hey, anyone?
Am I the only one trying to do realtime analytics on hadoop using pig?
Really? I guess I'm not, so what is your approach then?

I would really appreciate some advice.

Thanks!
Lucas
On Tue, Feb 5, 2013 at 6:06 PM, Lucas Bernardi <[EMAIL PROTECTED]> wrote:

> Hello there, I'm starting to use Pig for processing events and I'm having
> one specific issue.
> Currently, the writing process, writes a line to the file and syncs the
> file to readers.
> (org.apache.hadoop.fs.FSDataOutputStream.sync()).
>
> If I try to read the file from another process, it works fine, at least
> using
> org.apache.hadoop.fs.FSDataInputStream.
>
> But it looks like pig doesn't read any data. I tried PigStorage,
> CSVLoader, and CSVExcelStorage, but no luck.
>
> One weird thing is the following:
> Successfully read 0 records (376 bytes) from: "...."
>
> It looks like it is reading 376 bytes, the file has more than 1 hdfs block
> (64M).
>
> I'm using hadoop 1.0.3. and pig 0.10.0
>
> Thanks!
> Lucas
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB