Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> map reduce and sync


Copy link to this message
-
Re: map reduce and sync
Helo Hemanth, thanks for answering.
The file is open by a separate process not map reduce related at all. You
can think of it as a servlet, receiving requests, and writing them to this
file, every time a request is received it is written and
org.apache.hadoop.fs.FSDataOutputStream.sync() is invoked.

At the same time, I want to run a map reduce job over this file. Simply
runing the word count example doesn't seem to work, it is like if the file
were empty.

hadoop -fs -tail works just fine, and reading the file using
org.apache.hadoop.fs.FSDataInputStream also works ok.

Last thing, the web interface doesn't see the contents, and command hadoop
-fs -ls says the file is empty.

What am I doing wrong?

Thanks!

Lucas

On Sat, Feb 23, 2013 at 4:37 AM, Hemanth Yamijala <[EMAIL PROTECTED]
> wrote:

> Could you please clarify, are you opening the file in your mapper code and
> reading from there ?
>
> Thanks
> Hemanth
>
> On Friday, February 22, 2013, Lucas Bernardi wrote:
>
>> Hello there, I'm trying to use hadoop map reduce to process an open file.
>> The writing process, writes a line to the file and syncs the file to
>> readers.
>> (org.apache.hadoop.fs.FSDataOutputStream.sync()).
>>
>> If I try to read the file from another process, it works fine, at least
>> using
>> org.apache.hadoop.fs.FSDataInputStream.
>>
>> hadoop -fs -tail also works just fine
>>
>> But it looks like map reduce doesn't read any data. I tried using the
>> word count example, same thing, it is like if the file were empty for the
>> map reduce framework.
>>
>> I'm using hadoop 1.0.3. and pig 0.10.0
>>
>> I need some help around this.
>>
>> Thanks!
>>
>> Lucas
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB