Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:'


+
Jay Vyas 2012-04-02, 13:39
+
Harsh J 2012-04-02, 13:52
+
Jay Vyas 2012-04-02, 14:05
+
Harsh J 2012-04-02, 14:16
Copy link to this message
-
Re: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:'
Thanks J : just curious about how you came to hypothesize (1) (i.e.
regarding the fact that threads and the
API componentns arent thread safe in my hadoop version).

I think thats a really good guess, and I would like to be able to make
those sorts of intelligent hypotheses
myself.  Any reading you can point me to for further enlightement ?

On Mon, Apr 2, 2012 at 3:16 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Jay,
>
> Without seeing the whole stack trace all I can say as cause for that
> exception from a job is:
>
> 1. You're using threads and the API components you are using isn't
> thread safe in your version of Hadoop.
> 2. Files are being written out to HDFS directories without following
> the OC rules. (This is negated, per your response).
>
> On Mon, Apr 2, 2012 at 7:35 PM, Jay Vyas <[EMAIL PROTECTED]> wrote:
> > No, my job does not write files directly to disk. It simply goes to some
> > web pages , reads data (in the reducer phase), and parses jsons into
> thrift
> > objects which are emitted via the standard MultipleOutputs API to hdfs
> > files.
> >
> > Any idea why hadoop would throw the "AlreadyBeingCreatedException" ?
> >
> > On Mon, Apr 2, 2012 at 2:52 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> >
> >> Jay,
> >>
> >> What does your job do? Create files directly on HDFS? If so, do you
> >> follow this method?:
> >>
> >>
> http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F
> >>
> >> A local filesystem may not complain if you re-create an existent file.
> >> HDFS' behavior here is different. This simple Python test is what I
> >> mean:
> >> >>> a = open('a', 'w')
> >> >>> a.write('f')
> >> >>> b = open('a', 'w')
> >> >>> b.write('s')
> >> >>> a.close(), b.close()
> >> >>> open('a').read()
> >> 's'
> >>
> >> Hence it is best to use the FileOutputCommitter framework as detailed
> >> in the mentioned link.
> >>
> >> On Mon, Apr 2, 2012 at 7:09 PM, Jay Vyas <[EMAIL PROTECTED]> wrote:
> >> > Hi guys:
> >> >
> >> > I have a map reduce job that runs normally on local file system from
> >> > eclipse, *but* it fails on HDFS running in psuedo distributed mode.
> >> >
> >> > The exception I see is
> >> >
> >> > *org.apache.hadoop.ipc.RemoteException:
> >> > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:*
> >> >
> >> >
> >> > Any thoughts on why this might occur in psuedo distributed mode, but
> not
> >> in
> >> > regular file system ?
> >>
> >>
> >>
> >> --
> >> Harsh J
> >>
> >
> >
> >
> > --
> > Jay Vyas
> > MMSB/UCHC
>
>
>
> --
> Harsh J
>

--
Jay Vyas
MMSB/UCHC
+
Ana Gillan 2014-08-02, 15:24
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB