Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Re: Development against hadoop v 0.23.5


Copy link to this message
-
Re: Development against hadoop v 0.23.5
Harsh J 2012-12-20, 06:01
Hi Anil,

Usage oriented questions should be directed at [EMAIL PROTECTED].
Also, you are running into an MR2 issue (not related to YARN - which
is just a platform MR2 runs on. See [1]).

I've moved this to the proper thread (bcc'd yarn-dev). My replies inline.

On Thu, Dec 20, 2012 at 6:06 AM, anil chaurasia <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I have a small prototype application that was running against 1.0.X version
> of hadoop.
> We wanted to try out the hadoop-yarn to get some number using the yarn
> release and setup a single node hadoop system.
>
> When I run the job with hadoop 1.0.X it works fine but when I try to run it
> against the 0.23.5 I get following error:
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected

Upgrading jobs from 1.x to 2.x (or 0.23.x), requires that your jobs be
recompiled against the new version, cause there are incompatible
changes in this upgrade that a recompilation helps catch and also
resolve.

> I am sure that there are significant changes to the hadoop framework in
> YARN release.
> So I decided to change the code as per the new interface.

I don't think you need to change much code as the APIs are still the
same. A recompilation against proper new dependencies should do the
trick.

> But when I try to use the maven to import the dependecies, maven complains
> about the artifacts. ( The maven repo only has pom.xml files and no jar
> files are included . )

Can you share your <dependencies>?

In 1.x you may have had "hadoop-core" as one dependency. In 2.x (or
0.23.x, a subset), you have several broken out components so we
provide a wrapper dependency called "hadoop-client" which should be
all that you need to include. There is no more "hadoop-core" in 2.x
(or 0.23.x).

> I was using maven dependency hadoop-core for ( 1.0.X) and it was working so
> I though I would get the yarn-core/yarn-project/hadoop-yarn but all of them
> had missing artifacts.
>
> One thing I did notice that the other yarn modules such as
> hadoop-yarn-common did have artifcats.

The 0.23.5 artifacts are all available on the Apache Maven
repositories http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/0.23.5/

> So, I wanted to know what are all the modules required to build a hadoop
> application using hadoop-yarn.

Any regular hadoop application may simply include hadoop-client to
pull in all dependencies.

P.s. You aren't exactly writing an YARN app, but an MR2 job (MR is an
app, you do not need to write that as its provided to you already).
[1] again.

> Is there a document on how to write a sample app using yarn somewhere on
> the wiki ? IF there is no wiki,  Could someone please point me to some
> documents where I can find information on how to write a new app using
> hadoop-yarn.

If you are interested in writing a distributed application (which is
not the same as writing an MR job), you can read [2].

[1] - http://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/
[2] - http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
--
Harsh J