-Re: Map-Reduce V/S Hadoop Ecosystem
Bejoy KS 2012-11-07, 19:23
Pretty much all the requirements fit well into hive, pig etc. The HivQL and pig latin are parsed by its respective parsers to map reduce jobs. This MR code thus generated is generic and is totally based on some rules defined in the parser.
But say your requirement has something more to be done in your job. Like updating some stats in hbase or so in between your hdfs data processing. When you combine hdfs data processing along with some hbase inserts/updates Hive / Pig may do it in two sets of MR jobs. But if you write a custom code you may be able to integrate this hbase updates along with the Map/Reduce job that does hdfs data processing. Summarizing my thought the custom MR code can limit the no of MR jobs in this case.
There can be n number of complex scenarios like this where your custom code turns more efficient and performant.
Sent from handheld, please excuse typos.
From: yogesh dhari <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2012 00:37:44
To: hadoop helpforoum<[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: RE: Map-Reduce V/S Hadoop Ecosystem
Thanks Bejoy Sir,
I am always grateful to u for your help.
Please explain these word into simple language with some case (if possible)
" If your requirement is that complex and you need very low level control
of your code mapreduce is better. If you are an expert in mapreduce your
code can be efficient as yours would very specific to your app but the
MR in hive and pig may be more generic.
If my requirement is complex. I will prefer Ecosystem(bcoz it provide simple interface to run Map-Reduce).
What is low level control of code. belongs here?? plz provide an example..
please do explain it into simple way so that I can understand your point of view.
> Subject: Re: Map-Reduce V/S Hadoop Ecosystem
> To: [EMAIL PROTECTED]
> From: [EMAIL PROTECTED]
> Date: Wed, 7 Nov 2012 18:24:52 +0000
> Hi Yogesh,
> The development time in Pig and hive are pretty less compared to its equivalent mapreduce code and for generic cases it is very efficient.
> If your requirement is that complex and you need very low level control of your code mapreduce is better. If you are an expert in mapreduce your code can be efficient as yours would very specific to your app but the MR in hive and pig may be more generic.
> To just write your custom mapreduce functions, just basic knowledge on java is good. As you are better with java you can understand the internals better.
> Bejoy KS
> Sent from handheld, please excuse typos.
> -----Original Message-----
> From: <[EMAIL PROTECTED]>
> Date: Wed, 7 Nov 2012 15:33:07
> To: <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> Subject: Map-Reduce V/S Hadoop Ecosystem
> Hello Hadoop Champs,
> Please give some suggestion..
> As Hadoop Ecosystem(Hive, Pig...) internally do Map-Reduce to process.
> My Question is
> 1). where Map-Reduce program(written in Java, python etc) are overtaking Hadoop Ecosystem.
> 2). Limitations of Hadoop Ecosystem comparing with Writing Map-Reduce program.
> 3) for writing Map-Reduce jobs in java how much we need to have skills in java out of 10 (?/10)
> Please put some light over it.
> Thanks & Regards
> Yogesh Kumar
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.