Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Custom Mapper and Reducer vs HiveQL in terms of Performance


Copy link to this message
-
Re: Custom Mapper and Reducer vs HiveQL in terms of Performance
Sending it again. As I haven't got any reply on this. Any personal
experience will be appreciated.

*Raihan Jamal*

On Mon, Jul 9, 2012 at 3:37 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote:

>  *Problem Statement:-*
>
> I need to compare two tables Table1 and Table2 and they both store same
> thing. So I need to compare Table2 with Table1 as Table1 is the main
> table through which comparisons need to be made. So after comparing I need
> to make a report that Table2 has some sort of discrepancy. And these two
> tables has lots of data, around TB of data. So currently I have written
> HiveQL to do the comparisons and get the data back.
>
> So my question is which is better in terms of PERFORMANCE, writing a CUSTOM
> MAPPER and REDUCERto do this kind of job or the HiveQL that I wrote will
> be fine as I will be joining these two tables on millions of records. As
> far as I know HiveQL internally (behind the scenes) generates optimized
> custom map-reducer and submits for execution and gets back the results.
>
>
> *Raihan Jamal*
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB