Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Pangool: easier Hadoop, same performance


Copy link to this message
-
Pangool: easier Hadoop, same performance
Hi,
I'd like to introduce you Pangool <http://pangool.net/>, an easier
low-level MapReduce API for Hadoop. I'm one of the developers. We just
open-sourced it yesterday.

Pangool is a Java, low-level MapReduce API with the same flexibility and
performance than the plain Java Hadoop MapReduce API. The difference is
that it makes a lot of things easier to code and understand.

A few of Pangool's features:
- Tuple-based intermediate serialization (allowing easier development).
- Built-in, easy-to-use group by and sort by (removing boilerplate code for
things like secondary sort).
- Built-in, easy-to-use reduce-side joins (which are quite hard to
implement in Hadoop).
- Augmented Hadoop API: Built-in multiple inputs / outputs, configuration
via object instance.

Pangool meets the need of making Hadoop's steep learning curve a lot
smoother while retaining all its features, power and flexibility. It
differs in high-level tools like Pig or Hive in that it can be used as a
replacement of the low-level API. There is no performance / flexibility
penalty paid for using Pangool.

We did an initial benchmark <http://pangool.net/benchmark.html> to show
this idea.

I'd be very interested in hearing your feedback, opinions and questions on
it.

Cheers,

Pere.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB