Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> best practice for Pig + MySql for meta data lookups

Copy link to this message
best practice for Pig + MySql for meta data lookups

My setup is Pig + Hadoop + Cassandra for my "big data" and MySql for my
"relational/meta data".  Up until now that has been fine, but now I need to
start creating metrics that "cross the lines".  In particular, I need to
create aggregations of Cassandra data based on lookups from MySql.

After doing some research, it seems like my best option is using something
like Sqoop to map the meta/relational data I need from MySql -> HDFS, and
then use HDFS inside of Pig for the actual lookups.  I'd like to confirm
that general strategy is correct (or any other tips).


Bill Graham 2012-09-11, 15:33
William Oberman 2012-09-11, 15:54
Bill Graham 2012-09-11, 16:58
William Oberman 2012-09-11, 18:09
William Oberman 2012-09-12, 14:41