Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - best practice for Pig + MySql for meta data lookups


+
William Oberman 2012-09-11, 15:17
Copy link to this message
-
Re: best practice for Pig + MySql for meta data lookups
Bill Graham 2012-09-11, 15:33
That approach makes sense. We have similar situations where we pull
relation data into HDFS and then join/agg with it via MR. In other cases
we'll export aggregated HDFS data into a relational DB and then do
additional aggs using SQL. That option of course only works of your data
sizes are within reason.
On Tue, Sep 11, 2012 at 8:17 AM, William Oberman
<[EMAIL PROTECTED]>wrote:

> Hello,
>
> My setup is Pig + Hadoop + Cassandra for my "big data" and MySql for my
> "relational/meta data".  Up until now that has been fine, but now I need to
> start creating metrics that "cross the lines".  In particular, I need to
> create aggregations of Cassandra data based on lookups from MySql.
>
> After doing some research, it seems like my best option is using something
> like Sqoop to map the meta/relational data I need from MySql -> HDFS, and
> then use HDFS inside of Pig for the actual lookups.  I'd like to confirm
> that general strategy is correct (or any other tips).
>
> Thanks!
>
> will
>

--
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*
+
William Oberman 2012-09-11, 15:54
+
Bill Graham 2012-09-11, 16:58
+
William Oberman 2012-09-11, 18:09
+
William Oberman 2012-09-12, 14:41