Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> storing intermediate results ?


Copy link to this message
-
storing intermediate results ?
Hello,

I'm new to PIG, and I have a bunch of statements that process the
same input, which is actually the result of a JOIN between two very
big data set (millions of entries).

I wonder if it is better (faster) to save the result of this JOIN
into an Hadoop file and then to LOAD it, instead of just relying on
PIG optimizations ?

Thank a lot for your help.
+
Ashutosh Chauhan 2009-10-07, 16:33
+
zaki rahaman 2009-10-07, 20:08
+
Thejas Nair 2009-10-07, 20:16
+
Vincent BARAT 2009-10-08, 13:33
+
Alan Gates 2009-10-12, 18:50
+
Vincent BARAT 2009-10-08, 09:43
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB