Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig submit nodes


Copy link to this message
-
Pig submit nodes
Folks -- how are folks handling the "productionalization" of their Pig
submit nodes?

For our PROD environment, I originally thought we'd just have a few VMs
from which Pig jobs would be submitted onto our cluster.  But on our 8GB
VMs, I found that we were often hitting heap OOM errors on a relatively
small set of approx. 50 analytics jobs.  As a short-term solution, we ended
up scaling these VMs horizontally, which seemed a bit messy to me, since we
have to manage which jobs are executed where.

Is this heap footprint (300-400 MB/per Pig process) consistent with your
environment?

Norbert
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB