Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig submit nodes

Copy link to this message
Pig submit nodes
Folks -- how are folks handling the "productionalization" of their Pig
submit nodes?

For our PROD environment, I originally thought we'd just have a few VMs
from which Pig jobs would be submitted onto our cluster.  But on our 8GB
VMs, I found that we were often hitting heap OOM errors on a relatively
small set of approx. 50 analytics jobs.  As a short-term solution, we ended
up scaling these VMs horizontally, which seemed a bit messy to me, since we
have to manage which jobs are executed where.

Is this heap footprint (300-400 MB/per Pig process) consistent with your