Thanks Josh... nailed it.
Are we planning on shipping 1.0.0 out soon? Need any help with that?
And now that it's running - I have generated the same pipeline also in
Spark (w/o Crunch, using RDDs) and I compared the DAGs Spark generated, and
they look completely different, with Crunch adding many intermediate steps
in each step (map, mapPartitions, mapPartitionsWithIndex to name a few).
Can you give me some insight as to how is Crunch submitting the jobs to
Spark?
I am going to do some benchmarking, but will there be overhead to these
extra steps?
Thanks,
Ron.

On Fri, May 26, 2017 at 10:30 PM Josh Wills <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB