I think most of the issues you point out [validation, scheduling,
prefetching] are in the area of wiring. I reiterate that they can be solved
- both of the methods below will give the runner an answer to the low-level
question "which DoFn will need which side inputs":

1) Providing withSideInputs() builder methods on transforms that are
parameterized by user code. If only some side inputs should be made
available to particular bits of user code, provide more detailed
withBlahSideInputs() methods - this is up to the transform.

2) Inferring this from something annotation-driven as indicated in the
thread, e.g. capturing the PCollectionView in @SideInput-annotated public
fields. This can't be done on a lambda, because lambdas don't have fields
[so I think method #1 will keep being necessary], but it can be done on an
anonymous class.

As for direct access being misleading: I'm not sure I agree. I think the
intuition for PCollectionView.get() is no more wrong than the intuition for
ValueProvider.get(): the return value is, logically, context-free [more
like: its context is an execution of the pipeline], so I have no issue with
it being accessed implicitly.

On Wed, Sep 13, 2017 at 2:05 PM Kenneth Knowles <[EMAIL PROTECTED]lid>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB