ValueProvider is global, PCollectionView is per-window, state is
So my unhappiness increases as we move through that list, adding more and
more constraints on correct use, none of which are reflected in the API.
Your description of "its context is an execution of the pipeline" is
accurate for ValueProvider. The question is not merely "which DoFn will
need which side inputs" but in which methods the side input is accessed
(forbidden in every DoFn method other than @ProcessElement and @OnTimer).
As for lambdas being more universal - I agree! But the capabilities of
ParDo are not. I don't think we should transparently make them available
anywhere you have a lambda. For example, multiply triggered side inputs
fundamentally alter the semantics of MapElements and Filter to vary over
time. The only reason this isn't a showstopper is that multiply triggered
side inputs have very loose consistency already, and you can write
nondeterministic predicates and map functions anyhow. If either of those
were better, we'd want to keep them that way.
Since NewDoFn is somewhat tied to the alternative proposal, and there's the
point that since lambdas are cross-language we might reconsider
ProcessContext (aka "pile of mud") style. But this universality - being the
lowest common denominator across languages - is not a goal. Python already
is quite different from Java, using | and >> and kwarg side inputs to good
effect. And those two languages are quite similar. Go will look entirely
different. For Java, annotation-driven APIs are common and offer important
advantages for readability, validation, and forward/backward compatibility.
And incidentally NewDoFn subsumes ProcessContext.
On Wed, Sep 13, 2017 at 2:32 PM, Eugene Kirpichov <
[EMAIL PROTECTED]lid> wrote: