Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # dev - Default{Sink,Source,Channel}Factory question


+
Brock Noland 2012-10-09, 19:30
+
Hari Shreedharan 2012-10-09, 19:49
+
Brock Noland 2012-10-09, 20:02
Copy link to this message
-
Re: Default{Sink,Source,Channel}Factory question
Mike Percy 2012-10-09, 23:03
To me that seems like a symptom of the weirdness of reconfiguration in
Flume.

On the one hand, as Hari says you should be able to stop everything,
reconfigure, and start it back up again without losing state for components
that did not change or only changed their parameters (memory channel is the
prime example here).

On the other hand, say you remove a component from the configuration, run
for a while without it, then add a component back later with the same name.
I wouldn't expect the state to be saved, but it is. Ideally, the factories
would always construct new objects, and the component registry / object
caching mechanism would be explicit, separate from the factories.

Regards
Mike
On Tue, Oct 9, 2012 at 1:02 PM, Brock Noland <[EMAIL PROTECTED]> wrote:

> That makes sense.  There are two methods, unregister and
> getRegistryClone which appear as though they can be removed with no
> net affect on said functionality.
>
> Brock
>
> On Tue, Oct 9, 2012 at 2:49 PM, Hari Shreedharan
> <[EMAIL PROTECTED]> wrote:
> > Brock,
> >
> > The Default factories reuse objects which were already created if a new
> configuration has sources/sinks/channels with the same name. So during
> reconfig it avoids creating new instances. It does not matter if this code
> is removed from the source and the sink factories. But you need to track
> this in the channel code, especially for the MemoryChannel. If we forget
> about the old instance, and simply create a new instance, we lose the data
> that is still in the channel. Since the only way we can really track events
> in memory is by the name of the channel - caching channel objects is sort
> of necessary for correctness.
> >
> >
> > Thanks
> > Hari
> >
> > --
> > Hari Shreedharan
> >
> >
> > On Tuesday, October 9, 2012 at 12:30 PM, Brock Noland wrote:
> >
> >> Hi,
> >>
> >> I am working on FLUME-1502 and I noticed that the
> >> Default{Sink,Source,Channel}Factory classes all track the
> >> sink,source,channels they have created. Additionally, there is a
> >> getRegistryClone class which is only used in tests.
> >>
> >> 1) Is there a reason they track the instances they have created?
> >> 2) Any objection to removing this code since it's not being used?
> >>
> >> Brock
> >
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>
+
Brock Noland 2012-10-10, 00:03