Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk load from OSGi running client


Copy link to this message
-
Re: Bulk load from OSGi running client
Hi Amit,

Would you be able to open a ticket summarizing your findings? Can you
provide a sample project that demonstrates the behavior you're seeing? We
could use that to provide a fix and, I hope, some kind of unit or
integration test.

Thanks,
Nick
On Sun, Sep 22, 2013 at 6:10 AM, Amit Sela <[EMAIL PROTECTED]> wrote:

> I think I got it.
> Algorithm has a Configuration member that is instantiated when Algorithm is
> created. Since Algorithm is static, when I update a jar that doesn't cause
> the hbase bundle to update, configuration still holds to the old CL.
> I tested with GZ and changed the code to call new Configuration (in
> getCodec(Configuration conf) GZ override) and it works. Makes sense because
> it probably takes the current TCCL.
>
> Since Algorithm.getCodec(Configuration conf) is only called from inside
> Algorithm and the only (apparent) reason to have configuration as a member
> of Algorithm is to avoid code duplication of
> *conf.setBoolean("hadoop.native.lib",
> true)* for every codec, I think it would be better to duplicate code here
> even if it's not as pretty as it is now, because on top of cloning the
> configuration properties, the clone constructor also clones the CL and that
> may cause problems where CL changes like in OSGi environment.
>
> The NPE causing recovery is still a mystery to me :)
>
>
> On Wed, Sep 11, 2013 at 3:34 PM, Amit Sela <[EMAIL PROTECTED]> wrote:
>
> > I did some more digging and I got this:
> >
> > When the HBase bundle is loaded (system start) the Compression.Algorithm
> > is probably created for the first time and the constructor calls new
> > Configuration().
> > When I update a bundle (but not the HBase bundle) it refreshes the
> > relevant packages, which doesn't include the HBase bundle, and once I try
> > to use getCodec() (GZ in my case) it creates a
> > new ReusableStreamGzipCodec() and sets new Configuration(*conf*) - where
> *conf
> > *is a private final member in Algorithm. Since *conf *holds the old class
> > loader (referring to the pre-update bundle) it passes that CL to the new
> > configuration created for the codec.
> >
> > I still have NO IDEA why NPE would cause Compression.Algorithm to
> > re-instantiate itself...
> > I think that calling new Configuration() and
> > setting this.conf.setBoolean("hadoop.native.lib", true) for each codec
> > would solve it since the class loader that would be set is the TCCL.
> >
> > I'll give it a try an keep updating.
> >
> > Thanks,
> > Amit.
> >
> >
> >
> > On Mon, Sep 9, 2013 at 9:12 PM, Stack <[EMAIL PROTECTED]> wrote:
> >
> >> On Mon, Sep 9, 2013 at 12:14 AM, Amit Sela <[EMAIL PROTECTED]> wrote:
> >> ...
> >>
> >> > The main issue still remains, it looks like Compression.Algortihm
> >> > configuration's class loader had reference to the bundle in revision 0
> >> > (before jar update) instead of revision 1 (after jar update). This
> >> could be
> >> > because of caching (or static) but then, why should it work after I
> get
> >> > NullPointerException (it does, immediately, no restarts or bundle
> >> updates).
> >> >
> >>
> >> When you say configuration above, you mean Compression.Algorithm's
> >> reference to an Hadoop Configuration instance?  I've not looked at code.
> >>  Is it coming in via static?
> >>
> >> I am not sure what it would then start working after NPE.  I would
> expect
> >> that it would stay broke rather than recover.
> >>
> >> St.Ack
> >>
> >
> >
>