Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Including wonderdog in Pig contrib

Alan Gates 2012-07-10, 16:23
Daniel Dai 2012-07-10, 17:58
Dmitriy Ryaboy 2012-07-10, 20:31
Alan Gates 2012-07-13, 18:41
Copy link to this message
Re: Including wonderdog in Pig contrib

I think a subproject can evolve a lot more rapidly if it's in github.

Companion projects in github makes sense as long as each project can have
it's own set of owners with commit privileges.
On Fri, Jul 13, 2012 at 11:41 AM, Alan Gates <[EMAIL PROTECTED]> wrote:

> I agree with Dmitriy.  Everything we have put in contrib so far has rotted
> there.
> Would it make sense in our github piggybank to allow others to put in
> companion projects like this?  We could provide boiler plate build,
> directory layout, etc. and let those who wanted to put their code there and
> specify which version(s) of Pig it works with.
> Alan.
> On Jul 10, 2012, at 1:31 PM, Dmitriy Ryaboy wrote:
> > I don't see the need for Pig to include much of anything in contrib
> > (and that includes things that are currently in contrib).
> >
> > Wonderdog is a great library, and it's fantastic that Russel wants to
> > make sure it works with latest Pig versions. I don't see how putting
> > the apache process in Russel's way will help him achieve this goal,
> > though. As long as Pig faithfully publishes SNAPSHOT jars, any
> > interested project can ensure their tests pass with the latest
> > snapshot, and seek help from the dev and user lists if it does not.
> >
> > We had a whole discussion of the benefits of pulling piggybank out due
> > to the problems caused by the tight coupling; I don't see how those
> > don't apply in this case.
> >
> > D
> >
> > On Tue, Jul 10, 2012 at 10:58 AM, Daniel Dai <[EMAIL PROTECTED]>
> wrote:
> >> Who's the author for Wonderdog? Can Russell or the author talk about
> >> it in our next hackthon? Also we need to discuss with the author about
> >> it.
> >>
> >> On Tue, Jul 10, 2012 at 9:23 AM, Alan Gates <[EMAIL PROTECTED]>
> wrote:
> >>> From https://issues.apache.org/jira/browse/PIG-2803 posted yesterday
> by Russell.  I'm copying it here because I think we need to discuss this
> and decide what we want to do:
> >>>
> >>> I propose to add Wonderdog to Pig contrib/
> >>> Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig
> integration for ElasticSearch. This lets you index any Pig relation with a
> single UDF call, which is very powerful. Both writing searchable indexes
> and loading based on search queries is supported.
> >>> More information on Wonderdog is available at
> https://github.com/infochimps-labs/wonderdog and a great introduction to
> ElasticSearch is available at
> http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html
> >>> Wonderdog broke in Pig 0.10.0, and was patched to work here:
> https://github.com/infochimps-labs/wonderdog/pull/9 Even still, there is
> the issue of Pig creating schema files when storing and loading JSON that
> must be manually removed to make Wonderdog go.
> >>> Moving forward, I would like the Pig project to maintain Wonderdog in
> contrib/ and verify that it works with each version increment. Wonderdog is
> an incredibly useful library that is license compatible with Pig itself.
> Along with ElasticSearch, it adds the ability for any user to index his Pig
> relations and to load subsets of data by pushing search queries down to
> ElasticSearch.
> >>> I use Wonderdog in production and in my book, so I volunteer to do the
> maintenance on contrib/wonderdog.
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*
Jeremy Hanna 2012-07-11, 05:45