Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> A major addition to Pig. Working with spatial data


Copy link to this message
-
Re: A major addition to Pig. Working with spatial data
Nick: the only issue is that the way types are implemented in Pig don't
allow us to easily "plug-in" types externally. Adding support for that
would be cool, but a fair bit of work.
2013/5/6 Nick Dimiduk <[EMAIL PROTECTED]>

> I'm to a lawyer, but I see no reason why this cannot be an external
> extension to Pig. It would behave the same way PostGIS is an external
> extension to Postgres. Any Apache issues would be toward general
> purpose enhancements, not specific to your project.
>
> Good on you!
> -n
>
> On Mon, May 6, 2013 at 10:12 AM, Ahmed Eldawy <[EMAIL PROTECTED]> wrote:
>
> > I contacted solr developers to see how JTS can be included in an Apache
> > project. See
> >
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201305.mbox/raw/%3C1367815102914-4060969.post%40n3.nabble.com%3E/
> > As far as I understand, they did not include it in the main solr project,
> > rather, they created a separate project (spatial 4j) which is still
> > licensed under Apache license and refers to JTS. Users will have to
> > download JTS libraries separately to make it run. That's pretty much the
> > same plan that Jonathan mentioned. We will still have the overhead of
> > serializing/deserializing the shapes each time a function is called.
> Also,
> > we will have to use the ugly bytearray data type for spatial data instead
> > of creating its own data type (e.g., Geometry).
> > I think using spatial 4j instead of JTS will not be sufficient for our
> case
> > as we need to provide an access to all spatial functions of JTS such as
> > Union, Intersection, Difference, ... etc. This way we can claim
> conformity
> > with OGC standards which gives visibility and appreciations of the
> spatial
> > community.
> > I think also that this means I will not add any issues to JIRA as it is
> now
> > a separate project. I'm planning to host it on github and have all the
> > issues there.
> > Let me know if you have any suggestions or comments.
> >
> > Thanks
> > Ahmed
> >
> >
> > Best regards,
> > Ahmed Eldawy
> >
> >
> > On Mon, May 6, 2013 at 9:53 AM, Jonathan Coveney <[EMAIL PROTECTED]>
> > wrote:
> >
> > > You can give them all the same label or tag and filter on that later
> on.
> > >
> > >
> > > 2013/5/6 Ahmed Eldawy <[EMAIL PROTECTED]>
> > >
> > > > Thanks all for taking the time to respond. Danial, I didn't know that
> > > Solr
> > > > uses JTS. This is a good finding and we can definitely ask them to
> see
> > if
> > > > there is a work around we can do. Jonathan, I thought of the same
> idea
> > of
> > > > serializing/deserializing a bytearray each time a UDF is called. The
> > > > deserialization part is good for letting Pig auto detect spatial
> types
> > if
> > > > not set explicitly in the schema. What is the best way to start
> this? I
> > > > want to add an initial set of JIRA issues and start working on them
> > but I
> > > > also need to keep the work grouped in some sense just for
> organization.
> > > >
> > > > Thanks
> > > > Ahmed
> > > >
> > > > Best regards,
> > > > Ahmed Eldawy
> > > >
> > > >
> > > > On Sat, May 4, 2013 at 4:47 PM, Jonathan Coveney <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > >
> > > > > I agree that this is cool, and if other projects are using JTS it
> is
> > > > worth
> > > > > talking them to see how. I also agree that licensing is very
> > > frustrating.
> > > > >
> > > > > In the short term, however, while it is annoying to have to manage
> > the
> > > > > serialization and deserialization yourself, you can have the
> geometry
> > > > type
> > > > > be passed around as a bytearray type. Your UDF's will have to know
> > this
> > > > and
> > > > > treat it accordingly, but if you did this then all of the tools
> could
> > > be
> > > > in
> > > > > an external project on github instead of a branch in Pig. Then, if
> we
> > > can
> > > > > get the licensing done, we could add the Geometry type to Pig.
> Adding
> > > > > types, honestly, is kind of tedious but not super difficult, so
> once
> >