Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> A major addition to Pig. Working with spatial data


Copy link to this message
-
Re: A major addition to Pig. Working with spatial data
Hi all,
  Thanks for your help. I've started the project with a minimal
functionality as a start. It's currently hosted in github. It is licensed
under the Apache public license to make it easier to merge with Pig.
Currently it has only a very few functions. I implemented a function from
different types of functions (e.g., Aggregate and create). I'll keep adding
functions and any contributions to the project are welcome. As a beginning,
I need an ANT build file that runs the tests, compiles and generates a jar
file. I'm not familiar with ANT so any help in this is encouraged.
Here's the project home page
https://github.com/aseldawy/pigeon
If you have any comments or suggestion please contact me.
Best regards,
Ahmed Eldawy
On Mon, May 6, 2013 at 3:09 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> Nick: the only issue is that the way types are implemented in Pig don't
> allow us to easily "plug-in" types externally. Adding support for that
> would be cool, but a fair bit of work.
>
>
> 2013/5/6 Nick Dimiduk <[EMAIL PROTECTED]>
>
> > I'm to a lawyer, but I see no reason why this cannot be an external
> > extension to Pig. It would behave the same way PostGIS is an external
> > extension to Postgres. Any Apache issues would be toward general
> > purpose enhancements, not specific to your project.
> >
> > Good on you!
> > -n
> >
> > On Mon, May 6, 2013 at 10:12 AM, Ahmed Eldawy <[EMAIL PROTECTED]>
> wrote:
> >
> > > I contacted solr developers to see how JTS can be included in an Apache
> > > project. See
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201305.mbox/raw/%3C1367815102914-4060969.post%40n3.nabble.com%3E/
> > > As far as I understand, they did not include it in the main solr
> project,
> > > rather, they created a separate project (spatial 4j) which is still
> > > licensed under Apache license and refers to JTS. Users will have to
> > > download JTS libraries separately to make it run. That's pretty much
> the
> > > same plan that Jonathan mentioned. We will still have the overhead of
> > > serializing/deserializing the shapes each time a function is called.
> > Also,
> > > we will have to use the ugly bytearray data type for spatial data
> instead
> > > of creating its own data type (e.g., Geometry).
> > > I think using spatial 4j instead of JTS will not be sufficient for our
> > case
> > > as we need to provide an access to all spatial functions of JTS such as
> > > Union, Intersection, Difference, ... etc. This way we can claim
> > conformity
> > > with OGC standards which gives visibility and appreciations of the
> > spatial
> > > community.
> > > I think also that this means I will not add any issues to JIRA as it is
> > now
> > > a separate project. I'm planning to host it on github and have all the
> > > issues there.
> > > Let me know if you have any suggestions or comments.
> > >
> > > Thanks
> > > Ahmed
> > >
> > >
> > > Best regards,
> > > Ahmed Eldawy
> > >
> > >
> > > On Mon, May 6, 2013 at 9:53 AM, Jonathan Coveney <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > You can give them all the same label or tag and filter on that later
> > on.
> > > >
> > > >
> > > > 2013/5/6 Ahmed Eldawy <[EMAIL PROTECTED]>
> > > >
> > > > > Thanks all for taking the time to respond. Danial, I didn't know
> that
> > > > Solr
> > > > > uses JTS. This is a good finding and we can definitely ask them to
> > see
> > > if
> > > > > there is a work around we can do. Jonathan, I thought of the same
> > idea
> > > of
> > > > > serializing/deserializing a bytearray each time a UDF is called.
> The
> > > > > deserialization part is good for letting Pig auto detect spatial
> > types
> > > if
> > > > > not set explicitly in the schema. What is the best way to start
> > this? I
> > > > > want to add an initial set of JIRA issues and start working on them
> > > but I
> > > > > also need to keep the work grouped in some sense just for
> > organization.
> > > > >
> > > > > Thanks
> > > > > Ahmed
> > > > >
>