Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - A major addition to Pig. Working with spatial data


Copy link to this message
-
Re: A major addition to Pig. Working with spatial data
Russell Jurney 2013-05-29, 17:17
Awesome. This would be a great addition to Pig. Please create a JIRA.

Russell Jurney http://datasyndrome.com

On May 29, 2013, at 8:51 AM, Ahmed Eldawy <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> Nick has pointed out to me an alternative GIS package that can replace JTS.
> ESRI has recently released a GIS
> package<https://github.com/Esri/geometry-api-java>under Apache
> license. I changed Pigeon to work with that new package. I
> think it could be easier now to integrate this work with main branch of
> Apache Pig. I will go on with the current project and add more spatial
> functionality. We can then add a new datatype to Apache and link it to
> those functions.
>
> ESRI package contains a class OGCGeometry
> <http://esri.github.io/geometry-api-java/javadoc/com/esri/core/geometry/ogc/OGCGeometry.html>which
> can be linked to a new datatype 'Geometry'. Do you think we can rely on the
> new package and integrate the work with Apache Pig?
>
> On May 23, 2013 11:40 PM, "Ahmed Eldawy" <[EMAIL PROTECTED]> wrote:
>
>> Hi all,
>>  Thanks for your help. I've started the project with a minimal
>> functionality as a start. It's currently hosted in github. It is licensed
>> under the Apache public license to make it easier to merge with Pig.
>> Currently it has only a very few functions. I implemented a function from
>> different types of functions (e.g., Aggregate and create). I'll keep adding
>> functions and any contributions to the project are welcome. As a beginning,
>> I need an ANT build file that runs the tests, compiles and generates a jar
>> file. I'm not familiar with ANT so any help in this is encouraged.
>> Here's the project home page
>> https://github.com/aseldawy/pigeon
>>
>>
>> If you have any comments or suggestion please contact me.
>>
>>
>> Best regards,
>> Ahmed Eldawy
>>
>>
>> On Mon, May 6, 2013 at 3:09 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
>>
>>> Nick: the only issue is that the way types are implemented in Pig don't
>>> allow us to easily "plug-in" types externally. Adding support for that
>>> would be cool, but a fair bit of work.
>>>
>>>
>>> 2013/5/6 Nick Dimiduk <[EMAIL PROTECTED]>
>>>
>>>> I'm to a lawyer, but I see no reason why this cannot be an external
>>>> extension to Pig. It would behave the same way PostGIS is an external
>>>> extension to Postgres. Any Apache issues would be toward general
>>>> purpose enhancements, not specific to your project.
>>>>
>>>> Good on you!
>>>> -n
>>>>
>>>> On Mon, May 6, 2013 at 10:12 AM, Ahmed Eldawy <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>>> I contacted solr developers to see how JTS can be included in an
>>> Apache
>>>>> project. See
>>> http://mail-archives.apache.org/mod_mbox/lucene-dev/201305.mbox/raw/%3C1367815102914-4060969.post%40n3.nabble.com%3E/
>>>>> As far as I understand, they did not include it in the main solr
>>> project,
>>>>> rather, they created a separate project (spatial 4j) which is still
>>>>> licensed under Apache license and refers to JTS. Users will have to
>>>>> download JTS libraries separately to make it run. That's pretty much
>>> the
>>>>> same plan that Jonathan mentioned. We will still have the overhead of
>>>>> serializing/deserializing the shapes each time a function is called.
>>>> Also,
>>>>> we will have to use the ugly bytearray data type for spatial data
>>> instead
>>>>> of creating its own data type (e.g., Geometry).
>>>>> I think using spatial 4j instead of JTS will not be sufficient for our
>>>> case
>>>>> as we need to provide an access to all spatial functions of JTS such
>>> as
>>>>> Union, Intersection, Difference, ... etc. This way we can claim
>>>> conformity
>>>>> with OGC standards which gives visibility and appreciations of the
>>>> spatial
>>>>> community.
>>>>> I think also that this means I will not add any issues to JIRA as it
>>> is
>>>> now
>>>>> a separate project. I'm planning to host it on github and have all the
>>>>> issues there.
>>>>> Let me know if you have any suggestions or comments.
>>>>>