Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Re: [jira] [Updated] (PIG-2831) MR-Cube implementation (Distributed cubing for holistic measures)


Copy link to this message
-
Re: [jira] [Updated] (PIG-2831) MR-Cube implementation (Distributed cubing for holistic measures)

zhangfenghua

From: Prasanth J (JIRA)
Date: 2012-10-09 06:26
To: pig-dev
Subject: [jira] [Updated] (PIG-2831) MR-Cube implementation (Distributed cubing for holistic measures)

     [ https://issues.apache.org/jira/browse/PIG-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanth J updated PIG-2831:
----------------------------

    Attachment: PIG-2831.9.git.patch
    
> MR-Cube implementation (Distributed cubing for holistic measures)
> -----------------------------------------------------------------
>
>                 Key: PIG-2831
>                 URL: https://issues.apache.org/jira/browse/PIG-2831
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>         Attachments: PIG-2831.1.git.patch, PIG-2831.2.git.patch, PIG-2831.3.git.patch, PIG-2831.4.git.patch, PIG-2831.5.git.patch, PIG-2831.6.git.patch, PIG-2831.7.git.patch, PIG-2831.8.git.patch, PIG-2831.9.git.patch
>
>
> Implementing distributed cube materialization on holistic measure based on MR-Cube approach as described in http://arnab.org/files/mrcube.pdf.
> Primary steps involved:
> 1) Identify if the measure is holistic or not
> 2) Determine algebraic attribute (can be detected automatically for few cases, if automatic detection fails user should hint the algebraic attribute)
> 3) Modify MRPlan to insert a sampling job which executes naive cube algorithm and generates annotated cube lattice (contains large group partitioning information)
> 4) Modify plan to distribute annotated cube lattice to all mappers using distributed cache
> 5) Execute actual cube materialization on full dataset
> 6) Modify MRPlan to insert a post process job for combining the results of actual cube materialization job
> 7) OOM exception handling

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB