|
abhishek
2012-12-18, 22:39
Russell Jurney
2012-12-18, 23:13
abhishek
2012-12-19, 00:11
Russell Jurney
2012-12-19, 00:20
abhishek
2012-12-19, 00:27
Russell Jurney
2012-12-19, 00:43
abhishek
2012-12-19, 04:33
abhishek
2012-12-19, 01:03
Cheolsoo Park
2012-12-18, 23:43
|
-
Partitions in pigabhishek 2012-12-18, 22:39
Hi all,
I have a use case which is implemented in hive with partitions. Say Customer_data/2012-12-18/.... /2012-12-17/.... /2012-12-16/.... / / I want implement this in pig. How will partitions work in pig? Regards Abhishek +
abhishek 2012-12-18, 22:39
-
Re: Partitions in pigRussell Jurney 2012-12-18, 23:13
This is what HCatalog and Pig's HCatStorage is for, to access data
from Hive from Pig. Unfortunately you are running CDH, which doesn't support the Apache HCatalog project. HDP includes Apache HCatalog: http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info on Apache HCatalog is available here: http://www.infoq.com/articles/HadoopMetadata However, there is an RCFile loader in Piggybank: http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup Russell Jurney http://datasyndrome.com On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: > Hi all, > > I have a use case which is implemented in hive with partitions. > > Say > Customer_data/2012-12-18/.... > /2012-12-17/.... > /2012-12-16/.... > / > / > > I want implement this in pig. > > How will partitions work in pig? > > Regards > Abhishek +
Russell Jurney 2012-12-18, 23:13
-
Re: Partitions in pigabhishek 2012-12-19, 00:11
Hi Russell,
Thanks for the reply.How RCFile loader is related to partitions? I did not get your point in this. Regards Abhi Sent from my iPhone On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > This is what HCatalog and Pig's HCatStorage is for, to access data > from Hive from Pig. Unfortunately you are running CDH, which doesn't > support the Apache HCatalog project. HDP includes Apache HCatalog: > http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info > on Apache HCatalog is available here: > http://www.infoq.com/articles/HadoopMetadata > > However, there is an RCFile loader in Piggybank: > http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup > > Russell Jurney http://datasyndrome.com > > On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Hi all, >> >> I have a use case which is implemented in hive with partitions. >> >> Say >> Customer_data/2012-12-18/.... >> /2012-12-17/.... >> /2012-12-16/.... >> / >> / >> >> I want implement this in pig. >> >> How will partitions work in pig? >> >> Regards >> Abhishek +
abhishek 2012-12-19, 00:11
-
Re: Partitions in pigRussell Jurney 2012-12-19, 00:20
Are you doing a directory-based partition with Hive, or are you
letting Hive's RCFile partition data for you? Russell Jurney http://datasyndrome.com On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: > Hi Russell, > > Thanks for the reply.How RCFile loader is related to partitions? > > I did not get your point in this. > > Regards > Abhi > > Sent from my iPhone > > On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > >> This is what HCatalog and Pig's HCatStorage is for, to access data >> from Hive from Pig. Unfortunately you are running CDH, which doesn't >> support the Apache HCatalog project. HDP includes Apache HCatalog: >> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >> on Apache HCatalog is available here: >> http://www.infoq.com/articles/HadoopMetadata >> >> However, there is an RCFile loader in Piggybank: >> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup >> >> Russell Jurney http://datasyndrome.com >> >> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >> >>> Hi all, >>> >>> I have a use case which is implemented in hive with partitions. >>> >>> Say >>> Customer_data/2012-12-18/.... >>> /2012-12-17/.... >>> /2012-12-16/.... >>> / >>> / >>> >>> I want implement this in pig. >>> >>> How will partitions work in pig? >>> >>> Regards >>> Abhishek +
Russell Jurney 2012-12-19, 00:20
-
Re: Partitions in pigabhishek 2012-12-19, 00:27
Directory based partition in hive.
Partition by date Thanks Abhi Sent from my iPhone On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > Are you doing a directory-based partition with Hive, or are you > letting Hive's RCFile partition data for you? > > Russell Jurney http://datasyndrome.com > > On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Hi Russell, >> >> Thanks for the reply.How RCFile loader is related to partitions? >> >> I did not get your point in this. >> >> Regards >> Abhi >> >> Sent from my iPhone >> >> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >> >>> This is what HCatalog and Pig's HCatStorage is for, to access data >>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>> on Apache HCatalog is available here: >>> http://www.infoq.com/articles/HadoopMetadata >>> >>> However, there is an RCFile loader in Piggybank: >>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup >>> >>> Russell Jurney http://datasyndrome.com >>> >>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>> >>>> Hi all, >>>> >>>> I have a use case which is implemented in hive with partitions. >>>> >>>> Say >>>> Customer_data/2012-12-18/.... >>>> /2012-12-17/.... >>>> /2012-12-16/.... >>>> / >>>> / >>>> >>>> I want implement this in pig. >>>> >>>> How will partitions work in pig? >>>> >>>> Regards >>>> Abhishek +
abhishek 2012-12-19, 00:27
-
Re: Partitions in pigRussell Jurney 2012-12-19, 00:43
It will work like so:
http://stackoverflow.com/questions/3515481/pig-latin-load-multiple-files-from-a-date-range-part-of-the-directory-structur Russell Jurney http://datasyndrome.com On Dec 18, 2012, at 4:27 PM, abhishek <[EMAIL PROTECTED]> wrote: > Directory based partition in hive. > > Partition by date > > Thanks > Abhi > > Sent from my iPhone > > On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > >> Are you doing a directory-based partition with Hive, or are you >> letting Hive's RCFile partition data for you? >> >> Russell Jurney http://datasyndrome.com >> >> On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: >> >>> Hi Russell, >>> >>> Thanks for the reply.How RCFile loader is related to partitions? >>> >>> I did not get your point in this. >>> >>> Regards >>> Abhi >>> >>> Sent from my iPhone >>> >>> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >>> >>>> This is what HCatalog and Pig's HCatStorage is for, to access data >>>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>>> on Apache HCatalog is available here: >>>> http://www.infoq.com/articles/HadoopMetadata >>>> >>>> However, there is an RCFile loader in Piggybank: >>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup >>>> >>>> Russell Jurney http://datasyndrome.com >>>> >>>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I have a use case which is implemented in hive with partitions. >>>>> >>>>> Say >>>>> Customer_data/2012-12-18/.... >>>>> /2012-12-17/.... >>>>> /2012-12-16/.... >>>>> / >>>>> / >>>>> >>>>> I want implement this in pig. >>>>> >>>>> How will partitions work in pig? >>>>> >>>>> Regards >>>>> Abhishek +
Russell Jurney 2012-12-19, 00:43
-
Re: Partitions in pigabhishek 2012-12-19, 04:33
It works for me thanks.
Regards Abhi Sent from my iPhone On Dec 18, 2012, at 7:43 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > It will work like so: > http://stackoverflow.com/questions/3515481/pig-latin-load-multiple-files-from-a-date-range-part-of-the-directory-structur > > Russell Jurney http://datasyndrome.com > > On Dec 18, 2012, at 4:27 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Directory based partition in hive. >> >> Partition by date >> >> Thanks >> Abhi >> >> Sent from my iPhone >> >> On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >> >>> Are you doing a directory-based partition with Hive, or are you >>> letting Hive's RCFile partition data for you? >>> >>> Russell Jurney http://datasyndrome.com >>> >>> On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Russell, >>>> >>>> Thanks for the reply.How RCFile loader is related to partitions? >>>> >>>> I did not get your point in this. >>>> >>>> Regards >>>> Abhi >>>> >>>> Sent from my iPhone >>>> >>>> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >>>> >>>>> This is what HCatalog and Pig's HCatStorage is for, to access data >>>>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>>>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>>>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>>>> on Apache HCatalog is available here: >>>>> http://www.infoq.com/articles/HadoopMetadata >>>>> >>>>> However, there is an RCFile loader in Piggybank: >>>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup >>>>> >>>>> Russell Jurney http://datasyndrome.com >>>>> >>>>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have a use case which is implemented in hive with partitions. >>>>>> >>>>>> Say >>>>>> Customer_data/2012-12-18/.... >>>>>> /2012-12-17/.... >>>>>> /2012-12-16/.... >>>>>> / >>>>>> / >>>>>> >>>>>> I want implement this in pig. >>>>>> >>>>>> How will partitions work in pig? >>>>>> >>>>>> Regards >>>>>> Abhishek +
abhishek 2012-12-19, 04:33
-
Re: Partitions in pigabhishek 2012-12-19, 01:03
Hi Russell,
I will try this and get back to you. Regards Abhishek Sent from my iPhone On Dec 18, 2012, at 7:43 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > It will work like so: > http://stackoverflow.com/questions/3515481/pig-latin-load-multiple-files-from-a-date-range-part-of-the-directory-structur > > Russell Jurney http://datasyndrome.com > > On Dec 18, 2012, at 4:27 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Directory based partition in hive. >> >> Partition by date >> >> Thanks >> Abhi >> >> Sent from my iPhone >> >> On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >> >>> Are you doing a directory-based partition with Hive, or are you >>> letting Hive's RCFile partition data for you? >>> >>> Russell Jurney http://datasyndrome.com >>> >>> On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Russell, >>>> >>>> Thanks for the reply.How RCFile loader is related to partitions? >>>> >>>> I did not get your point in this. >>>> >>>> Regards >>>> Abhi >>>> >>>> Sent from my iPhone >>>> >>>> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >>>> >>>>> This is what HCatalog and Pig's HCatStorage is for, to access data >>>>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>>>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>>>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>>>> on Apache HCatalog is available here: >>>>> http://www.infoq.com/articles/HadoopMetadata >>>>> >>>>> However, there is an RCFile loader in Piggybank: >>>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup >>>>> >>>>> Russell Jurney http://datasyndrome.com >>>>> >>>>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have a use case which is implemented in hive with partitions. >>>>>> >>>>>> Say >>>>>> Customer_data/2012-12-18/.... >>>>>> /2012-12-17/.... >>>>>> /2012-12-16/.... >>>>>> / >>>>>> / >>>>>> >>>>>> I want implement this in pig. >>>>>> >>>>>> How will partitions work in pig? >>>>>> >>>>>> Regards >>>>>> Abhishek +
abhishek 2012-12-19, 01:03
-
Re: Partitions in pigCheolsoo Park 2012-12-18, 23:43
To be clear, the next CDH release is going to include HCatalog.
Thanks, Cheolsoo On Tue, Dec 18, 2012 at 3:13 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > This is what HCatalog and Pig's HCatStorage is for, to access data > from Hive from Pig. Unfortunately you are running CDH, which doesn't > support the Apache HCatalog project. HDP includes Apache HCatalog: > http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info > on Apache HCatalog is available here: > http://www.infoq.com/articles/HadoopMetadata > > However, there is an RCFile loader in Piggybank: > > http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup > > Russell Jurney http://datasyndrome.com > > On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: > > > Hi all, > > > > I have a use case which is implemented in hive with partitions. > > > > Say > > Customer_data/2012-12-18/.... > > /2012-12-17/.... > > /2012-12-16/.... > > / > > / > > > > I want implement this in pig. > > > > How will partitions work in pig? > > > > Regards > > Abhishek > +
Cheolsoo Park 2012-12-18, 23:43
|