|
|
abhishek 2012-12-18, 22:39
Hi all,
I have a use case which is implemented in hive with partitions.
Say Customer_data/2012-12-18/.... /2012-12-17/.... /2012-12-16/.... / /
I want implement this in pig.
How will partitions work in pig?
Regards Abhishek
+
abhishek 2012-12-18, 22:39
+
Russell Jurney 2012-12-18, 23:13
abhishek 2012-12-19, 00:11
Hi Russell, Thanks for the reply.How RCFile loader is related to partitions? I did not get your point in this. Regards Abhi Sent from my iPhone On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > This is what HCatalog and Pig's HCatStorage is for, to access data > from Hive from Pig. Unfortunately you are running CDH, which doesn't > support the Apache HCatalog project. HDP includes Apache HCatalog: > http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info > on Apache HCatalog is available here: > http://www.infoq.com/articles/HadoopMetadata> > However, there is an RCFile loader in Piggybank: > http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup> > Russell Jurney http://datasyndrome.com> > On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Hi all, >> >> I have a use case which is implemented in hive with partitions. >> >> Say >> Customer_data/2012-12-18/.... >> /2012-12-17/.... >> /2012-12-16/.... >> / >> / >> >> I want implement this in pig. >> >> How will partitions work in pig? >> >> Regards >> Abhishek
+
abhishek 2012-12-19, 00:11
Russell Jurney 2012-12-19, 00:20
Are you doing a directory-based partition with Hive, or are you letting Hive's RCFile partition data for you? Russell Jurney http://datasyndrome.comOn Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: > Hi Russell, > > Thanks for the reply.How RCFile loader is related to partitions? > > I did not get your point in this. > > Regards > Abhi > > Sent from my iPhone > > On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > >> This is what HCatalog and Pig's HCatStorage is for, to access data >> from Hive from Pig. Unfortunately you are running CDH, which doesn't >> support the Apache HCatalog project. HDP includes Apache HCatalog: >> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >> on Apache HCatalog is available here: >> http://www.infoq.com/articles/HadoopMetadata>> >> However, there is an RCFile loader in Piggybank: >> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup>> >> Russell Jurney http://datasyndrome.com>> >> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >> >>> Hi all, >>> >>> I have a use case which is implemented in hive with partitions. >>> >>> Say >>> Customer_data/2012-12-18/.... >>> /2012-12-17/.... >>> /2012-12-16/.... >>> / >>> / >>> >>> I want implement this in pig. >>> >>> How will partitions work in pig? >>> >>> Regards >>> Abhishek
+
Russell Jurney 2012-12-19, 00:20
abhishek 2012-12-19, 00:27
Directory based partition in hive. Partition by date Thanks Abhi Sent from my iPhone On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > Are you doing a directory-based partition with Hive, or are you > letting Hive's RCFile partition data for you? > > Russell Jurney http://datasyndrome.com> > On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Hi Russell, >> >> Thanks for the reply.How RCFile loader is related to partitions? >> >> I did not get your point in this. >> >> Regards >> Abhi >> >> Sent from my iPhone >> >> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >> >>> This is what HCatalog and Pig's HCatStorage is for, to access data >>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>> on Apache HCatalog is available here: >>> http://www.infoq.com/articles/HadoopMetadata>>> >>> However, there is an RCFile loader in Piggybank: >>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup>>> >>> Russell Jurney http://datasyndrome.com>>> >>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>> >>>> Hi all, >>>> >>>> I have a use case which is implemented in hive with partitions. >>>> >>>> Say >>>> Customer_data/2012-12-18/.... >>>> /2012-12-17/.... >>>> /2012-12-16/.... >>>> / >>>> / >>>> >>>> I want implement this in pig. >>>> >>>> How will partitions work in pig? >>>> >>>> Regards >>>> Abhishek
+
abhishek 2012-12-19, 00:27
Russell Jurney 2012-12-19, 00:43
It will work like so: http://stackoverflow.com/questions/3515481/pig-latin-load-multiple-files-from-a-date-range-part-of-the-directory-structurRussell Jurney http://datasyndrome.comOn Dec 18, 2012, at 4:27 PM, abhishek <[EMAIL PROTECTED]> wrote: > Directory based partition in hive. > > Partition by date > > Thanks > Abhi > > Sent from my iPhone > > On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > >> Are you doing a directory-based partition with Hive, or are you >> letting Hive's RCFile partition data for you? >> >> Russell Jurney http://datasyndrome.com>> >> On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: >> >>> Hi Russell, >>> >>> Thanks for the reply.How RCFile loader is related to partitions? >>> >>> I did not get your point in this. >>> >>> Regards >>> Abhi >>> >>> Sent from my iPhone >>> >>> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >>> >>>> This is what HCatalog and Pig's HCatStorage is for, to access data >>>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>>> on Apache HCatalog is available here: >>>> http://www.infoq.com/articles/HadoopMetadata>>>> >>>> However, there is an RCFile loader in Piggybank: >>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup>>>> >>>> Russell Jurney http://datasyndrome.com>>>> >>>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I have a use case which is implemented in hive with partitions. >>>>> >>>>> Say >>>>> Customer_data/2012-12-18/.... >>>>> /2012-12-17/.... >>>>> /2012-12-16/.... >>>>> / >>>>> / >>>>> >>>>> I want implement this in pig. >>>>> >>>>> How will partitions work in pig? >>>>> >>>>> Regards >>>>> Abhishek
+
Russell Jurney 2012-12-19, 00:43
abhishek 2012-12-19, 04:33
It works for me thanks. Regards Abhi Sent from my iPhone On Dec 18, 2012, at 7:43 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > It will work like so: > http://stackoverflow.com/questions/3515481/pig-latin-load-multiple-files-from-a-date-range-part-of-the-directory-structur> > Russell Jurney http://datasyndrome.com> > On Dec 18, 2012, at 4:27 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Directory based partition in hive. >> >> Partition by date >> >> Thanks >> Abhi >> >> Sent from my iPhone >> >> On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >> >>> Are you doing a directory-based partition with Hive, or are you >>> letting Hive's RCFile partition data for you? >>> >>> Russell Jurney http://datasyndrome.com>>> >>> On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Russell, >>>> >>>> Thanks for the reply.How RCFile loader is related to partitions? >>>> >>>> I did not get your point in this. >>>> >>>> Regards >>>> Abhi >>>> >>>> Sent from my iPhone >>>> >>>> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >>>> >>>>> This is what HCatalog and Pig's HCatStorage is for, to access data >>>>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>>>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>>>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>>>> on Apache HCatalog is available here: >>>>> http://www.infoq.com/articles/HadoopMetadata>>>>> >>>>> However, there is an RCFile loader in Piggybank: >>>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup>>>>> >>>>> Russell Jurney http://datasyndrome.com>>>>> >>>>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have a use case which is implemented in hive with partitions. >>>>>> >>>>>> Say >>>>>> Customer_data/2012-12-18/.... >>>>>> /2012-12-17/.... >>>>>> /2012-12-16/.... >>>>>> / >>>>>> / >>>>>> >>>>>> I want implement this in pig. >>>>>> >>>>>> How will partitions work in pig? >>>>>> >>>>>> Regards >>>>>> Abhishek
+
abhishek 2012-12-19, 04:33
abhishek 2012-12-19, 01:03
Hi Russell, I will try this and get back to you. Regards Abhishek Sent from my iPhone On Dec 18, 2012, at 7:43 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > It will work like so: > http://stackoverflow.com/questions/3515481/pig-latin-load-multiple-files-from-a-date-range-part-of-the-directory-structur> > Russell Jurney http://datasyndrome.com> > On Dec 18, 2012, at 4:27 PM, abhishek <[EMAIL PROTECTED]> wrote: > >> Directory based partition in hive. >> >> Partition by date >> >> Thanks >> Abhi >> >> Sent from my iPhone >> >> On Dec 18, 2012, at 7:20 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >> >>> Are you doing a directory-based partition with Hive, or are you >>> letting Hive's RCFile partition data for you? >>> >>> Russell Jurney http://datasyndrome.com>>> >>> On Dec 18, 2012, at 4:12 PM, abhishek <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Russell, >>>> >>>> Thanks for the reply.How RCFile loader is related to partitions? >>>> >>>> I did not get your point in this. >>>> >>>> Regards >>>> Abhi >>>> >>>> Sent from my iPhone >>>> >>>> On Dec 18, 2012, at 6:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: >>>> >>>>> This is what HCatalog and Pig's HCatStorage is for, to access data >>>>> from Hive from Pig. Unfortunately you are running CDH, which doesn't >>>>> support the Apache HCatalog project. HDP includes Apache HCatalog: >>>>> http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info >>>>> on Apache HCatalog is available here: >>>>> http://www.infoq.com/articles/HadoopMetadata>>>>> >>>>> However, there is an RCFile loader in Piggybank: >>>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup>>>>> >>>>> Russell Jurney http://datasyndrome.com>>>>> >>>>> On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have a use case which is implemented in hive with partitions. >>>>>> >>>>>> Say >>>>>> Customer_data/2012-12-18/.... >>>>>> /2012-12-17/.... >>>>>> /2012-12-16/.... >>>>>> / >>>>>> / >>>>>> >>>>>> I want implement this in pig. >>>>>> >>>>>> How will partitions work in pig? >>>>>> >>>>>> Regards >>>>>> Abhishek
+
abhishek 2012-12-19, 01:03
Cheolsoo Park 2012-12-18, 23:43
To be clear, the next CDH release is going to include HCatalog. Thanks, Cheolsoo On Tue, Dec 18, 2012 at 3:13 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > This is what HCatalog and Pig's HCatStorage is for, to access data > from Hive from Pig. Unfortunately you are running CDH, which doesn't > support the Apache HCatalog project. HDP includes Apache HCatalog: > http://hortonworks.com/hdp/hdp-hcatalog-metadata-services/ More info > on Apache HCatalog is available here: > http://www.infoq.com/articles/HadoopMetadata> > However, there is an RCFile loader in Piggybank: > > http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java?view=markup> > Russell Jurney http://datasyndrome.com> > On Dec 18, 2012, at 2:39 PM, abhishek <[EMAIL PROTECTED]> wrote: > > > Hi all, > > > > I have a use case which is implemented in hive with partitions. > > > > Say > > Customer_data/2012-12-18/.... > > /2012-12-17/.... > > /2012-12-16/.... > > / > > / > > > > I want implement this in pig. > > > > How will partitions work in pig? > > > > Regards > > Abhishek >
+
Cheolsoo Park 2012-12-18, 23:43
|
|