Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - output partitioning


Copy link to this message
-
Re: output partitioning
Thejas Nair 2011-10-05, 15:30
-thejas.
typed on a tiny virtual keyboard
On Oct 5, 2011 5:21 AM, "Alex Rovner" <[EMAIL PROTECTED]> wrote:
> Alan,
>
> We are looking into integrating with the HCatalog and I have the following
> questions:
>
> 1. In your opinion, how stable is the HCatalog?
> 2. On the install page it mentions the creation of the hive metastore db.
> What if we are already using Hive and have an existing metastore db in
> MySQL? What versions of Hive is the HCatalog compatible with?
>
> Thanks in advance
>
> Alex R
>
> On Tue, Oct 4, 2011 at 2:14 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> That means one partition at a time, not the number of keys in the
>> partition. And in the 0.2 (just released), the one at a time restriction
is
>> removed. So you can partition data by client id and date.
>>
>> Alan.
>>
>> On Oct 4, 2011, at 11:12 AM, Stan Rosenberg wrote:
>>
>> > On Tue, Oct 4, 2011 at 2:06 PM, Alan Gates <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >> Can you explain what you mean by secondary output partitioning?
>> HCatalog
>> >> supports the same partitioning that Hive does.
>> >>
>> >
>> > "Currently HCatStorer only supports writing to one partition."
>> >
>> > We need to partition our data by client id, then by date, hence
>> two-level
>> > partitioning.
>>
>>