Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> output partitioning


Copy link to this message
-
Re: output partitioning
-thejas.
typed on a tiny virtual keyboard
On Oct 5, 2011 5:21 AM, "Alex Rovner" <[EMAIL PROTECTED]> wrote:
> Alan,
>
> We are looking into integrating with the HCatalog and I have the following
> questions:
>
> 1. In your opinion, how stable is the HCatalog?
> 2. On the install page it mentions the creation of the hive metastore db.
> What if we are already using Hive and have an existing metastore db in
> MySQL? What versions of Hive is the HCatalog compatible with?
>
> Thanks in advance
>
> Alex R
>
> On Tue, Oct 4, 2011 at 2:14 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> That means one partition at a time, not the number of keys in the
>> partition. And in the 0.2 (just released), the one at a time restriction
is
>> removed. So you can partition data by client id and date.
>>
>> Alan.
>>
>> On Oct 4, 2011, at 11:12 AM, Stan Rosenberg wrote:
>>
>> > On Tue, Oct 4, 2011 at 2:06 PM, Alan Gates <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >> Can you explain what you mean by secondary output partitioning?
>> HCatalog
>> >> supports the same partitioning that Hive does.
>> >>
>> >
>> > "Currently HCatStorer only supports writing to one partition."
>> >
>> > We need to partition our data by client id, then by date, hence
>> two-level
>> > partitioning.
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB