Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa, mail # dev - [DISCUSSION] Making HBaseWriter default


Copy link to this message
-
RE: [DISCUSSION] Making HBaseWriter default
Deshpande, Deepak 2010-11-22, 18:47
I agree. Making HBase by default would make some Chukwa users life difficult. In my set up, I don't need HDFS. I am using Chukwa merely as a Log Streaming framework. I have plugged in my own writer to write log files in Local File system (instead of HDFS). I evaluated Chukwa with other frameworks and Chukwa had very good fault tolerance built in than other frameworks. This made me recommend Chukwa over other frameworks.

By making HBase default option would definitely make my life difficult :).

Thanks,
Deepak Deshpande

-----Original Message-----
From: Bill Graham [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 22, 2010 1:23 PM
To: [EMAIL PROTECTED]
Subject: Re: [DISCUSSION] Making HBaseWriter default

Hi Eric,

I think we should have a default config that is easy to tweak to work
with or without HBase. My inclination would be to not have HBase
enabled by default, since it raises the barrier to entry for a basic
set-up that might not otherwise need HBase.

When I first installed Chukwa 0.3.0 for evaluation I spent a lot of
time setting up MySQL and HICC because I thought I had to, only to
realize later that those components weren't needed for my use cases
(this wasn't and still isn't clearly reflected in the quick start
documentation). Hence I think it's better to require a few extra steps
for people who have HBase, than to risk losing users to the extra
steps required to get a basic setup running without HBase.

thanks,
Bill

On Sat, Nov 20, 2010 at 9:02 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> Hi James,
>
> In my 10 nodes cluster, it used to take 7 minutes (3 minutes M/R + 4
> minutes load to mysql) to process data and being able to visualize on
> HICC UI.  Now, it takes 50 milliseconds.  For data aggregation, it
> used to take 15-20 minutes to roll up data for 2000 nodes data daily,
> now it takes <5 minutes.  The improvement is 2100 times better for
> data load latency, and 3 times better for data analytics throughput
> with pig+hbase.
>
> regards,
> Eric
>
> On Sat, Nov 20, 2010 at 12:20 PM, James Seigel <[EMAIL PROTECTED]> wrote:
>> Hello!
>>
>> As a high volume user, I was just wondering how the HbaseWriter compares with the current one under load?  Better or worse and by how much?
>>
>> Cheers
>> James.
>>
>>
>> On 2010-11-20, at 1:15 PM, Eric Yang wrote:
>>
>>> Hi all,
>>>
>>> In order to use full features of Chukwa in trunk, HBase is required to
>>> display data on HICC.  I am wondering if anyone has good success in
>>> using HBase+HICC?  I am leaning toward making hbase the default data
>>> storage for chukwa, and the default configuration for chukwa collector
>>> will make use of HBaseWriter.  What do the community feel about
>>> changing the default writer config?
>>>
>>> regards,
>>> Eric
>>
>>
>