Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Chukwa >> mail # user >> speeding up demux


+
Corbin Hoenes 2011-05-27, 03:23
+
Bill Graham 2011-05-27, 03:30
+
Eric Yang 2011-05-27, 15:58
+
James Seigel 2011-06-01, 20:01
Copy link to this message
-
Re: speeding up demux
Yes.

Trunk should be stable at this point.

Performance should be about the same as 0.3 or 0.4 -- most Chukwa
development has been geared to features and bugfixes.

CHANGES.txt in trunk is the changelist since 0.4

Trunk compiles, last I checked.

--Ari

On Wed, Jun 1, 2011 at 1:01 PM, James Seigel <[EMAIL PROTECTED]> wrote:
> Hello!
> I am seriously considering what you are suggesting in this email, even
> though it goes against what would seem to make sense.  I have a couple of
> questions if anyone has the time to answer.
> 1) How stable is trunk right now?
> 2) Any performance improvements/degredations since 0.3
> 3) Is there a pseudo change log between “trunk” and 0.4 that I could take a
> peak at at this point
> 4) does it compile ;)
> Cheers and thanks for your time!
> James.
>
> On 2011-05-27, at 9:58 AM, Eric Yang wrote:
>
> I would recommend to skip Chukwa 0.4 and go to the trunk.  In addition, use
> HBaseWriter to stream data into HBase in parallel, hence, the data can be
> processed in near real time for demux.
>
> Regards,
> Eric
>
> On 5/26/11 8:30 PM, "Bill Graham" <[EMAIL PROTECTED]> wrote:
>
> This seems possible, but one thing that would need to be changed is the
> directories that demux uses. For example:
> demuxProcessing/mrInput
> demuxProcessing/mrOutput
>
> These would need to dynamic directories with the timestamp or something else
> in them to keep two jobs from interfering with each other.
>
> On Thu, May 26, 2011 at 8:23 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
>
> Finding demux to be a bit too slow for our needs.  It seems like only 1 runs
> at a time; is there some technical reason why we couldn't run a couple in
> parallel?  If so any hints on how difficult it would be to run multiple
> demuxers at a time?
>
>
>
>
>
>

--
Ari Rabkin [EMAIL PROTECTED]
UC Berkeley Computer Science Department
+
Eric Yang 2011-06-01, 20:33
+
James Seigel 2011-06-01, 20:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB