Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # dev >> Cluster-specific Adaptors


Copy link to this message
-
Re: Cluster-specific Adaptors
I've been thinking about the adaptor add syntax for a while.  It's
kinda clunky and hard to extend.

Here's something I've been considering.  Suppose instead of doing this
in the add command, we had a separate setTags command that controls
the tags for the next adaptor[s] started from that conf file or socket
connection.

so you'd have stuff like
settags cluster foo
add .....

And then the adaptors wouldn't even have to know about it; you could
handle this primarily on the framework side.

This would make the checkpointing somewhat more complex, but I think
that's potentially not so bad to patch up.  You'd keep a list for each
adaptor of supplemental commands that should be applied to it, and
dump those out before checkpointing the adaptor state.
--Ari

On Mon, Sep 20, 2010 at 5:15 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
> I'd like to hear Ari's take on this, but this does feel a bit hacky to
> me. Plus, it would put the responsibility of parsing tags on each
> adaptor impl and would require a refactor of how each one currently
> parses args.
>
> Actually, we might be able to intercept the call to parseArgs in
> AbstractAdaptor and pull out the tags if they exist and pass the rest
> to the subclass, which would be none the wiser. Not the cleanest, but
> at lease not as intrusive on the adaptor implementations.
>
> Ari, also what about the getCurrentStatus() method? I'd think all the
> impls would somehow need to incorporate tags into that response as
> well, since AFAIR that's what's used to do Adaptor SerDe with the
> checkpoints file.
>
>
> On Mon, Sep 20, 2010 at 3:57 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
>> Hi Bill,
>>
>> This might be hacky but it should be possible to have adaptor specific
>> params to include tags.  Ari, what do you think?
>>
>> Regards,
>> Eric
>>
>> On 9/20/10 2:58 PM, "Bill Graham" <[EMAIL PROTECTED]> wrote:
>>
>> Hi,
>>
>> In CHUKWA-515 we discussed the possibility being able to add an
>> adaptor bound to a given cluster:
>>
>> https://issues.apache.org/jira/browse/CHUKWA-515?focusedCommentId=12905811&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12905811
>>
>> I can actually see this being useful, especially now that it's easier
>> to add/remove agents with the Adaptor REST API. Looking into the code
>> it doesn't seem like it would be that hard to do, but I want to make
>> sure I'm not overlooking anything.
>>
>> It seems like we could support this with a few small changes:
>>
>> - Add the concept of tags to the Adaptor interface.
>> - AbstractAdator would support a getTags method which would return the
>> union of tags set on the Adaptor and the default tags on the
>> DataFactory.
>> - Internal tag implementations on each would change to store tags in
>> maps, instead of concat'ed strings. This would allow for a "last in
>> wins" type of functionality so tags could be overriden. This assumes
>> of course that there should never be more than one of the same tag key
>> value, which I _assume_ is the case.
>> - The ChunkImpl constructor will call getTags on the agent, instead of
>> getDefaultTags the data factory.
>>
>> The trickiest part as I see it is figuring out how to change the add
>> adaptor string syntax in ChukwaAgetn.processAddCommandE in a way that
>> both makes sense and doesn't break things. In it's current form it
>> doesn't have room for easy expansion except at the end of the line:
>>
>> add [name =] <adaptor_class_name> <datatype> <adaptor specific params>
>> <initial offset>
>>
>> Any thoughts or suggestions? There's also a potential gotcha with all
>> the impls of Adaptor.parseArgs either breaking or needing to
>> change....
>>
>> thanks,
>> Bill
>>
>>
>

--
Ari Rabkin [EMAIL PROTECTED]
UC Berkeley Computer Science Department
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB