Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa, mail # dev - Cluster-specific Adaptors


Copy link to this message
-
Re: Cluster-specific Adaptors
Ariel Rabkin 2010-09-21, 00:32
I've been thinking about the adaptor add syntax for a while.  It's
kinda clunky and hard to extend.

Here's something I've been considering.  Suppose instead of doing this
in the add command, we had a separate setTags command that controls
the tags for the next adaptor[s] started from that conf file or socket
connection.

so you'd have stuff like
settags cluster foo
add .....

And then the adaptors wouldn't even have to know about it; you could
handle this primarily on the framework side.

This would make the checkpointing somewhat more complex, but I think
that's potentially not so bad to patch up.  You'd keep a list for each
adaptor of supplemental commands that should be applied to it, and
dump those out before checkpointing the adaptor state.
--Ari

On Mon, Sep 20, 2010 at 5:15 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
> I'd like to hear Ari's take on this, but this does feel a bit hacky to
> me. Plus, it would put the responsibility of parsing tags on each
> adaptor impl and would require a refactor of how each one currently
> parses args.
>
> Actually, we might be able to intercept the call to parseArgs in
> AbstractAdaptor and pull out the tags if they exist and pass the rest
> to the subclass, which would be none the wiser. Not the cleanest, but
> at lease not as intrusive on the adaptor implementations.
>
> Ari, also what about the getCurrentStatus() method? I'd think all the
> impls would somehow need to incorporate tags into that response as
> well, since AFAIR that's what's used to do Adaptor SerDe with the
> checkpoints file.
>
>
> On Mon, Sep 20, 2010 at 3:57 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
>> Hi Bill,
>>
>> This might be hacky but it should be possible to have adaptor specific
>> params to include tags.  Ari, what do you think?
>>
>> Regards,
>> Eric
>>
>> On 9/20/10 2:58 PM, "Bill Graham" <[EMAIL PROTECTED]> wrote:
>>
>> Hi,
>>
>> In CHUKWA-515 we discussed the possibility being able to add an
>> adaptor bound to a given cluster:
>>
>> https://issues.apache.org/jira/browse/CHUKWA-515?focusedCommentId=12905811&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12905811
>>
>> I can actually see this being useful, especially now that it's easier
>> to add/remove agents with the Adaptor REST API. Looking into the code
>> it doesn't seem like it would be that hard to do, but I want to make
>> sure I'm not overlooking anything.
>>
>> It seems like we could support this with a few small changes:
>>
>> - Add the concept of tags to the Adaptor interface.
>> - AbstractAdator would support a getTags method which would return the
>> union of tags set on the Adaptor and the default tags on the
>> DataFactory.
>> - Internal tag implementations on each would change to store tags in
>> maps, instead of concat'ed strings. This would allow for a "last in
>> wins" type of functionality so tags could be overriden. This assumes
>> of course that there should never be more than one of the same tag key
>> value, which I _assume_ is the case.
>> - The ChunkImpl constructor will call getTags on the agent, instead of
>> getDefaultTags the data factory.
>>
>> The trickiest part as I see it is figuring out how to change the add
>> adaptor string syntax in ChukwaAgetn.processAddCommandE in a way that
>> both makes sense and doesn't break things. In it's current form it
>> doesn't have room for easy expansion except at the end of the line:
>>
>> add [name =] <adaptor_class_name> <datatype> <adaptor specific params>
>> <initial offset>
>>
>> Any thoughts or suggestions? There's also a potential gotcha with all
>> the impls of Adaptor.parseArgs either breaking or needing to
>> change....
>>
>> thanks,
>> Bill
>>
>>
>

--
Ari Rabkin [EMAIL PROTECTED]
UC Berkeley Computer Science Department