Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # dev >> Cluster-specific Adaptors


Copy link to this message
-
Re: Cluster-specific Adaptors
+1 on staying stateless.

I think the challenge we're facing is that we're trying to support a
syntax that is simple and readable and can be done with a single line
(i.e. for the initial_adaptors file, the telnet API, the command line,
etc), but the configs can potentially be not-so-simple.

For example, here's how you might configure the JMS adaptor which used
dependency injection. That's a lot for a single line and there's
nowhere to add new global configs in front of the adaptor specific
configs without breaking things.

add jms.JMSAdaptor jms-events
failover:(tcp://jms-host.foo.com:61616,tcp://jms-host.foo.com:61616)
-q some.queue.name -s "id_type IN ('162')" -x
org.apache.hadoop.chukwa.datacollection.adaptor.
jms.JMSMessagePropertyTransformer -p
"event_time,id_type,id,srcurl,xref,xrq,title -r event_time,id_type,id"
0

What if we were to adopt a few flags into the syntax:

add [name =] <adaptor_class_name> <datatype> [--tags <tags>]
[--adaptor-params <adaptor specific params>|--adaptor-config-file
<file>]
<initial offset>

The '--*' flags could be reserved. This would allow us to keep with a
one-line syntax where that approach works, but allow for expansion.
Also, if an adaptor config got to complex, those configs could be
specified in a file if needed.
On Mon, Sep 20, 2010 at 9:52 PM, Jerome Boulon <[EMAIL PROTECTED]> wrote:
> Hi,
> If I had to implement this, I will add an extra parameter
> (?extraParams=xyz).
> The adaptorImp will be the only one responsible for parsing this adaptor’s
> specific info.
> I don’t think that we could/should add new complexity in the parsing.
> The same think should be done for getCurrentStatus(), a public result, that
> is the same for all adaptors in order to know if the adaptors is working or
> not and a private section that will give extra information.
>
> Also, moving to a json input should simplify everything.
> /Jerome.
>
> On 9/20/10 5:15 PM, "Bill Graham" <[EMAIL PROTECTED]> wrote:
>
> I'd like to hear Ari's take on this, but this does feel a bit hacky to
> me. Plus, it would put the responsibility of parsing tags on each
> adaptor impl and would require a refactor of how each one currently
> parses args.
>
> Actually, we might be able to intercept the call to parseArgs in
> AbstractAdaptor and pull out the tags if they exist and pass the rest
> to the subclass, which would be none the wiser. Not the cleanest, but
> at lease not as intrusive on the adaptor implementations.
>
> Ari, also what about the getCurrentStatus() method? I'd think all the
> impls would somehow need to incorporate tags into that response as
> well, since AFAIR that's what's used to do Adaptor SerDe with the
> checkpoints file.
>
>
> On Mon, Sep 20, 2010 at 3:57 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
>> Hi Bill,
>>
>> This might be hacky but it should be possible to have adaptor specific
>> params to include tags.  Ari, what do you think?
>>
>> Regards,
>> Eric
>>
>> On 9/20/10 2:58 PM, "Bill Graham" <[EMAIL PROTECTED]> wrote:
>>
>> Hi,
>>
>> In CHUKWA-515 we discussed the possibility being able to add an
>> adaptor bound to a given cluster:
>>
>>
>> https://issues.apache.org/jira/browse/CHUKWA-515?focusedCommentId=12905811&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12905811
>>
>> I can actually see this being useful, especially now that it's easier
>> to add/remove agents with the Adaptor REST API. Looking into the code
>> it doesn't seem like it would be that hard to do, but I want to make
>> sure I'm not overlooking anything.
>>
>> It seems like we could support this with a few small changes:
>>
>> - Add the concept of tags to the Adaptor interface.
>> - AbstractAdator would support a getTags method which would return the
>> union of tags set on the Adaptor and the default tags on the
>> DataFactory.
>> - Internal tag implementations on each would change to store tags in
>> maps, instead of concat'ed strings. This would allow for a "last in
>> wins" type of functionality so tags could be overriden. This assumes
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB