Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> How to use Flume Client perl API to send data to ElasticSearch?


+
shushuai zhu 2013-06-13, 03:47
+
shushuai zhu 2013-06-14, 03:03
+
Edward Sargisson 2013-06-14, 15:53
+
shushuai zhu 2013-06-14, 18:52
+
Edward Sargisson 2013-06-18, 21:48
Copy link to this message
-
Re: How to use Flume Client perl API to send data to ElasticSearch?
Edward,
 
Thanks again. Found the cause of the problem based on what you pointed out: the index pattern is wrong in Kibana config file. In summary, to make ES to work with Kibana, the followings were done:
 
Have "timestamp" in headers along with other fields
Use org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer as ES sink serializer
Modify Smart_index_pattern in Kibana config file
 
Cheers,
 
Shushuai
 

________________________________
 From: Edward Sargisson <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Sent: Tuesday, June 18, 2013 5:48 PM
Subject: Re: How to use Flume Client perl API to send data to ElasticSearch?
  
Hi Shushuai,
Your index naming isn't matching what Kibana is looking for.
Note that flume is writing to flume-2013-06-14 while logstash is writing to logstash-2013.05.28.
The first part is a setting kin your Kibana config. I've never seen logstash write with periods as delimiters so I'm not sure why that would work.

The other thing to be aware of is that the machine running ElasticSearchSink needs to be in the UTC Timezone. There is a fixed defect where ElasticSearchSink uses the local timezone to decide which index to write to while Kibana always uses UTC.

Cheers,
Edward
"

Edward,
 
Thanks for the reply. I tried using both epoch time and date for timestamp but either of them makes the ElasticSearch to be viewed by Kibana. The perl call looks like:
 
my ($result, $response) = $ng_requestor->request('appendBatch', [{ headers => {"\@source" => "default", "\@timestamp" =>
1369769253000, "\@source_host" => "null", "\@source_path" =>
"default", "\@message" => "abc", "\@type" => "flume-input"}, body
=> "hello, this is sent from perl (using FlumeNG)"}]);
print "$response\n";    # response will be 'OK' on success

I could always query the ElasticSearch with REST API. A returned example result in JSON looks like:
 
{
_index: "flume-2013-06-14"
_type: "logs"
_id: "iug8xyVWRK6Qm7Z22ohCXw"
_score: 1
_source: {
body: "hello, this is sent from perl (using FlumeNG)"
@timestamp: "2013-05-28T19:27:49.947Z"
@message: "abc"
@source: "default"
@type: "flume-input"
@source_host: "null"
@source_path: "default"
}
}
 
A returned example result from the ElasticSearch built with Redis+LogStash looks like:
 
{
_index: "logstash-2013.05.28"
_type: "redis-input"
_id: "I3AJHKzhQ1GXWc1Lb8WlFQ"
_score: 1
_source: {
@source: "default"
@tags: [0]
@fields: {
OBS_FIELD_PRIORITY: "DEBUG"
OBS_LOG_ENTRY_CONTENT:
 "2013-05-28 12:34:21,224 [#SYSTEM_POOL#:WorkManagerSvc] DEBUG
workmanager.WorkManagerStats logp.251 - WeightedAvg.summarize[RunningThreads] : Avg :
 1.0, min : 1, max : 1, lastVal : 1, dur : 60014812861, totdur : 60014812861 "
OBS_FIELD_CLASS: "workmanager.WorkManagerStats"
OBS_LOG_FILE: "/scratch/work/system/log/abc.trc"
OBS_FIELD_MESSAGE: "WeightedAvg.summarize[RunningThreads] : Avg : 1.0, min : 1, max : 1, lastVal : 1, dur : 60014812861, totdur : 60014812861 "
OBS_FIELD_LOGGER: "#SYSTEM_POOL#:WorkManagerSvc"
OBS_LOG_ENTRY_LENGTH: "226"
OBS_FIELD_METHOD: "logp.251"
OBS_FIELD_TIME: "1369769661224"
}
@timestamp: "2013-05-28T19:34:50.672Z"
@source_host: null
@source_path: "default"
@type: "redis-input"
}
}
 
Not sure what exactly in ElasticSearch causes the issue for Kibana. The
"body" name/value seems incompatible with others, but I could not
control it (it still shows with an empty string even if I removed it
from the API call).
 
Another major issue I have with the perl API is, how could I add the @fields data? When I have
another level of JSON in "hearders", it became an object string when
querying with REST.
 
Shushuai"