Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - send the whole logs


Copy link to this message
-
Re: send the whole logs
Jeff Lord 2013-10-12, 06:05
So if you use trunk and set the keepFields property to true than the
Timestamp and Hostname will be preserved in the body of the event now.

https://github.com/apache/flume/blob/trunk/flume-ng-doc/sphinx/FlumeUserGuide.rst#syslog-sources
On Fri, Oct 11, 2013 at 7:29 AM, David Sinclair <
[EMAIL PROTECTED]> wrote:

> Ah, now I understand. The syslog source is currently parsing out some of
> the fields and putting them in as headers, e.g. facility, severity,
> timestamp, hostname.
>
> If you want to have it output in the original format, you can implement
> a EventSerializer.  You could take a look at SyslogAvroEventSerializer to
> see how it deals with the syslog Flume event. Or implement your
> own SyslogUDPSource that puts the entire message in the Flume event's body.
>
> dave
>
>
> On Fri, Oct 11, 2013 at 3:36 AM, Martinus m <[EMAIL PROTECTED]> wrote:
>
>> Hi David,
>>
>> Actually the requirement is I need to send the whole logs info as the
>> original one, before someone else do the MR on it. Is there any other
>> options in Flume configurations that I can do this?
>>
>> Thanks.
>>
>> Martinus
>>
>>
>> On Wed, Oct 9, 2013 at 8:53 PM, David Sinclair <
>> [EMAIL PROTECTED]> wrote:
>>
>>> That is the original timestamp; just in seconds since epoch, not
>>> formatted as a string. Could you parse that in MR to a date if you need to
>>> manipulate it as such?
>>>
>>>
>>> On Wed, Oct 9, 2013 at 3:52 AM, Martinus m <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi Hari,
>>>>
>>>> Thanks, it's worked, but it's timestamp information doesn't looks like
>>>> the original one :
>>>>
>>>> {timestamp=1381304766000, host=flume, Severity=6, Facility=3}
>>>>
>>>> The original one is like below :
>>>>
>>>> Oct  9 07:46:06 flume
>>>>
>>>> Is there any other configuration that I miss to make this header just
>>>> looks like the original message?
>>>>
>>>> Thanks,
>>>>
>>>> Martinus
>>>>
>>>>
>>>> On Wed, Oct 9, 2013 at 3:29 PM, Hari Shreedharan <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>>  text does not write the headers, try HEADER_AND_TEXT
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Hari
>>>>>
>>>>> On Tuesday, October 8, 2013 at 11:26 PM, Martinus m wrote:
>>>>>
>>>>> Hi Hari,
>>>>>
>>>>> I tried to add below serializers in my flume.conf :
>>>>>
>>>>> agent.sinks.s3Sink.serializer = text
>>>>>
>>>>> And it's still doesn't have timestamp (date) info from the original
>>>>> log message :
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Martinus
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 9, 2013 at 1:45 PM, Hari Shreedharan <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>  The timestamp is in the event header. You would need to use a
>>>>> serializer which also writes out the headers.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Hari
>>>>>
>>>>> On Tuesday, October 8, 2013 at 7:35 PM, Martinus m wrote:
>>>>>
>>>>> Hi David,
>>>>>
>>>>> I'm using Syslog UDP source and on the syslog messages it have it's
>>>>> timestamp for each message. I also use HDFS sink, but when I saw the result
>>>>> message on HDFS folder, it doesn't have the timestamp (date) info.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Martinus
>>>>>
>>>>>
>>>>> On Tue, Oct 8, 2013 at 9:01 PM, David Sinclair <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>> Martinus,
>>>>>
>>>>> Can you give a little more details? It sounds like you want to use the
>>>>> Spooling Directory Source,
>>>>> http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source,
>>>>> but if you can be clearer about your requirements, I may be able to help
>>>>> you better.
>>>>>
>>>>> dave
>>>>>
>>>>>
>>>>> On Tue, Oct 8, 2013 at 6:45 AM, Martinus m <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> What configurations should I put on flume.conf to get the whole logs
>>>>> message to be put into the sink?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Martinus
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>