Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> send the whole logs


Copy link to this message
-
Re: send the whole logs
Ah, now I understand. The syslog source is currently parsing out some of
the fields and putting them in as headers, e.g. facility, severity,
timestamp, hostname.

If you want to have it output in the original format, you can implement
a EventSerializer.  You could take a look at SyslogAvroEventSerializer to
see how it deals with the syslog Flume event. Or implement your
own SyslogUDPSource that puts the entire message in the Flume event's body.

dave
On Fri, Oct 11, 2013 at 3:36 AM, Martinus m <[EMAIL PROTECTED]> wrote:

> Hi David,
>
> Actually the requirement is I need to send the whole logs info as the
> original one, before someone else do the MR on it. Is there any other
> options in Flume configurations that I can do this?
>
> Thanks.
>
> Martinus
>
>
> On Wed, Oct 9, 2013 at 8:53 PM, David Sinclair <
> [EMAIL PROTECTED]> wrote:
>
>> That is the original timestamp; just in seconds since epoch, not
>> formatted as a string. Could you parse that in MR to a date if you need to
>> manipulate it as such?
>>
>>
>> On Wed, Oct 9, 2013 at 3:52 AM, Martinus m <[EMAIL PROTECTED]> wrote:
>>
>>> Hi Hari,
>>>
>>> Thanks, it's worked, but it's timestamp information doesn't looks like
>>> the original one :
>>>
>>> {timestamp=1381304766000, host=flume, Severity=6, Facility=3}
>>>
>>> The original one is like below :
>>>
>>> Oct  9 07:46:06 flume
>>>
>>> Is there any other configuration that I miss to make this header just
>>> looks like the original message?
>>>
>>> Thanks,
>>>
>>> Martinus
>>>
>>>
>>> On Wed, Oct 9, 2013 at 3:29 PM, Hari Shreedharan <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>>  text does not write the headers, try HEADER_AND_TEXT
>>>>
>>>>
>>>> Thanks,
>>>> Hari
>>>>
>>>> On Tuesday, October 8, 2013 at 11:26 PM, Martinus m wrote:
>>>>
>>>> Hi Hari,
>>>>
>>>> I tried to add below serializers in my flume.conf :
>>>>
>>>> agent.sinks.s3Sink.serializer = text
>>>>
>>>> And it's still doesn't have timestamp (date) info from the original log
>>>> message :
>>>>
>>>> Thanks,
>>>>
>>>> Martinus
>>>>
>>>>
>>>>
>>>> On Wed, Oct 9, 2013 at 1:45 PM, Hari Shreedharan <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>  The timestamp is in the event header. You would need to use a
>>>> serializer which also writes out the headers.
>>>>
>>>>
>>>> Thanks,
>>>> Hari
>>>>
>>>> On Tuesday, October 8, 2013 at 7:35 PM, Martinus m wrote:
>>>>
>>>> Hi David,
>>>>
>>>> I'm using Syslog UDP source and on the syslog messages it have it's
>>>> timestamp for each message. I also use HDFS sink, but when I saw the result
>>>> message on HDFS folder, it doesn't have the timestamp (date) info.
>>>>
>>>> Thanks,
>>>>
>>>> Martinus
>>>>
>>>>
>>>> On Tue, Oct 8, 2013 at 9:01 PM, David Sinclair <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>> Martinus,
>>>>
>>>> Can you give a little more details? It sounds like you want to use the
>>>> Spooling Directory Source,
>>>> http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source,
>>>> but if you can be clearer about your requirements, I may be able to help
>>>> you better.
>>>>
>>>> dave
>>>>
>>>>
>>>> On Tue, Oct 8, 2013 at 6:45 AM, Martinus m <[EMAIL PROTECTED]>wrote:
>>>>
>>>> Hi,
>>>>
>>>> What configurations should I put on flume.conf to get the whole logs
>>>> message to be put into the sink?
>>>>
>>>> Thanks.
>>>>
>>>> Martinus
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>