Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - flume agent with HDFS sink, syslog source and memory channel - stuck on hdfs IOException


Copy link to this message
-
Re: flume agent with HDFS sink, syslog source and memory channel - stuck on hdfs IOException
Suhas Satish 2014-01-15, 01:42
The patch has been tested and  uploaded. This should fix flume1.4 and
before.
https://issues.apache.org/jira/browse/FLUME-1654
Cheers,
Suhas.
On Wed, Oct 16, 2013 at 5:15 PM, Suhas Satish <[EMAIL PROTECTED]>wrote:

> There already exists a JIRA. I have come up with a local fix which works.
> https://issues.apache.org/jira/browse/FLUME-1654
>
> Will be uploading a patch soon.
>
> Cheers,
> Suhas.
>
>
> On Tue, Oct 15, 2013 at 1:15 PM, Roshan Naik <[EMAIL PROTECTED]>wrote:
>
>> Paul,
>>    HDFS sink issue apart... it sounds like this is a setup where  Hive  s
>> being allowed to read through new files/directories flowing into the
>> partition while HDFS sink is still writing to it. To my knowledge, in Hive,
>> a partition is considered immutable and it should not be updated once the
>> partition is created. So only once the HDFS sink has rolled over to the
>> next directory, the previous directory should be exposed to Hive.
>> -roshan
>>
>>
>> On Tue, Oct 15, 2013 at 11:23 AM, Paul Chavez <
>> [EMAIL PROTECTED]> wrote:
>>
>>> I can’t speak for Suhas, but I face a similar issue in production. For
>>> me it occurs when someone queries a .tmp file from Hive or Pig. This causes
>>> the HDFS sink to lose the ability to close and rename the file and then the
>>> HDFS sink is completely out of commission until the agent is restarted.
>>> We’ve mitigated this in our environment by careful Hive partition
>>> coordination but it still crops up in cases where people are running ad-hoc
>>> queries they probably shouldn’t be. We are waiting to get the latest CDH in
>>> production which eliminates the .tmp file issue but I would still like to
>>> have a more resilient HDFS sink and so I support development effort in this
>>> area.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Paul Chavez
>>>
>>>
>>>
>>>
>>>
>>> *From:* Roshan Naik [mailto:[EMAIL PROTECTED]]
>>> *Sent:* Tuesday, October 15, 2013 11:14 AM
>>> *To:* [EMAIL PROTECTED]
>>> *Cc:* [EMAIL PROTECTED]; [EMAIL PROTECTED]
>>> *Subject:* Re: flume agent with HDFS sink, syslog source and memory
>>> channel - stuck on hdfs IOException
>>>
>>>
>>>
>>> sounds like a valid bug. i am curious though... is there a use real use
>>> scenario you are facing in production ?
>>>
>>>
>>>
>>> On Mon, Oct 14, 2013 at 7:39 PM, Suhas Satish <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>> In summary, although the flume-agent JVM doesnt exit, once a HDFS IO
>>> exception
>>> occurs due to deleting a .tmp file, the agent doesn't recover from this
>>> to log
>>> other hdfs sink outputs generated by syslog source.
>>>
>>> There was only 1 JIRA remotely related to this HDFS sink issue I found in
>>> Apache which we didn't have. I tested by pulling-in jira patch
>>> FLUME-2007 into flume-1.4.0.
>>>
>>> https://github.com/apache/flume/commit/5b5470bd5d3e94842032009c36788d4ae346674bhttps://issues.apache.org/jira/browse/FLUME-2007<https://github.com/apache/flume/commit/5b5470bd5d3e94842032009c36788d4ae346674bhttps:/issues.apache.org/jira/browse/FLUME-2007>
>>>
>>> But it doesn't solve this issue.
>>>
>>> Should I open a new jira ticket?
>>>
>>>
>>>
>>> Thanks,
>>> Suhas.
>>>
>>>
>>> On Fri, Oct 11, 2013 at 4:13 PM, Suhas Satish <[EMAIL PROTECTED]
>>> >wrote:
>>>
>>>
>>> > Hi I have the  following flume configuration file   flume-syslog.conf
>>> > (attached) -
>>> >
>>> > 1.) I laucnh it with -
>>> >
>>> > bin/flume-ng agent -n agent -c conf -f conf/flume-syslog.conf
>>> >
>>> > 2.) Generate log output using loggen (provided by syslog-ng):
>>> > loggen -I 30 -s 300 -r 900 localhost 13073
>>> >
>>> > 3.) I verify flume output is generated under /flume_import/ on hadoop
>>> cluster.
>>> >
>>> > It generates output of the form -
>>> >
>>> > -rwxr-xr-x   3 root root     139235 2013-10-11 14:35
>>> > /flume_import/2013/10/14/logdata-2013-10-14-35-45.1381527345384.tmp
>>> > -rwxr-xr-x   3 root root     138095 2013-10-11 14:35
>>> > /flume_import/2013/10/14/logdata-2013-10-14-35-46.1381527346543.tmp