Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume startup takes ~ hour


Copy link to this message
-
Re: Flume startup takes ~ hour
Hari,

Maybe you can just send me the java source for both classes?

Thanks
Anat
On Wed, Sep 25, 2013 at 9:29 AM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:

> OK, I understand.
>
> I can't apply the patch, I have a format failed error, not sure why.
> Is this a diff from trunk? or from some local version? I see some changes
> with no matching lines in code.
>
> Many thanks
> Anat
>
>
> On Tue, Sep 24, 2013 at 9:15 PM, Hari Shreedharan <
> [EMAIL PROTECTED]> wrote:
>
>> That is actually a symptom of the real problem. The real problem is that
>> the remove method ends up hitting the main checkpoint data structure and
>> causes too many ops on the hash map. The real fix is in the patch I
>> mentioned which reduce the number of ops tremendously.
>>
>>
>> Thanks,
>> Hari
>>
>> On Tuesday, September 24, 2013 at 6:12 AM, Anat Rozenzon wrote:
>>
>> For example this stack trace:
>>
>>
>> "lifecycleSupervisor-1-2" prio=10 tid=0x00007f89141d8800 nid=0x5ac8
>> runnable [0x00007f89501ad000]
>>    java.lang.Thread.State: RUNNABLE
>>         at java.lang.Integer.valueOf(Integer.java:642)
>>         at
>> org.apache.flume.channel.file.EventQueueBackingStoreFile.get(EventQueueBackingStoreFile.java:310)
>>         at
>> org.apache.flume.channel.file.FlumeEventQueue.get(FlumeEventQueue.java:225)
>>         at
>> org.apache.flume.channel.file.FlumeEventQueue.remove(FlumeEventQueue.java:195)
>>         - locked <0x00000006890f68f0> (a
>> org.apache.flume.channel.file.FlumeEventQueue)
>>         at
>> org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:405)
>>         at
>> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:328)
>>         at org.apache.flume.channel.file.Log.doReplay(Log.java:503)
>>         at org.apache.flume.channel.file.Log.replay(Log.java:430)
>>         at
>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302)
>>         - locked <0x00000006890ea360> (a
>> org.apache.flume.channel.file.FileChannel)
>>         at
>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>>         - locked <0x00000006890ea360> (a
>> org.apache.flume.channel.file.FileChannel)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>>         at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:724)
>>
>>
>>
>> On Tue, Sep 24, 2013 at 4:10 PM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
>>
>> After some deeper dive, it seems that the problem is with HashMap usage
>> in EventQueueBackingStoreFile.
>>
>> Almost every time I run jstack the JVM is inside
>> EventQueueBackingStoreFile.get() doing either HashMap.containsKey() or
>> Integer.valueOf().
>> This is because of overwriteMap is defined as regular HashMap<Integer,
>> Long>().
>>
>> Does your fix solves this issue?
>>
>> I think maybe using a Long[] will be better.
>>
>>
>> On Tue, Sep 24, 2013 at 2:34 PM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
>>
>> Thanks Hari, great news, I'll be glad to test it.
>>
>> However, I don't have environment with trunk, any way I can get it
>> packaged somehow?
>>
>>
>> On Mon, Sep 23, 2013 at 8:50 PM, Hari Shreedharan <
>> [EMAIL PROTECTED]> wrote:
>>
>>  How many events does the File Channel get every 30 seconds and how many
>> get taken out? This is one of the edge cases of the File Channel I have
>> been working on ironing out. There is a patch on
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB