Take a look at these two settings:  topology.spout.wait.strategy & topology.sleep.spout.wait.strategy.time.ms
-roshan

From: "Ramin Farajollah (BLOOMBERG/ 731 LEX)" <[EMAIL PROTECTED]>
Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, Ramin Farajollah <[EMAIL PROTECTED]>
Date: Tuesday, July 18, 2017 at 2:27 PM
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Subject: Re: blocking ISpout::nextTuple()

Thank you for your replies.

Cool. We've switch to non-blocking.
How does one avoid nextTuple() being called too fast (and potentially use up too much cpu)?

We have seen sleeps (in test spouts) and one colleague suggested a wait strategy. Which should we do, if any?
From: [EMAIL PROTECTED]
Subject: Re: blocking ISpout::nextTuple()
Avoid blocking inside nextTuple(). Use non blocking call (or blocking call with timeout) to check external source for new data.  Since the spout (doing I/O) are already running asynchronously from the bolts (doing processing) … I personally prefer to keep the spout single threaded If the source provides either a non-blocking or blocking-with-wait method of reading.

Blocking too long in the nextTuple() can lead to the spout taking longer to process acks (if acking is enabled) and generate updated  metrics.
-roshan
From: "Ramin Farajollah (BLOOMBERG/ 731 LEX)" <[EMAIL PROTECTED]>
Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, Ramin Farajollah <[EMAIL PROTECTED]>
Date: Tuesday, July 18, 2017 at 9:10 AM
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Subject: Re: blocking ISpout::nextTuple()

Agree on populating the queue on another thread.
When you say "try to pop the next message", do you mean that it should block indefinitely?

My colleagues suggest that it should not. Instead a wait strategy to be used in the yaml:
https://storm.apache.org/releases/1.0.3/javadocs/org/apache/storm/spout/ISpoutWaitStrategy.html
From: [EMAIL PROTECTED]
Subject: Re: blocking ISpout::nextTuple()
I believe the docs around spouts say nextTuple() should never block.  In the spout implementations I've done, I typically spin off a separate work thread in the spout's open() method and push tuples into a shared concurrent queue.  nextTuple() simply tries to pop the next message off of that shared queue.

On Tue, Jul 18, 2017 at 8:32 AM, Ramin Farajollah (BLOOMBERG/ 731 LEX) <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi,

should a spout block on its source to produce a tuple?

For example, should it read its source from a blocking queue before emit?
In this case, the spout will indefinitely block before it can emit a tuple.

I read about how ack and fail are processed on the same thread.

1) Will blocking in spout's nextTuple result in blocking the downstream bolts?
2) How about when a messageId is not specified?
3) How about when a tuple traverses multiple JVM instances on the same or cross machines?

Kind regards
<< �gA mind is like a parachute. It doesn't work if it is not open�h Frank Zap >>
<< �gA mind is like a parachute. It doesn't work if it is not open�h Frank Zap >>

<< �gA mind is like a parachute. It doesn't work if it is not open�h Frank Zap >>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB