Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Blur >> mail # dev >> Optimizing for indexing vs searching


Copy link to this message
-
Optimizing for indexing vs searching
Hi,

I want to refresh my understanding, so just a few imaginary situations.
Lets say I have a data ingestion of 25000 docs per second (average size
10k). There could be situations where I want to optimize for indexing and
in some cases I want to optimize for speed while searching. How do I
control these individually?
My understanding is that having fewer but bigger shards improves search
performance. Is this right?
Also does each shard correspond to one segment file (ignoring snapshots)? I
am trying to understand what happens when a shard is being searched and
someone tries to write to the same shard. Would a new segment be created?
(if so how do we control merging of segments within a shard?)

My apologies if this doesn't make a whole lot of sense.

Thank You.

- Rahul