Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Blur >> mail # dev >> Optimizing for indexing vs searching


Copy link to this message
-
Optimizing for indexing vs searching
Hi,

I want to refresh my understanding, so just a few imaginary situations.
Lets say I have a data ingestion of 25000 docs per second (average size
10k). There could be situations where I want to optimize for indexing and
in some cases I want to optimize for speed while searching. How do I
control these individually?
My understanding is that having fewer but bigger shards improves search
performance. Is this right?
Also does each shard correspond to one segment file (ignoring snapshots)? I
am trying to understand what happens when a shard is being searched and
someone tries to write to the same shard. Would a new segment be created?
(if so how do we control merging of segments within a shard?)

My apologies if this doesn't make a whole lot of sense.

Thank You.

- Rahul

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB