Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop/Lucene + Solr architecture suggestions?


Copy link to this message
-
Re: Hadoop/Lucene + Solr architecture suggestions?
Interestingly, a few MapR customers have gone the other way, deliberately
having the indexer put  the Solr shards directly into MapR and letting it
distribute it. Has made index-management a cinch.

Otherwise they do run into what Tim alludes to.

On Wed, Oct 10, 2012 at 7:27 PM, Tim Williams <[EMAIL PROTECTED]> wrote:

> On Wed, Oct 10, 2012 at 10:15 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:
> > In the LucidWorks Big Data product, we handle this with a reducer that
> sends documents to a SolrCloud cluster. This way the index files are not
> managed by Hadoop.
>
> Hi Lance,
> I'm curious if you've gotten that to work with a decent-sized (e.g. >
> 250 node) cluster?  Even a trivial cluster seems to crush SolrCloud
> from a few months ago at least...
>
> Thanks,
> --tim
>
> > ----- Original Message -----
> > | From: "Ted Dunning" <[EMAIL PROTECTED]>
> > | To: [EMAIL PROTECTED]
> > | Cc: "Hadoop User" <[EMAIL PROTECTED]>
> > | Sent: Wednesday, October 10, 2012 7:58:57 AM
> > | Subject: Re: Hadoop/Lucene + Solr architecture suggestions?
> > |
> > | I prefer to create indexes in the reducer personally.
> > |
> > | Also you can avoid the copies if you use an advanced hadoop-derived
> > | distro. Email me off list for details.
> > |
> > | Sent from my iPhone
> > |
> > | On Oct 9, 2012, at 7:47 PM, Mark Kerzner <[EMAIL PROTECTED]>
> > | wrote:
> > |
> > | > Hi,
> > | >
> > | > if I create a Lucene index in each mapper, locally, then copy them
> > | > to under /jobid/mapid1, /jodid/mapid2, and then in the reducers
> > | > copy them to some Solr machine (perhaps even merging), does such
> > | > architecture makes sense, to create a searchable index with
> > | > Hadoop?
> > | >
> > | > Are there links for similar architectures and questions?
> > | >
> > | > Thank you. Sincerely,
> > | > Mark
> > |
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB