Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop/Lucene + Solr architecture suggestions?


Copy link to this message
-
Re: Hadoop/Lucene + Solr architecture suggestions?
That is very interesting, Lance, thank you.

Mark

On Wed, Oct 10, 2012 at 9:15 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:

> In the LucidWorks Big Data product, we handle this with a reducer that
> sends documents to a SolrCloud cluster. This way the index files are not
> managed by Hadoop.
>
> ----- Original Message -----
> | From: "Ted Dunning" <[EMAIL PROTECTED]>
> | To: [EMAIL PROTECTED]
> | Cc: "Hadoop User" <[EMAIL PROTECTED]>
> | Sent: Wednesday, October 10, 2012 7:58:57 AM
> | Subject: Re: Hadoop/Lucene + Solr architecture suggestions?
> |
> | I prefer to create indexes in the reducer personally.
> |
> | Also you can avoid the copies if you use an advanced hadoop-derived
> | distro. Email me off list for details.
> |
> | Sent from my iPhone
> |
> | On Oct 9, 2012, at 7:47 PM, Mark Kerzner <[EMAIL PROTECTED]>
> | wrote:
> |
> | > Hi,
> | >
> | > if I create a Lucene index in each mapper, locally, then copy them
> | > to under /jobid/mapid1, /jodid/mapid2, and then in the reducers
> | > copy them to some Solr machine (perhaps even merging), does such
> | > architecture makes sense, to create a searchable index with
> | > Hadoop?
> | >
> | > Are there links for similar architectures and questions?
> | >
> | > Thank you. Sincerely,
> | > Mark
> |
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB