Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop Real time help


Copy link to this message
-
Re: Hadoop Real time help
Lucene allows you to build a kind of inverted index "content to document
identifier". Solr or ElasticSearch allows to scale the process.

However, if I am reading it correctly, you are saying that you can not pre
compute a structure (such an index) before the search?

If that's true and that you need to process GB of data, then you have to
allow a latency, if you can not have everything in memory before the search
itself.

I can't say anything more precisely.  It will depend on your context. One
may ask : why can't you index the content of your database and your files?

Bertrand

On Sun, Aug 19, 2012 at 9:06 PM, mahout user <[EMAIL PROTECTED]> wrote:

> Thanks Mohit and  Bertrand,
>
>      I am looking into hadoop for search engine as many others. But in
> case of search engine, I know lucene is there. But in my case i have
> implemented java classes, they are searching from databases as well as from
> csv files. But i cant understand if there are GB's of data is there, then
> how can i get real time search service with hadoop. ?
>
>
> On Sun, Aug 19, 2012 at 10:06 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
>
>>
>>
>> On Sun, Aug 19, 2012 at 8:44 AM, mahout user <[EMAIL PROTECTED]>wrote:
>>
>>> Hello folks,
>>>
>>>
>>>    I am new to hadoop, I just want to get information that how hadoop
>>> framework is usefull for real time service.?can any one explain me..?
>>>
>>> Thanks.
>>>
>>
>> Can you specify your use case? Each use case calls for different design
>> consideration.
>>
>
>
--
Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB