Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Solr & HBase - Re: How is Data Indexed in HBase?


Copy link to this message
-
Re: Solr & HBase - Re: How is Data Indexed in HBase?
Bing Li 2012-02-23, 19:44
Dear Mr Gupta,

Your understanding about my solution is correct. Now both HBase and Solr
are used in my system. I hope it could work.

Thanks so much for your reply!

Best regards,
Bing

On Fri, Feb 24, 2012 at 3:30 AM, T Vinod Gupta <[EMAIL PROTECTED]>wrote:

> regarding your question on hbase support for high performance and
> consistency - i would say hbase is highly scalable and performant. how it
> does what it does can be understood by reading relevant chapters around
> architecture and design in the hbase book.
>
> with regards to ranking, i see your problem. but if you split the problem
> into hbase specific solution and solr based solution, you can achieve the
> results probably. may be you do the ranking and store the rank in hbase and
> then use solr to get the results and then use hbase as a lookup to get the
> rank. or you can put the rank as part of the document schema and index the
> rank too for range queries and such. is my understanding of your scenario
> wrong?
>
> thanks
>
>
> On Wed, Feb 22, 2012 at 9:51 AM, Bing Li <[EMAIL PROTECTED]> wrote:
>
>> Mr Gupta,
>>
>> Thanks so much for your reply!
>>
>> In my use cases, retrieving data by keyword is one of them. I think Solr
>> is a proper choice.
>>
>> However, Solr does not provide a complex enough support to rank. And,
>> frequent updating is also not suitable in Solr. So it is difficult to
>> retrieve data randomly based on the values other than keyword frequency in
>> text. In this case, I attempt to use HBase.
>>
>> But I don't know how HBase support high performance when it needs to keep
>> consistency in a large scale distributed system.
>>
>> Now both of them are used in my system.
>>
>> I will check out ElasticSearch.
>>
>> Best regards,
>> Bing
>>
>>
>> On Thu, Feb 23, 2012 at 1:35 AM, T Vinod Gupta <[EMAIL PROTECTED]>wrote:
>>
>>> Bing,
>>> Its a classic battle on whether to use solr or hbase or a combination of
>>> both. both systems are very different but there is some overlap in the
>>> utility. they also differ vastly when it compares to computation power,
>>> storage needs, etc. so in the end, it all boils down to your use case. you
>>> need to pick the technology that it best suited to your needs.
>>> im still not clear on your use case though.
>>>
>>> btw, if you haven't started using solr yet - then you might want to
>>> checkout ElasticSearch. I spent over a week researching between solr and ES
>>> and eventually chose ES due to its cool merits.
>>>
>>> thanks
>>>
>>>
>>> On Wed, Feb 22, 2012 at 9:31 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>>>
>>>> There is no secondary index support in HBase at the moment.
>>>>
>>>> It's on our road map.
>>>>
>>>> FYI
>>>>
>>>> On Wed, Feb 22, 2012 at 9:28 AM, Bing Li <[EMAIL PROTECTED]> wrote:
>>>>
>>>> > Jacques,
>>>> >
>>>> > Yes. But I still have questions about that.
>>>> >
>>>> > In my system, when users search with a keyword arbitrarily, the query
>>>> is
>>>> > forwarded to Solr. No any updating operations but appending new
>>>> indexes
>>>> > exist in Solr managed data.
>>>> >
>>>> > When I need to retrieve data based on ranking values, HBase is used.
>>>> And,
>>>> > the ranking values need to be updated all the time.
>>>> >
>>>> > Is that correct?
>>>> >
>>>> > My question is that the performance must be low if keeping
>>>> consistency in a
>>>> > large scale distributed environment. How does HBase handle this issue?
>>>> >
>>>> > Thanks so much!
>>>> >
>>>> > Bing
>>>> >
>>>> >
>>>> > On Thu, Feb 23, 2012 at 1:17 AM, Jacques <[EMAIL PROTECTED]> wrote:
>>>> >
>>>> > > It is highly unlikely that you could replace Solr with HBase.
>>>>  They're
>>>> > > really apples and oranges.
>>>> > >
>>>> > >
>>>> > > On Wed, Feb 22, 2012 at 1:09 AM, Bing Li <[EMAIL PROTECTED]> wrote:
>>>> > >
>>>> > >> Dear all,
>>>> > >>
>>>> > >> I wonder how data in HBase is indexed? Now Solr is used in my
>>>> system
>>>> > >> because data is managed in inverted index. Such an index is
>>>> suitable to
>>>> > >> retrieve unstructured and huge amount of data. How does HBase deal