It is absolutely fine to have as many tables as we like. My point
was that if we have a large no of tables then it might add some
overhead in locating the user region, as there will be a huge amount
of mapping from "user tables" to "region servers". Also, client will
have to cache more information blocking the additional memory. So, I
suggested to have small no of large tables rather than large no of
small tables, if the data is similar.
On Tue, Aug 7, 2012 at 5:30 PM, Eric Czech <[EMAIL PROTECTED]> wrote:
> Thanks Mohammad,
> By saying the major purpose is to host very large tables (implying a
> smaller number of them), are you referring to anything other than the
> memstores per column family taking up sizable portions of physical memory?
> Are there other components or design aspects that make using large numbers
> of tables inadvisable?
> On Sun, Aug 5, 2012 at 5:55 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>> Hello sir,
>> Going for a single table with 30+ rows would be a better choice,
>> if the data from all the sources is not very different. Since, you are
>> considering Hbase as your data store, it wouldn't be wise to have
>> several small rows. The major purpose of Hbase is to host very large
>> tables that may go beyond billions of rows and millions of columns.
>> Mohammad Tariq
>> On Mon, Aug 6, 2012 at 3:18 AM, Eric Czech <[EMAIL PROTECTED]> wrote:
>>> I need to support data that comes from 30+ sources and the structure
>>> of that data is consistent across all the sources, but what I'm not
>>> clear on is whether or not I should use 30+ tables with roughly the
>>> same format or 1 table where the row key reflects the source.
>>> Anybody have a strong argument one way or the other?