Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Rows vs. Columns


+
Konrad Tendera 2012-03-19, 14:51
+
Laxman 2012-03-20, 08:56
+
Michael Segel 2012-03-20, 09:32
+
Konrad Tendera 2012-03-20, 09:40
+
Konrad Tendera 2012-03-20, 09:04
+
Qian Ye 2012-03-20, 09:32
+
Michael Segel 2012-03-20, 09:44
Copy link to this message
-
Re: Rows vs. Columns
As the advice says...  Millions of colums are not a good idea.   If your
user information will be sparse eg only a few hundred users will associate
with a particular row you'll be fine.  However if your matrix is complete
you probably need to store as rows.  Also you should check out advice (a
jira bug covers this) about frequent flushes using column families of
substantially different sizes if the blob is large and the info is small.
On Mar 19, 2012 1:07 PM, "Konrad Tendera" <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I'm designing some schema for my use case and I'm considering what will be
> better: rows or columns. Here's what I need - my schema actually looks like
> this (it will be used for keeping not large pdf files or single pages of
> larger document)
> table files:
>    family "info":
>        "info:pg" - keeps page number
>        "info:id" - sender ID
>        "info:nm" - pdf name
>        ***
>    family "data":
>        "data:blob" - blob of pdf file
>
> Now let's get back to ***: each user can add multiple of additional
> properties ("name" - "value"), but let's assume that every user will be so
> creative that there won't be two same names. I don't know how solve this
> problem: each "name" will be new column ("info:name") or I should try to do
> this like it is said here: http://hbase.apache.org/book.**
> html#schema.smackdown.rowscols<http://hbase.apache.org/book.html#schema.smackdown.rowscols>and make new row for earch property?
>
> K.
>
+
Manish Bhoge 2012-03-20, 03:44