Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase (BigTable) many to many with students and courses


+
Em 2012-05-28, 17:50
Copy link to this message
-
Re: HBase (BigTable) many to many with students and courses
Depends...
Try looking at a hierarchical model rather than a relational model...

One thing to remember is that joins are expensive in HBase.

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 28, 2012, at 12:50 PM, Em <[EMAIL PROTECTED]> wrote:

> Hello list,
>
> I have some time now to try out HBase and want to use it for a private
> project.
>
> Questions like "How to I transfer one-to-many or many-to-many relations
> from my RDBMS's schema to HBase?" seem to be common.
>
> I hope we can throw all the best practices that are out there in this
> thread.
>
> As the wiki states:
> One should create two tables.
> One for students, another for courses.
>
> Within the students' table, one should add one column per selected
> course with the course_id besides some columns for the student itself
> (name, birthday, sex etc.).
>
> On the other hand one fills the courses table with one column per
> student_id besides some columns which describe the course itself (name,
> teacher, begin, end, year, location etc.).
>
> So far, so good.
>
> How do I access these tables efficiently?
>
> A common case would be to show all courses per student.
>
> To do so, one has to access the student-table and get all the student's
> courses-columns.
> Let's say their names are prefixed ids. One has to remove the prefix and
> then one accesses the courses-table to get all the courses and their
> metadata (name, teacher, location etc.).
>
> How do I do this kind of operation efficiently?
> The naive and brute force approach seems to be using a Get-object per
> course and fetch the neccessary data.
> Another approach seems to be using the HTable-class and unleash the
> power of "multigets" by using the batch()-method.
>
> All of the information above is theoretically, since I did not used it
> in code (I currently learn more about the fundamentals of HBase).
>
> That's why I give the question to you: How do you do this kind of
> operation by using HBase?
>
> Kind regards,
> Em
>
+
shashwat shriparv 2012-05-29, 10:49
+
Em 2012-05-29, 14:28
+
Ian Varley 2012-05-29, 15:08
+
Em 2012-05-29, 15:54
+
N Keywal 2012-05-29, 16:19
+
Ian Varley 2012-05-29, 17:49
+
Em 2012-05-29, 18:24
+
Ian Varley 2012-05-29, 19:26
+
Em 2012-05-29, 20:25
+
Ian Varley 2012-05-29, 23:30
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB