Your question is a good and tough one. I haven't find anything that helps
in guiding the schema design in the nosql world. There are general concepts
but none of them is closed to the SQL schema design in which you can apply
some rules to guiding your decision.
The best presentation I have found about the general concepts in hbase
schema design is
search for Schema Design. From this presentation, you can learn why it is
so difficult to come up with a suggestion for your problem and learn some
best practices to start your own design.
On Thu, Nov 8, 2012 at 10:17 AM, Nick maillard <
[EMAIL PROTECTED]> wrote:
> Thanks for the anwsers.
> I'm trying to really make sense of NoSql and Hbase in particular. The
> part has a lot of loop wholes and I'm still fighting off the compaction
> issue, so right I would not say hbase is fast when it comes to writing.
> But my post was more nosql schema thoughts, after so long on SQL schemas it
> does take a little time to stop thinking that way in terms of schema but
> also of
> in terms of questions or of interaction if you'd rather.
> So contrary to SQL I cannot think a logical model for data and figure out
> what I'll want out of it.
> In my case I stated 10 TB but this is very likely to grow since it is the
> starting scenario. I do believe having a 30 minutes latency before
> logs is not an issue, however the questions to the Hbase must be anwsered
> real time manner.
> I have been trying to play with my questions and see how they can fit in a
> rowkey and Or columnfamilies but they being different in nature and
> purpose I
> ended supposing they would end up in a number of different hbase tables in
> order to adress the scope of questions. One table for one or three
> The questions have joins and filter embedded in them.
> My post was about getting your insight on how you would go about answering
> type of issues, what your schemas might be. Overall how to switch from SQL
> vision to noSQL vision.
> Coprocessor to create a couple of tables on the fly for all questions are
> interesting way. To mapreduce the logs however I am afraid the performance
> be to slow. I was thinking of answering in milliseconds if possible. But
> might be me being new and not evaluating correctly.