Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Designing table with auto increment key


Copy link to this message
-
Re: Designing table with auto increment key
Thanks everyone for the excellent ideas.

Ryan - I kinda understand your suggestion to a point.  If time permits,
please explain further.

What you are suggesting is to create a table with 99 rows with keys 'c_1',
'c_2'... thru 'c_99'.  Row c_1 would generate ids 1, 101, 201.. so on, and
row c_99 would generate 99, 199, & so on.  I got it this far.

But hypothetically speaking, let's say I am running a MapReduce to process a
huge log file.  Each line of the log would be passed to a Map function.
 Trying to figure out how I would distribute load evenly amongst c_1 thru
c_99.  Please explain.
On Sun, Feb 13, 2011 at 10:18 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:

> you can also stripe, eg:
>
> c_1 starts at 1, skip=100
> c_2 starts at 2, skip=100
> c_$i starts at $i, skip=100 for 3..99
>
> now you have 100x speed/parallelism.  If single regionserver
> assignment becomes a problem, use multiple tables.
>
> On Sun, Feb 13, 2011 at 10:12 PM, Lars George <[EMAIL PROTECTED]>
> wrote:
> > Hi SS,
> >
> > Some people that do not need strict contiguous IDs also use block
> > increments of say 100. Each app server then gets 100 IDs to hand out
> > and in case it dies it gets its next assigned 100 IDs and leaves a
> > small gap behind. That way you can take the pressure of the counter if
> > that is going to be an issue for you. Depends on your insert frequency
> > obviously.
> >
> > Lars
> >
> > On Sun, Feb 13, 2011 at 7:10 PM, Something Something
> > <[EMAIL PROTECTED]> wrote:
> >> Hello,
> >>
> >> Can you please tell me if this is the proper way of designing a table
> that's
> >> got an auto increment key?  If there's a better way please let me know
> that
> >> as well.
> >>
> >> After reading the mail archives, I learned that the best way is to use
> the
> >> 'incrementColumnValue' method of HTable.
> >>
> >> So hypothetically speaking let's say I have to create a "User -> Orders"
> >> relationship.  Every time user creates an order we will assign a system
> >> generated (auto increment) id as primary key for the order.
> >>
> >> I am thinking I could do this:
> >>
> >> 1)  Create a table of Ids for various objects such as "Order".  It will
> have
> >> just a single row with key "1" and column families for various objects.
> >>  When it's time to add a new order for a user I can do something like
> this:
> >>
> >> HTable tableIds = new HTable("IDs");
> >> Get get = new Get(Bytes.toBytes("1"));
> >> Result result = tableIds.get(get);
> >> long newOrderId = tableIds.incrementColumnValue(result.getRow(),
> "orders",
> >> "orderId", 1);
> >>
> >> // In future I could use the same table for other objects as follows
> >> // long newInvoiceId = tableIds.incrementColumnValue(result.getRow(),
> >> "invoices", "invoiceId", 1);
> >>
> >> 2)  Once the newOrderId is retrieved I can add the info about order to
> >> UserOrder table with a key of format:  userId + "*" + newOrderId.  The
> >> "info" family of this table will have columns such as "orderAmount" ,
> >> "orderDate" etc.
> >>
> >>
> >> As per the documentation, the 'incrementColumnValue' is done in
> exclusive
> >> and serial fashion for each row with a rowlock.  In other words, even in
> >> multi-threading environment we are guaranteed to get a unique key per
> >> thread, correct?
> >>
> >> Is this a correct/good design for a table that needs auto increment key?
> >>  Please let me know.  Thanks.
> >>
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB