Thanks everyone for the excellent ideas.
Ryan - I kinda understand your suggestion to a point. If time permits,
please explain further.
What you are suggesting is to create a table with 99 rows with keys 'c_1',
'c_2'... thru 'c_99'. Row c_1 would generate ids 1, 101, 201.. so on, and
row c_99 would generate 99, 199, & so on. I got it this far.
But hypothetically speaking, let's say I am running a MapReduce to process a
huge log file. Each line of the log would be passed to a Map function.
Trying to figure out how I would distribute load evenly amongst c_1 thru
c_99. Please explain.
On Sun, Feb 13, 2011 at 10:18 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:
> you can also stripe, eg:
> c_1 starts at 1, skip=100
> c_2 starts at 2, skip=100
> c_$i starts at $i, skip=100 for 3..99
> now you have 100x speed/parallelism. If single regionserver
> assignment becomes a problem, use multiple tables.
> On Sun, Feb 13, 2011 at 10:12 PM, Lars George <[EMAIL PROTECTED]>
> > Hi SS,
> > Some people that do not need strict contiguous IDs also use block
> > increments of say 100. Each app server then gets 100 IDs to hand out
> > and in case it dies it gets its next assigned 100 IDs and leaves a
> > small gap behind. That way you can take the pressure of the counter if
> > that is going to be an issue for you. Depends on your insert frequency
> > obviously.
> > Lars
> > On Sun, Feb 13, 2011 at 7:10 PM, Something Something
> > <[EMAIL PROTECTED]> wrote:
> >> Hello,
> >> Can you please tell me if this is the proper way of designing a table
> >> got an auto increment key? If there's a better way please let me know
> >> as well.
> >> After reading the mail archives, I learned that the best way is to use
> >> 'incrementColumnValue' method of HTable.
> >> So hypothetically speaking let's say I have to create a "User -> Orders"
> >> relationship. Every time user creates an order we will assign a system
> >> generated (auto increment) id as primary key for the order.
> >> I am thinking I could do this:
> >> 1) Create a table of Ids for various objects such as "Order". It will
> >> just a single row with key "1" and column families for various objects.
> >> When it's time to add a new order for a user I can do something like
> >> HTable tableIds = new HTable("IDs");
> >> Get get = new Get(Bytes.toBytes("1"));
> >> Result result = tableIds.get(get);
> >> long newOrderId = tableIds.incrementColumnValue(result.getRow(),
> >> "orderId", 1);
> >> // In future I could use the same table for other objects as follows
> >> // long newInvoiceId = tableIds.incrementColumnValue(result.getRow(),
> >> "invoices", "invoiceId", 1);
> >> 2) Once the newOrderId is retrieved I can add the info about order to
> >> UserOrder table with a key of format: userId + "*" + newOrderId. The
> >> "info" family of this table will have columns such as "orderAmount" ,
> >> "orderDate" etc.
> >> As per the documentation, the 'incrementColumnValue' is done in
> >> and serial fashion for each row with a rowlock. In other words, even in
> >> multi-threading environment we are guaranteed to get a unique key per
> >> thread, correct?
> >> Is this a correct/good design for a table that needs auto increment key?
> >> Please let me know. Thanks.