Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Re: Get on a row with multiple columns


+
Varun Sharma 2013-02-09, 05:22
+
lars hofhansl 2013-02-09, 05:34
+
Varun Sharma 2013-02-09, 05:44
+
Ted Yu 2013-02-09, 05:55
+
Varun Sharma 2013-02-09, 06:05
+
lars hofhansl 2013-02-09, 06:33
+
Varun Sharma 2013-02-09, 06:45
+
Varun Sharma 2013-02-09, 06:57
+
lars hofhansl 2013-02-09, 07:31
+
lars hofhansl 2013-02-09, 07:41
+
lars hofhansl 2013-02-09, 07:57
+
Varun Sharma 2013-02-09, 08:05
+
Varun Sharma 2013-02-09, 08:11
+
lars hofhansl 2013-02-09, 08:17
Copy link to this message
-
Re: Get on a row with multiple columns
Varun Sharma 2013-02-09, 08:29
Yeah, I meant true...

On Sat, Feb 9, 2013 at 12:17 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Should be set to true. If tcpnodelay is set to true, Nagle's is disabled.
>
> -- Lars
>
>
>
> ________________________________
>  From: Varun Sharma <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Sent: Saturday, February 9, 2013 12:11 AM
> Subject: Re: Get on a row with multiple columns
>
>
> Okay I did my research - these need to be set to false. I agree.
>
>
> On Sat, Feb 9, 2013 at 12:05 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:
>
> I have ipc.client.tcpnodelay, ipc.server.tcpnodelay set to false and the
> hbase one - [hbase].ipc.client.tcpnodelay set to true. Do these induce
> network latency ?
> >
> >
> >On Fri, Feb 8, 2013 at 11:57 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >
> >Sorry.. I meant set these two config parameters to true (not false as I
> state below).
> >>
> >>
> >>
> >>
> >>----- Original Message -----
> >>From: lars hofhansl <[EMAIL PROTECTED]>
> >>To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> >>Cc:
> >>Sent: Friday, February 8, 2013 11:41 PM
> >>Subject: Re: Get on a row with multiple columns
> >>
> >>Only somewhat related. Seeing the magic 40ms random read time there. Did
> you disable Nagle's?
> >>(set hbase.ipc.client.tcpnodelay and ipc.server.tcpnodelay to false in
> hbase-site.xml).
> >>
> >>________________________________
> >>From: Varun Sharma <[EMAIL PROTECTED]>
> >>To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> >>Sent: Friday, February 8, 2013 10:45 PM
> >>Subject: Re: Get on a row with multiple columns
> >>
> >>The use case is like your twitter feed. Tweets from people u follow. When
> >>someone unfollows, you need to delete a bunch of his tweets from the
> >>following feed. So, its frequent, and we are essentially running into
> some
> >>extreme corner cases like the one above. We need high write throughput
> for
> >>this, since when someone tweets, we need to fanout the tweet to all the
> >>followers. We need the ability to do fast deletes (unfollow) and fast
> adds
> >>(follow) and also be able to do fast random gets - when a real user loads
> >>the feed. I doubt we will able to play much with the schema here since we
> >>need to support a bunch of use cases.
> >>
> >>@lars: It does not take 30 seconds to place 300 delete markers. It takes
> 30
> >>seconds to first find which of those 300 pins are in the set of columns
> >>present - this invokes 300 gets and then place the appropriate delete
> >>markers. Note that we can have tens of thousands of columns in a single
> row
> >>so a single get is not cheap.
> >>
> >>If we were to just place delete markers, that is very fast. But when
> >>started doing that, our random read performance suffered because of too
> >>many delete markers. The 90th percentile on random reads shot up from 40
> >>milliseconds to 150 milliseconds, which is not acceptable for our
> usecase.
> >>
> >>Thanks
> >>Varun
> >>
> >>On Fri, Feb 8, 2013 at 10:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >>
> >>> Can you organize your columns and then delete by column family?
> >>>
> >>> deleteColumn without specifying a TS is expensive, since HBase first
> has
> >>> to figure out what the latest TS is.
> >>>
> >>> Should be better in 0.94.1 or later since deletes are batched like Puts
> >>> (still need to retrieve the latest version, though).
> >>>
> >>> In 0.94.3 or later you can also the BulkDeleteEndPoint, which basically
> >>> let's specify a scan condition and then place specific delete marker
> for
> >>> all KVs encountered.
> >>>
> >>>
> >>> If you wanted to get really
> >>> fancy, you could hook up a coprocessor to the compaction process and
> >>> simply filter all KVs you no longer want (without ever placing any
> >>> delete markers).
> >>>
> >>>
> >>> Are you saying it takes 15 seconds to place 300 version delete
> markers?!
> >>>
> >>>
> >>> -- Lars
> >>>
> >>>
> >>>
> >>> ________________________________
+
Jean-Marc Spaggiari 2013-02-09, 13:02
+
lars hofhansl 2013-02-09, 16:46
+
Varun Sharma 2013-02-10, 22:35
+
Anoop Sam John 2013-02-11, 12:50
+
Varun Sharma 2013-02-11, 15:36
+
Varun Sharma 2013-02-11, 16:44
+
Varun Sharma 2013-02-11, 16:44
+
Ted Yu 2013-02-09, 06:09
+
Varun Sharma 2013-02-09, 06:16
+
Ted 2013-02-09, 06:29
+
lars hofhansl 2013-02-09, 06:34
+
Mrudula Madiraju 2013-08-14, 03:52