|
|
-
Controlling TableMapReduceUtil table split points
David Koch 2013-01-06, 12:37
Hello,
Is there a way to override the method which is used by TableMapReduceUtil to split a HBase table across several mapper instances when running a Map Reduce over it?
Our current data model is:
Row-key: <user_id> Qualifier: <timestamp> Value: <event>
The distribution of "events per <user_id>" is long-tailed. Some users have in the order of 10^6 events, whereas the mean is somewhat around 100.
These occasional big rows cause problems when running M/R jobs over the table (timeouts, etc.). Also, according to the HBase book they are detrimental to the region splitting process. We currently use server-side filters to not retrieve these large rows but it's not a nice solution.
To overcome this, I was thinking about using a "tall and narrow" key design (term from HBase book), meaning in our case that rowkeys are formed by a composite of <user_id>_<tmst>.
However, and hence my original question, how would I process this table "user-wise" in a HBase M/R job? Specifically, how do I make sure that a user's history is not split across several Mapper instances?
Thank you,
/David
-
Re: Controlling TableMapReduceUtil table split points
Ted Yu 2013-01-06, 16:11
David: The determination of splits is done in TableInputFormatBase.getSplits() where table.getStartEndKeys() is called to get the boundaries of regions. You can take a look and see how you can customize the splits.
bq. how do I make sure that a user's history is not split across several Mapper instances?
If events for one user are processed by a single mapper, I think you would continue to see timeouts in your map/reduce job.
Cheers
On Sun, Jan 6, 2013 at 4:37 AM, David Koch <[EMAIL PROTECTED]> wrote:
> Hello, > > Is there a way to override the method which is used by TableMapReduceUtil > to split a HBase table across several mapper instances when running a Map > Reduce over it? > > Our current data model is: > > Row-key: <user_id> > Qualifier: <timestamp> > Value: <event> > > The distribution of "events per <user_id>" is long-tailed. Some users have > in the order of 10^6 events, whereas the mean is somewhat around 100. > > These occasional big rows cause problems when running M/R jobs over the > table (timeouts, etc.). Also, according to the HBase book they are > detrimental to the region splitting process. We currently use server-side > filters to not retrieve these large rows but it's not a nice solution. > > To overcome this, I was thinking about using a "tall and narrow" key design > (term from HBase book), meaning in our case that rowkeys are formed by a > composite of <user_id>_<tmst>. > > However, and hence my original question, how would I process this table > "user-wise" in a HBase M/R job? Specifically, how do I make sure that a > user's history is not split across several Mapper instances? > > Thank you, > > /David >
-
Re: Controlling TableMapReduceUtil table split points
David Koch 2013-01-06, 16:37
Hi Ted,
Thank you for your response. I will take a look.
With regards to the timeouts: I think changing the key design as outlined above would ameliorate the situation since each map call only requests a small amount of data as opposed to what could be a large chunk. I remember that simply doing a get on one of the large outlier rows (~500mb) brought down the region server involved.
/David
On Sun, Jan 6, 2013 at 5:11 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> If events for one user are processed by a single mapper, I think you would >
-
Re: Controlling TableMapReduceUtil table split points
Ted Yu 2013-01-06, 16:47
All right. Hopefully the new schema would solve that problem.
On Sun, Jan 6, 2013 at 8:37 AM, David Koch <[EMAIL PROTECTED]> wrote:
> Hi Ted, > > Thank you for your response. I will take a look. > > With regards to the timeouts: I think changing the key design as outlined > above would ameliorate the situation since each map call only requests a > small amount of data as opposed to what could be a large chunk. I remember > that simply doing a get on one of the large outlier rows (~500mb) brought > down the region server involved. > > /David > > On Sun, Jan 6, 2013 at 5:11 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > If events for one user are processed by a single mapper, I think you > would > > >
-
Re: Controlling TableMapReduceUtil table split points
Dhaval Shah 2013-01-06, 17:29
Another option to avoid the timeout/oome issues is to use scan.setBatch() so that the scanner would function normally for small rows but would break up large rows in multiple Result objects which you can now use in conjunction with scan.setCaching() to control how much data you get back..
This approach would not need a change in your schema design and would ensure that only 1 mapper processes the entire row (but in multiple calls to the map function)
------------------------------ On Sun 6 Jan, 2013 10:07 PM IST David Koch wrote:
>Hi Ted, > >Thank you for your response. I will take a look. > >With regards to the timeouts: I think changing the key design as outlined >above would ameliorate the situation since each map call only requests a >small amount of data as opposed to what could be a large chunk. I remember >that simply doing a get on one of the large outlier rows (~500mb) brought >down the region server involved. > >/David > >On Sun, Jan 6, 2013 at 5:11 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > >> If events for one user are processed by a single mapper, I think you would >>
-
Re: Controlling TableMapReduceUtil table split points
David Koch 2013-01-06, 17:53
Hi Dhaval,
Good call on the setBatch. I had forgotten about it. Just like changing the schema it would involve changing the map(...) to reflect the fact that only part of the user's data is returned in each call but I would not have to manipulate table splits.
The HBase book does suggest that it's bad practice to use the "logical" schema of lumping all user data into a single row(*) but I'll do some testing to see what works.
Thank you,
/David
(*) Chapter 9, section "Tall-Narrow Versus Flat-Wide Tables", 3rd ed., page 359) On Sun, Jan 6, 2013 at 6:29 PM, Dhaval Shah <[EMAIL PROTECTED]>wrote:
> Another option to avoid the timeout/oome issues is to use scan.setBatch() > so that the scanner would function normally for small rows but would break > up large rows in multiple Result objects which you can now use in > conjunction with scan.setCaching() to control how much data you get back.. > > This approach would not need a change in your schema design and would > ensure that only 1 mapper processes the entire row (but in multiple calls > to the map function) >
-
Re: Controlling TableMapReduceUtil table split points
Dhaval Shah 2013-01-22, 23:10
Hi David.. We successfully use the "logical" schema approach and have not seen issues yet.. Ofcourse it all depends on the use case and saying it would work for you because it works for us would be naive.. However, if it does work, it will make your life much easier because with a logical schema other problems become simpler (like you can be sure that 1 map function will process an entire row rather than a row going to multiple mappers, or if you are using filters that restrict queries to only a small subset of the data, even setBatch won't be needed for those use cases).. I did run into issues where I did not use setBatch and my mappers ran out of memory but that was a simpler one to solve (and by the way if you are on CDH4, the HBase export utility also does not use setBatch and your mapper will run out of memory if you have a large row.. Its easy to put that line in though as a config param and this feature is available in future releases of HBase trunk)
Regards, Dhaval
________________________________ From: David Koch <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Sunday, 6 January 2013 12:53 PM Subject: Re: Controlling TableMapReduceUtil table split points Hi Dhaval,
Good call on the setBatch. I had forgotten about it. Just like changing the schema it would involve changing the map(...) to reflect the fact that only part of the user's data is returned in each call but I would not have to manipulate table splits.
The HBase book does suggest that it's bad practice to use the "logical" schema of lumping all user data into a single row(*) but I'll do some testing to see what works.
Thank you,
/David
(*) Chapter 9, section "Tall-Narrow Versus Flat-Wide Tables", 3rd ed., page 359) On Sun, Jan 6, 2013 at 6:29 PM, Dhaval Shah <[EMAIL PROTECTED]>wrote:
> Another option to avoid the timeout/oome issues is to use scan.setBatch() > so that the scanner would function normally for small rows but would break > up large rows in multiple Result objects which you can now use in > conjunction with scan.setCaching() to control how much data you get back.. > > This approach would not need a change in your schema design and would > ensure that only 1 mapper processes the entire row (but in multiple calls > to the map function) >
|
|