|
Bill Graham
2011-06-13, 22:31
Barney Frank
2011-06-14, 12:04
Andrew Purtell
2011-06-16, 18:08
Bill Graham
2011-06-16, 18:29
Otis Gospodnetic
2011-06-18, 03:54
Hiller, Dean x66079
2011-06-19, 21:48
Bill Graham
2011-06-20, 17:06
Gary Helmling
2011-06-20, 17:56
Andrew Purtell
2011-06-20, 19:03
Bill Graham
2011-06-20, 21:15
|
-
any multitenancy suggestions for HBase?Bill Graham 2011-06-13, 22:31
Hello there,
We have a number of different groups within our organization who will soon be working within the same HBase cluster and we're trying to set up some best practices to keep thinks organized. Since there are no HBase ACLs and no concept of multiple databases in the cluster, we're looking to propose a simple convention that will hopefully keep people from stepping on each others toes (or worse!). Does anyone have any best/worst practices they're willing to share w.r.t. thing likes table/column naming schemes in a multitenant environment? For table names for example, is there anything better than a basic dot-delimited naming convention with the group name as the first token? Also, I assume there's no performance cost with using long table names like there is with long CF:col names. Please let me know if that's not the case. thanks, Bill
-
Re: any multitenancy suggestions for HBase?Barney Frank 2011-06-14, 12:04
Our implementation supports multiple customers that share the same tables
and column families. We use the customerId as the first token of the Row Id i.e. "CUST123|someOtherRowQualifier". For all customer queries, we add their customerId as the row prefix and, of course, ensure that they are authorized within our app. On Mon, Jun 13, 2011 at 5:31 PM, Bill Graham <[EMAIL PROTECTED]> wrote: > Hello there, > > We have a number of different groups within our organization who will soon > be working within the same HBase cluster and we're trying to set up some > best practices to keep thinks organized. Since there are no HBase ACLs and > no concept of multiple databases in the cluster, we're looking to propose a > simple convention that will hopefully keep people from stepping on each > others toes (or worse!). > > Does anyone have any best/worst practices they're willing to share w.r.t. > thing likes table/column naming schemes in a multitenant environment? For > table names for example, is there anything better than a basic > dot-delimited > naming convention with the group name as the first token? > > Also, I assume there's no performance cost with using long table names like > there is with long CF:col names. Please let me know if that's not the case. > > thanks, > Bill >
-
Re: any multitenancy suggestions for HBase?Andrew Purtell 2011-06-16, 18:08
> Since there are no HBase ACLs
This is true only until HBASE-3025 is ready and goes in, post 0.92. It may not be of immediate help now but yes HBase will have ACLs. They are on the roadmap. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --- On Tue, 6/14/11, Barney Frank <[EMAIL PROTECTED]> wrote: > From: Barney Frank <[EMAIL PROTECTED]> > Subject: Re: any multitenancy suggestions for HBase? > To: [EMAIL PROTECTED], [EMAIL PROTECTED] > Date: Tuesday, June 14, 2011, 5:04 AM > Our implementation supports multiple > customers that share the same tables > and column families. We use the customerId as the > first token of the Row Id > i.e. "CUST123|someOtherRowQualifier". For all > customer queries, we add > their customerId as the row prefix and, of course, ensure > that they are > authorized within our app. > > On Mon, Jun 13, 2011 at 5:31 PM, Bill Graham <[EMAIL PROTECTED]> > wrote: > > > Hello there, > > > > We have a number of different groups within our > organization who will soon > > be working within the same HBase cluster and we're > trying to set up some > > best practices to keep thinks organized. Since there > are no HBase ACLs and > > no concept of multiple databases in the cluster, we're > looking to propose a > > simple convention that will hopefully keep people from > stepping on each > > others toes (or worse!). > > > > Does anyone have any best/worst practices they're > willing to share w.r.t. > > thing likes table/column naming schemes in a > multitenant environment? For > > table names for example, is there anything better than > a basic > > dot-delimited > > naming convention with the group name as the first > token? > > > > Also, I assume there's no performance cost with using > long table names like > > there is with long CF:col names. Please let me know if > that's not the case. > > > > thanks, > > Bill > > >
-
Re: any multitenancy suggestions for HBase?Bill Graham 2011-06-16, 18:29
Thanks Andy, that will be a big help once it's out.
What do people think about introducing the concept of "databases" to HBase? Just a container of tables really, but something that can be used to group ACLs around at some point, or to just keep the table list manageable. Has there been any discussion about this yet? On Thu, Jun 16, 2011 at 11:08 AM, Andrew Purtell <[EMAIL PROTECTED]>wrote: > > Since there are no HBase ACLs > > This is true only until HBASE-3025 is ready and goes in, post 0.92. It may > not be of immediate help now but yes HBase will have ACLs. They are on the > roadmap. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > > --- On Tue, 6/14/11, Barney Frank <[EMAIL PROTECTED]> wrote: > > > From: Barney Frank <[EMAIL PROTECTED]> > > Subject: Re: any multitenancy suggestions for HBase? > > To: [EMAIL PROTECTED], [EMAIL PROTECTED] > > Date: Tuesday, June 14, 2011, 5:04 AM > > Our implementation supports multiple > > customers that share the same tables > > and column families. We use the customerId as the > > first token of the Row Id > > i.e. "CUST123|someOtherRowQualifier". For all > > customer queries, we add > > their customerId as the row prefix and, of course, ensure > > that they are > > authorized within our app. > > > > On Mon, Jun 13, 2011 at 5:31 PM, Bill Graham <[EMAIL PROTECTED]> > > wrote: > > > > > Hello there, > > > > > > We have a number of different groups within our > > organization who will soon > > > be working within the same HBase cluster and we're > > trying to set up some > > > best practices to keep thinks organized. Since there > > are no HBase ACLs and > > > no concept of multiple databases in the cluster, we're > > looking to propose a > > > simple convention that will hopefully keep people from > > stepping on each > > > others toes (or worse!). > > > > > > Does anyone have any best/worst practices they're > > willing to share w.r.t. > > > thing likes table/column naming schemes in a > > multitenant environment? For > > > table names for example, is there anything better than > > a basic > > > dot-delimited > > > naming convention with the group name as the first > > token? > > > > > > Also, I assume there's no performance cost with using > > long table names like > > > there is with long CF:col names. Please let me know if > > that's not the case. > > > > > > thanks, > > > Bill > > > > > >
-
Re: any multitenancy suggestions for HBase?Otis Gospodnetic 2011-06-18, 03:54
Hi Bill,
At the recent HBase hackathon in Berlin there was some word of ACLs in (the next release of?) HBase from the Trend Micro guys, I believe. Check this: http://search-hadoop.com/?q=acl&fc_project=HBase&fc_type=jira Otis -- We're hiring HBase / Hadoop / Hive / Mahout engineers with interest in Big Data Mining and Analytics http://blog.sematext.com/2011/04/18/hiring-data-mining-analytics-machine-learning-hackers/ >________________________________ >From: Bill Graham <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Monday, June 13, 2011 6:31 PM >Subject: any multitenancy suggestions for HBase? > >Hello there, > >We have a number of different groups within our organization who will soon >be working within the same HBase cluster and we're trying to set up some >best practices to keep thinks organized. Since there are no HBase ACLs and >no concept of multiple databases in the cluster, we're looking to propose a >simple convention that will hopefully keep people from stepping on each >others toes (or worse!). > >Does anyone have any best/worst practices they're willing to share w.r.t. >thing likes table/column naming schemes in a multitenant environment? For >table names for example, is there anything better than a basic dot-delimited >naming convention with the group name as the first token? > >Also, I assume there's no performance cost with using long table names like >there is with long CF:col names. Please let me know if that's not the case. > >thanks, >Bill > > >
-
RE: any multitenancy suggestions for HBase?Hiller, Dean x66079 2011-06-19, 21:48
I don't know if it is good or bad but we went down a route of all keys are prefixed or postfixed with "customerId"....prefixed if you want their data more isolated or postfixed if you want them sharing the same grid more and more.
We had some shared tables that are not postfixed nor prefixed and are only touched by a committee when needed for everyone...obviously tradeoff in doing that. Later, Dean -----Original Message----- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Friday, June 17, 2011 9:55 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: any multitenancy suggestions for HBase? Hi Bill, At the recent HBase hackathon in Berlin there was some word of ACLs in (the next release of?) HBase from the Trend Micro guys, I believe. Check this: http://search-hadoop.com/?q=acl&fc_project=HBase&fc_type=jira Otis -- We're hiring HBase / Hadoop / Hive / Mahout engineers with interest in Big Data Mining and Analytics http://blog.sematext.com/2011/04/18/hiring-data-mining-analytics-machine-learning-hackers/ >________________________________ >From: Bill Graham <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Monday, June 13, 2011 6:31 PM >Subject: any multitenancy suggestions for HBase? > >Hello there, > >We have a number of different groups within our organization who will soon >be working within the same HBase cluster and we're trying to set up some >best practices to keep thinks organized. Since there are no HBase ACLs and >no concept of multiple databases in the cluster, we're looking to propose a >simple convention that will hopefully keep people from stepping on each >others toes (or worse!). > >Does anyone have any best/worst practices they're willing to share w.r.t. >thing likes table/column naming schemes in a multitenant environment? For >table names for example, is there anything better than a basic dot-delimited >naming convention with the group name as the first token? > >Also, I assume there's no performance cost with using long table names like >there is with long CF:col names. Please let me know if that's not the case. > >thanks, >Bill > > > This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.
-
Re: any multitenancy suggestions for HBase?Bill Graham 2011-06-20, 17:06
Thanks Dean, that sounds similar to the approach we're considering.
Andy, I can see value in having ACLs on a per-column-pattern (or maybe just per-prefix to make multiple pattern conflict resolution simpler) basis. I know this isn't in scope for the initial release, but would the current design lend itself to be extended for this case? The use case is where a column prefix naming scheme is used for example, and certain groups should have write access to certain prefix patterns. thanks, Bill On Sun, Jun 19, 2011 at 2:48 PM, Hiller, Dean x66079 < [EMAIL PROTECTED]> wrote: > I don't know if it is good or bad but we went down a route of all keys are > prefixed or postfixed with "customerId"....prefixed if you want their data > more isolated or postfixed if you want them sharing the same grid more and > more. > > We had some shared tables that are not postfixed nor prefixed and are only > touched by a committee when needed for everyone...obviously tradeoff in > doing that. > > Later, > Dean > > -----Original Message----- > From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] > Sent: Friday, June 17, 2011 9:55 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: any multitenancy suggestions for HBase? > > Hi Bill, > > At the recent HBase hackathon in Berlin there was some word of ACLs in (the > next release of?) HBase from the Trend Micro guys, I believe. > Check this: http://search-hadoop.com/?q=acl&fc_project=HBase&fc_type=jira > > > Otis > > -- > We're hiring HBase / Hadoop / Hive / Mahout engineers with interest in Big > Data Mining and Analytics > > http://blog.sematext.com/2011/04/18/hiring-data-mining-analytics-machine-learning-hackers/ > > > > >________________________________ > >From: Bill Graham <[EMAIL PROTECTED]> > >To: [EMAIL PROTECTED] > >Sent: Monday, June 13, 2011 6:31 PM > >Subject: any multitenancy suggestions for HBase? > > > >Hello there, > > > >We have a number of different groups within our organization who will soon > >be working within the same HBase cluster and we're trying to set up some > >best practices to keep thinks organized. Since there are no HBase ACLs > and > >no concept of multiple databases in the cluster, we're looking to propose > a > >simple convention that will hopefully keep people from stepping on each > >others toes (or worse!). > > > >Does anyone have any best/worst practices they're willing to share w.r.t. > >thing likes table/column naming schemes in a multitenant environment? For > >table names for example, is there anything better than a basic > dot-delimited > >naming convention with the group name as the first token? > > > >Also, I assume there's no performance cost with using long table names > like > >there is with long CF:col names. Please let me know if that's not the > case. > > > >thanks, > >Bill > > > > > > > This message and any attachments are intended only for the use of the > addressee and > may contain information that is privileged and confidential. If the reader > of the > message is not the intended recipient or an authorized representative of > the > intended recipient, you are hereby notified that any dissemination of this > communication is strictly prohibited. If you have received this > communication in > error, please notify us immediately by e-mail and delete the message and > any > attachments from your system. > >
-
Re: any multitenancy suggestions for HBase?Gary Helmling 2011-06-20, 17:56
Hi Bill,
The current security code supports per-column-qualifier ACLs, though not the pattern matching approach you describe. It's simply an exact match on column qualifier. As an alternative (which would work with the current code), you could segment each set of access patterns to a separate column family, then set ACLs on the column family level. Of course this might not scale for many different sets of column patterns. The current recommendation is to have at most "a few" column families per table, but ideally just one family unless you really, really need more. But doing a lot of regexes to match patterns on every single KeyValue is bound to have scalability problems of it's own. I don't think it would be all that difficult if someone wanted to add pattern-based ACLs, but we don't have any plans for it at the moment. Please let us know how it goes and if there's anything we can help with! Gary On Mon, Jun 20, 2011 at 10:06 AM, Bill Graham <[EMAIL PROTECTED]> wrote: > Thanks Dean, that sounds similar to the approach we're considering. > > Andy, I can see value in having ACLs on a per-column-pattern (or maybe just > per-prefix to make multiple pattern conflict resolution simpler) basis. I > know this isn't in scope for the initial release, but would the current > design lend itself to be extended for this case? The use case is where a > column prefix naming scheme is used for example, and certain groups should > have write access to certain prefix patterns. > > thanks, > Bill > > On Sun, Jun 19, 2011 at 2:48 PM, Hiller, Dean x66079 < > [EMAIL PROTECTED]> wrote: > > > I don't know if it is good or bad but we went down a route of all keys > are > > prefixed or postfixed with "customerId"....prefixed if you want their > data > > more isolated or postfixed if you want them sharing the same grid more > and > > more. > > > > We had some shared tables that are not postfixed nor prefixed and are > only > > touched by a committee when needed for everyone...obviously tradeoff in > > doing that. > > > > Later, > > Dean > > > > -----Original Message----- > > From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] > > Sent: Friday, June 17, 2011 9:55 PM > > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > > Subject: Re: any multitenancy suggestions for HBase? > > > > Hi Bill, > > > > At the recent HBase hackathon in Berlin there was some word of ACLs in > (the > > next release of?) HBase from the Trend Micro guys, I believe. > > Check this: > http://search-hadoop.com/?q=acl&fc_project=HBase&fc_type=jira > > > > > > Otis > > > > -- > > We're hiring HBase / Hadoop / Hive / Mahout engineers with interest in > Big > > Data Mining and Analytics > > > > > http://blog.sematext.com/2011/04/18/hiring-data-mining-analytics-machine-learning-hackers/ > > > > > > > > >________________________________ > > >From: Bill Graham <[EMAIL PROTECTED]> > > >To: [EMAIL PROTECTED] > > >Sent: Monday, June 13, 2011 6:31 PM > > >Subject: any multitenancy suggestions for HBase? > > > > > >Hello there, > > > > > >We have a number of different groups within our organization who will > soon > > >be working within the same HBase cluster and we're trying to set up some > > >best practices to keep thinks organized. Since there are no HBase ACLs > > and > > >no concept of multiple databases in the cluster, we're looking to > propose > > a > > >simple convention that will hopefully keep people from stepping on each > > >others toes (or worse!). > > > > > >Does anyone have any best/worst practices they're willing to share > w.r.t. > > >thing likes table/column naming schemes in a multitenant environment? > For > > >table names for example, is there anything better than a basic > > dot-delimited > > >naming convention with the group name as the first token? > > > > > >Also, I assume there's no performance cost with using long table names > > like > > >there is with long CF:col names. Please let me know if that's not the > > case. > >
-
Re: any multitenancy suggestions for HBase?Andrew Purtell 2011-06-20, 19:03
Hi Bill,
> From: Bill Graham <[EMAIL PROTECTED]> > Andy, I can see value in having ACLs on a per-column-pattern (or > maybe just per-prefix to make multiple pattern conflict resolution > simpler) basis. I know this isn't in scope for the initial release, > but would the current design lend itself to be extended for this > case? Yes this is feasible. Best regards, - Andy
-
Re: any multitenancy suggestions for HBase?Bill Graham 2011-06-20, 21:15
Thanks for your reply Gary, see below...
On Mon, Jun 20, 2011 at 10:56 AM, Gary Helmling <[EMAIL PROTECTED]> wrote: > Hi Bill, > > The current security code supports per-column-qualifier ACLs, though not > the pattern matching approach you describe. It's simply an exact match on > column qualifier. > Good to know. I was going off of the goals listed in HBASE-3025, which were only as granular as CF. > > As an alternative (which would work with the current code), you could > segment each set of access patterns to a separate column family, then set > ACLs on the column family level. Of course this might not scale for many > different sets of column patterns. The current recommendation is to have at > most "a few" column families per table, but ideally just one family unless > you really, really need more. > Yeah, the scalability issues with coupling ACL to CF make it not work well for our use case due to CF bloat. Also our use case is such that we could have multiple sources for similar types of data, written by different groups/processes but accessed all together, which makes it logical to store them in the same CF. > > But doing a lot of regexes to match patterns on every single KeyValue is > bound to have scalability problems of it's own. I don't think it would be > all that difficult if someone wanted to add pattern-based ACLs, but we don't > have any plans for it at the moment. > Yes, I realized the same re scalability as I was typing my previous email. Hence the suggestion to instead just use a prefix-based approach which could be implemented more optimally. For now, we'll probably use a naming scheme that would allow us to introduce these ACLs into the security implementation once it's released. > > Please let us know how it goes and if there's anything we can help with! > Will do, thanks! > > Gary > > > > On Mon, Jun 20, 2011 at 10:06 AM, Bill Graham <[EMAIL PROTECTED]>wrote: > >> Thanks Dean, that sounds similar to the approach we're considering. >> >> Andy, I can see value in having ACLs on a per-column-pattern (or maybe >> just >> per-prefix to make multiple pattern conflict resolution simpler) basis. I >> know this isn't in scope for the initial release, but would the current >> design lend itself to be extended for this case? The use case is where a >> column prefix naming scheme is used for example, and certain groups should >> have write access to certain prefix patterns. >> >> thanks, >> Bill >> >> On Sun, Jun 19, 2011 at 2:48 PM, Hiller, Dean x66079 < >> [EMAIL PROTECTED]> wrote: >> >> > I don't know if it is good or bad but we went down a route of all keys >> are >> > prefixed or postfixed with "customerId"....prefixed if you want their >> data >> > more isolated or postfixed if you want them sharing the same grid more >> and >> > more. >> > >> > We had some shared tables that are not postfixed nor prefixed and are >> only >> > touched by a committee when needed for everyone...obviously tradeoff in >> > doing that. >> > >> > Later, >> > Dean >> > >> > -----Original Message----- >> > From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] >> > Sent: Friday, June 17, 2011 9:55 PM >> > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] >> > Subject: Re: any multitenancy suggestions for HBase? >> > >> > Hi Bill, >> > >> > At the recent HBase hackathon in Berlin there was some word of ACLs in >> (the >> > next release of?) HBase from the Trend Micro guys, I believe. >> > Check this: >> http://search-hadoop.com/?q=acl&fc_project=HBase&fc_type=jira >> > >> > >> > Otis >> > >> > -- >> > We're hiring HBase / Hadoop / Hive / Mahout engineers with interest in >> Big >> > Data Mining and Analytics >> > >> > >> http://blog.sematext.com/2011/04/18/hiring-data-mining-analytics-machine-learning-hackers/ >> > >> > >> > >> > >________________________________ >> > >From: Bill Graham <[EMAIL PROTECTED]> >> > >To: [EMAIL PROTECTED] >> > >Sent: Monday, June 13, 2011 6:31 PM >> > >Subject |