|
|
-
HBase without compactions?
Otis Gospodnetic 2013-02-19, 03:30
Hello, It's kind of funny, we run SPM, which includes SPM for HBase (performance monitoring service/tool for HBase essentially) and we currently store all performance metrics in HBase. I see a ton of HBase development activity, which is great, but it just occurred to me that I don't think I recall seeing anything about getting rid of compactions. Yet, compactions are one thing that I know hurt us the most and is one thing that MapR somehow got rid of in their implementation. Have there been any discussions,attempts, or thoughts about finding a way to avoid compactions? Thanks, Otis -- HBASE Performance Monitoring - http://sematext.com/spm/index.html
-
Re: HBase without compactions?
Michael Segel 2013-02-19, 04:50
Take a look at MapR's M7 For Apache based Hadoop and HBase, you will need to evolve HDFS. On Feb 18, 2013, at 9:30 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Hello, > > It's kind of funny, we run SPM, which includes SPM for HBase (performance > monitoring service/tool for HBase essentially) and we currently store all > performance metrics in HBase. > > I see a ton of HBase development activity, which is great, but it just > occurred to me that I don't think I recall seeing anything about getting > rid of compactions. Yet, compactions are one thing that I know hurt us the > most and is one thing that MapR somehow got rid of in their implementation. > > Have there been any discussions,attempts, or thoughts about finding a way > to avoid compactions? > > Thanks, > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html
-
Re: HBase without compactions?
Ted Yu 2013-02-19, 04:54
Take a look at HBASE-7667 Support stripe compaction Cheers On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Hello, > > It's kind of funny, we run SPM, which includes SPM for HBase (performance > monitoring service/tool for HBase essentially) and we currently store all > performance metrics in HBase. > > I see a ton of HBase development activity, which is great, but it just > occurred to me that I don't think I recall seeing anything about getting > rid of compactions. Yet, compactions are one thing that I know hurt us the > most and is one thing that MapR somehow got rid of in their implementation. > > Have there been any discussions,attempts, or thoughts about finding a way > to avoid compactions? > > Thanks, > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html>
-
Re: HBase without compactions?
Otis Gospodnetic 2013-02-19, 05:01
Hi, On Mon, Feb 18, 2013 at 11:50 PM, Michael Segel <[EMAIL PROTECTED]>wrote: > Take a look at MapR's M7 > > For Apache based Hadoop and HBase, you will need to evolve HDFS. > What do you mean by evolve HDFS? You mean HDFS would need to change if Apache HBase were to become compactionless? Otis -- HBASE Performance Monitoring - http://sematext.com/spm/index.html> > > On Feb 18, 2013, at 9:30 PM, Otis Gospodnetic <[EMAIL PROTECTED]> > wrote: > > > Hello, > > > > It's kind of funny, we run SPM, which includes SPM for HBase (performance > > monitoring service/tool for HBase essentially) and we currently store all > > performance metrics in HBase. > > > > I see a ton of HBase development activity, which is great, but it just > > occurred to me that I don't think I recall seeing anything about getting > > rid of compactions. Yet, compactions are one thing that I know hurt us > the > > most and is one thing that MapR somehow got rid of in their > implementation. > > > > Have there been any discussions,attempts, or thoughts about finding a way > > to avoid compactions? > > > > Thanks, > > Otis > > -- > > HBASE Performance Monitoring - http://sematext.com/spm/index.html> >
-
Re: HBase without compactions?
Otis Gospodnetic 2013-02-19, 05:06
Hi, HBASE-7667 sounds like an improvement whose details I don't fully understand, but not quite the same as compaction elimination. And I don't understand HBASE-7667 enough to have the feeling for how much less painful compactions would become after this. Any way to quantify that? Thanks, Otis -- HBASE Performance Monitoring - http://sematext.com/spm/index.htmlOn Mon, Feb 18, 2013 at 11:54 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Take a look at HBASE-7667 Support stripe compaction > > Cheers > > On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > > > Hello, > > > > It's kind of funny, we run SPM, which includes SPM for HBase (performance > > monitoring service/tool for HBase essentially) and we currently store all > > performance metrics in HBase. > > > > I see a ton of HBase development activity, which is great, but it just > > occurred to me that I don't think I recall seeing anything about getting > > rid of compactions. Yet, compactions are one thing that I know hurt us > the > > most and is one thing that MapR somehow got rid of in their > implementation. > > > > Have there been any discussions,attempts, or thoughts about finding a way > > to avoid compactions? > > > > Thanks, > > Otis > > -- > > HBASE Performance Monitoring - http://sematext.com/spm/index.html> > >
-
Re: HBase without compactions?
Ted Yu 2013-02-19, 05:09
We will do some testing along with code reviews. You can watch HBASE-7667 for further development. You're right in that its goal is not to get rid of compaction. Thanks On Mon, Feb 18, 2013 at 9:06 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Hi, > > HBASE-7667 sounds like an improvement whose details I don't fully > understand, but not quite the same as compaction elimination. And I don't > understand HBASE-7667 enough to have the feeling for how much less painful > compactions would become after this. Any way to quantify that? > > Thanks, > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html> > > > On Mon, Feb 18, 2013 at 11:54 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > Take a look at HBASE-7667 Support stripe compaction > > > > Cheers > > > > On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < > > [EMAIL PROTECTED]> wrote: > > > > > Hello, > > > > > > It's kind of funny, we run SPM, which includes SPM for HBase > (performance > > > monitoring service/tool for HBase essentially) and we currently store > all > > > performance metrics in HBase. > > > > > > I see a ton of HBase development activity, which is great, but it just > > > occurred to me that I don't think I recall seeing anything about > getting > > > rid of compactions. Yet, compactions are one thing that I know hurt us > > the > > > most and is one thing that MapR somehow got rid of in their > > implementation. > > > > > > Have there been any discussions,attempts, or thoughts about finding a > way > > > to avoid compactions? > > > > > > Thanks, > > > Otis > > > -- > > > HBASE Performance Monitoring - http://sematext.com/spm/index.html> > > > > >
-
Re: HBase without compactions?
Michael Segel 2013-02-19, 05:46
He asked for a compactionless version. You still have compactions w a stripe compaction. On Feb 18, 2013, at 10:54 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Take a look at HBASE-7667 Support stripe compaction > > Cheers > > On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > >> Hello, >> >> It's kind of funny, we run SPM, which includes SPM for HBase (performance >> monitoring service/tool for HBase essentially) and we currently store all >> performance metrics in HBase. >> >> I see a ton of HBase development activity, which is great, but it just >> occurred to me that I don't think I recall seeing anything about getting >> rid of compactions. Yet, compactions are one thing that I know hurt us the >> most and is one thing that MapR somehow got rid of in their implementation. >> >> Have there been any discussions,attempts, or thoughts about finding a way >> to avoid compactions? >> >> Thanks, >> Otis >> -- >> HBASE Performance Monitoring - http://sematext.com/spm/index.html>>
-
Re: HBase without compactions?
Michael Segel 2013-02-19, 05:46
In a single word, yes. Or rather you can't have a compactionless HBase without fixing the deficiencies in HDFS. On Feb 18, 2013, at 11:01 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Hi, > > > On Mon, Feb 18, 2013 at 11:50 PM, Michael Segel > <[EMAIL PROTECTED]>wrote: > >> Take a look at MapR's M7 >> >> For Apache based Hadoop and HBase, you will need to evolve HDFS. >> > > > What do you mean by evolve HDFS? You mean HDFS would need to change if > Apache HBase were to become compactionless? > > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html> > > >> >> >> On Feb 18, 2013, at 9:30 PM, Otis Gospodnetic <[EMAIL PROTECTED]> >> wrote: >> >>> Hello, >>> >>> It's kind of funny, we run SPM, which includes SPM for HBase (performance >>> monitoring service/tool for HBase essentially) and we currently store all >>> performance metrics in HBase. >>> >>> I see a ton of HBase development activity, which is great, but it just >>> occurred to me that I don't think I recall seeing anything about getting >>> rid of compactions. Yet, compactions are one thing that I know hurt us >> the >>> most and is one thing that MapR somehow got rid of in their >> implementation. >>> >>> Have there been any discussions,attempts, or thoughts about finding a way >>> to avoid compactions? >>> >>> Thanks, >>> Otis >>> -- >>> HBASE Performance Monitoring - http://sematext.com/spm/index.html>> >>
-
Re: HBase without compactions?
Otis Gospodnetic 2013-02-19, 06:05
And MapR has their own, completely reimplemented HDFS without these deficiencies.... and I can stop dreaming about compactionless Apache HBase? Otis -- HBASE Performance Monitoring - http://sematext.com/spm/index.htmlOn Tue, Feb 19, 2013 at 12:46 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > > In a single word, yes. > > Or rather you can't have a compactionless HBase without fixing the > deficiencies in HDFS. > > On Feb 18, 2013, at 11:01 PM, Otis Gospodnetic <[EMAIL PROTECTED]> > wrote: > > > Hi, > > > > > > On Mon, Feb 18, 2013 at 11:50 PM, Michael Segel > > <[EMAIL PROTECTED]>wrote: > > > >> Take a look at MapR's M7 > >> > >> For Apache based Hadoop and HBase, you will need to evolve HDFS. > >> > > > > > > What do you mean by evolve HDFS? You mean HDFS would need to change if > > Apache HBase were to become compactionless? > > > > Otis > > -- > > HBASE Performance Monitoring - http://sematext.com/spm/index.html> > > > > > > >> > >> > >> On Feb 18, 2013, at 9:30 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> > >> wrote: > >> > >>> Hello, > >>> > >>> It's kind of funny, we run SPM, which includes SPM for HBase > (performance > >>> monitoring service/tool for HBase essentially) and we currently store > all > >>> performance metrics in HBase. > >>> > >>> I see a ton of HBase development activity, which is great, but it just > >>> occurred to me that I don't think I recall seeing anything about > getting > >>> rid of compactions. Yet, compactions are one thing that I know hurt us > >> the > >>> most and is one thing that MapR somehow got rid of in their > >> implementation. > >>> > >>> Have there been any discussions,attempts, or thoughts about finding a > way > >>> to avoid compactions? > >>> > >>> Thanks, > >>> Otis > >>> -- > >>> HBASE Performance Monitoring - http://sematext.com/spm/index.html> >> > >> > >
-
Re: HBase without compactions?
Stack 2013-02-19, 06:09
On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote:
> Have there been any discussions,attempts, or thoughts about finding a way > to avoid compactions? > Any ideas on how it would work Otis?
Anyone know what m7 does?
St.Ack
-
Re: HBase without compactions?
Michael Segel 2013-02-19, 06:47
Well
M7 is supposed to be in open public beta, I think.
I haven't had time to play with it, but MapR has a lot of nice features that can't really be done in HDFS. Its basically the benefits of being almost POSIX compliant.
The reason I mention M7 is that they supposedly get rid of compactions, however I havent seen it in action.
In theory I can see this happening because you have a rw filesystem so why would would you need to have a write only file and then compaction where the write only files merge? I would imagine with some thought and time, HDFS could evolve to this ...
Its an interesting evolution on MapR's part and you have to give those guys credit for doing something cool.
On Feb 19, 2013, at 12:09 AM, Stack <[EMAIL PROTECTED]> wrote:
> On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > >> Have there been any discussions,attempts, or thoughts about finding a way >> to avoid compactions? >> > > > Any ideas on how it would work Otis? > > Anyone know what m7 does? > > St.Ack
-
Re: HBase without compactions?
Andrew Purtell 2013-02-19, 07:12
MapR had a distributed key value store internal to the FS for its metadata. Eventually they got the idea to put an API on it that mimics the HBase client API. This is not "removing compactions". I can't say for sure but feel pretty comfortable stating its an alternate architecture to BigTable, there was never a need for compactions in the first place but instead some other tradeoffs. Apples and oranges.
On Monday, February 18, 2013, Michael Segel wrote:
> Well > > M7 is supposed to be in open public beta, I think. > > I haven't had time to play with it, but MapR has a lot of nice features > that can't really be done in HDFS. > Its basically the benefits of being almost POSIX compliant. > > The reason I mention M7 is that they supposedly get rid of compactions, > however I havent seen it in action. > > In theory I can see this happening because you have a rw filesystem so why > would would you need to have a write only file and then compaction where > the write only files merge? > > > I would imagine with some thought and time, HDFS could evolve to this ... > > Its an interesting evolution on MapR's part and you have to give those > guys credit for doing something cool. > > On Feb 19, 2013, at 12:09 AM, Stack <[EMAIL PROTECTED] <javascript:;>> > wrote: > > > On Mon, Feb 18, 2013 at 7:30 PM, Otis Gospodnetic < > > [EMAIL PROTECTED] <javascript:;>> wrote: > > > >> Have there been any discussions,attempts, or thoughts about finding a > way > >> to avoid compactions? > >> > > > > > > Any ideas on how it would work Otis? > > > > Anyone know what m7 does? > > > > St.Ack > >
-- Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: HBase without compactions?
lars hofhansl 2013-02-19, 08:12
If you store data in LSM trees you need compactions. The advantage is that your data files are immutable. MapR has a mutable file system and they probably store their data in something more akin to B-Trees...? Or maybe they somehow avoid the expensive merge sorting of many small files. It seems that is has to be one or the other. (Maybe somebody from MapR reads this and can explain how it actually works.) Compations let you trade random IO for sequential IO (just to state the obvious). It seems that you can't have it both ways. -- Lars ________________________________ From: Otis Gospodnetic <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Monday, February 18, 2013 7:30 PM Subject: HBase without compactions? Hello, It's kind of funny, we run SPM, which includes SPM for HBase (performance monitoring service/tool for HBase essentially) and we currently store all performance metrics in HBase. I see a ton of HBase development activity, which is great, but it just occurred to me that I don't think I recall seeing anything about getting rid of compactions. Yet, compactions are one thing that I know hurt us the most and is one thing that MapR somehow got rid of in their implementation. Have there been any discussions,attempts, or thoughts about finding a way to avoid compactions? Thanks, Otis -- HBASE Performance Monitoring - http://sematext.com/spm/index.html
-
Re: HBase without compactions?
Enis Söztutar 2013-02-22, 01:15
>From some of their presentations, I've gathered that they implement B-Tree's instead of LSM's on top of their file system which allows random writes. They also claim that they are converting random mutation requests to the B-Tree leafs to sequential-writes. They are also talking about mini-WALs to do this, so there might be mini-LSM's going on. Not sure. Any case, agreed with, if there are LSMs there are compactions. LSM vs B-Trees tradeoff's are well understood. Enis On Tue, Feb 19, 2013 at 12:12 AM, lars hofhansl <[EMAIL PROTECTED]> wrote: > If you store data in LSM trees you need compactions. > The advantage is that your data files are immutable. > MapR has a mutable file system and they probably store their data in > something more akin to B-Trees...? > Or maybe they somehow avoid the expensive merge sorting of many small > files. It seems that is has to be one or the other. > > (Maybe somebody from MapR reads this and can explain how it actually > works.) > > Compations let you trade random IO for sequential IO (just to state the > obvious). It seems that you can't have it both ways. > > -- Lars > > > > ________________________________ > From: Otis Gospodnetic <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Monday, February 18, 2013 7:30 PM > Subject: HBase without compactions? > > Hello, > > It's kind of funny, we run SPM, which includes SPM for HBase (performance > monitoring service/tool for HBase essentially) and we currently store all > performance metrics in HBase. > > I see a ton of HBase development activity, which is great, but it just > occurred to me that I don't think I recall seeing anything about getting > rid of compactions. Yet, compactions are one thing that I know hurt us the > most and is one thing that MapR somehow got rid of in their implementation. > > Have there been any discussions,attempts, or thoughts about finding a way > to avoid compactions? > > Thanks, > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html>
|
|