|
|
-
reason to do major compaction after split
Sergey Shelukhin 2013-03-07, 18:50
Hi. Is there a reason to do major compaction after split, instead of allowing the reference files to go away gradually as the normal compactions happen? I could think up two reasons - region with reference files currently cannot be split again (not clear why not though, could just create more references); and avoiding load on the same datanodes from both new regions. Are there some other reasons?
-
Re: reason to do major compaction after split
Jean-Daniel Cryans 2013-03-07, 18:54
Clean the parent would be another one.
J-D
On Thu, Mar 7, 2013 at 10:50 AM, Sergey Shelukhin <[EMAIL PROTECTED]> wrote: > Hi. > Is there a reason to do major compaction after split, instead of allowing > the reference files to go away gradually as the normal compactions happen? > I could think up two reasons - region with reference files currently cannot > be split again (not clear why not though, could just create more > references); and avoiding load on the same datanodes from both new regions. > Are there some other reasons?
-
Re: reason to do major compaction after split
Stack 2013-03-07, 18:58
On Thu, Mar 7, 2013 at 10:50 AM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:
> Hi. > Is there a reason to do major compaction after split, instead of allowing > the reference files to go away gradually as the normal compactions happen? > I could think up two reasons - region with reference files currently cannot > be split again (not clear why not though, could just create more > references); and avoiding load on the same datanodes from both new regions. > Are there some other reasons? > We could do references to references but was afraid the linkage would be too fragile and would break in hard-to-trace ways. St.Ack
-
Re: reason to do major compaction after split
Enis Söztutar 2013-03-07, 19:03
I was thinking of allowing regions with refs to split again, but the cleaning parent logic will get messy a lot.
Enis On Thu, Mar 7, 2013 at 10:58 AM, Stack <[EMAIL PROTECTED]> wrote:
> On Thu, Mar 7, 2013 at 10:50 AM, Sergey Shelukhin <[EMAIL PROTECTED] > >wrote: > > > Hi. > > Is there a reason to do major compaction after split, instead of allowing > > the reference files to go away gradually as the normal compactions > happen? > > I could think up two reasons - region with reference files currently > cannot > > be split again (not clear why not though, could just create more > > references); and avoiding load on the same datanodes from both new > regions. > > Are there some other reasons? > > > > > We could do references to references but was afraid the linkage would be > too fragile and would break in hard-to-trace ways. > St.Ack >
-
Re: reason to do major compaction after split
Sergey Shelukhin 2013-03-07, 20:58
Can you create same-level references instead of references to references?
On Thu, Mar 7, 2013 at 11:03 AM, Enis Söztutar <[EMAIL PROTECTED]> wrote:
> I was thinking of allowing regions with refs to split again, but the > cleaning parent logic will get messy a lot. > > Enis > > > On Thu, Mar 7, 2013 at 10:58 AM, Stack <[EMAIL PROTECTED]> wrote: > > > On Thu, Mar 7, 2013 at 10:50 AM, Sergey Shelukhin < > [EMAIL PROTECTED] > > >wrote: > > > > > Hi. > > > Is there a reason to do major compaction after split, instead of > allowing > > > the reference files to go away gradually as the normal compactions > > happen? > > > I could think up two reasons - region with reference files currently > > cannot > > > be split again (not clear why not though, could just create more > > > references); and avoiding load on the same datanodes from both new > > regions. > > > Are there some other reasons? > > > > > > > > > We could do references to references but was afraid the linkage would be > > too fragile and would break in hard-to-trace ways. > > St.Ack > > >
-
Re: reason to do major compaction after split
Enis Söztutar 2013-03-07, 21:14
We do not have to created references to references. We can find the original file, and directly create a ref at the grand daughters. The messy part, is in the cleanup for parent region, where we have to recursively search for all successors to decide whether we can delete this region, and delete the hfile.
Enis On Thu, Mar 7, 2013 at 12:58 PM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:
> Can you create same-level references instead of references to references? > > On Thu, Mar 7, 2013 at 11:03 AM, Enis Söztutar <[EMAIL PROTECTED]> wrote: > > > I was thinking of allowing regions with refs to split again, but the > > cleaning parent logic will get messy a lot. > > > > Enis > > > > > > On Thu, Mar 7, 2013 at 10:58 AM, Stack <[EMAIL PROTECTED]> wrote: > > > > > On Thu, Mar 7, 2013 at 10:50 AM, Sergey Shelukhin < > > [EMAIL PROTECTED] > > > >wrote: > > > > > > > Hi. > > > > Is there a reason to do major compaction after split, instead of > > allowing > > > > the reference files to go away gradually as the normal compactions > > > happen? > > > > I could think up two reasons - region with reference files currently > > > cannot > > > > be split again (not clear why not though, could just create more > > > > references); and avoiding load on the same datanodes from both new > > > regions. > > > > Are there some other reasons? > > > > > > > > > > > > > We could do references to references but was afraid the linkage would > be > > > too fragile and would break in hard-to-trace ways. > > > St.Ack > > > > > >
-
Re: reason to do major compaction after split
Stack 2013-03-07, 22:13
On Thu, Mar 7, 2013 at 1:14 PM, Enis Söztutar <[EMAIL PROTECTED]> wrote:
> We do not have to created references to references. We can find the > original file, and directly create a ref at the grand daughters. The messy > part, is in the cleanup for parent region, where we have to recursively > search for all successors to decide whether we can delete this region, and > delete the hfile. >
Yes. That is a few trips to the NN listing directory contents and then some edits/reading of .META. We would have to introduce a QuarterHFile to go with our HalfHFile (or rename HalfHFile as PieceO'HFile). St.Ack
-
Re: reason to do major compaction after split
Matteo Bertozzi 2013-03-07, 22:28
This is seems to going in a super messy direction. With HBASE-7806 the ideas was to cleanup all this crazy stuff (HFileLink, References, ...)
unfortunately the initial decision of tight together the fs layout and the tables/regions/families is bringing to all this workaround to have something cool.
If you put the files in one place, and the association in another you can avoid all this complexity.
/hbase/data/[file1, file 2, file 3, file N]
table 1/region 1: [file 2] table 1/region 2: [file 1 (from 0 to 50)] table 1/region 3: [file 1 (from 50 to 100)] table 2/region 1: [file 1, file 2]
On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote:
> Yes. That is a few trips to the NN listing directory contents and then > some edits/reading of .META. We would have to introduce a QuarterHFile to > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > St.Ack >
-
Re: reason to do major compaction after split
Stack 2013-03-07, 22:56
On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <[EMAIL PROTECTED]>wrote:
> This is seems to going in a super messy direction. >
Smile. I was thinking you'd show up on this thread.
Agree.
> With HBASE-7806 the ideas was to cleanup all this crazy stuff (HFileLink, > References, ...) > > unfortunately the initial decision of tight together the fs layout > and the tables/regions/families is bringing to all this workaround to have > something cool. > > If you put the files in one place, and the association in another you can > avoid all this complexity. > > /hbase/data/[file1, file 2, file 3, file N] > > table 1/region 1: [file 2] > table 1/region 2: [file 1 (from 0 to 50)] > table 1/region 3: [file 1 (from 50 to 100)] > table 2/region 1: [file 1, file 2] > > Any ideas on what migration from current format to the above would be like Matteo? We'd read current layout, use it to populate a files table, new files would be written to a the new /hbase/data/ dir, and for a while we'd span the old and new locations?
St.Ack
-
Re: reason to do major compaction after split
Matteo Bertozzi 2013-03-07, 23:09
On Thu, Mar 7, 2013 at 10:56 PM, Stack <[EMAIL PROTECTED]> wrote:
> Any ideas on what migration from current format to the above would be like > Matteo? We'd read current layout, use it to populate a files table, new > files would be written to a the new /hbase/data/ dir, and for a while we'd > span the old and new locations? >
If you have the possibility to shutdown the whole cluster, the way is easy move all the hfiles in /hbase/data and populate the "files table".
If you can't, you just have to keep the current code able to been able read the current fs layout and archiving if there's something in that directory reads from that as today if not goes to the file table. on write (flush compactions) adds the new file to the "files table" and /hbase/data
-
Re: reason to do major compaction after split
Sergey Shelukhin 2013-03-07, 23:20
Hmm... should we have hardlinks (or use HDFS hardlinks if any?) to solve this problem. HalfHFile could be HFileWithRange :)
On Thu, Mar 7, 2013 at 2:13 PM, Stack <[EMAIL PROTECTED]> wrote:
> On Thu, Mar 7, 2013 at 1:14 PM, Enis Söztutar <[EMAIL PROTECTED]> > wrote: > > > We do not have to created references to references. We can find the > > original file, and directly create a ref at the grand daughters. The > messy > > part, is in the cleanup for parent region, where we have to recursively > > search for all successors to decide whether we can delete this region, > and > > delete the hfile. > > > > Yes. That is a few trips to the NN listing directory contents and then > some edits/reading of .META. We would have to introduce a QuarterHFile to > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > St.Ack >
-
Re: reason to do major compaction after split
Sergey Shelukhin 2013-03-07, 23:22
Hmm... ranges sounds good, but for files, it would be nice if there were a hardlink mechanism. It should be trivial to do in HDFS if blocks could belong to several files. Then we don't have to have private cleanup code.
On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <[EMAIL PROTECTED]>wrote:
> This is seems to going in a super messy direction. > With HBASE-7806 the ideas was to cleanup all this crazy stuff (HFileLink, > References, ...) > > unfortunately the initial decision of tight together the fs layout > and the tables/regions/families is bringing to all this workaround to have > something cool. > > If you put the files in one place, and the association in another you can > avoid all this complexity. > > /hbase/data/[file1, file 2, file 3, file N] > > table 1/region 1: [file 2] > table 1/region 2: [file 1 (from 0 to 50)] > table 1/region 3: [file 1 (from 50 to 100)] > table 2/region 1: [file 1, file 2] > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote: > > > Yes. That is a few trips to the NN listing directory contents and then > > some edits/reading of .META. We would have to introduce a QuarterHFile > to > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > St.Ack > > >
-
Re: reason to do major compaction after split
Matteo Bertozzi 2013-03-07, 23:36
sure having the hardlink support (HDFS-3370< https://issues.apache.org/jira/browse/HDFS-3370>)solve the HFileLink hack but you still need to add extra metadata for splits (reference files) also, if instead of files you think about handling blocks directly you can end up doing more stuff, like a proper compaction that require less I/O if N blocks are not changed, some crazy deduplication on tables with same content & similar... On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote: > Hmm... ranges sounds good, but for files, it would be nice if there were a > hardlink mechanism. > It should be trivial to do in HDFS if blocks could belong to several files. > Then we don't have to have private cleanup code. > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <[EMAIL PROTECTED] > >wrote: > > > This is seems to going in a super messy direction. > > With HBASE-7806 the ideas was to cleanup all this crazy stuff (HFileLink, > > References, ...) > > > > unfortunately the initial decision of tight together the fs layout > > and the tables/regions/families is bringing to all this workaround to > have > > something cool. > > > > If you put the files in one place, and the association in another you > can > > avoid all this complexity. > > > > /hbase/data/[file1, file 2, file 3, file N] > > > > table 1/region 1: [file 2] > > table 1/region 2: [file 1 (from 0 to 50)] > > table 1/region 3: [file 1 (from 50 to 100)] > > table 2/region 1: [file 1, file 2] > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > Yes. That is a few trips to the NN listing directory contents and then > > > some edits/reading of .META. We would have to introduce a QuarterHFile > > to > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > > > > St.Ack > > > > > >
-
Re: reason to do major compaction after split
Enis Söztutar 2013-03-08, 00:11
> /hbase/data/[file1, file 2, file 3, file N] > > table 1/region 1: [file 2] > table 1/region 2: [file 1 (from 0 to 50)] > table 1/region 3: [file 1 (from 50 to 100)] > table 2/region 1: [file 1, file 2] We do not necessarily have to have a separate dir for files. We can just keep the files in the region dir, until no more references. The problem comes from the fact that we rely on hdfs ls for regions rather than META being the one and only authoritative source. Enis On Thu, Mar 7, 2013 at 3:36 PM, Matteo Bertozzi <[EMAIL PROTECTED]>wrote: > sure having the hardlink support > (HDFS-3370< https://issues.apache.org/jira/browse/HDFS-3370>)> solve the HFileLink hack > but you still need to add extra metadata for splits (reference files) > > also, if instead of files you think about handling blocks directly > you can end up doing more stuff, like a proper compaction that > require less I/O if N blocks are not changed, some crazy deduplication > on tables with same content & similar... > > On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin <[EMAIL PROTECTED] > >wrote: > > > Hmm... ranges sounds good, but for files, it would be nice if there were > a > > hardlink mechanism. > > It should be trivial to do in HDFS if blocks could belong to several > files. > > Then we don't have to have private cleanup code. > > > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <[EMAIL PROTECTED] > > >wrote: > > > > > This is seems to going in a super messy direction. > > > With HBASE-7806 the ideas was to cleanup all this crazy stuff > (HFileLink, > > > References, ...) > > > > > > unfortunately the initial decision of tight together the fs layout > > > and the tables/regions/families is bringing to all this workaround to > > have > > > something cool. > > > > > > If you put the files in one place, and the association in another you > > can > > > avoid all this complexity. > > > > > > /hbase/data/[file1, file 2, file 3, file N] > > > > > > table 1/region 1: [file 2] > > > table 1/region 2: [file 1 (from 0 to 50)] > > > table 1/region 3: [file 1 (from 50 to 100)] > > > table 2/region 1: [file 1, file 2] > > > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > > > Yes. That is a few trips to the NN listing directory contents and > then > > > > some edits/reading of .META. We would have to introduce a > QuarterHFile > > > to > > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > > > > > > > St.Ack > > > > > > > > > >
-
Re: reason to do major compaction after split
Andrew Purtell 2013-03-08, 01:42
> also, if instead of files you think about handling blocks directly you can end up doing more stuff, like a proper compaction that require less I/O if N blocks are not changed, some crazy deduplication on tables with same content & similar... Sounds like a step toward using a block pool directly and avoiding the filesystem layer (Hadoop 2+). On Fri, Mar 8, 2013 at 7:36 AM, Matteo Bertozzi <[EMAIL PROTECTED]>wrote: > sure having the hardlink support > (HDFS-3370< https://issues.apache.org/jira/browse/HDFS-3370>)> solve the HFileLink hack > but you still need to add extra metadata for splits (reference files) > > also, if instead of files you think about handling blocks directly > you can end up doing more stuff, like a proper compaction that > require less I/O if N blocks are not changed, some crazy deduplication > on tables with same content & similar... > > On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin <[EMAIL PROTECTED] > >wrote: > > > Hmm... ranges sounds good, but for files, it would be nice if there were > a > > hardlink mechanism. > > It should be trivial to do in HDFS if blocks could belong to several > files. > > Then we don't have to have private cleanup code. > > > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <[EMAIL PROTECTED] > > >wrote: > > > > > This is seems to going in a super messy direction. > > > With HBASE-7806 the ideas was to cleanup all this crazy stuff > (HFileLink, > > > References, ...) > > > > > > unfortunately the initial decision of tight together the fs layout > > > and the tables/regions/families is bringing to all this workaround to > > have > > > something cool. > > > > > > If you put the files in one place, and the association in another you > > can > > > avoid all this complexity. > > > > > > /hbase/data/[file1, file 2, file 3, file N] > > > > > > table 1/region 1: [file 2] > > > table 1/region 2: [file 1 (from 0 to 50)] > > > table 1/region 3: [file 1 (from 50 to 100)] > > > table 2/region 1: [file 1, file 2] > > > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > > > Yes. That is a few trips to the NN listing directory contents and > then > > > > some edits/reading of .META. We would have to introduce a > QuarterHFile > > > to > > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > > > > > > > St.Ack > > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: reason to do major compaction after split
Sergey Shelukhin 2013-03-08, 19:32
+1. That gives us a lot of freedom to do stuff in many scenarios. On Thu, Mar 7, 2013 at 5:42 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > > also, if instead of files you think about handling blocks directly you > can end up doing more stuff, like a proper compaction that require less I/O > if N blocks are not changed, some crazy deduplication on tables with same > content & similar... > > Sounds like a step toward using a block pool directly and avoiding the > filesystem layer (Hadoop 2+). > > > On Fri, Mar 8, 2013 at 7:36 AM, Matteo Bertozzi <[EMAIL PROTECTED] > >wrote: > > > sure having the hardlink support > > (HDFS-3370< https://issues.apache.org/jira/browse/HDFS-3370>)> > solve the HFileLink hack > > but you still need to add extra metadata for splits (reference files) > > > > also, if instead of files you think about handling blocks directly > > you can end up doing more stuff, like a proper compaction that > > require less I/O if N blocks are not changed, some crazy deduplication > > on tables with same content & similar... > > > > On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin < > [EMAIL PROTECTED] > > >wrote: > > > > > Hmm... ranges sounds good, but for files, it would be nice if there > were > > a > > > hardlink mechanism. > > > It should be trivial to do in HDFS if blocks could belong to several > > files. > > > Then we don't have to have private cleanup code. > > > > > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > This is seems to going in a super messy direction. > > > > With HBASE-7806 the ideas was to cleanup all this crazy stuff > > (HFileLink, > > > > References, ...) > > > > > > > > unfortunately the initial decision of tight together the fs layout > > > > and the tables/regions/families is bringing to all this workaround to > > > have > > > > something cool. > > > > > > > > If you put the files in one place, and the association in another > you > > > can > > > > avoid all this complexity. > > > > > > > > /hbase/data/[file1, file 2, file 3, file N] > > > > > > > > table 1/region 1: [file 2] > > > > table 1/region 2: [file 1 (from 0 to 50)] > > > > table 1/region 3: [file 1 (from 50 to 100)] > > > > table 2/region 1: [file 1, file 2] > > > > > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > > > > > Yes. That is a few trips to the NN listing directory contents and > > then > > > > > some edits/reading of .META. We would have to introduce a > > QuarterHFile > > > > to > > > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
-
Re: reason to do major compaction after split
Enis Söztutar 2013-03-08, 20:06
> Sounds like a step toward using a block pool directly and avoiding the filesystem layer (Hadoop 2+). This has come up previously. With federation, we should be able to embed NN as a first cut, and own all the blocks in the hbase namespace. Enis On Fri, Mar 8, 2013 at 11:32 AM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote: > +1. > That gives us a lot of freedom to do stuff in many scenarios. > > On Thu, Mar 7, 2013 at 5:42 PM, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > > > > also, if instead of files you think about handling blocks directly you > > can end up doing more stuff, like a proper compaction that require less > I/O > > if N blocks are not changed, some crazy deduplication on tables with same > > content & similar... > > > > Sounds like a step toward using a block pool directly and avoiding the > > filesystem layer (Hadoop 2+). > > > > > > On Fri, Mar 8, 2013 at 7:36 AM, Matteo Bertozzi <[EMAIL PROTECTED] > > >wrote: > > > > > sure having the hardlink support > > > (HDFS-3370< https://issues.apache.org/jira/browse/HDFS-3370>)> > > solve the HFileLink hack > > > but you still need to add extra metadata for splits (reference files) > > > > > > also, if instead of files you think about handling blocks directly > > > you can end up doing more stuff, like a proper compaction that > > > require less I/O if N blocks are not changed, some crazy deduplication > > > on tables with same content & similar... > > > > > > On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin < > > [EMAIL PROTECTED] > > > >wrote: > > > > > > > Hmm... ranges sounds good, but for files, it would be nice if there > > were > > > a > > > > hardlink mechanism. > > > > It should be trivial to do in HDFS if blocks could belong to several > > > files. > > > > Then we don't have to have private cleanup code. > > > > > > > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi < > > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > This is seems to going in a super messy direction. > > > > > With HBASE-7806 the ideas was to cleanup all this crazy stuff > > > (HFileLink, > > > > > References, ...) > > > > > > > > > > unfortunately the initial decision of tight together the fs layout > > > > > and the tables/regions/families is bringing to all this workaround > to > > > > have > > > > > something cool. > > > > > > > > > > If you put the files in one place, and the association in another > > you > > > > can > > > > > avoid all this complexity. > > > > > > > > > > /hbase/data/[file1, file 2, file 3, file N] > > > > > > > > > > table 1/region 1: [file 2] > > > > > table 1/region 2: [file 1 (from 0 to 50)] > > > > > table 1/region 3: [file 1 (from 50 to 100)] > > > > > table 2/region 1: [file 1, file 2] > > > > > > > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > Yes. That is a few trips to the NN listing directory contents > and > > > then > > > > > > some edits/reading of .META. We would have to introduce a > > > QuarterHFile > > > > > to > > > > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > > > > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > >
|
|