Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - reason to do major compaction after split


+
Sergey Shelukhin 2013-03-07, 18:50
+
Stack 2013-03-07, 18:58
+
Enis Söztutar 2013-03-07, 19:03
+
Sergey Shelukhin 2013-03-07, 20:58
+
Enis Söztutar 2013-03-07, 21:14
+
Stack 2013-03-07, 22:13
+
Matteo Bertozzi 2013-03-07, 22:28
+
Stack 2013-03-07, 22:56
+
Matteo Bertozzi 2013-03-07, 23:09
+
Sergey Shelukhin 2013-03-07, 23:22
+
Matteo Bertozzi 2013-03-07, 23:36
+
Enis Söztutar 2013-03-08, 00:11
+
Andrew Purtell 2013-03-08, 01:42
+
Sergey Shelukhin 2013-03-08, 19:32
Copy link to this message
-
Re: reason to do major compaction after split
Enis Söztutar 2013-03-08, 20:06
> Sounds like a step toward using a block pool directly and avoiding the
filesystem layer (Hadoop 2+).

This has come up previously. With federation, we should be able to embed NN
as a first cut, and own all the blocks in the hbase namespace.

Enis
On Fri, Mar 8, 2013 at 11:32 AM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:

> +1.
> That gives us a lot of freedom to do stuff in many scenarios.
>
> On Thu, Mar 7, 2013 at 5:42 PM, Andrew Purtell <[EMAIL PROTECTED]>
> wrote:
>
> > > also, if instead of files you think about handling blocks directly you
> > can end up doing more stuff, like a proper compaction that require less
> I/O
> > if N blocks are not changed, some crazy deduplication on tables with same
> > content & similar...
> >
> > Sounds like a step toward using a block pool directly and avoiding the
> > filesystem layer (Hadoop 2+).
> >
> >
> > On Fri, Mar 8, 2013 at 7:36 AM, Matteo Bertozzi <[EMAIL PROTECTED]
> > >wrote:
> >
> > > sure having the hardlink support
> > > (HDFS-3370<https://issues.apache.org/jira/browse/HDFS-3370>)
> > > solve the HFileLink hack
> > > but you still need to add extra metadata for splits (reference files)
> > >
> > > also, if instead of files you think about handling blocks directly
> > > you can end up doing more stuff, like a proper compaction that
> > > require less I/O if N blocks are not changed, some crazy deduplication
> > > on tables with same content & similar...
> > >
> > > On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin <
> > [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Hmm... ranges sounds good, but for files, it would be nice if there
> > were
> > > a
> > > > hardlink mechanism.
> > > > It should be trivial to do in HDFS if blocks could belong to several
> > > files.
> > > > Then we don't have to have private cleanup code.
> > > >
> > > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <
> > [EMAIL PROTECTED]
> > > > >wrote:
> > > >
> > > > > This is seems to going in a super messy direction.
> > > > > With HBASE-7806 the ideas was to cleanup all this crazy stuff
> > > (HFileLink,
> > > > > References, ...)
> > > > >
> > > > > unfortunately the initial decision of tight together the fs layout
> > > > > and the tables/regions/families is bringing to all this workaround
> to
> > > > have
> > > > > something cool.
> > > > >
> > > > > If you put the files in one place, and the association in another
> >  you
> > > > can
> > > > > avoid all this complexity.
> > > > >
> > > > > /hbase/data/[file1, file 2, file 3, file N]
> > > > >
> > > > > table 1/region 1: [file 2]
> > > > > table 1/region 2: [file 1 (from 0 to 50)]
> > > > > table 1/region 3: [file 1 (from 50 to 100)]
> > > > > table 2/region 1: [file 1, file 2]
> > > > >
> > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > > Yes.  That is a few trips to the NN listing directory contents
> and
> > > then
> > > > > > some edits/reading of .META.  We would have to introduce a
> > > QuarterHFile
> > > > > to
> > > > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile).
> > > > > >
> > > > > >
> > > > > > St.Ack
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
+
Sergey Shelukhin 2013-03-07, 23:20
+
Jean-Daniel Cryans 2013-03-07, 18:54