Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - reason to do major compaction after split


+
Sergey Shelukhin 2013-03-07, 18:50
+
Stack 2013-03-07, 18:58
+
Enis Söztutar 2013-03-07, 19:03
+
Sergey Shelukhin 2013-03-07, 20:58
+
Enis Söztutar 2013-03-07, 21:14
+
Stack 2013-03-07, 22:13
+
Matteo Bertozzi 2013-03-07, 22:28
+
Stack 2013-03-07, 22:56
+
Matteo Bertozzi 2013-03-07, 23:09
+
Sergey Shelukhin 2013-03-07, 23:22
Copy link to this message
-
Re: reason to do major compaction after split
Matteo Bertozzi 2013-03-07, 23:36
sure having the hardlink support
(HDFS-3370<https://issues.apache.org/jira/browse/HDFS-3370>)
solve the HFileLink hack
but you still need to add extra metadata for splits (reference files)

also, if instead of files you think about handling blocks directly
you can end up doing more stuff, like a proper compaction that
require less I/O if N blocks are not changed, some crazy deduplication
on tables with same content & similar...

On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:

> Hmm... ranges sounds good, but for files, it would be nice if there were a
> hardlink mechanism.
> It should be trivial to do in HDFS if blocks could belong to several files.
> Then we don't have to have private cleanup code.
>
> On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi <[EMAIL PROTECTED]
> >wrote:
>
> > This is seems to going in a super messy direction.
> > With HBASE-7806 the ideas was to cleanup all this crazy stuff (HFileLink,
> > References, ...)
> >
> > unfortunately the initial decision of tight together the fs layout
> > and the tables/regions/families is bringing to all this workaround to
> have
> > something cool.
> >
> > If you put the files in one place, and the association in another  you
> can
> > avoid all this complexity.
> >
> > /hbase/data/[file1, file 2, file 3, file N]
> >
> > table 1/region 1: [file 2]
> > table 1/region 2: [file 1 (from 0 to 50)]
> > table 1/region 3: [file 1 (from 50 to 100)]
> > table 2/region 1: [file 1, file 2]
> >
> > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[EMAIL PROTECTED]> wrote:
> >
> > > Yes.  That is a few trips to the NN listing directory contents and then
> > > some edits/reading of .META.  We would have to introduce a QuarterHFile
> > to
> > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile).
> > >
> > >
> > > St.Ack
> > >
> >
>
+
Enis Söztutar 2013-03-08, 00:11
+
Andrew Purtell 2013-03-08, 01:42
+
Sergey Shelukhin 2013-03-08, 19:32
+
Enis Söztutar 2013-03-08, 20:06
+
Sergey Shelukhin 2013-03-07, 23:20
+
Jean-Daniel Cryans 2013-03-07, 18:54