On Tue, Dec 3, 2013 at 3:07 PM, Enis Söztutar <[EMAIL PROTECTED]> wrote:
> On Tue, Dec 3, 2013 at 2:03 PM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:
> > On Tue, Dec 3, 2013 at 11:42 AM, Enis Söztutar <[EMAIL PROTECTED]>
> > > On Mon, Dec 2, 2013 at 10:20 PM, Jonathan Hsieh <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > > Deveraj:
> > > > > Jonathan Hsieh, WAL per region (WALpr) would give you the locality
> > (and
> > > > hence HDFS short
> > > > > circuit) of reads if you were to couple it with the favored nodes.
> > The
> > > > cost is of course more WAL
> > > > > files... In the current situation (no WALpr) it would create quite
> > some
> > > > traffic cross machine, no?
> > > >
> > > > I think we all agree that wal per region isn't efficient on today's
> > > > spinning hard drive world where we are limited to a relatively low
> > budget
> > > > or seeks (though may be better in the future with SSD's).
> > > >
> > >
> > > WALpr makes sense in fully SSD world and if hdfs had journaling for
> > writes.
> > > I don't think anybody
> > > is working on this yet.
> > what do you mean by journaling for writes? do you mean where sync
> > operations update length at the nn on every call?
> I think hdfs guys were using "super sync" for referring to that. I was
> referring to
> journaling file system (
> where the writes to
> multiple files are persisted to a journal disk so that you do not pay the
> constant seeks for writing to
> a lot of files (for regions wals) in parallel.
Wait, we have a system that provides the ability to write data for a bunch
of buckets to a particular disk before rewriting them to others in a split
out read optimized from..
Isn't this what HBase and its HLog basically provides? :)
Joking aside, can you give a quick example of the semantics it would have
so I can grok what you are talking about?
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]