Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Memory mapped resources


+
Benson Margulies 2011-04-11, 22:57
+
Jason Rutherglen 2011-04-11, 23:05
+
Edward Capriolo 2011-04-12, 00:32
+
Ted Dunning 2011-04-12, 01:30
+
Jason Rutherglen 2011-04-12, 01:48
+
Ted Dunning 2011-04-12, 04:09
+
Kevin.Leach@... 2011-04-12, 12:51
+
Ted Dunning 2011-04-12, 15:07
+
Jason Rutherglen 2011-04-12, 13:32
+
Ted Dunning 2011-04-12, 15:08
+
Jason Rutherglen 2011-04-12, 15:24
+
Ted Dunning 2011-04-12, 15:35
+
Benson Margulies 2011-04-12, 17:40
+
Jason Rutherglen 2011-04-12, 18:09
Copy link to this message
-
Re: Memory mapped resources
Actually, it doesn't become trivial.  It just becomes total fail or total
win instead of almost always being partial win.  It doesn't meet Benson's
need.

On Tue, Apr 12, 2011 at 11:09 AM, Jason Rutherglen <
[EMAIL PROTECTED]> wrote:

> To get around the chunks or blocks problem, I've been implementing a
> system that simply sets a max block size that is too large for a file
> to reach.  In this way there will only be one block for HDFS file, and
> so MMap'ing or other single file ops become trivial.
>
> On Tue, Apr 12, 2011 at 10:40 AM, Benson Margulies
> <[EMAIL PROTECTED]> wrote:
> > Here's the OP again.
> >
> > I want to make it clear that my question here has to do with the
> > problem of distributing 'the program' around the cluster, not 'the
> > data'. In the case at hand, the issue a system that has a large data
> > resource that it needs to do its work. Every instance of the code
> > needs the entire model. Not just some blocks or pieces.
> >
> > Memory mapping is a very attractive tactic for this kind of data
> > resource. The data is read-only. Memory-mapping it allows the
> > operating system to ensure that only one copy of the thing ends up in
> > physical memory.
> >
> > If we force the model into a conventional file (storable in HDFS) and
> > read it into the JVM in a conventional way, then we get as many copies
> > in memory as we have JVMs.  On a big machine with a lot of cores, this
> > begins to add up.
> >
> > For people who are running a cluster of relatively conventional
> > systems, just putting copies on all the nodes in a conventional place
> > is adequate.
> >
>
+
Luke Lu 2011-04-12, 19:50
+
Luca Pireddu 2011-04-13, 07:21
+
M. C. Srivas 2011-04-13, 02:16
+
Ted Dunning 2011-04-13, 04:09
+
Benson Margulies 2011-04-13, 10:54
+
M. C. Srivas 2011-04-13, 14:33
+
Benson Margulies 2011-04-13, 14:35
+
Lance Norskog 2011-04-14, 02:41
+
Michael Flester 2011-04-12, 14:06
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB