Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # dev >> [DISCUSS] Remove append?


+
Eli Collins 2012-03-21, 00:37
+
Dave Shine 2012-03-21, 12:36
+
Konstantin Shvachko 2012-03-22, 08:11
+
Tim Broberg 2012-03-21, 18:31
+
Eli Collins 2012-03-21, 18:52
+
Dave Shine 2012-03-21, 19:07
+
Milind.Bhandarkar@... 2012-03-21, 17:17
+
Eli Collins 2012-03-21, 17:32
+
Eli Collins 2012-03-21, 17:33
+
Milind.Bhandarkar@... 2012-03-21, 17:47
+
Eli Collins 2012-03-21, 18:27
+
Milind.Bhandarkar@... 2012-03-21, 19:30
+
Eli Collins 2012-03-21, 21:14
+
Milind.Bhandarkar@... 2012-03-21, 19:48
+
Eli Collins 2012-03-21, 21:09
+
Milind.Bhandarkar@... 2012-03-21, 22:06
+
Eli Collins 2012-03-21, 22:16
+
Milind.Bhandarkar@... 2012-03-21, 22:48
+
Eli Collins 2012-03-21, 23:30
+
Milind.Bhandarkar@... 2012-03-21, 20:24
+
Konstantin Shvachko 2012-03-22, 08:26
+
Eli Collins 2012-03-22, 17:25
+
Scott Carey 2012-03-24, 02:44
+
Colin McCabe 2012-03-26, 19:53
+
Scott Carey 2012-03-26, 20:53
Copy link to this message
-
Re: [DISCUSS] Remove append?
On Tue, Mar 20, 2012 at 5:37 PM, Eli Collins <[EMAIL PROTECTED]> wrote:

>
>
> Append introduces non-trivial design and code complexity, which is not
> worth the cost if we don't have real users.

The bulk of the complexity of HDFS-265 ("the new Append") was around
Hflush, concurrent readers, the pipeline etc. The code and complexity  for
appending to previously closed file was not that large.

> Removing append means we
> have the property that HDFS blocks, when finalized, are immutable.
> This significantly simplifies the design and code, which significantly
> simplifies the implementation of other features like snapshots,
> HDFS-level caching, dedupe, etc.
>

While Snapshots  are challenging with Append, it is solvable - the snapshot
needs to remember the length of the file. (We have a working prototype - we
will posting the design and the code soon).
I agree that the notion of an immutable file is useful since it lets the
system and tools optimize certain things.  A xerox-parc file system in the
80s had this feature that the system exploited. I would support adding the
notion of an immutable file to Hadoop.
sanjay
+
Eli Collins 2012-03-21, 21:08
+
Tsz Wo Sze 2012-03-21, 20:31
+
Eli Collins 2012-03-21, 20:58
+
Daryn Sharp 2012-03-22, 17:15
+
Eli Collins 2012-03-22, 17:47
+
Scott Carey 2012-03-24, 02:26
+
Milind.Bhandarkar@... 2012-03-22, 23:27
+
Eli Collins 2012-03-23, 00:41
+
Scott Carey 2012-03-24, 02:46
+
Tsz Wo Sze 2012-03-23, 00:03
+
Eli Collins 2012-03-23, 00:49
+
Colin McCabe 2012-03-26, 20:02
+
Tsz Wo Sze 2012-03-26, 20:55
+
Colin McCabe 2012-03-26, 21:31
+
Tsz Wo Sze 2012-03-27, 02:46
+
Dhruba Borthakur 2012-03-23, 01:18
+
Daryn Sharp 2012-03-23, 17:03
+
CHANG Lei 2012-03-23, 06:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB