Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> structured data split


+
臧冬松 2011-11-11, 07:43
Copy link to this message
-
Re: structured data split
hi
   Structured data is always being split into different blocks, likes a
word or line.
   MapReduce task read HDFS data with the unit - *line* - it will read the
whole line from the end of previous block to start of subsequent to obtains
that part of line record. So you does not worry about the Incomplete
structured data. HDFS do nothing for this mechanism.

-Regards
Denny Ye

On Fri, Nov 11, 2011 at 3:43 PM, 臧冬松 <[EMAIL PROTECTED]> wrote:

> Usually large file in HDFS is split into bulks and store in different
> DataNodes.
> A map task is assigned to deal with that bulk, I wonder what if the
> Structured data(i.e a word) was split into two bulks?
> How MapReduce and HDFS deal with this?
>
> Thanks!
> Donal
>
+
臧冬松 2011-11-11, 10:11
+
Bejoy KS 2011-11-11, 11:01
+
臧冬松 2011-11-11, 12:46
+
bejoy.hadoop@... 2011-11-11, 13:25
+
Harsh J 2011-11-11, 13:54
+
Bejoy KS 2011-11-11, 14:38
+
Harsh J 2011-11-11, 16:06
+
Bejoy KS 2011-11-11, 16:27
+
臧冬松 2011-11-11, 14:12
+
Will Maier 2011-11-11, 14:26
+
Charles Earl 2011-11-11, 14:42
+
Bejoy KS 2011-11-11, 15:10
+
臧冬松 2011-11-11, 15:57
+
臧冬松 2011-11-14, 08:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB