Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> SequenceFile split question

Copy link to this message
SequenceFile split question
I have a client program that creates sequencefile, which essentially merges
small files into a big file. I was wondering how is sequence file splitting
the data accross nodes. When I start the sequence file is empty. Does it
get split when it reaches the dfs.block size? If so then does it mean that
I am always writing to just one node at a given point in time?

If I start a new client writing a new sequence file then is there a way to
select a different data node?