Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - dfs.write.packet.size set to 2G

donal0412 2011-11-08, 07:32
Harsh J 2011-11-08, 08:21
donal0412 2011-11-08, 08:42
Copy link to this message
Re: dfs.write.packet.size set to 2G
Ted Dunning 2011-11-08, 09:37
By snapshots, I mean that you can freeze a copy of a portion of the the
file system for later use as a backup or reference.  By mirror, I mean that
a snapshot can be transported to another location in the same cluster or to
another cluster and the mirrored image will be updated atomically to the
new state.

See http://mapr.com/products/why-mapr for more info.

On Tue, Nov 8, 2011 at 3:42 AM, donal0412 <[EMAIL PROTECTED]> wrote:

> Thanks! That's exactly what I want.
> And Ted, what do you mean by "snapshots and mirrors" ?
> On 2011/11/8 16:21, Harsh J wrote:
>> Block sizes are per-file, not permanently set on the HDFS. So create
>> your files with a sufficiently large block size (2G is OK if it fits
>> your usecase well). This way you won't have block splits, as you
>> desire.
>> For example, to upload a file via the shell with a tweaked blocksize, I'd
>> do:
>> hadoop dfs -Ddfs.block.size=2147483648 -copyFromLocal localFile
>> remoteFile
>> Packet sizes are not what you want to tweak here.
>> On Tue, Nov 8, 2011 at 1:02 PM, donal0412<[EMAIL PROTECTED]>  wrote:
>>> Hi,
>>> I want to store lots of files in HDFS, the file size is<= 2G.
>>> I don't want the file to split into blocks,because I need the whole file
>>> while processing it, and I don't want to transfer blocks to one node when
>>> processing it.
>>> A easy way to do this would be set dfs.write.packet.size to 2G, I wonder
>>> if
>>> some one has similar experiences  or known whether this is  practicable.
>>> Will there be performance problems when set the packet size to a big
>>> number?
>>> Thanks!
>>> donal
Uma Maheswara Rao G 72686... 2011-11-08, 08:45