|
|
-
what happens when a datanode rejoins?
mehul choube 2012-09-11, 08:39
Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: a) Some of the blocks it was managing are deleted/modified?
b) The size of the blocks are now modified say from 64MB to 128MB?
c) What if the block replication factor was one (yea not in most deployments but say in case) so does the namenode recreate a file once the datanode rejoins? Thanks,
Mehul
-
Re: what happens when a datanode rejoins?
shashwat shriparv 2012-09-11, 08:58
Yes the cluster will be re balanced.
On Tue, Sep 11, 2012 at 2:09 PM, mehul choube <[EMAIL PROTECTED]> wrote:
> Hi, > > > What happens when an existing (not new) datanode rejoins a cluster for > following scenarios: > > > a) Some of the blocks it was managing are deleted/modified? > > b) The size of the blocks are now modified say from 64MB to 128MB? > > c) What if the block replication factor was one (yea not in most > deployments but say in case) so does the namenode recreate a file once the > datanode rejoins? > > > > > Thanks, > > Mehul > > > -- ∞ Shashwat Shriparv
-
Re: what happens when a datanode rejoins?
Harsh J 2012-09-11, 09:01
Hi Mehul, Please do not send multiple mails with the same questions. We've already answered this at your other post, follow thread at: http://mail-archives.apache.org/mod_mbox/hadoop-user/201209.mbox/%[EMAIL PROTECTED]%3e On Tue, Sep 11, 2012 at 2:09 PM, mehul choube <[EMAIL PROTECTED]> wrote: > Hi, > > > What happens when an existing (not new) datanode rejoins a cluster for > following scenarios: > > > a) Some of the blocks it was managing are deleted/modified? > > b) The size of the blocks are now modified say from 64MB to 128MB? > > c) What if the block replication factor was one (yea not in most deployments > but say in case) so does the namenode recreate a file once the datanode > rejoins? > > > > > Thanks, > > Mehul > > -- Harsh J
-
RE: what happens when a datanode rejoins?
Mehul Choube 2012-09-11, 09:06
> The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. What happens when the datanode rejoins after namenode has already re-replicated the blocs it was managing? Will namenode ask the datanode to discard the blocks and start managing new blocks? Or will namenode discard the new blocks which were replicated due to unavailability of this datanode?
Thanks, Mehul From: George Datskos [mailto:[EMAIL PROTECTED]] Sent: Tuesday, September 11, 2012 12:56 PM To: [EMAIL PROTECTED] Subject: Re: what happens when a datanode rejoins?
Hi Mehul Some of the blocks it was managing are deleted/modified?
The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. The size of the blocks are now modified say from 64MB to 128MB?
Block size is a per-file setting so new files will be 128MB, but the old ones will remain at 64MB. What if the block replication factor was one (yea not in most deployments but say incase) so does the namenode recreate a file once the datanode rejoins?
(assuming you didn't perform a decommission) Blocks that lived only on that datanode will be declared "missing" and the files associated with those blocks will be not be able to be fully read, until the datanode rejoins.
George
-
RE: what happens when a datanode rejoins?
Mehul Choube 2012-09-11, 09:07
I apologize for this :( I thought the earlier mail didn't go through -----Original Message----- From: Harsh J [mailto:[EMAIL PROTECTED]] Sent: Tuesday, September 11, 2012 2:31 PM To: [EMAIL PROTECTED] Subject: Re: what happens when a datanode rejoins? Hi Mehul, Please do not send multiple mails with the same questions. We've already answered this at your other post, follow thread at: http://mail-archives.apache.org/mod_mbox/hadoop-user/201209.mbox/%[EMAIL PROTECTED]%3e On Tue, Sep 11, 2012 at 2:09 PM, mehul choube <[EMAIL PROTECTED]> wrote: > Hi, > > > What happens when an existing (not new) datanode rejoins a cluster for > following scenarios: > > > a) Some of the blocks it was managing are deleted/modified? > > b) The size of the blocks are now modified say from 64MB to 128MB? > > c) What if the block replication factor was one (yea not in most deployments > but say in case) so does the namenode recreate a file once the datanode > rejoins? > > > > > Thanks, > > Mehul > > -- Harsh J
-
Re: what happens when a datanode rejoins?
Narasingu Ramesh 2012-09-11, 09:08
Hi Mehul, DataNode rejoins take care of only NameNode. Thanks & Regards, Ramesh.Narasingu
On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube <[EMAIL PROTECTED]>wrote:
> > The namenode will asynchronously replicate the blocks to other > datanodes in order to maintain the replication factor after a datanode has > not been in contact for 10 minutes.**** > > What happens when the datanode rejoins after namenode has already > re-replicated the blocs it was managing?**** > > Will namenode ask the datanode to discard the blocks and start managing > new blocks?**** > > Or will namenode discard the new blocks which were replicated due to > unavailability of this datanode?**** > > ** ** > > ** ** > > ** ** > > Thanks,**** > > Mehul**** > > ** ** > > ** ** > > *From:* George Datskos [mailto:[EMAIL PROTECTED]] > *Sent:* Tuesday, September 11, 2012 12:56 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: what happens when a datanode rejoins?**** > > ** ** > > Hi Mehul**** > > Some of the blocks it was managing are deleted/modified?**** > > > The namenode will asynchronously replicate the blocks to other datanodes > in order to maintain the replication factor after a datanode has not been > in contact for 10 minutes. > > > **** > > The size of the blocks are now modified say from 64MB to 128MB?**** > > > Block size is a per-file setting so new files will be 128MB, but the old > ones will remain at 64MB. > > > **** > > What if the block replication factor was one (yea not in most deployments > but say incase) so does the namenode recreate a file once the datanode > rejoins?**** > > > (assuming you didn't perform a decommission) Blocks that lived only on > that datanode will be declared "missing" and the files associated with > those blocks will be not be able to be fully read, until the datanode > rejoins. > > > > George**** >
-
Re: what happens when a datanode rejoins?
Harsh J 2012-09-11, 09:10
Hi,
Inline.
On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube <[EMAIL PROTECTED]> wrote: >> The namenode will asynchronously replicate the blocks to other datanodes >> in order to maintain the replication factor after a datanode has not been in >> contact for 10 minutes. > > What happens when the datanode rejoins after namenode has already > re-replicated the blocs it was managing?
The block count total goes +1, and the file's block is treated as an over-replicated one.
> Will namenode ask the datanode to discard the blocks and start managing new > blocks?
Yes, this may happen.
> Or will namenode discard the new blocks which were replicated due to > unavailability of this datanode?
It deletes extra blocks while still keeping the block placement policy in mind. It may delete any block replica as long as the placement policy is not violated by doing so.
-- Harsh J
-
RE: what happens when a datanode rejoins?
Mehul Choube 2012-09-11, 09:11
>DataNode rejoins take care of only NameNode. Sorry didn't get this From: Narasingu Ramesh [mailto:[EMAIL PROTECTED]] Sent: Tuesday, September 11, 2012 2:38 PM To: [EMAIL PROTECTED] Subject: Re: what happens when a datanode rejoins?
Hi Mehul, DataNode rejoins take care of only NameNode. Thanks & Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: > The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. What happens when the datanode rejoins after namenode has already re-replicated the blocs it was managing? Will namenode ask the datanode to discard the blocks and start managing new blocks? Or will namenode discard the new blocks which were replicated due to unavailability of this datanode?
Thanks, Mehul From: George Datskos [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, September 11, 2012 12:56 PM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: what happens when a datanode rejoins?
Hi Mehul Some of the blocks it was managing are deleted/modified?
The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes.
The size of the blocks are now modified say from 64MB to 128MB?
Block size is a per-file setting so new files will be 128MB, but the old ones will remain at 64MB.
What if the block replication factor was one (yea not in most deployments but say incase) so does the namenode recreate a file once the datanode rejoins?
(assuming you didn't perform a decommission) Blocks that lived only on that datanode will be declared "missing" and the files associated with those blocks will be not be able to be fully read, until the datanode rejoins.
George
|
|