|
Robert Dyer
2012-12-16, 09:16
Stack
2012-12-17, 23:11
Robert Dyer
2012-12-18, 01:26
Jean-Daniel Cryans
2012-12-18, 02:21
Robert Dyer
2012-12-18, 02:39
Jean-Daniel Cryans
2012-12-18, 02:45
Robert Dyer
2013-01-28, 09:37
|
-
Wrong input split locations after enabling reverse DNSRobert Dyer 2012-12-16, 09:16
I recently enabled reverse DNS on my test cluster. Now when I run a MR
job, the HBase input split locations are all adding a period to the end. For example: /default-rack/foo-1. /default-rack/foo-2. Yet the machine locations are still correct: /default-rack/foo-1 /default-rack/foo-2 Since those strings don't match, it isn't assigning the tasks locally. It actually thinks 100% of the map tasks are rack-local and 0% data-local (although in reality, some still wind up being data-local due to sheer luck). What is the issue here? Note that I don't have this problem with the MR tasks using SequenceFile as input, only with HBase's TableMapper.
-
Re: Wrong input split locations after enabling reverse DNSStack 2012-12-17, 23:11
On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer <[EMAIL PROTECTED]> wrote:
> I recently enabled reverse DNS on my test cluster. Now when I run a MR > job, the HBase input split locations are all adding a period to the end. > For example: > > /default-rack/foo-1. > /default-rack/foo-2. > > Yet the machine locations are still correct: > > /default-rack/foo-1 > /default-rack/foo-2 > > Since those strings don't match, it isn't assigning the tasks locally. It > actually thinks 100% of the map tasks are rack-local and 0% data-local > (although in reality, some still wind up being data-local due to sheer > luck). > > What is the issue here? Note that I don't have this problem with the MR > tasks using SequenceFile as input, only with HBase's TableMapper. > Looks like https://issues.apache.org/jira/browse/HBASE-4109 ? St.Ack
-
Re: Wrong input split locations after enabling reverse DNSRobert Dyer 2012-12-18, 01:26
That's what I thought too. Except I am running 0.94.2 and this fix was
released in 0.90.4. On Mon, Dec 17, 2012 at 5:11 PM, Stack <[EMAIL PROTECTED]> wrote: > On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer <[EMAIL PROTECTED]> wrote: > >> I recently enabled reverse DNS on my test cluster. Now when I run a MR >> job, the HBase input split locations are all adding a period to the end. >> For example: >> >> /default-rack/foo-1. >> /default-rack/foo-2. >> >> Yet the machine locations are still correct: >> >> /default-rack/foo-1 >> /default-rack/foo-2 >> >> Since those strings don't match, it isn't assigning the tasks locally. It >> actually thinks 100% of the map tasks are rack-local and 0% data-local >> (although in reality, some still wind up being data-local due to sheer >> luck). >> >> What is the issue here? Note that I don't have this problem with the MR >> tasks using SequenceFile as input, only with HBase's TableMapper. >> > > > Looks like https://issues.apache.org/jira/browse/HBASE-4109 ? > St.Ack > >
-
Re: Wrong input split locations after enabling reverse DNSJean-Daniel Cryans 2012-12-18, 02:21
Maybe TableInputFormatBase.getSplits is missing something similar to HBASE-4109?
J-D On Mon, Dec 17, 2012 at 5:26 PM, Robert Dyer <[EMAIL PROTECTED]> wrote: > That's what I thought too. Except I am running 0.94.2 and this fix was > released in 0.90.4. > > > On Mon, Dec 17, 2012 at 5:11 PM, Stack <[EMAIL PROTECTED]> wrote: > >> On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer <[EMAIL PROTECTED]> wrote: >> >>> I recently enabled reverse DNS on my test cluster. Now when I run a MR >>> job, the HBase input split locations are all adding a period to the end. >>> For example: >>> >>> /default-rack/foo-1. >>> /default-rack/foo-2. >>> >>> Yet the machine locations are still correct: >>> >>> /default-rack/foo-1 >>> /default-rack/foo-2 >>> >>> Since those strings don't match, it isn't assigning the tasks locally. It >>> actually thinks 100% of the map tasks are rack-local and 0% data-local >>> (although in reality, some still wind up being data-local due to sheer >>> luck). >>> >>> What is the issue here? Note that I don't have this problem with the MR >>> tasks using SequenceFile as input, only with HBase's TableMapper. >>> >> >> >> Looks like https://issues.apache.org/jira/browse/HBASE-4109 ? >> St.Ack >> >>
-
Re: Wrong input split locations after enabling reverse DNSRobert Dyer 2012-12-18, 02:39
Seems plausible. A simple grep reveals this:
mapreduce/TableInputFormatBase.java: hostName DNS.reverseDns(ipAddress, this.nameServer); which is not doing the filtering that HBASE-4109 does. Would this typically be filed as a new issue or brought up in comments on the closed issue? On Mon, Dec 17, 2012 at 8:21 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > Maybe TableInputFormatBase.getSplits is missing something similar to > HBASE-4109? > > J-D > > On Mon, Dec 17, 2012 at 5:26 PM, Robert Dyer <[EMAIL PROTECTED]> wrote: > > That's what I thought too. Except I am running 0.94.2 and this fix was > > released in 0.90.4. > > > > > > On Mon, Dec 17, 2012 at 5:11 PM, Stack <[EMAIL PROTECTED]> wrote: > > > >> On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer <[EMAIL PROTECTED]> wrote: > >> > >>> I recently enabled reverse DNS on my test cluster. Now when I run a MR > >>> job, the HBase input split locations are all adding a period to the > end. > >>> For example: > >>> > >>> /default-rack/foo-1. > >>> /default-rack/foo-2. > >>> > >>> Yet the machine locations are still correct: > >>> > >>> /default-rack/foo-1 > >>> /default-rack/foo-2 > >>> > >>> Since those strings don't match, it isn't assigning the tasks locally. > It > >>> actually thinks 100% of the map tasks are rack-local and 0% data-local > >>> (although in reality, some still wind up being data-local due to sheer > >>> luck). > >>> > >>> What is the issue here? Note that I don't have this problem with the > MR > >>> tasks using SequenceFile as input, only with HBase's TableMapper. > >>> > >> > >> > >> Looks like https://issues.apache.org/jira/browse/HBASE-4109 ? > >> St.Ack > >> > >> > -- Robert Dyer [EMAIL PROTECTED]
-
Re: Wrong input split locations after enabling reverse DNSJean-Daniel Cryans 2012-12-18, 02:45
New issue, the other one is too old.
Thx! J-D On Mon, Dec 17, 2012 at 6:39 PM, Robert Dyer <[EMAIL PROTECTED]> wrote: > Seems plausible. A simple grep reveals this: > > mapreduce/TableInputFormatBase.java: hostName > DNS.reverseDns(ipAddress, this.nameServer); > > which is not doing the filtering that HBASE-4109 does. > > Would this typically be filed as a new issue or brought up in comments on > the closed issue? > > > On Mon, Dec 17, 2012 at 8:21 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> > wrote: >> >> Maybe TableInputFormatBase.getSplits is missing something similar to >> HBASE-4109? >> >> J-D >> >> On Mon, Dec 17, 2012 at 5:26 PM, Robert Dyer <[EMAIL PROTECTED]> wrote: >> > That's what I thought too. Except I am running 0.94.2 and this fix was >> > released in 0.90.4. >> > >> > >> > On Mon, Dec 17, 2012 at 5:11 PM, Stack <[EMAIL PROTECTED]> wrote: >> > >> >> On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer <[EMAIL PROTECTED]> wrote: >> >> >> >>> I recently enabled reverse DNS on my test cluster. Now when I run a >> >>> MR >> >>> job, the HBase input split locations are all adding a period to the >> >>> end. >> >>> For example: >> >>> >> >>> /default-rack/foo-1. >> >>> /default-rack/foo-2. >> >>> >> >>> Yet the machine locations are still correct: >> >>> >> >>> /default-rack/foo-1 >> >>> /default-rack/foo-2 >> >>> >> >>> Since those strings don't match, it isn't assigning the tasks locally. >> >>> It >> >>> actually thinks 100% of the map tasks are rack-local and 0% data-local >> >>> (although in reality, some still wind up being data-local due to sheer >> >>> luck). >> >>> >> >>> What is the issue here? Note that I don't have this problem with the >> >>> MR >> >>> tasks using SequenceFile as input, only with HBase's TableMapper. >> >>> >> >> >> >> >> >> Looks like https://issues.apache.org/jira/browse/HBASE-4109 ? >> >> St.Ack >> >> >> >> > > > > > -- > > Robert Dyer > [EMAIL PROTECTED]
-
Re: Wrong input split locations after enabling reverse DNSRobert Dyer 2013-01-28, 09:37
Just to follow up here, I did manage to test a patch on
TableInputFormatBase.java and it resolved my issue. I filed https://issues.apache.org/jira/browse/HBASE-7693 and will attach the patch as soon as my Git updates. On Mon, Dec 17, 2012 at 8:45 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > New issue, the other one is too old. > > Thx! > > J-D > > On Mon, Dec 17, 2012 at 6:39 PM, Robert Dyer <[EMAIL PROTECTED]> wrote: > > Seems plausible. A simple grep reveals this: > > > > mapreduce/TableInputFormatBase.java: hostName > > DNS.reverseDns(ipAddress, this.nameServer); > > > > which is not doing the filtering that HBASE-4109 does. > > > > Would this typically be filed as a new issue or brought up in comments on > > the closed issue? > > > > > > On Mon, Dec 17, 2012 at 8:21 PM, Jean-Daniel Cryans <[EMAIL PROTECTED] > > > > wrote: > >> > >> Maybe TableInputFormatBase.getSplits is missing something similar to > >> HBASE-4109? > >> > >> J-D > >> > >> On Mon, Dec 17, 2012 at 5:26 PM, Robert Dyer <[EMAIL PROTECTED]> wrote: > >> > That's what I thought too. Except I am running 0.94.2 and this fix > was > >> > released in 0.90.4. > >> > > >> > > >> > On Mon, Dec 17, 2012 at 5:11 PM, Stack <[EMAIL PROTECTED]> wrote: > >> > > >> >> On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer <[EMAIL PROTECTED]> > wrote: > >> >> > >> >>> I recently enabled reverse DNS on my test cluster. Now when I run a > >> >>> MR > >> >>> job, the HBase input split locations are all adding a period to the > >> >>> end. > >> >>> For example: > >> >>> > >> >>> /default-rack/foo-1. > >> >>> /default-rack/foo-2. > >> >>> > >> >>> Yet the machine locations are still correct: > >> >>> > >> >>> /default-rack/foo-1 > >> >>> /default-rack/foo-2 > >> >>> > >> >>> Since those strings don't match, it isn't assigning the tasks > locally. > >> >>> It > >> >>> actually thinks 100% of the map tasks are rack-local and 0% > data-local > >> >>> (although in reality, some still wind up being data-local due to > sheer > >> >>> luck). > >> >>> > >> >>> What is the issue here? Note that I don't have this problem with > the > >> >>> MR > >> >>> tasks using SequenceFile as input, only with HBase's TableMapper. > >> >>> > >> >> > >> >> > >> >> Looks like https://issues.apache.org/jira/browse/HBASE-4109 ? > >> >> St.Ack > >> >> > >> >> > > > > > > > > > > -- > > > > Robert Dyer > > [EMAIL PROTECTED] > -- Robert Dyer [EMAIL PROTECTED] |