Krishna Kishore Bonagiri 2013-08-06, 12:25
Omkar Joshi 2013-08-06, 18:11
YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
> Omkar Joshi
> Hortonworks Inc.
> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <[EMAIL PROTECTED]> wrote:
> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Can you try passing a fully qualified local path? That is, including the file:/ scheme
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <[EMAIL PROTECTED]> wrote:
> Hi Harsh,
> The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
> at java.net.URI$Parser.fail(URI.java:2820)
> at java.net.URI$Parser.failExpecting(URI.java:2826)
> at java.net.URI$Parser.parse(URI.java:3015)
> at java.net.URI.<init>(URI.java:747)
> at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <[EMAIL PROTECTED]> wrote:
> > Hi Harsh,
> > Please see if this is useful, I got a stack trace after the error has
> > occurred....
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > > > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004