|
|
-
Re: hftp can list directories but won't send filesArpit Gupta 2012-12-18, 22:49
Hi Robert
Does the cat work for you if you dont use hftp, something like hadoop fs -cat hdfs://hdenn00.trueffect.com:8020/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x Or hadoop fs -cat /user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Dec 18, 2012, at 2:43 PM, Robert Rapplean <[EMAIL PROTECTED]> wrote: > Hey, everone. Just got finished reading about all of the unsubscribe messages in Sept-Oct, and was hoping someone has a clue about what my system is doing wrong. I suspect that this is a configuration issue, but I don't even know where to start looking for it. I'm a developer, and my sysadmin is tied up until the end of the year. > > I'm trying to move files from one cluster to another using distcp, using the hftp protocol as specified in their instructions. > > I can read directories over hftp, but when I attempt to get a file I get a 500 (internal server error). To eliminate the possibility of network and firewall issues, I'm using hadoop fs -ls and hadoop fs -cat commands on the source server in order to attempt to figure out this issue. > > This provides a directory of the files, which is correct. > > hadoop fs -ls ourlogs/day_id=19991231/hour_id=1999123123 > -rw-r--r-- 3 username supergroup 812 2012-12-16 17:21 logfiles/day_id=19991231/hour_id=1999123123/000008_0 > > This gives me a "file not found" error, which is also correct because the file isn't there: > > hadoop fs -cat hftp://hdenn00.trueffect.com:50070/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x > cat: `hftp://hdenn00.trueffect.com:50070/user/prodman/ods_fail/day_id=19991231/hour_id=1999123123/000008_0x': No such file or directory > > This line gives me a 500 internal server error. The file is confirmed to be on the server. > > hadoop fs -cat hftp://hdenn00.trueffect.com:50070/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0 > cat: HTTP_OK expected, received 500 > > Here is a stack trace of what distcp logs when I attempt this: > > java.io.IOException: HTTP_OK expected, received 500 > at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:365) > at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119) > at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103) > at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187) > at java.io.DataInputStream.read(DataInputStream.java:83) > at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:424) > at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547) > at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > > Can someone tell me why hftp is failing to serve files, or at least where to look? > |