Oh, another link I should have included!
On Mon, Jan 14, 2013 at 2:19 PM, Andy Isaacson <[EMAIL PROTECTED]> wrote:
> Hadoop Streaming does not magically teach Python open() how to read
> from "hdfs://" URLs. You'll need to use a library or fork a "hdfs dfs
> -cat" to read the file for you.
> A few links that may help:
> On Sat, Jan 12, 2013 at 12:30 AM, springring <[EMAIL PROTECTED]> wrote:
>> When I run code below as a streaming, the job error N/A and killed. I run step by step, find it error when
>> " file_obj = open(file) " . When I run same code outside of hadoop, everything is ok.
>> 1 #!/bin/env python
>> 3 import sys
>> 5 for line in sys.stdin:
>> 6 offset,filename = line.split("\t")
>> 7 file = "hdfs://user/hdfs/catalog3/" + filename
>> 8 print line
>> 9 print filename
>> 10 print file
>> 11 file_obj = open(file)