-Re: executing files on hdfs via hadoop not possible? is JNI/JNA a reasonable solution?
You're confusing two things here. HDFS is a data storage filesystem.
MR does not have anything to do with HDFS (generally speaking).
A reducer runs as a regular JVM on a provided node, and can execute
any program you'd like it to by downloading it onto its configured
local filesystem and executing it.
If your goal is merely to run a regular program over data that is
sitting in HDFS, that can be achieved. If your library is in C then
simply use a streaming program to run it and use libhdfs' HDFS API
(C/C++) to read data into your functions from HDFS files. Would this
On Sun, Mar 17, 2013 at 3:09 PM, Julian Bui <[EMAIL PROTECTED]> wrote:
> Hi hadoop users,
> I just want to verify that there is no way to put a binary on HDFS and
> execute it using the hadoop java api. If not, I would appreciate advice in
> getting in creating an implementation that uses native libraries.
> "In contrast to the POSIX model, there are no sticky, setuid or setgid bits
> for files as there is no notion of executable files." Is there no
> A little bit more about what I'm trying to do. I have a binary that
> converts my image to another image format. I currently want to put it in
> the distributed cache and tell the reducer to execute the binary on the data
> on hdfs. However, since I can't set the execute permission bit on that
> file, it seems that I cannot do that.
> Since I cannot use the binary, it seems like I have to use my own
> implementation to do this. The challenge is that these libraries that I can
> use to do this are .a and .so files. Would I have to use JNI and package
> the libraries in the distributed cache and then have the reducer find and
> use those libraries on the task nodes? Actually, I wouldn't want to use
> JNI, I'd probably want to use java native access (JNA) to do this. Has
> anyone used JNA with hadoop and been successful? Are there problems I'll
> Please let me know.