|
|
-
executing hadoop commands from python?jamal sasha 2013-02-16, 22:47
Hi,
This might be more of a python centric question but was wondering if anyone has tried it out... I am trying to run few hadoop commands from python program... For example if from command line, you do: bin/hadoop dfs -ls /hdfs/query/path it returns all the files in the hdfs query path.. So very similar to unix Now I am trying to basically do this from python.. and do some manipulation from it. exec_str = "path/to/hadoop/bin/hadoop dfs -ls " + query_path os.system(exec_str) Now, I am trying to grab this output to do some manipulation in it. For example.. count number of files? I looked into subprocess module but then... these are not native shell commands. hence not sure whether i can apply those concepts How to solve this? Thanks |