Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Best way to write files to hdfs (from a Python app)

Copy link to this message
Re: Best way to write files to hdfs (from a Python app)
Philip Zeyliger 2010-08-09, 23:35
Hi Bjoern,

To give you an example of how this may be done, HUE, under the covers, pipes
your data to 'bin/hadoop fs -Dhadoop.job.ugi=user,group put - path'.
 (That's from memory, but it's approximately right; the full python code is


-- Philip

On Mon, Aug 9, 2010 at 9:18 AM, Bjoern Schiessle <[EMAIL PROTECTED]>wrote:

> Hi all,
> I develop a web application with Django(Python) which should access an
> hbase database and store large files to hdfs.
> I wonder what is the best way to write files to hdfs from my Django app?
> Basically I thought about two ways but maybe you know a better option:
> 1. First store the file on the local file system and than move it with
> the thrift interface to hdfs. (downside: needs always enough space on the
> web application server)
> 2. Use hdfs-fuse to mount the hdfs file system and write the file directly
> to hdfs. (downside: I don't know how well hdfs-fuse is supported and I'm
> not sure if it is a good idea to mount the file system and run large
> operation on it).
> Since I'm new to hdfs and Hadoop in general I'm not sure what's the best
> and less error-prone way.
> What would be your recommendation?
> Thanks a lot!
> Björn