Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Best way to write files to hdfs (from a Python app)


Copy link to this message
-
Re: Best way to write files to hdfs (from a Python app)
Hi Bjoern,

To give you an example of how this may be done, HUE, under the covers, pipes
your data to 'bin/hadoop fs -Dhadoop.job.ugi=user,group put - path'.
 (That's from memory, but it's approximately right; the full python code is
at
http://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/src/hadoop/fs/hadoopfs.py#L692
)

Cheers,

-- Philip

On Mon, Aug 9, 2010 at 9:18 AM, Bjoern Schiessle <[EMAIL PROTECTED]>wrote:

> Hi all,
>
> I develop a web application with Django(Python) which should access an
> hbase database and store large files to hdfs.
>
> I wonder what is the best way to write files to hdfs from my Django app?
> Basically I thought about two ways but maybe you know a better option:
>
> 1. First store the file on the local file system and than move it with
> the thrift interface to hdfs. (downside: needs always enough space on the
> web application server)
>
> 2. Use hdfs-fuse to mount the hdfs file system and write the file directly
> to hdfs. (downside: I don't know how well hdfs-fuse is supported and I'm
> not sure if it is a good idea to mount the file system and run large
> operation on it).
>
> Since I'm new to hdfs and Hadoop in general I'm not sure what's the best
> and less error-prone way.
>
> What would be your recommendation?
>
> Thanks a lot!
> Björn
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB