Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Streaming data access in HDFS: Design Feature


Copy link to this message
-
Re: Streaming data access in HDFS: Design Feature
Hadoop streaming  allows you to create and run Map/Reduce jobs with any
executable or script as the mapper and/or the reducer. In other words, you
need not need to learn java programming for writing simple mapreduce
program.

Where as streaming data access in HDFS is totally different. When mapreduce
framework tries to read/write data from/to hdfs blocks, its done by byte
streams. Bytes are always appended to the end of a stream, and byte streams
are guaranteed to be stored in the order written.
following code snippet shows how the steam data is written to HDFS. If you
want to understand more of it then you can look at the codebase for any
fileformat like sequencefile format.
Hope this helps a  bit

===
// Create a new file and write data to it.
    FSDataOutputStream out = fileSystem.create(path);
    InputStream in = new BufferedInputStream(new FileInputStream(
        new File(source)));

    byte[] b = new byte[1024];
    int numBytes = 0;
    while ((numBytes = in.read(b)) > 0) {
        out.write(b, 0, numBytes);
    }

    // Close all the file descripters
    in.close();
    out.close();
    fileSystem.close();
===
On Wed, Mar 5, 2014 at 2:25 PM, Radhe Radhe <[EMAIL PROTECTED]>wrote:
Nitin Pawar

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB