Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Streaming data access in HDFS: Design Feature

Copy link to this message
Re: Streaming data access in HDFS: Design Feature
Hadoop streaming  allows you to create and run Map/Reduce jobs with any
executable or script as the mapper and/or the reducer. In other words, you
need not need to learn java programming for writing simple mapreduce

Where as streaming data access in HDFS is totally different. When mapreduce
framework tries to read/write data from/to hdfs blocks, its done by byte
streams. Bytes are always appended to the end of a stream, and byte streams
are guaranteed to be stored in the order written.
following code snippet shows how the steam data is written to HDFS. If you
want to understand more of it then you can look at the codebase for any
fileformat like sequencefile format.
Hope this helps a  bit

// Create a new file and write data to it.
    FSDataOutputStream out = fileSystem.create(path);
    InputStream in = new BufferedInputStream(new FileInputStream(
        new File(source)));

    byte[] b = new byte[1024];
    int numBytes = 0;
    while ((numBytes = in.read(b)) > 0) {
        out.write(b, 0, numBytes);

    // Close all the file descripters
On Wed, Mar 5, 2014 at 2:25 PM, Radhe Radhe <[EMAIL PROTECTED]>wrote:
Nitin Pawar