Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Automatically upload files into HDFS


+
kashif khan 2012-11-19, 10:44
+
Mohammad Tariq 2012-11-19, 10:50
+
kashif khan 2012-11-19, 12:30
+
Mohammad Tariq 2012-11-19, 12:34
+
Mohammad Tariq 2012-11-19, 12:35
+
Alexander Alten-Lorenz 2012-11-19, 12:26
+
kashif khan 2012-11-19, 12:35
+
Mohammad Tariq 2012-11-19, 12:41
Copy link to this message
-
Re: Automatically upload files into HDFS
Thanks M  Tariq

As I am new in  Java and Hadoop and have no much experience. I am trying to
first write a simple program to upload data into HDFS and gradually move
forward. I have written the following simple program to upload the file
into HDFS, I dont know why it does not working.  could you please check it,
if have time.

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.*;
//import java.nio.file.Path;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class hdfsdata {
public static void main(String [] args) throws IOException
{
    try{
    Configuration conf = new Configuration();
    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
    FileSystem fileSystem = FileSystem.get(conf);
    String source = "/usr/Eclipse/Output.csv";
    String dest = "/user/hduser/input/";

    //String fileName = source.substring(source.lastIndexOf('/') +
source.length());
    String fileName = "Output1.csv";

    if (dest.charAt(dest.length() -1) != '/')
    {
        dest = dest + "/" +fileName;
    }
    else
    {
        dest = dest + fileName;

    }
    Path path = new Path(dest);
    if(fileSystem.exists(path))
    {
        System.out.println("File" + dest + " already exists");
    }
   FSDataOutputStream out = fileSystem.create(path);
   InputStream in = new BufferedInputStream(new FileInputStream(new
File(source)));
   File myfile = new File(source);
   byte [] b = new byte [(int) myfile.length() ];
   int numbytes = 0;
   while((numbytes = in.read(b)) >= 0)

   {
       out.write(b,0,numbytes);
   }
   in.close();
   out.close();
   //bos.close();
   fileSystem.close();
    }
    catch(Exception e)
    {

        System.out.println(e.toString());
    }
    }

}
Thanks again,

Best regards,

KK
On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> You can set your cronjob to execute the program after every 5 sec.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <[EMAIL PROTECTED]>wrote:
>
>> Well, I want to automatically upload the files as  the files are
>> generating about every 3-5 sec and each file has size about 3MB.
>>
>>  Is it possible to automate the system using put or cp command?
>>
>> I read about the flume and webHDFS but I am not sure it will work or not.
>>
>> Many thanks
>>
>> Best regards
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> Why do you don't use HDFS related tools like put or cp?
>>>
>>> - Alex
>>>
>>> On Nov 19, 2012, at 11:44 AM, kashif khan <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>> > HI,
>>> >
>>> > I am generating files continuously in local folder of my base machine.
>>> How
>>> > I can now use the flume to stream the generated files from local
>>> folder to
>>> > HDFS.
>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>> >
>>> > 1) location of folder where files are generating: /usr/datastorage/
>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>> >
>>> > Please let me help.
>>> >
>>> > Many thanks
>>> >
>>> > Best regards,
>>> > KK
>>>
>>> --
>>> Alexander Alten-Lorenz
>>> http://mapredit.blogspot.com
>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>
>>>
>>
>
+
Mohammad Tariq 2012-11-19, 13:18
+
kashif khan 2012-11-19, 14:01
+
Mohammad Tariq 2012-11-19, 14:04
+
kashif khan 2012-11-19, 14:29
+
kashif khan 2012-11-19, 14:43
+
Mohammad Tariq 2012-11-19, 14:53
+
kashif khan 2012-11-19, 15:01
+
Mohammad Tariq 2012-11-19, 15:10
+
kashif khan 2012-11-19, 15:34
+
Mohammad Tariq 2012-11-19, 15:41
+
kashif khan 2012-11-20, 10:40
+
Mohammad Tariq 2012-11-20, 14:19
+
kashif khan 2012-11-20, 14:27
+
Mohammad Tariq 2012-11-20, 14:33
+
kashif khan 2012-11-20, 14:36
+
Mohammad Tariq 2012-11-20, 14:53
+
kashif khan 2012-11-20, 15:04
+
kashif khan 2012-11-20, 16:22
+
shekhar sharma 2012-11-20, 19:06
+
kashif khan 2012-11-21, 12:36
+
shekhar sharma 2012-11-26, 16:42
+
kashif khan 2012-11-27, 21:25
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB