Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Automatically upload files into HDFS


+
kashif khan 2012-11-19, 10:44
+
Mohammad Tariq 2012-11-19, 10:50
+
kashif khan 2012-11-19, 12:30
+
Mohammad Tariq 2012-11-19, 12:34
+
Mohammad Tariq 2012-11-19, 12:35
+
Alexander Alten-Lorenz 2012-11-19, 12:26
+
kashif khan 2012-11-19, 12:35
+
Mohammad Tariq 2012-11-19, 12:41
+
kashif khan 2012-11-19, 12:53
Copy link to this message
-
Re: Automatically upload files into HDFS
Mohammad Tariq 2012-11-19, 13:18
If it is just copying the files without any processing or change, you can
use something like this :

public class CopyData {

    public static void main(String[] args) throws IOException{

        Configuration configuration = new Configuration();
        configuration.addResource(new
Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
        configuration.addResource(new
Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
        FileSystem fs = FileSystem.get(configuration);
        Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
        Path outputFile = new Path("/mapout/FFT.java");
        fs.copyFromLocalFile(inputFile, outputFile);
        fs.close();
    }
}

Obviously you have to modify it as per your requirements like continuously
polling the targeted directory for new files.

Regards,
    Mohammad Tariq

On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <[EMAIL PROTECTED]> wrote:

> Thanks M  Tariq
>
> As I am new in  Java and Hadoop and have no much experience. I am trying
> to first write a simple program to upload data into HDFS and gradually move
> forward. I have written the following simple program to upload the file
> into HDFS, I dont know why it does not working.  could you please check it,
> if have time.
>
> import java.io.BufferedInputStream;
> import java.io.BufferedOutputStream;
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import java.io.IOException;
> import java.io.InputStream;
> import java.io.OutputStream;
> import java.nio.*;
> //import java.nio.file.Path;
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FSDataInputStream;
> import org.apache.hadoop.fs.FSDataOutputStream;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> public class hdfsdata {
>
>
> public static void main(String [] args) throws IOException
> {
>     try{
>
>
>     Configuration conf = new Configuration();
>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>     FileSystem fileSystem = FileSystem.get(conf);
>     String source = "/usr/Eclipse/Output.csv";
>     String dest = "/user/hduser/input/";
>
>     //String fileName = source.substring(source.lastIndexOf('/') +
> source.length());
>     String fileName = "Output1.csv";
>
>     if (dest.charAt(dest.length() -1) != '/')
>     {
>         dest = dest + "/" +fileName;
>     }
>     else
>     {
>         dest = dest + fileName;
>
>     }
>     Path path = new Path(dest);
>
>
>     if(fileSystem.exists(path))
>     {
>         System.out.println("File" + dest + " already exists");
>     }
>
>
>    FSDataOutputStream out = fileSystem.create(path);
>    InputStream in = new BufferedInputStream(new FileInputStream(new
> File(source)));
>    File myfile = new File(source);
>    byte [] b = new byte [(int) myfile.length() ];
>    int numbytes = 0;
>    while((numbytes = in.read(b)) >= 0)
>
>    {
>        out.write(b,0,numbytes);
>    }
>    in.close();
>    out.close();
>    //bos.close();
>    fileSystem.close();
>     }
>     catch(Exception e)
>     {
>
>         System.out.println(e.toString());
>     }
>     }
>
> }
>
>
> Thanks again,
>
> Best regards,
>
> KK
>
>
>
> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>
>> You can set your cronjob to execute the program after every 5 sec.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <[EMAIL PROTECTED]>wrote:
>>
>>> Well, I want to automatically upload the files as  the files are
>>> generating about every 3-5 sec and each file has size about 3MB.
>>>
>>>  Is it possible to automate the system using put or cp command?
>>>
>>> I read about the flume and webHDFS but I am not sure it will work or not.
>>>
>>> Many thanks
>>>
>>> Best regards
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>> [EMAIL PROTECTED]> wrote:
+
kashif khan 2012-11-19, 14:01
+
Mohammad Tariq 2012-11-19, 14:04
+
kashif khan 2012-11-19, 14:29
+
kashif khan 2012-11-19, 14:43
+
Mohammad Tariq 2012-11-19, 14:53
+
kashif khan 2012-11-19, 15:01
+
Mohammad Tariq 2012-11-19, 15:10
+
kashif khan 2012-11-19, 15:34
+
Mohammad Tariq 2012-11-19, 15:41
+
kashif khan 2012-11-20, 10:40
+
Mohammad Tariq 2012-11-20, 14:19
+
kashif khan 2012-11-20, 14:27
+
Mohammad Tariq 2012-11-20, 14:33
+
kashif khan 2012-11-20, 14:36
+
Mohammad Tariq 2012-11-20, 14:53
+
kashif khan 2012-11-20, 15:04
+
kashif khan 2012-11-20, 16:22
+
shekhar sharma 2012-11-20, 19:06
+
kashif khan 2012-11-21, 12:36
+
shekhar sharma 2012-11-26, 16:42
+
kashif khan 2012-11-27, 21:25