|
|
-
Re: calling C programs from Hadoop
Asif Jan 2010-05-29, 19:52
Look at Hadoop streaming, may be it is helpful to you. asif On May 29, 2010, at 8:31 PM, Michael Robinson wrote: > > I am new to Hadoop. I have successfully run java programs from > Hadoop and I > would like to call C programs from Hadoop. > > Thank you for your help > > Michael > -- > View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p854833.html> Sent from the Hadoop lucene-users mailing list archive at Nabble.com. **************************************************************************************** Asif Jan Gaia Project SixSq Sarl / ISDC Astrophysics Data Centre & Geneva Observatory Chemin des Ecogia 16 CH-1290 Versoix Switzerland E-mail : [EMAIL PROTECTED] Tel. : +41 22 37 92198 Fax : +41 22 37 92133 ****************************************************************************************
-
Re: calling C programs from Hadoop
Owen O'Malley 2010-05-29, 20:35
On Sat, May 29, 2010 at 12:52 PM, Asif Jan <[EMAIL PROTECTED]> wrote: > Look at Hadoop streaming, may be it is helpful to you.
There is also Pipes, which is the C++ interface to MapReduce.
-- Owen
-
Re: calling C programs from Hadoop
Michael Robinson 2010-05-30, 01:14
Thanks for your answers. I have read "hadoop streaming" and I think it is great, however what I am trying to do is to run a C program that I have with its own data, and have hadoop do the scheduling and make it run in multiple nodes as a distributed system. The process I need to do does NOT do map and reduce type of process, so what I was thinking was either feed the C program to Hadoop or write a java program that would call the C program and have Hadoop do its magic. Thanks Michael -- View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p855338.htmlSent from the Hadoop lucene-users mailing list archive at Nabble.com.
-
Re: calling C programs from Hadoop
Michael Robinson 2010-05-30, 01:17
Thanks for your answers. I have read "hadoop streaming" and I think it is great, however what I am trying to do is to run a C program that I have with its own data, and have hadoop do the scheduling and make it run in multiple nodes as a distributed system. The process I need to do does NOT do map and reduce type of process, so what I was thinking was either feed the C program to Hadoop or write a java program that would call the C program and have Hadoop do its magic. Thanks Michael -- View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p855341.htmlSent from the Hadoop lucene-users mailing list archive at Nabble.com.
-
Re: calling C programs from Hadoop
Jeff Bean 2010-05-30, 14:17
Hi Michael, How come you can't specify the C program as the mapper in streaming and just have no reducers? Jeff On Sat, May 29, 2010 at 6:14 PM, Michael Robinson <[EMAIL PROTECTED]>wrote: > > Thanks for your answers. > > I have read "hadoop streaming" and I think it is great, however what I am > trying to do is to run a C program that I have with its own data, and have > hadoop do the scheduling and make it run in multiple nodes as a distributed > system. > > The process I need to do does NOT do map and reduce type of process, so > what > I was thinking was either feed the C program to Hadoop or write a java > program that would call the C program and have Hadoop do its magic. > > Thanks > > Michael > -- > View this message in context: > http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p855338.html> Sent from the Hadoop lucene-users mailing list archive at Nabble.com. >
-
Re: calling C programs from Hadoop
Brian Bockelman 2010-05-30, 20:38
Uh... So you want a batch system? Look up PBS (Torque/Maui), SGE, or Condor. Brian On May 29, 2010, at 8:17 PM, Michael Robinson wrote: > > Thanks for your answers. > > I have read "hadoop streaming" and I think it is great, however what I am > trying to do is to run a C program that I have with its own data, and have > hadoop do the scheduling and make it run in multiple nodes as a distributed > system. > > The process I need to do does NOT do map and reduce type of process, so what > I was thinking was either feed the C program to Hadoop or write a java > program that would call the C program and have Hadoop do its magic. > > Thanks > > Michael > -- > View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p855341.html> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
-
Re: calling C programs from Hadoop
Michael Robinson 2010-05-31, 15:44
Hi Brian, Yes, it is a batch process. I am using Ubuntu Linux, can you tell me how to open the p7s file you send me? I googled for p7s viewer and it seems they work on windows and mac only. Thanks Michael -- View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p858867.htmlSent from the Hadoop lucene-users mailing list archive at Nabble.com.
-
Re: calling C programs from Hadoop
Michael Robinson 2010-05-31, 16:17
Hi Jef, I have a C program that processes very large data files which are compressed, so this program has to have full control of the process. However the input data can be broken down into chunks, and a separate (distributed) process for each chunk can be run, which what I am doing now, but I am doing this manually at this time. I am looking to use a distributed system like Hadoop to do this so that i controls the scheduling and all those great things I have read about Hadoop. I was wondering if I can have Hadoop run a batch file (.bat in windows or .sh in linux), also I would like to run this in Virtual Machines. Thanks Michael -- View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p858959.htmlSent from the Hadoop lucene-users mailing list archive at Nabble.com.
-
Re: calling C programs from Hadoop
Jeff Bean 2010-05-31, 16:24
Hi Michael, Why did you determine that Hadoop streaming was insufficient for you? Jeff On Mon, May 31, 2010 at 9:17 AM, Michael Robinson <[EMAIL PROTECTED]>wrote: > > Hi Jef, > > I have a C program that processes very large data files which are > compressed, so this program has to have full control of the process. > However > the input data can be broken down into chunks, and a separate (distributed) > process for each chunk can be run, which what I am doing now, but I am > doing > this manually at this time. > > I am looking to use a distributed system like Hadoop to do this so that i > controls the scheduling and all those great things I have read about > Hadoop. > > I was wondering if I can have Hadoop run a batch file (.bat in windows or > .sh in linux), also I would like to run this in Virtual Machines. > > Thanks > > > Michael > -- > View this message in context: > http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p858959.html> Sent from the Hadoop lucene-users mailing list archive at Nabble.com. >
-
Re: calling C programs from Hadoop
Michael Robinson 2010-05-31, 16:47
Jeff, Reading "Hadoof Streaming" I found the following: "How Does Streaming Work In the above example, both the mapper and the reducer are executables that read the input from stdin (line by line) and emit the output to stdout. The utility will create a Map/Reduce job, submit the job to an appropriate cluster, and monitor the progress of the job until it completes. " I am beginning to think that my understanding of map/reduce is faulty. At this time I understand that the mapper takes in data and splits it into chunks creating lists of (<key>, <values>), then it combines this output and sends the result to the reducer. The C program I have reads each line in the input file and searches a master file looking for exact and similar matches then it does computations bases on how similar the results are, so there is no need for creating <key>, <values> lists. Thanks very much Michael -- View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p859041.htmlSent from the Hadoop lucene-users mailing list archive at Nabble.com.
|
|