Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Remote connection bottleneck?


Copy link to this message
-
Re: Remote connection bottleneck?
> In the new shell, I wento to the hadoop/bin directory in my computer

Why didn't you issue the command from window which had ssh ?

On Sat, Sep 25, 2010 at 6:53 PM, Mario M <[EMAIL PROTECTED]> wrote:

> Hi,
> what I did was this:
>
> I am working with Cygwin in Windows 7.
>
> - I copied my jar file ITESMCEMdebug.jar to the cluster in the directory
> /home/mariom . (I then connected with the ssh and confirmed that it is
> there).
>
> - I left the ssh window open and opened another cygwin shell.
>
> - In the new shell, I wento to the hadoop/bin directory in my computer, and
> ran:
>
> "bash hadoop jar /home/mariom/ITESMCEMdebug.jar"
>
> (I omitted the arguments just to test, my program outputs the usage
> instructions when called without arguments)
>
> - I got this:
>
> Exception in thread "main" java.io.IOException: Error opening job jar:
> /home/mariom/ITESMCEMdebug.jar
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
> Caused by: java.io.FileNotFoundException: \home\mariom\ITESMCEMdebug.jar
> (El sistema no puede encontrar la ruta especificada)
>         at java.util.zip.ZipFile.open(Native Method)
>         at java.util.zip.ZipFile.<init>(ZipFile.java:114)
>         at java.util.jar.JarFile.<init>(JarFile.java:133)
>         at java.util.jar.JarFile.<init>(JarFile.java:70)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
>
> - If I run my local jar file with "bash hadoop jar ITESMCEMdebug.jar", it
> works fine (it outputs the usage instructions).
>
> Also, is it ok that I have to write "bash" everytime? The examples I have
> seen just seem to use "hadoop jar etc", I guess this is Cygwin specific,
> otherwise it will say bash: hadoop: command not found.
>
> Thanks again :) for your time.
>
> Mario Maqueo
> ITESM-CEM
>
>
>
> PS: "El sistema no puede encontrar la ruta especificada" = "The system
> can't find the specified route" In case the spanish text might confuse you.
>
>
> 2010/9/25 Ted Yu <[EMAIL PROTECTED]>
>
>> Mario:
>> Can you show us the error when you run the following ?
>> "hadoop jar <route where I placed the file with the ssh connection>
>> <arguments>"
>>
>>
>>
>>  Hello,
>>>> please excuse my ignorance, but how can I run it from there?
>>>> Up to now I've been running the programs with "hadoop jar <localfile>
>>>> <arguments>".
>>>>
>>>> I tried copying the jar to the HDFS and using "hadoop jar <HDFS route>
>>>> <arguments>" but that didn't work (file not found), so I went to the ssh
>>>> connection and copied the jar to my directory in there, but now I don't know
>>>> how to run it from there.  "hadoop jar <route where I placed the file with
>>>> the ssh connection> " didn't work.
>>>>
>>>> I am not very experienced with ssh, so I am sorry if this is basic
>>>> stuff.
>>>>
>>>> Thanks,
>>>>
>>>> Mario Maqueo
>>>> ITESM-CEM
>>>>
>>>> 2010/9/25 Ted Yu <[EMAIL PROTECTED]>
>>>>
>>>> Mario:
>>>>> Please produce a jar, place it on one of the servers in the cloud and
>>>>> run from there.
>>>>>
>>>>>
>>>>> On Sat, Sep 25, 2010 at 7:46 AM, Raja Thiruvathuru <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> MapReduce doesn't download the actual data, but it reads meta-data
>>>>>> before it starts MapReduce job
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 25, 2010 at 7:55 AM, Mario M <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> I am having a problem that might be expected behaviour. I am using a
>>>>>>> cloud with Hadoop remotely through ssh. I have a program that runs for about
>>>>>>> a minute, it processes a 200 MB file using NLineInputFormat and the user
>>>>>>> decides the number of lines to divide the file. However, before the
>>>>>>> map-reduce phase starts, the part of the program that divides the input runs
>>>>>>> locally in my computer, which means that if I use a 100 Mbps connection to
>>>>>>> access the cloud, it isn't that much of a problem, but in my house with a 1
>>>>>>> Mbps connection, the program takes about 30 minutes or more to process this
>>>>>>> input. Apparently it is downloading the full 200 MB, processing them to
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB