Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - load a serialized object in hadoop


Copy link to this message
-
Re: load a serialized object in hadoop
Luke Lu 2010-10-13, 21:28
On Wed, Oct 13, 2010 at 2:21 PM, Shi Yu <[EMAIL PROTECTED]> wrote:
> Hi,  thanks for the advice. I tried with your settings,
> $ bin/hadoop jar Test.jar OOloadtest -D HADOOP_CLIENT_OPTS=-Xmx4000m
> still no effect. Or this is a system variable? Should I export it? How to
> configure it?

HADOOP_CLIENT_OPTS is an environment variable so you should run it as
HADOOP_CLIENT_OPTS=-Xmx1000m bin/hadoop jar Test.jar OOloadtest

if you use sh derivative shells (bash, ksh etc.) prepend env for other shells.

__Luke
> Shi
>
>  java -Xms3G -Xmx3G -classpath
> .:WordCount.jar:hadoop-0.19.2-core.jar:lib/log4j-1.2.15.jar:lib/commons-collections-3.2.1.jar:lib/stanford-postagger-2010-05-26.jar
> OOloadtest
>
>
> On 2010-10-13 15:28, Luke Lu wrote:
>>
>> On Wed, Oct 13, 2010 at 12:27 PM, Shi Yu<[EMAIL PROTECTED]>  wrote:
>>
>>>
>>> I haven't implemented anything in map/reduce yet for this issue. I just
>>> try
>>> to invoke the same java class using   bin/hadoop  command.  The thing is
>>> a
>>> very simple program could be executed in Java, but not doable in
>>> bin/hadoop
>>> command.
>>>
>>
>> If you are just trying to use bin/hadoop jar your.jar command, your
>> code runs in a local client jvm and mapred.child.java.opts has no
>> effect. You should run it with HADOOP_CLIENT_OPTS=-Xmx1000m bin/hadoop
>> jar your.jar
>>
>>
>>>
>>> I think if I couldn't get through the first stage, even I had a
>>> map/reduce program it would also fail. I am using Hadoop 0.19.2. Thanks.
>>>
>>> Best Regards,
>>>
>>> Shi
>>>
>>> On 2010-10-13 14:15, Luke Lu wrote:
>>>
>>>>
>>>> Can you post your mapper/reducer implementation? or are you using
>>>> hadoop streaming? for which mapred.child.java.opts doesn't apply to
>>>> the jvm you care about. BTW, what's the hadoop version you're using?
>>>>
>>>> On Wed, Oct 13, 2010 at 11:45 AM, Shi Yu<[EMAIL PROTECTED]>    wrote:
>>>>
>>>>
>>>>>
>>>>> Here is my code. There is no Map/Reduce in it. I could run this code
>>>>> using
>>>>> java -Xmx1000m ,  however, when using  bin/hadoop  -D
>>>>> mapred.child.java.opts=-Xmx3000M   it has heap space not enough error.
>>>>>  I
>>>>> have tried other program in Hadoop with the same settings so the memory
>>>>> is
>>>>> available in my machines.
>>>>>
>>>>>
>>>>> public static void main(String[] args) {
>>>>>   try{
>>>>>             String myFile = "xxx.dat";
>>>>>             FileInputStream fin = new FileInputStream(myFile);
>>>>>             ois = new ObjectInputStream(fin);
>>>>>             margintagMap = ois.readObject();
>>>>>             ois.close();
>>>>>             fin.close();
>>>>>     }catch(Exception e){
>>>>>         //
>>>>>    }
>>>>> }
>>>>>
>>>>> On 2010-10-13 13:30, Luke Lu wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> On Wed, Oct 13, 2010 at 8:04 AM, Shi Yu<[EMAIL PROTECTED]>
>>>>>>  wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> As a coming-up to the my own question, I think to invoke the JVM in
>>>>>>> Hadoop
>>>>>>> requires much more memory than an ordinary JVM.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> That's simply not true. The default mapreduce task Xmx is 200M, which
>>>>>> is much smaller than the standard jvm default 512M and most users
>>>>>> don't need to increase it. Please post the code reading the object (in
>>>>>> hdfs?) in your tasks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I found that instead of
>>>>>>> serialization the object, maybe I could create a MapFile as an index
>>>>>>> to
>>>>>>> permit lookups by key in Hadoop. I have also compared the performance
>>>>>>> of
>>>>>>> MongoDB and Memcache. I will let you know the result after I try the
>>>>>>> MapFile
>>>>>>> approach.
>>>>>>>
>>>>>>> Shi
>>>>>>>
>>>>>>> On 2010-10-12 21:59, M. C. Srivas wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 12, 2010 at 4:50 AM, Shi Yu<[EMAIL PROTECTED]>
>>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I want to load a serialized HashMap object in hadoop. The file of
>>>>>>>>>> stored