Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - Error while random accessing Lzo file in Hadoop


Copy link to this message
-
Re: Error while random accessing Lzo file in Hadoop
Hong Tang 2010-04-20, 05:56
Yuting,

Thanks for reporting the bug, I will take a look.

-Hong

On Apr 19, 2010, at 7:30 PM, Yuting Lin wrote:

> Hi Hong and Ryan
>
> Thanks for your suggestion.
>
> I update the jdk to 1.6.0_20 and try but it doesn't solves the  
> problems. The
> nofile doesn't affect the codes here since there is only one thread  
> read the
> file, and it's the second pair of seek() and next() that caused the  
> error.
>
> I have reported the bug at
> http://code.google.com/p/hadoop-gpl-compression/issues/list and  
> shared the
> test program.
>
> Thanks.
> -
> Regards
> Yuting
>
> On Tue, Apr 20, 2010 at 3:10 AM, Ryan Rawson <[EMAIL PROTECTED]>  
> wrote:
>
>> Hey,
>>
>> You are running an extremely old JVM.  Could you try with JDK
>> 1.6.0_19?  (or at least 14)
>>
>> Also your ulimit -n is fairly low 1024 file handles is not enough.  
>> Try 32k
>>
>> -ryan
>>
>> On Mon, Apr 19, 2010 at 6:33 AM, Yuting Lin <[EMAIL PROTECTED]>  
>> wrote:
>>> Hi all
>>>
>>> I try to random access the sequence file which is blocked  
>>> compressed by
>>> "com.hadoop.compression.lzo.LzoCodec.class".  However, when the  
>>> program
>>> contains more than one seek(offset) followed by next(), it leads
>> unexpected
>>> error in Java (The offset passed to seek() is the beginning  
>>> position of
>> the
>>> block).
>>>
>>> If the sequence file is compressed by
>>> "org.apache.hadoop.io.compress.GzipCodec.class", there is no such  
>>> error.
>>>
>>> My Linux is 32-bit 2.6.27, Lzo version is 0.1.0. OS, Java version is
>> 1.6.0.
>>> The log of error is shown below.
>>>
>>> Is there any approach to random access the lzo-file in Hadoop?  
>>> Thanks.
>>> -
>>> Regards
>>> Yuting
>>>
>>> ---------------  S Y S T E M  ---------------
>>>
>>> OS:lenny/sid
>>>
>>> uname:Linux 2.6.27-14-generic #1 SMP Mon Aug 31 13:01:41 UTC 2009  
>>> i686
>>> libc:glibc 2.8.90 NPTL 2.8.90
>>> rlimit: STACK 8192k, CORE 0k, NPROC 26607, NOFILE 1024, AS infinity
>>> load average:0.03 0.11 0.14
>>>
>>> CPU:total 2 (2 cores per cpu, 1 threads per core) family 6 model 15
>> stepping
>>> 11, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3
>>>
>>> Memory: 4k page, physical 3369768k(1217868k free), swap  
>>> 1140572k(1140572k
>>> free)
>>>
>>> vm_info: Java HotSpot(TM) Server VM (11.0-b15) for linux-x86 JRE
>>> (1.6.0_10-b33), built on Sep 26 2008 01:05:20 by "java_re" with gcc
>> 3.2.1-7a
>>> (J2SE release)
>>>
>>> ---------------  T H R E A D  ---------------
>>>
>>> Current thread (0x09542000):  JavaThread "main" [_thread_in_native,
>>> id=22346, stack(0xb7e6e000,0xb7ebf000)]
>>>
>>> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR),
>>> si_addr=0x09a82000
>>>
>>> Registers:
>>> EAX=0x00000000, EBX=0x7181cff4, ECX=0x01048f83, EDX=0x00074de0
>>> ESP=0xb7ebd6c4, EBP=0xb7ebd6f8, ESI=0x09a0c7b5, EDI=0x09a0d21b
>>> EIP=0x716d644b, CR2=0x09a82000, EFLAGS=0x00210212
>>>
>>> Top of Stack: (sp=0xb7ebd6c4)
>>> 0xb7ebd6c4:   00fd419e 00000000 01048f82 09a0d21f
>>> 0xb7ebd6d4:   09a0c7b9 099fc003 09a0d21b 099fc21e
>>> 0xb7ebd6e4:   095c644c 00000022 7181cff4 09542114
>>> 0xb7ebd6f4:   09542000 b7ebd7f8 7181a994 099fc000
>>> 0xb7ebd704:   00000003 09a0d000 b7ebd764 00000000
>>> 0xb7ebd714:   00000000 b4dec439 716d6200 099fc002
>>> 0xb7ebd724:   00000000 ae190570 b7ebd72c 72a18ae0
>>> 0xb7ebd734:   b7ebd848 00000003 00010000 72a18af0
>>>
>>> Instructions: (pc=0x716d644b)
>>> 0x716d643b:   26 00 00 00 00 8b 44 16 04 8b 7d e4 83 6d cc 04
>>> 0x716d644b:   89 44 17 04 83 c2 04 83 7d cc 03 77 e8 8d 41 fb
>>> ...
>>>
>>