Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> cannot find DeprecatedLzoTextInputFormat


Copy link to this message
-
Re: cannot find DeprecatedLzoTextInputFormat
Hi Jessica,

Sorry for the delay. I don't know of a pre-built version of the LZO
libraries that has the fix. I also couldn't quite tell which source
versions might have it. The easiest thing to do would be to pull the
source from github, make any changes, and build it locally:

https://github.com/kevinweil/hadoop-lzo

-Joey

On Mon, Oct 10, 2011 at 7:54 PM, Jessica Owensby
<[EMAIL PROTECTED]> wrote:
> I understood the comments in the JIRA ticket to say that hadoop-lzo
> 0.4.8.jar from gerrit had the fix for
> HIVE-2395<https://issues.apache.org/jira/browse/HIVE-2395>.
>  I wasn't able to find a good version of 0.4.8 of already built (I found
> this, but there appears to be some issues with it:
> http://hadoop-gpl-packing.googlecode.com/svn-history/r18/trunk/src/main/resources/lib/hadoop-lzo-0.4.8.jar).
> And hadoop-lzo-0.4.13.jar (
> http://hadoop-gpl-packing.googlecode.com/svn-history/r39/trunk/hadoop/src/main/resources/lib/hadoop-lzo-0.4.13.jar)
> doesn't contain the fix.  Is there a version of the jar built with the
> HIVE-2395 fix?  I thought I would ask before I build it myself.
>
> Lastly, I didn't mention before that this issue appears in only one of our 2
> environments - both running cdh3u1.  I've done an number of comparisons
> between the environments and am still unable to find a dissimilarity that
> might be resulting in the 'No LZO codec found' error.  So, it
> would surprise me if we required the fix in one environment and did not in
> another -- but that may just show my lack of understanding about hadoop. :-)
>
> Jessica
>
> On Wed, Oct 5, 2011 at 4:27 PM, Jessica Owensby
> <[EMAIL PROTECTED]>wrote:
>
>> Great.  Thanks!  Will give that a try.
>> Jessica
>>
>>
>> On Wed, Oct 5, 2011 at 4:22 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote:
>>
>>> It sounds like you're hitting this:
>>>
>>> https://issues.apache.org/jira/browse/HIVE-2395
>>>
>>> You might need to patch your version of DeprecatedLzoLineRecordReader
>>> to ignore the .lzo.index files.
>>>
>>> -Joey
>>>
>>> On Wed, Oct 5, 2011 at 4:13 PM, Jessica Owensby
>>> <[EMAIL PROTECTED]> wrote:
>>> > Alex,
>>> > The task trackers have been restarted many times across the cluster
>>> since
>>> > this issue was first seen.
>>> >
>>> > Hmmm, I hadn't tried to explicitly add the lzo jar to my classpath in
>>> the
>>> > hive shell, but I just tried it and got the same errors.
>>> >
>>> > Do you see
>>> >
>>> > /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar in the child classpath
>>> when
>>> >
>>> > the task is executed (use 'ps aux' on the node)?
>>> >
>>> >
>>> > While the job wasn't running, I did this and I got back the tasktracker
>>> > process:  ps aux | grep java | grep lzo.
>>> > Do I have to run this while the task is running on that node?
>>> >
>>> > Joey,
>>> > Yes, the lzo files are indexed.  They are indexed using the following
>>> > command:
>>> >
>>> > hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-20110217.jar
>>> > com.hadoop.compression.lzo.LzoIndexer /user/hive/warehouse/foo/bar.lzo
>>> >
>>> > Jessica
>>> >
>>> > On Wed, Oct 5, 2011 at 3:52 PM, Joey Echeverria <[EMAIL PROTECTED]>
>>> wrote:
>>> >> Are your LZO files indexed?
>>> >>
>>> >> -Joey
>>> >>
>>> >> On Wed, Oct 5, 2011 at 3:35 PM, Jessica Owensby
>>> >> <[EMAIL PROTECTED]> wrote:
>>> >>> Hi Joey,
>>> >>> Thanks. I forgot to say that; yes, the lzocodec class is listed in
>>> >>> core-site.xml under the io.compression.codecs property:
>>> >>>
>>> >>> <property>
>>> >>>  <name>io.compression.codecs</name>
>>> >>>
>>> >
>>>  <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
>>> >>> </property>
>>> >>>
>>> >>> I also added the mapred.child.env property to mapred site:
>>> >>>
>>> >>>  <property>
>>> >>>    <name>mapred.child.env</name>
>>> >>>    <value>JAVA_LIBRARY_PATH=/usr/lib/hadoop-0.20/lib</value>
>
Joseph Echeverria
Cloudera, Inc.
443.305.9434
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB