Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Avro and Hadoop streaming


+
Miki Tebeka 2011-06-02, 21:30
+
Doug Cutting 2011-06-03, 08:43
+
Tatu Saloranta 2011-06-03, 16:18
+
Miki Tebeka 2011-06-15, 00:01
+
Harsh J 2011-06-15, 10:33
Copy link to this message
-
Re: Avro and Hadoop streaming
Miki Tebeka 2011-06-15, 16:26
Still didn't work.

I'm pretty new to hadoop world, I probably need to place the avro jar
somewhere on the classpath of the nodes,
however I have no idea how to do that.

On Wed, Jun 15, 2011 at 3:33 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> Miki,
>
> You'll need to provide the entire canonical class name
> (org.apache.avro.mapred…).
>
> On Wed, Jun 15, 2011 at 5:31 AM, Miki Tebeka <[EMAIL PROTECTED]> wrote:
>> Greetings,
>>
>> I've tried to run a job with the following command:
>>
>> hadoop jar ./hadoop-streaming-0.20.2-cdh3u0.jar \
>>    -input /in/avro \
>>    -output $out \
>>    -mapper avro-mapper.py \
>>    -reducer avro-reducer.py \
>>    -file avro-mapper.py \
>>    -file avro-reducer.py \
>>    -cacheArchive /cache/avro-mapred-1.6.0-SNAPSHOT.jar \
>>    -inputformat AvroAsTextInputFormat
>>
>> However I get
>> -inputformat : class not found : AvroAsTextInputFormat
>>
>> I'm probably missing something obvious to do.
>>
>> Any ideas?
>>
>> Thanks!
>> --
>> Miki
>>
>> On Fri, Jun 3, 2011 at 1:43 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:
>>> Miki,
>>>
>>> Have you looked at AvroAsTextInputFormat?
>>>
>>> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/AvroAsTextInputFormat.html
>>>
>>> Also, release 1.5.2 will include AvroTextOutputFormat:
>>>
>>> https://issues.apache.org/jira/browse/AVRO-830
>>>
>>> Are these perhaps what you're looking for?
>>>
>>> Doug
>>>
>>> On 06/02/2011 11:30 PM, Miki Tebeka wrote:
>>>> Greetings,
>>>>
>>>> I'd like to use hadoop streaming with Avro files.
>>>> My plan is to write an inputformat class that emits json records, one
>>>> per line. This way the streaming application can read one record per
>>>> line.
>>>> (http://hadoop.apache.org/common/docs/r0.15.2/streaming.html#Specifying+Other+Plugins+for+Jobs)
>>>>
>>>> I couldn't find any documentation/help about writing inputformat
>>>> classes. Can someone point me to the right direction?
>>>>
>>>> Thanks,
>>>> --
>>>> Miki
>>>
>>
>
>
>
> --
> Harsh J
>
+
Matt Pouttu-Clarke 2011-06-15, 16:30
+
Scott Carey 2011-06-15, 16:53
+
Miki Tebeka 2011-06-15, 17:36
+
Mona Gandhi 2011-07-12, 00:36
+
Miki Tebeka 2011-10-03, 23:21