Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Avro and Hadoop streaming


+
Miki Tebeka 2011-06-02, 21:30
+
Doug Cutting 2011-06-03, 08:43
+
Tatu Saloranta 2011-06-03, 16:18
+
Miki Tebeka 2011-06-15, 00:01
Copy link to this message
-
Re: Avro and Hadoop streaming
Miki,

You'll need to provide the entire canonical class name
(org.apache.avro.mapred…).

On Wed, Jun 15, 2011 at 5:31 AM, Miki Tebeka <[EMAIL PROTECTED]> wrote:
> Greetings,
>
> I've tried to run a job with the following command:
>
> hadoop jar ./hadoop-streaming-0.20.2-cdh3u0.jar \
>    -input /in/avro \
>    -output $out \
>    -mapper avro-mapper.py \
>    -reducer avro-reducer.py \
>    -file avro-mapper.py \
>    -file avro-reducer.py \
>    -cacheArchive /cache/avro-mapred-1.6.0-SNAPSHOT.jar \
>    -inputformat AvroAsTextInputFormat
>
> However I get
> -inputformat : class not found : AvroAsTextInputFormat
>
> I'm probably missing something obvious to do.
>
> Any ideas?
>
> Thanks!
> --
> Miki
>
> On Fri, Jun 3, 2011 at 1:43 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:
>> Miki,
>>
>> Have you looked at AvroAsTextInputFormat?
>>
>> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/AvroAsTextInputFormat.html
>>
>> Also, release 1.5.2 will include AvroTextOutputFormat:
>>
>> https://issues.apache.org/jira/browse/AVRO-830
>>
>> Are these perhaps what you're looking for?
>>
>> Doug
>>
>> On 06/02/2011 11:30 PM, Miki Tebeka wrote:
>>> Greetings,
>>>
>>> I'd like to use hadoop streaming with Avro files.
>>> My plan is to write an inputformat class that emits json records, one
>>> per line. This way the streaming application can read one record per
>>> line.
>>> (http://hadoop.apache.org/common/docs/r0.15.2/streaming.html#Specifying+Other+Plugins+for+Jobs)
>>>
>>> I couldn't find any documentation/help about writing inputformat
>>> classes. Can someone point me to the right direction?
>>>
>>> Thanks,
>>> --
>>> Miki
>>
>

--
Harsh J
+
Miki Tebeka 2011-06-15, 16:26
+
Matt Pouttu-Clarke 2011-06-15, 16:30
+
Scott Carey 2011-06-15, 16:53
+
Miki Tebeka 2011-06-15, 17:36
+
Mona Gandhi 2011-07-12, 00:36
+
Miki Tebeka 2011-10-03, 23:21
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB