Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Avro Container file and JsonEncoding.


Copy link to this message
-
Re: Avro Container file and JsonEncoding.
AvroJob by default uses AvroInputFormat, which uses Avro Data Files.  You
can write your own InputFormat that returns Avro objects if you wish, but
you will be overriding more and more of the Avro mapreduce implementation.

If you have use cases that are in need of easier configuration, debugging,
or greater flexibility please capture the request and use cases in a JIRA
ticket.  It will be useful for others who choose to volunteer time to
enhance that part of Avro.

Thanks!

On 2/8/12 10:06 AM, "karthik ramachandran" <[EMAIL PROTECTED]> wrote:

> I'm in the process of writing/ debugging a MapReduce job and the Avro MapRed
> API seems to require that the input file be a proper Avro container file.
>
>
> I was hoping to be able to use the AvroMapper interface, feeding it a JSON
> file just as a debugging step.  That way I can use VI to modify values in the
> JSON structure.  However, if the Avro file format has binary delimiters, then
> this is probably not a viable approach.
>
> Thanks,
> Karthik
>
>
> On Wed, Feb 8, 2012 at 12:57 PM, Scott Carey <[EMAIL PROTECTED]> wrote:
>>
>>
>> On 2/8/12 7:14 AM, "karthik ramachandran" <[EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I'm trying to figure out if its possible to create an Avro container file
>>> with JsonEnconding.  It doesn't appear to be:
>>> org.apache.avro.file.DataFileWriter seems to use a binary encoder by
>>> default.
>>
>> One thing to note is that if you write it to an Avro container file in binary
>> it will be significantly smaller.  You can extract the contents as JSON using
>> either the C command line tools or the Java 'tojson' tool.  If the reason you
>> want it in JSON is for human readability, this is all you need.
>>
>> For example, I often do the following:
>>
>> java ­jar avro-tools.jar tojson my_avro_file.avro | grep  Š.
>>
>> or pipe it to other tools to view or interpret as JSON.
>>
>>>
>>> Is there another FileWriter class that I should be using?
>>
>> See Doug's comments.  It doesn't make sense to store JSON in an Avro Data
>> File because it is delimited with binary markers and contains binary
>> metadata.
>>>
>>>
>>> Karthik
>>>
>>> --
>>> Karthik Ramachandran
>>>
>
>
>
> --
> Karthik Ramachandran
> Mobile: 412-606-8981