Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Re: Is it possible to append to an already existing avro file


Copy link to this message
-
Re: Is it possible to append to an already existing avro file
Harsh J 2013-02-07, 16:56
I *completely* missed that, although I've worked with it in past, thanks Doug!

I updated my example: https://gist.github.com/QwertyManiac/4724582.

On Thu, Feb 7, 2013 at 10:21 PM, Doug Cutting <[EMAIL PROTECTED]> wrote:
> The avro-mapred module includes a Seekable implementation that works
> with HDFS called FsInput:
>
> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/FsInput.html
>
> With this, your example can be made considerably smaller.
>
> Doug
>
>
>
> On Thu, Feb 7, 2013 at 8:28 AM, Harsh J <[EMAIL PROTECTED]> wrote:
>> I assume by non-trivial you meant the extra Seekable stuff I needed to
>> wrap around the DFS output streams to let Avro take it as append-able?
>> I don't think its possible for Avro to carry it since Avro (core) does
>> not reverse-depend on Hadoop. Should we document it somewhere though?
>> Do you have any ideas on the best place to do that?
>>
>> On Thu, Feb 7, 2013 at 6:12 AM, Michael Malak <[EMAIL PROTECTED]> wrote:
>>> Thanks so much for the code -- it works great!
>>>
>>> Since it is a non-trivial amount of code required to achieve append, I suggest attaching that code to AVRO-1035, in the hopes that someone will come up with an interface that requires just one line of user code to achieve append.
>>>
>>> --- On Wed, 2/6/13, Harsh J <[EMAIL PROTECTED]> wrote:
>>>
>>>> From: Harsh J <[EMAIL PROTECTED]>
>>>> Subject: Re: Is it possible to append to an already existing avro file
>>>> To: [EMAIL PROTECTED]
>>>> Date: Wednesday, February 6, 2013, 11:17 AM
>>>> Hey Michael,
>>>>
>>>> It does implement the regular Java OutputStream interface,
>>>> as seen in
>>>> the API: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FSDataOutputStream.html.
>>>>
>>>> Here's a sample program that works on Hadoop 2.x in my
>>>> tests:
>>>> https://gist.github.com/QwertyManiac/4724582
>>>>
>>>> On Wed, Feb 6, 2013 at 9:00 AM, Michael Malak <[EMAIL PROTECTED]>
>>>> wrote:
>>>> > I don't believe a Hadoop FileSystem is a Java
>>>> OutputStream?
>>>> >
>>>> > --- On Tue, 2/5/13, Doug Cutting <[EMAIL PROTECTED]>
>>>> wrote:
>>>> >
>>>> >> From: Doug Cutting <[EMAIL PROTECTED]>
>>>> >> Subject: Re: Is it possible to append to an already
>>>> existing avro file
>>>> >> To: [EMAIL PROTECTED]
>>>> >> Date: Tuesday, February 5, 2013, 5:27 PM
>>>> >> It will work on an OutputStream that
>>>> >> supports append.
>>>> >>
>>>> >> http://avro.apache.org/docs/current/api/java/org/apache/avro/file/DataFileWriter.html#appendTo(org.apache.avro.file.SeekableInput,
>>>> >> java.io.OutputStream)
>>>> >>
>>>> >> So it depends on how well HDFS implements
>>>> >> FileSystem#append(), not on
>>>> >> any changes in Avro.
>>>> >>
>>>> >> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#append(org.apache.hadoop.fs.Path)
>>>> >>
>>>> >> I have no recent personal experience with append
>>>> in
>>>> >> HDFS.  Does anyone
>>>> >> else here?
>>>> >>
>>>> >> Doug
>>>> >>
>>>> >> On Tue, Feb 5, 2013 at 4:10 PM, Michael Malak
>>>> <[EMAIL PROTECTED]>
>>>> >> wrote:
>>>> >> > My understanding is that will append to a file
>>>> on the
>>>> >> local filesystem, but not to a file on HDFS.
>>>> >> >
>>>> >> > --- On Tue, 2/5/13, Doug Cutting <[EMAIL PROTECTED]>
>>>> >> wrote:
>>>> >> >
>>>> >> >> From: Doug Cutting <[EMAIL PROTECTED]>
>>>> >> >> Subject: Re: Is it possible to append to
>>>> an already
>>>> >> existing avro file
>>>> >> >> To: [EMAIL PROTECTED]
>>>> >> >> Date: Tuesday, February 5, 2013, 5:08 PM
>>>> >> >> The Jira is:
>>>> >> >>
>>>> >> >> https://issues.apache.org/jira/browse/AVRO-1035
>>>> >> >>
>>>> >> >> It is possible to append to an existing
>>>> Avro file:
>>>> >> >>
>>>> >> >> http://avro.apache.org/docs/current/api/java/org/apache/avro/file/DataFileWriter.html#appendTo(java.io.File)
>>>> >> >>
>>>> >> >> Should we close that issue as "fixed"?
>>>> >> >>
>>>> >> >> Doug
>>>> >> >>
>>>> >> >> On Fri, Feb 1, 2013 at 11:32 AM, Michael
>>>> Malak

Harsh J