Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Re: Is it possible to append to an already existing avro file


+
Michael Malak 2013-02-01, 19:32
Copy link to this message
-
Re: Is it possible to append to an already existing avro file
The Jira is:

https://issues.apache.org/jira/browse/AVRO-1035

It is possible to append to an existing Avro file:

http://avro.apache.org/docs/current/api/java/org/apache/avro/file/DataFileWriter.html#appendTo(java.io.File)

Should we close that issue as "fixed"?

Doug

On Fri, Feb 1, 2013 at 11:32 AM, Michael Malak <[EMAIL PROTECTED]> wrote:
> Was a JIRA ticket ever created regarding appending to an existing Avro file on HDFS?
>
> What is the status of such a capability, a year out from when the issue below was raised?
>
> On Wed, 22 Feb 2012 10:57:48 +0100, "Vyacheslav Zholudev" <[EMAIL PROTECTED]> wrote:
>
>> Thanks for your reply, I suspected this.
>>
>> I will create a JIRA ticket.
>>
>> Vyacheslav
>>
>> On Feb 21, 2012, at 6:02 PM, Scott Carey wrote:
>>
>>>
>>> On 2/21/12 7:29 AM, "Vyacheslav Zholudev" <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>> Yep, I saw that method as well as the stackoverflow post. However, I'm
>>>> interested how to append to a file on the arbitrary file system, not
>>>> only on the local one.
>>>>
>>>> I want to get an OutputStream based on the Path and the FileSystem
>>>> implementation and then pass it for appending to avro methods.
>>>>
>>>> Is that possible?
>>>
>>> It is not possible without modifying DataFileWriter. Please open a JIRA
>>> ticket.
>>>
>>> It could not simply append to an OutputStream, since it must either:
>>> * Seek to the start to validate the schemas match and find the sync
>>> marker, or
>>> * Trust that the schemas match and find the sync marker from the last
>>> block
>>>
>>> DataFileWriter cannot refer to Hadoop classes such as FileSystem, but we
>>> could add something to the mapred module that takes a Path and
>>> FileSystem and returns something that implemements an interface that
>>> DataFileWriter can append to.  This would be something that is both a
>>> http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/SeekableInput.html
>>> and an OutputStream, or has both an InputStream from the start of the
>>> existing file and an OutputStream at the end.
>>>
>>>> Thanks,
>>>> Vyacheslav
>>>>
>>>> On Feb 21, 2012, at 5:29 AM, Harsh J wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Use the appendTo feature of the DataFileWriter. See
>>>>>
>>>>> http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/DataFileWriter.html#appendTo(java.io.File)
>>>>>
>>>>> For a quick setup example, read also:
>>>>>
>>>>> http://stackoverflow.com/questions/8806689/can-you-append-data-to-an-existing-avro-data-file
>>>>>
>>>>> On Tue, Feb 21, 2012 at 3:15 AM, Vyacheslav Zholudev
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> is it possible to append to an already existing avro file when it was
>>>>>> written and closed before?
>>>>>>
>>>>>> If I use
>>>>>> outputStream = fs.append(avroFilePath);
>>>>>>
>>>>>> then later on I get: java.io.IOException: Invalid sync!
>>>>>>
>>>>>> Probably because the schema is written twice and some other issues.
>>>>>>
>>>>>> If I use outputStream = fs.create(avroFilePath); then the avro file
>>>>>> gets
>>>>>> overwritten.
>>>>>>
>>>>>> Thanks,
>>>>>> Vyacheslav
>>>>>
>>>>> --
>>>>> Harsh J
>>>>> Customer Ops. Engineer
>>>>> Cloudera | http://tiny.cloudera.com/about
>

On Fri, Feb 1, 2013 at 11:32 AM, Michael Malak <[EMAIL PROTECTED]> wrote:
> Was a JIRA ticket ever created regarding appending to an existing Avro file on HDFS?
>
> What is the status of such a capability, a year out from when the issue below was raised?
>
> On Wed, 22 Feb 2012 10:57:48 +0100, "Vyacheslav Zholudev" <[EMAIL PROTECTED]> wrote:
>
>> Thanks for your reply, I suspected this.
>>
>> I will create a JIRA ticket.
>>
>> Vyacheslav
>>
>> On Feb 21, 2012, at 6:02 PM, Scott Carey wrote:
>>
>>>
>>> On 2/21/12 7:29 AM, "Vyacheslav Zholudev" <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>> Yep, I saw that method as well as the stackoverflow post. However, I'm
>>>> interested how to append to a file on the arbitrary file system, not
+
Michael Malak 2013-02-06, 00:10
+
Doug Cutting 2013-02-06, 00:27
+
Michael Malak 2013-02-06, 03:30
+
Harsh J 2013-02-06, 18:17
+
Michael Malak 2013-02-07, 00:42
+
Harsh J 2013-02-07, 16:28
+
Doug Cutting 2013-02-07, 16:51
+
Harsh J 2013-02-07, 16:56
+
Michael Malak 2013-02-07, 16:42
+
Ken Krugler 2013-02-06, 18:03
+
TrevniUser 2013-07-08, 16:29
+
Doug Cutting 2013-07-09, 16:29
+
TrevniUser 2013-07-09, 17:24