Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # dev >> Memory leak in resolved-writer.c in Avro C, with patch


Copy link to this message
-
Memory leak in resolved-writer.c in Avro C, with patch
Hi -

I was able to isolate a memory leak in the new schema resolver
(resolved-writer.c) in Avro C on the avro-trunk (v1.6.0), in which the
code attempts to access and free memory that has already been freed. I
was able to fix it with a simple patch. What is the best way to proceed?

Longer Version:

I created a small test program to test schema resolution in C. The
program contains a writer schema with two integer elements and a reader
schema with three integer elements. Since the reader schema has elements
that the writer doesn't contain, and default values are not yet
supported in C, the function try_record() goes into the "error" section
(by design) and tries to clean up after itself. This cleanup process has
a memory bug in it.

Specifically, if the reader schema contains two elements of the same
type (in my case two ints), try_record() uses the same resolver for both
elements. This common resolver is used because the function
avro_resolved_writer_new_memoized() tests to see if the schema's of the
two elements are the same, and if they are matched, it returns the
resolver, and does not create a new resolver. So, multiple elements of
the array of field_resolvers[] can contain pointers to the same
resolver.

During the cleanup process, there is a loop that checks if the pointer
field_resolvers[i] is non-NULL, and tries to free all non-NULL
resolvers. Since we have two elements of the field_resolvers[] array
pointing to the same resolver, the code tries to call
avro_value_iface_decref() on an already freed resolver.

This can be simply fixed by checking the field_resolvers[] array for
duplicate entries, and setting the duplicates to NULL. This is what I do
in my patch, appended below. I also verified the patch using valgrind.
Multiple valgrind errors were seen in try_record() and
avro_refcount_dec() before applying the patch, and these errors went
away after applying the patch.

I would be happy to provide my test program, upload my bug report and
patch to JIRA, or make any requested changes to the patch. Please let me
know how to proceed.

Thanks,
Vivek Nadkarni
*******************************************
Start patch
*******************************************

diff --git a/lang/c/src/resolved-writer.c b/lang/c/src/resolved-writer.c
old mode 100644
new mode 100755
index bc7a93e..bb219b1
--- a/lang/c/src/resolved-writer.c
+++ b/lang/c/src/resolved-writer.c
@@ -2431,7 +2431,15 @@ error:
        unsigned int  i;
        for (i = 0; i < wfields; i++) {
            if (field_resolvers[i]) {
+               unsigned int j;
                avro_value_iface_decref(&field_resolvers[i]->parent);
+               for (j = i+1; j < wfields; j++) {
+                   /* Compare pointers to detect and remove duplicates.
*/
+                   if ( field_resolvers[j] == field_resolvers[i] ) {
+                       field_resolvers[j] = NULL;
+                   }
+               }
+               field_resolvers[i] = NULL;
            }
        }
    }
*******************************************
End patch
*******************************************