This sounds like the Ruby implementation does not correctly use UTF-8 on
your platform for encoding strings. It may be a bug, but I am not
knowledgeable enough on the Ruby implementation to know for sure.
The Avro specification states that "a string is encoded as a long followed
by that many bytes of UTF-8 encoded character data."
If you think that the Ruby implementation does not adhere to the spec,
please file a bug in JIRA.
On 1/4/12 3:59 AM, "kafka0102 kafka0102" <[EMAIL PROTECTED]> wrote:
> I use avro's java and ruby clients. When they comunite, the ruby client always
> encode(decode) the multi-byte chars(utf-8) to latin1. For now, when the data
> is multi-byte chars,I first encode Iconv.conv("UTF8", "LATIN1",data) in the
> ruby client, and then decoded it Utils.conv(data, "ISO-8859-1","UTF-8"); in
> the java server.It works,but too ugly. I see the avro ruby client using
> StringIO to pack the data, but I cannot find ways to make it support
> multi-byte chars.
> Can anyone help me?