Storing the bytes as native UTF-16 or UTF-32 will not help. Even
strings in UTF-8 format can be sorted by their code points when stored
as bytes. Unfortunately, that's not really useful for collation as
characters like "è" (U+00E8) should appear between "e" (U+0065) and
"f" (U+0066), but the code points to not allow this.
On Fri, Jun 8, 2012 at 11:14 AM, Stack <[EMAIL PROTECTED]> wrote:
> On Fri, Jun 8, 2012 at 9:35 AM, Tom Brown <[EMAIL PROTECTED]> wrote:
>> Is there any way to control introduce a different ordering scheme from
>> the base comparable bytes? My use case is that I am using UTF-8 data
>> for my keys, and I would like to have scans use UTF-8 collation.
>> Could this be done by providing an alternate implementation of
>> Thanks in advance!
> Unfortunately no Tom. The database is all sorted the same way.
> Different sorts per table would complicate system interactions (the
> catalog tables would have to change sort by table). It might be
> doable but it would take some work.
> Can you store your data UTF-16 or UTF-32? Its a while since I dealt
> w/ this stuff but IIRC, their sort order is byte order? (WARNING! I
> could be way off here).