mysql and utf8 handling in FreeRADIUS 3.2.x

Thu Apr 3 13:43:02 UTC 2025

On Apr 3, 2025, at 7:35 AM, Bjørn Mork via Freeradius-Users <freeradius-users at lists.freeradius.org> wrote:
> I have several questions after hitting this hard:
> 
> Should FR handle the situation better, detecting the character set
> mismatches and automatically escape utf8 multi-byte sequences when the
> target table use a different charset?

  In general, detecting character sets is impossible.  There is significant overlap between valid characters, so any detection attempt is likely to get things wrong.

> Should there be a way to configure rlm_sql into always escaping utf8
> multi-byte sequences?

  Not right now.

> Does the default aggressive ascii escaping really align with allowing
> any multi-byte utf8?  The end results look strange and unexpected IMHO.

  SQL databases use ASCII characters for quotes, statement separation, etc.  SQL databases treat UTF-8 characters the same way they treat upper/lower case.  So valid UTF-8 characters (with high bit set) don't need to be escaped.

> Any thoughts outside "don't to that then"?

  This is definitely a "don't do that" scenario.  Attributes of type 'octets' can contain anything.  If you're gong to store them as printable strings, you have to be careful.  Either verify that their content matches the expected character set, or encode them as hex / base64 / etc.

  The short term solution is perhaps to write a small patch which double-checks that the data matches the latin1 character set you're using.  Any data which doesn't match can be converted to escape sequences.

  Alan DeKok.