Freeradius 3.0.12 problems - redis and mysql pool connections
aland at deployingradius.com
Fri Dec 23 02:51:37 CET 2016
On Dec 22, 2016, at 4:09 PM, Michal Tomaszewski <Michal.Tomaszewski at cca.pl> wrote:
>> That's likely a source of the problem. If the HA proxy isn't closing bad connections, FreeRADIUS has no idea that the connection is bad,
> Not exactly.
> It doesn't happen in case of MySQL.
It doesn't matter to FreeRADIUS what the back-end database is.
> And please note - the same situation appears when you stop and start haproxy service. Connections are closed for sure as haproxy is stopped, so such connection does not exist any more.
Then the OS is responsible for telling FreeRADIUS that the connection is closed.
i.e. when FreeRADIUS tries to use the connection, it gets told "connection closed", and it tries another one. The server *may* keep an idle connection open, even if it's dead. That's because the server doesn't know it's dead.
> What more - there is no other way than haproxy to point freeradius to master redis-server while redis does not have another mechanism to check which server is a master and redirect requests. The haproxy use is the preferred one. It sends PING/PONG to check if redis server is alive.
That's really the responsibility of the underling library. It's also bad practice to have FreeRADIUS to periodically "ping" every connection. There many be many connections, and if the server is idle, the pings serve no purpose.
> And in freeradius we have no other tools to choose master redis-server. Writes to slave are not replicated to master and thus are not persistent...
Version 4 implements full Redis cluster support.
> Freeradius does not check the state if the connection is alive.
If it's not using the connection, it doesn't matter if the connection is dead. if it is using the connection, it will discover when the connection is dead.
> Maybe redis-tools (or redis library) is broken and does not give such possibility but you always can verify whether connection is alive using PING/PONG mechanism.
Then fix the library. We just don't have time to implement work-arounds for bugs in DB libraries.
> So changing idl_timeout to 60 secs makes it working much better. But in for example MySQL connection state is checked much better.
Probably because the MySQL client library is smarter than the Redis one.
> I've changed it and it helps. But there is still 60-second delay to have redis-server responding.
I have no idea why.
Again, if the connection is dead, then the server discovers that immediately. If the connection *looks* alive but is really dead, then blame HA Proxy or the OS. They're responsible for closing connections.
>> I don't see that ERROR as much of a problem. The server tried to use an old connection, it failed, and when the server tried another connection, that one failed, too.
> Yes. But connection has expired and probably something in connection's pool management can be improved.
How? Make a concrete suggestion. Or send a patch.
> Server is not down. Server is working perfectly all the time. Those messages are not because MySQL server is down. They are appearing in normal operation when server is 100% available. They are probably because idle_timeout closes the connection and freeradius does not record the fact that such connection does not exist.
No. And why guess? The source is available to you.
The idle_timeout configuration is enforced *before* the SQL module gets a connection. The SQL module just asks for a connection, and then gets one.
>> Then set idle_timeout to a smaller value. That way idle connections will get closed more quickly, even if you have "lifetime" set to one day.
> Idle_timeout is in default value of 60 seconds.
> If I change it to e.g. 2 secs I have much more ERROR messages.
I don't see how that happens.
>> Perhaps I could turn this problem around. What do you want FreeRADIUS to *do* when the SQL server goes down? How does FreeRADIUS know that SQL is down? What happens when SQL goes down?
> SQL is working all the time without any interruption. It is not this case.
>> These aren't simple questions to answer. FreeRADIUS does it's best with the information it has.
> In my opinion connection is closed because of idle_timeout but freeradius still tries to use closed connection...
That's just not possible in the current code.
Again, why guess? The code is available to you. Go read it.
More information about the Freeradius-Users