segmentation fault freeradius 2.1.7 using rlm_sql

Tue Aug 2 15:48:47 CEST 2011

>> Upgraded freeradius to 2.1.11 (built from source)

> Don't use 2.1.11 it segfaults, checkout the head of the 2.1.X branch in git

> Notice how I DIDN'T suggest upgrading to 2.1.11, but to v2.1.x of git
> branch? There's a reason for that, and you just found out the hard
> way.

"Houston, we have a problem" ;-)

This is not the first time a FreeRADIUS release was not ready for 
production when it was released. Those of us who package upstream 
projects for distribution worry a lot about stability and robustness. 
I've said this before so forgive me, but I'm going to reiterate it 
again. Please don't get mad at the messenger, I have only the best 
intentions with these observations.

FreeRADIUS has some problems which other projects have avoided.

* FreeRADIUS has no notion of a "stable release". Many projects maintain 
both a stable production version and a current version (which is not the 
same as the "tip", rather it's tagged in source code control, tested and 
released just like any other release, it's just got a few more features 
than the rock solid stable release). The rock solid stable release has 
been field proven, should have the absolute confidence of system 
administrators and be viable for multiple years (in other words you can 
install it and be confident once it's put in production you're good to 
go for several years. Occasionally a stable release needs a bug or 
security fix. When that occurs the stable release is surgically modified 
to fix exactly that one issue, it's minor version number is bumped. 
System administrators are never told to upgrade to a significant new 
version because of the bug/security issue, instead they reinstall a 
patched version of "stable".

* FreeRADIUS has way too much churn for a critical system service. Think 
about other system services, how often do you see kerberos, bind, 
iptables, pam, MySQL, etc. going through significant revisions? Are the 
administrators of those services constantly being told to upgrade the 
service because of the bug/feature du jour?

* The QE component of FreeRADIUS has proven to be inadequate. I know 
Alan runs a set of tests and he calls for testing prior to a new 
release. But we've seen the amount of testing which actually occurs is 
inadequate because releases have gone out with significant problems and 
those releases have gotten pushed into production. I think part of the 
problem is the frequent release schedule (measured in months) and the 
lack of a coordinated beta testing program. Releases should not occur 
until after they've successfully navigated a beta program.

I humbly would suggest the following:

* Create and maintain a "stable" version.

* Organize a rigorous beta test program.

* Slow down the release schedule, avoid the temptation to cut a new 
release because of minor new features. If production servers can't run 
successfully without a feature that's an indication the prior release 
was too hasty. Critical bug fixes should occur in the release branch and 
the release branch re-released. The release interval for a system 
service like FreeRADIUS should be measured in years, not months or weeks.

Comments? Thoughts? Do you agree/disagree?

John

-- 
John Dennis <jdennis at redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/