mschap via ntlm_auth over a socket
p.mayers at imperial.ac.uk
Wed Dec 3 14:00:09 CET 2014
On 03/12/14 00:33, Matthew Newton wrote:
> We've been hit with the same issue that a lot of other sites seem
> to have seen, which is that one freeradius server just doesn't
> seem to be able to authenticate over a certain number of users per
> second, somewhere around 30 or so, to our AD domain controllers.
> Standard ntlm_auth -> winbind -> AD stuff.
Ah, the AD auth spiral of doom. Very puzzling that one. Never really got
to the bottom of what exactly the cause was - I think it was a mix of
hardware, kernel, samba and AD problems at our site.
> We've done things like tweak "winbind max domain connections" and
> "winbind max clients", but can't seem to get winbind to connect to
> more than one DC, or seemingly parallelise anything in any way.
Which version of Samba are you using, and how are you determining
there's no parallelism?
We're running on Samba 3.6.9 on RHEL6, and have "winbind max domain
connections = 12", and with "lsof -i :445" we see many windbind
processes and separate TCP connections after spikes of load.
However, the parallelism is complex - in Samba 3.x it's only to one DC,
not several, and you run into issues with failed auths being punted to
the PDC emulator.
> Though reading archives it looks like we may have to use Samba 4
> for that (though I still don't understand the reason for the max
> connections option if it can't/won't do it; I must me missing
As above, works for us.
> I've done some digging, and looking at the winbind debug logs, it
> seems to be taking around 3ms to do an auth, give or take.
> However, using sysdig to watch the ntlm_auth process, it takes
I've got C source for a tiny wrapper that logs process start/stop times
to an append file, which I found useful for instrumentation.
You might also find it useful to setup a rolling tcpdump ringbuffer
capture to the DCs, and use "tshark -T fields" to dump out the msrpc
header and packet time - although payload is encrypted, you can
correlate request ID in request/response payload to get on-the-wire auth
times for AD.
We found the latter very useful.
> The second is to add a new method, "ntlmauth_socket", which uses
> a connection pool to talk to a UNIX socket to send/receive auth
> data to ntlm_auth.
Handy. FWIW I think this is a better solution long-term for big/busy
sites - it avoids process startup overhead completely.
I have no time to test right now unfortunately, and like you ITIL has
kept 3.x away from our radius servers :o(
More information about the Freeradius-Devel