mschap via ntlm_auth over a socket

Matthew Newton mcn4 at leicester.ac.uk
Wed Dec 3 15:00:09 CET 2014


Hi,

On Wed, Dec 03, 2014 at 01:00:09PM +0000, Phil Mayers wrote:
> On 03/12/14 00:33, Matthew Newton wrote:
> >We've done things like tweak "winbind max domain connections" and
> >"winbind max clients", but can't seem to get winbind to connect to
> >more than one DC, or seemingly parallelise anything in any way.
> 
> Which version of Samba are you using, and how are you determining
> there's no parallelism?

Debian wheezy - winbind 3.6.6.

lsof shows 5 connections to a single DC, as you say. But tcpdump
shows essentially all connections coming from a single TCP source
port. Argus (network flow logging) shows the same. There's a very
negligible percentage from the other source ports, even during
busy periods.

Also, winbind debug logs (-d4) show each request with "child
daemon request 14" / "Finished processing child request 14", which
seems to be a connection number. Very infrequently I may see a
number other than 14, which I take to be a parallel request, but
this is a) not often and b) not when the system is under heavy
auth, which is when I'd expect to see it.

But I may have misunderstood the "child request" number...

> We're running on Samba 3.6.9 on RHEL6, and have "winbind max domain
> connections = 12", and with "lsof -i :445" we see many windbind
> processes and separate TCP connections after spikes of load.

Yes, many winbinds running (one per connection plus a parent).

> >I've done some digging, and looking at the winbind debug logs, it
> >seems to be taking around 3ms to do an auth, give or take.
> >However, using sysdig to watch the ntlm_auth process, it takes
> 
> I've got C source for a tiny wrapper that logs process start/stop
> times to an append file, which I found useful for instrumentation.
> 
> You might also find it useful to setup a rolling tcpdump ringbuffer
> capture to the DCs, and use "tshark -T fields" to dump out the msrpc
> header and packet time - although payload is encrypted, you can
> correlate request ID in request/response payload to get on-the-wire
> auth times for AD.

Thanks. Actually, on a quiet RADIUS server it looks like the
normal request time is just over 1ms. I guess the question is if
it goes up significantly for a busy server, which that would show.

Running winbind in debug (-d4) produces some interesting logs,
which show (with the child daemon request / Finished processing
lines) the start/stop of each auth, which seems to be a fairly
constant 3ms, even in busy times.

> >The second is to add a new method, "ntlmauth_socket", which uses
> >a connection pool to talk to a UNIX socket to send/receive auth
> >data to ntlm_auth.
> 
> Handy. FWIW I think this is a better solution long-term for big/busy
> sites - it avoids process startup overhead completely.
> 
> I have no time to test right now unfortunately, and like you ITIL
> has kept 3.x away from our radius servers :o(

Yeah. I might be able to use this as an excuse to get it in
somehow. But really, it shouldn't be that hard. I guess progress
these days is made by standing still.

Cheers,

Matthew


-- 
Matthew Newton, Ph.D. <mcn4 at le.ac.uk>

Systems Specialist, Infrastructure Services,
I.T. Services, University of Leicester, Leicester LE1 7RH, United Kingdom

For IT help contact helpdesk extn. 2253, <ithelp at le.ac.uk>


More information about the Freeradius-Devel mailing list