Redundant Oracle instances

Anders Holm anders.holm at sysadmin.ie
Tue Mar 3 18:16:50 CET 2009


Hi folks.

I've got FR 2.1.3 running hooked up to an Oracle instance. While testing
failure scenarios I'm finding that the module never fails. I'm testing
failures where the server has initially been able to connect to the database
and then subsequently the database goes away. I'm testing by doing a nasty
ifdown on the interface to simulate dropping network connectivity. Hence,
this is for disaster type situations where something suddenly severs our
connectivity.

What I see when running a radtest to localhost is that FR tries the initial
SELECT query we have defined and then sits doing nothing until something
eventually times out about 18 minutes later and then it proceeds to process
whatever else has been sent to it.

I'd be curious in knowing how this timeout can be tweaked as 18 minutes is
way too long for us, though I've been unable to find any documentation
leading me to an answer. Seems this may be somewhere in the Oracle side of
things, but I'm really not sure to be honest.

I'd also be highly curious to know how one may return an Access-Accept even
though we have not been able to actually authenticate the account, seeing as
our DB is down which holds all the credentials. It seems the Fail-Over Wiki
has a section on if-else branching which may be useful here, as I'd really
only want to send Access-Accept when the DB truly has failed. though the
wiki states "Documentation will be updated later..." and doesn't go into any
details on how this could be achieved. Reason we'd want this "Fail-Open" as
we have dubbed it, is that our end customers shouldn't unnecessarily be
blocked from using our services.

Relevant debug lines (yes, we have tweaked the SQL queries to suit our
needs):
Tue Mar  3 15:29:22 2009 : Info: [sql]  expand: SELECT
id,UserName,Attribute,Value,op FROM radcheck WHERE Username =
'%{SQL-User-Name}' ORDER BY id -> SELECT id,UserName,Attribute,Value,op FROM
radcheck WHERE Username = 'test2' ORDER BY id
Tue Mar  3 15:29:22 2009 : Debug: SELECT id,UserName,Attribute,Value,op FROM
radcheck WHERE Username = 'test2' ORDER BY id

... time passes ...

Tue Mar  3 15:46:56 2009 : Error: rlm_sql_oracle: query failed in
sql_select_query: ORA-03135: connection lost contact
Tue Mar  3 15:46:56 2009 : Error: rlm_sql_oracle: OCI_SERVER_NORMAL
Tue Mar  3 15:46:56 2009 : Error: rlm_sql_getvpdata: database query error
Tue Mar  3 15:46:56 2009 : Error: [sql] SQL query error; rejecting user
Tue Mar  3 15:46:56 2009 : Debug: rlm_sql (sql): Released sql socket id: 3
Tue Mar  3 15:46:56 2009 : Info: +++[sql] returns fail
Tue Mar  3 15:46:56 2009 : Info: +++[ok] returns ok
Tue Mar  3 15:46:56 2009 : Info: ++- policy redundant returns ok
Tue Mar  3 15:46:56 2009 : Info: [pap] WARNING! No "known good" password
found for the user.  Authentication may fail because of this.
Tue Mar  3 15:46:56 2009 : Info: ++[pap] returns noop
Tue Mar  3 15:46:56 2009 : Info: No authenticate method (Auth-Type)
configuration found for the request: Rejecting the user
Tue Mar  3 15:46:56 2009 : Info: Failed to authenticate the user.
Tue Mar  3 15:46:56 2009 : Auth: Login incorrect: [test2/test2] (from client
localhost port 1812)
Tue Mar  3 15:46:56 2009 : Info: Using Post-Auth-Type Reject
Tue Mar  3 15:46:56 2009 : Info: +- entering group REJECT {...}
Tue Mar  3 15:46:56 2009 : Info: [attr_filter.access_reject]    expand:
%{User-Name} -> test2
Tue Mar  3 15:46:56 2009 : Debug:  attr_filter: Matched entry DEFAULT at
line 11
Tue Mar  3 15:46:56 2009 : Info: ++[attr_filter.access_reject] returns
updated
Sending Access-Reject of id 253 to 127.0.0.1 port 33032
Tue Mar  3 15:46:56 2009 : Info: Finished request 0.
Tue Mar  3 15:46:56 2009 : Debug: Going to the next request
Tue Mar  3 15:46:56 2009 : Debug: Waking up in 4.9 seconds.

Here's another interesting part. As the server continues its work, it at
this stage works through the remaining requests sitting in the buffers..

rad_recv: Access-Request packet from host 127.0.0.1 port 33032, id=253,
length=57
Tue Mar  3 15:46:56 2009 : Info: Sending duplicate reply to client localhost
port 33032 - ID: 253
Sending Access-Reject of id 253 to 127.0.0.1 port 33032
Tue Mar  3 15:46:56 2009 : Debug: Waking up in 4.9 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 33032, id=253,
length=57
Tue Mar  3 15:46:56 2009 : Info: Sending duplicate reply to client localhost
port 33032 - ID: 253
Sending Access-Reject of id 253 to 127.0.0.1 port 33032

... and so on and so forth for all remaining in buffer requests ...

Of course, tweaking this timeout value somehow to rather be in the seconds
than minutes if not even sub-seconds would be preferable. Has anyone done
this before and if so could I get a snippet of your configuration showing me
how to achieve this?

Thanks in advance.

//anders
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20090303/0cbe89a8/attachment.html>


More information about the Freeradius-Users mailing list