Proposed behaviour for rlm_mruby (which might impact rlm_perl and rlm_python too)

Sat Nov 26 21:04:55 CET 2016

This weekend I've been trying to get some work done in this very old 
ticket[1] to remove rlm_ruby and replace it with rlm_mruby. mruby is a 
kind of minified ruby that is suitable for embedding in processes (as 
opposed to ruby, which does things like hooking its own signal handlers 
into freeradius). It was definitely an interesting experience, not in 
the least because documentation like this[2] is far from exceptional.

The code I currently have is feature complete[3], but during 
construction I thought of a few limitations that I would like to solve. 
This reminded me of the documented I wanted to write earlier this year 
about my vision on the rlm_language-modules, and how to solve the 
current problems.

There are currently 4 modules for scripting languages: rlm_perl, 
rlm_python, rlm_ruby and rlm_lua. I haven't looked at that last one, so 
for the sake of simplicity I'll just forget about that one for the rest 
of this mail. rlm_python and rlm_ruby work pretty much the same: You 
call a method with an array containing all attributes of the request 
list, and the return value of that method can be either an integer (that 
maps to things like RLM_MODULE_OK), or a list of 3 items: the first one 
is the aforementioned integer, the second one is the updates to the 
reply list, the last one is the updates to the control list (or those 
might be the other way around, which indicates what a terrible interface 
it actually is). A few improvements have been added to rlm_python, but 
those are just minor differences and don't really change the 
architecture. rlm_perl works completely different: there are a number of 
global hash variables (for those not familiar with Perl lingo: a hash is 
a key-value-map). Changes can be made to those hashes, there are no 
parameters for the methods, and the return value is just a simple 
integer (which is defined by some constants in that file, and copied 
from the example.pl file, instead of injected by freeradius).

At the moment, rlm_perl is by far the most powerful of the three, 
because it can access all lists. Although the request list is probably 
enough when you call the module in the authorize section, it is almost 
useless in the post-auth. There have been requests to add more lists to 
rlm_python[4], but I don't like the idea of adding extra arguments to 
the method (the ticket only mentions config and reply, but what about 
session-state, proxy-request and proxy-reply? That makes a total of 6 
arguments).
(As a side note: the same limitations of the input currently exist in 
the rlm_rest module, I submitted a pull request[5] to fix this. The 
output here is a bit more flexible)

Querying the value of an attribute is another thing that differs 
greatly between rlm_{python,ruby} and rlm_perl. The main problem here is 
that RADIUS allows us to have multiple values for an attribute. Perl 
uses a hash, where each value can be either a scalar or an array (for 
the people that know perl: yes, it's an arrayref instead of an array, 
but I'd rather skip about those implementation details to make it easier 
to understand for those who don't know perl). This means your code to 
read a value first performs a hash lookup, than has to check if the 
value is a scalar or an array. Output values are even worse: if the 
attribute/key is not present in the hash: add the value as a scalar, if 
it is a scalar: create an array with that value and your own value, 
otherwise push it into the existing array. Output in rlm_python and 
rlm_mruby is a bit better (rlm_mruby lacks functionality here, but we 
want to get rid of it anyway), you can define an attribute, an operator 
and a value, so you're basicly writing FreeRADIUS config in your code, 
like `['Tmp-String-0', ':=', 'foo']`. The input is a bit different here, 
you get an array of arrays, every attribute/value-pair has got its own 
array (an example: `[['User-Name', 'bob'], ['User-Password', 'hello']]`. 
This works with duplicate values, but a lookup requires you to loop over 
the array, where a hash lookup could be more efficient.

I don't like any of these interfaces, so I'm trying to propose a new 
one. We could use that in every rlm_language-module, so switching 
between languages would be easier and functionality will be preserved.

Use an object for the input. This object has methods like request, 
reply and session_state to get the correct lists. This means no more 
globale variables (perl) and no excessive lists of argument (python, 
ruby).
The result of these method calls would be objects too. These object 
have methods like get_attribute and get_attributes. The first one 
returns a scalar value, if there are multiple attributes it returns the 
first one, If there is no attribute, it does something that is expected 
in the language (perl and ruby would return a NULL-like value here, in 
python a KeyError would be more suiteable). The second method always 
returns a list, that might contain 0 or 1 elements. In the general case, 
people are only interested in the first value.
Updates should be performed via those objects too, so we could use code 
like `control.set_attribute('Cleartext-Password', 'hello')`. This means 
we can update all lists from rlm_python/rlm_ruby as well, and we no 
longer have to remember in what order "control" and "reply" were.

One of the drawbacks of the current implementation is that a lot of 
attributes have to be copied from the request and converted into 
language-specific strings. It is not unlikely that we copy and convert a 
load of EAP-data, while the called script is only interested in the 
User-Name attribute. By converting everything to objects we can make 
these conversions on-demand, instead of doing everything up front.

We might even want to skip the parameter to the methods, and just 
define an abstract base class that has to be subclassed in the script. 
The downside of this being that it would become hard to test the script 
without freeradius, because the base class has been implemented there. 
Then again, the same problem arises with the rest of this proposal.

The biggest drawbacks here are that everything has to be coded, and 
that the language modules become backwards incompatible with older 
version. The changes for rlm_python are relatively small (I just hope 
nobody is using rlm_ruby at the moment, so nothing would be backwards 
incompatible here), but the behaviour of rlm_perl really changes here.

I would like to hear your comments on this proposal, I can't be the 
only one who didn't like the interface of the scripting language 
modules.

[1] https://github.com/FreeRADIUS/freeradius-server/issues/990
[2] http://mruby.org/docs/api/headers/mruby_2Fvariable.h.html
[3] 
https://github.com/herwinw/freeradius-server/tree/rlm_mruby/src/modules/rlm_mruby
[4] https://github.com/FreeRADIUS/freeradius-server/issues/1464
[5] https://github.com/FreeRADIUS/freeradius-server/pull/1843

-- 
Herwin Weststrate