Error: Cannot update core dump limit

Wed Jul 13 17:41:44 CEST 2016

> On Jul 13, 2016, at 10:46 AM, Alan DeKok <aland at deployingradius.com> wrote:
> 
> 
>> On Jul 13, 2016, at 10:28 AM, Jakob Hirsch <jh at plonk.de> wrote:
>> 
>> Arran Cudbard-Bell wrote on 2016-07-13 15:17:
>>>> but FR uses setrlimit() with rlim_max=0, and you can only lower
>>>> rlim_max, but not raise it (unless you have the CAP_SYS_RESOURCE
>>>> capability).
>>> OK... So there's probably an issue there.  Could we get the stack trace so it's clear what logic is being executed.
>> 
>> Sure:
> 
>  This code is run before the configuration is read:
> 
> 	{
> 		char const *panic_action = NULL;
> 
> 		panic_action = getenv("PANIC_ACTION");
> 		if (!panic_action) panic_action = main_config.panic_action;
> 
> 		if (panic_action && (fr_fault_setup(panic_action, argv[0]) < 0)) {
> 			fr_perror("Failed configuring panic action: %s", main_config.name);
> 			fr_exit(EXIT_FAILURE);
> 		}
> 	}
> 
>  Note that main_config.panic_action will always be NULL here.  Because it hasn't read the configuration files yet.
> 
>  The solution is to have an "initialization" function which is separate from the "do work" function.

That code doesn't touch the dumpable flag or rlim_cur or rlim_max, or initialize anything to do with core dumps,
the core dump code just happens to exist in the same src file as fr_fault_setup.

OP's idea of not setting rlim_max will work.  We need to be able to change rlim_cur, because IIRC, the process needs
to be temporarily 'dumpable' in order for PATTACH to work in some cases.

A big part of why the code in debug.c is so complex, is because the Linux/BSD/POSIX APIs for controlling this stuff
are awful. Whether the process can have its core dumped or not should affect whether you can attach a debugger to it,
but it does.

There's three different facilities all controlling a slightly different part of this functionality:

- Linux CAP API		- Controls user/process permissions like PATTACH.
- get/set rlimit	- Controls the size of cores and transitively whether core dumps are allowed.
- PRCTRL - Bool		- Again whether core dumps are allowed.

So get/set rlimit do essentially the same thing as PR_SET_DUMPABLE, I guess PR_SET_DUMPABLE must have some interaction
with CAP, that get/set rlimit doesn't...

There's no good test for whether a process is currently being debugged either.  We attempt a PATTACH to try and figure
it out, but that doesn't always work, because we might not have CAP permissions to do that, even though the process
is trying to attach to itself.  It makes absolutely no sense.

fr_fault_setup needs to be run once before we read the config (to set the panic action from the environment),
and once after (if the config specifies a panic action), to set the new panic action.

Its position in radiusd.c was changed recently, because the debug callbacks weren't being installed early enough to
catch issues in the dictionary code.  The code needs to be changed so it's called a second time in mainconfig.c with
the new panic_action (if one was provided).

I'm going to split the debug code and the core dump control code into separate files to make it easier to see
what's going on.  They're not really related.

-Arran
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.freeradius.org/pipermail/freeradius-devel/attachments/20160713/b9e00364/attachment-0001.sig>