Question about Clickhouse DB to Store Accounting
Terry Burton
terry.burton at gmail.com
Fri Mar 6 14:00:39 UTC 2026
On Tue, 3 Mar 2026 at 20:53, Ibrahim Almahfooz
<ibrahim.nezar at sevennet.net> wrote:
> We are working on a project to store every accounting start, update and
> stop for special reporting purposes and we are thinking of using ClickHouse
> DB due to its capability such as supporting fast analytical queries and
> great for reporting and BI dashboards.
>
> Are there any concerns or suggestions if we decide to use rlm_sql_unixodbc
> with it, concerns like compatibility or performance?
I recently did some brief testing using ClickHouse as a LE archive -
using forced accounting stops only. Requirement was (1) to beat
regular SQL ingest rates, (2) to overcome the issue that above certain
ingest rates and retention periods it is not reasonable to expect to
be able to recover a typical streaming replication solution after
replica failure, (3) lookup users/circuits by IPv4/v6 within a
well-constrained time period, then to invert the search to find other
IPv4/v6 addresses used by identified users/circuits within a larger
period of time.
Wrote out the accounting requests as one JSON message per line
(rlm_linelog) into per minute files:
- linelog.filename = ${radacctdir}/acct-%Y%m%d%H%G
- linelog.format = "%{json_encode:&request:[*]}"
- json.encode.output_mode = object_simple
Named the ClickHouse columns directly after the FreeRADIUS dictionary,
which allows you to batch insert data by reading the closed / unlocked
files without any additional processing:
INSERT INTO radacct FORMAT JSONEachRow
{ <JSON-encoded accounting message> }
{ <JSON-encoded accounting message> }
...
Bespoke batch inserter process that used sendfile to perform zero-copy
publishing of the files data directly to the ClickHouse socket/s. It
reads batches at offsets within the file upon POSIX unlock and
provides "plugging" - batched, but without waiting for the file to
close.
(ClickHouse has a frontend cache that batches INSERTs into "parts"
before committing them to disk, providing good ingest performance.
However, this doesn't overcome the protocol overhead, so a
well-designed "batch inserter" provides considerable improvement. The
canonical design pattern here is of course to use something like Kafka
and do batching in the consumer.)
The ingest rate of a virtualised 3-node, 3-replica-count cluster
exceeded 15k rows/sec, all three nodes running on a single, modest
laptop (Linux KVM). Node replacement was automatic and transparent
with ~1B rows. YMMV.
ClickHouse isn't a regular SQL database. Make sure you understand the
workloads that it supports, how it encodes data, and the properties of
the various index strategies. Don't expect efficient lookups by any
key!
You will need to carefully design some materialised columns to support
efficient lookups. For example to lookup by IPv6 address given that
RADIUS data likely contains v6 prefixes in one or more attributes.
More information about the Freeradius-Users
mailing list