Epoll Reactor¶
Overview¶
High-performance, single-threaded event loop using Linux epoll for scalable I/O multiplexing. Supports edge-triggered (ET) and level-triggered (LT) modes. Designed for low-latency networking: batch-drains all ready events per wake-up, supports CPU pinning, and integrates with timer FDs for deterministic scheduling.
Status: Implemented
Full implementation with timer FD integration, CPU affinity, batch dispatch, and non-blocking poll mode.
Architecture¶
┌───────────────────────────────────────────────────────────────────┐
│ Single-threaded reactor (run-to-completion per event): │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ epoll_wait() system call │ │
│ │ Returns batch of N ready file descriptors │ │
│ └──────────────────────┬──────────────────────────────┘ │
│ │ │
│ ┌──────────────────────▼──────────────────────────────┐ │
│ │ for (i = 0..N): │ │
│ │ handler = handlers[events[i].data.fd] │ │
│ │ handler->callback(fd, events, userdata) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ No locks needed: single-thread ownership of all FDs. │
│ Pre-allocated event array: zero dynamic allocation in hot path. │
└───────────────────────────────────────────────────────────────────┘
sequenceDiagram
participant App as Application
participant R as Reactor
participant K as Kernel (epoll)
participant T as TimerFD
App->>R: reactor_init()
App->>R: reactor_add(socket_fd, EPOLLIN | EPOLLET)
App->>R: reactor_add_timer(1ms, callback)
App->>R: reactor_run(-1)
loop Event Loop
R->>K: epoll_wait(events[], 256, timeout)
K-->>R: N ready events
R->>R: dispatch handlers[0..N]
Note over R: Batch processing - all events<br/>dispatched before next syscall
end
T-->>K: timer expiration
K-->>R: EPOLLIN on timer_fd
R->>App: timer_callback(fd, EPOLLIN, userdata) API Reference¶
Data Structures¶
typedef void (*reactor_callback_t)(int fd, uint32_t events, void *userdata);
typedef struct {
reactor_callback_t callback;
void *userdata;
bool active;
} reactor_handler_t;
typedef struct {
int epoll_fd; /* epoll instance */
struct epoll_event events[REACTOR_MAX_EVENTS]; /* Event batch buffer */
reactor_handler_t handlers[REACTOR_MAX_FDS]; /* FD → handler map */
volatile bool running; /* Loop control flag */
uint64_t total_events; /* Statistics */
uint64_t total_iterations; /* Loop iterations */
int cpu_affinity; /* Pinned CPU (-1 = none) */
} reactor_t;
Constants¶
#define REACTOR_MAX_EVENTS 256 /* Max events per epoll_wait batch */
#define REACTOR_MAX_FDS 4096 /* Max registered file descriptors */
Reactor Lifecycle¶
reactor_init¶
Create the epoll instance and zero-initialize all handler slots. Uses EPOLL_CLOEXEC to prevent FD leaks across exec().
Returns: 0 on success, -errno on failure.
reactor_set_affinity¶
Pin the reactor thread to a specific CPU core. Call before reactor_run() for best effect. Recommended with isolcpus= kernel parameter for deterministic latency.
| Parameter | Type | Description |
|---|---|---|
cpu | int | CPU core number (0-based) |
reactor_destroy¶
Close the epoll file descriptor.
FD Registration¶
reactor_add¶
int reactor_add(reactor_t *reactor, int fd, uint32_t events,
reactor_callback_t callback, void *userdata);
Register a file descriptor for monitoring.
| Parameter | Type | Description |
|---|---|---|
fd | int | File descriptor to monitor |
events | uint32_t | Epoll event mask (EPOLLIN, EPOLLOUT, EPOLLET, etc.) |
callback | reactor_callback_t | Function called when events fire |
userdata | void * | Context pointer passed to callback |
Edge-Triggered vs Level-Triggered
Use EPOLLET (edge-triggered) for high-throughput sockets where you drain all data per event. Use level-triggered (default) for simpler semantics where you process one chunk per wake-up.
reactor_modify¶
Change the event mask for an already-registered FD.
reactor_remove¶
Remove a file descriptor from the reactor.
Timer Integration¶
reactor_add_timer¶
int reactor_add_timer(reactor_t *reactor, uint32_t interval_ms,
reactor_callback_t callback, void *userdata);
Create a periodic timer using timerfd_create(CLOCK_MONOTONIC) and register it with the reactor. The timer fires at interval_ms intervals.
Returns: Timer file descriptor on success, -errno on failure.
Timer Callback Contract
Inside the timer callback, you must call reactor_timer_drain(fd) to consume the expiration counter. Failure to drain will cause immediate re-trigger (level-triggered).
reactor_timer_drain¶
Read and return the number of expirations since last drain. Call this inside your timer callback.
Event Loop¶
reactor_run¶
Run the event loop. Blocks until reactor_stop() is called from a callback or another thread.
| Parameter | Type | Description |
|---|---|---|
timeout_ms | int | Per-iteration timeout: -1 = block, 0 = poll, >0 = ms |
Handles EINTR gracefully (retries on signal interruption).
reactor_poll_once¶
Single non-blocking poll iteration. Returns number of events processed. Useful for integrating the reactor into an external loop.
reactor_stop¶
Signal the reactor to exit its run loop. Thread-safe (uses volatile flag).
Utility¶
make_nonblocking¶
Set O_NONBLOCK on a file descriptor via fcntl.
Usage Examples¶
Basic TCP Echo Server¶
#include "epoll_reactor.h"
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
static reactor_t reactor;
void on_client(int fd, uint32_t events, void *userdata)
{
(void)userdata;
if (events & EPOLLIN) {
char buf[4096];
ssize_t n = read(fd, buf, sizeof(buf));
if (n <= 0) {
reactor_remove(&reactor, fd);
close(fd);
return;
}
write(fd, buf, n); /* Echo back */
}
}
void on_accept(int fd, uint32_t events, void *userdata)
{
(void)events; (void)userdata;
int client = accept4(fd, NULL, NULL, SOCK_NONBLOCK | SOCK_CLOEXEC);
if (client >= 0) {
reactor_add(&reactor, client, EPOLLIN | EPOLLET, on_client, NULL);
}
}
int main(void)
{
reactor_init(&reactor);
reactor_set_affinity(&reactor, 2); /* Pin to CPU 2 */
int srv = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
struct sockaddr_in addr = { .sin_family = AF_INET, .sin_port = htons(8080) };
bind(srv, (struct sockaddr*)&addr, sizeof(addr));
listen(srv, 128);
reactor_add(&reactor, srv, EPOLLIN, on_accept, NULL);
reactor_run(&reactor, -1);
reactor_destroy(&reactor);
return 0;
}
Timer-Driven Periodic Task¶
#include "epoll_reactor.h"
#include <stdio.h>
#include <time.h>
void on_timer(int fd, uint32_t events, void *userdata)
{
(void)events;
uint64_t *count = (uint64_t *)userdata;
uint64_t expirations = reactor_timer_drain(fd);
*count += expirations;
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
printf("[%ld.%09ld] Timer fired (total: %lu)\n",
ts.tv_sec, ts.tv_nsec, *count);
}
int main(void)
{
reactor_t reactor;
reactor_init(&reactor);
uint64_t counter = 0;
int tfd = reactor_add_timer(&reactor, 10, on_timer, &counter); /* 10ms */
printf("Timer fd: %d\n", tfd);
reactor_run(&reactor, -1);
reactor_destroy(&reactor);
return 0;
}
Integration with Raw Socket¶
#include "epoll_reactor.h"
#include "raw_socket.h"
static raw_socket_t raw_sock;
void on_raw_packet(int fd, uint32_t events, void *userdata)
{
(void)fd; (void)events; (void)userdata;
uint32_t len;
const void *frame = raw_socket_mmap_recv(&raw_sock, &len);
if (frame) {
/* Process frame in-place (zero-copy) */
const eth_frame_t *eth = (const eth_frame_t *)frame;
/* ... handle frame ... */
raw_socket_mmap_release(&raw_sock);
}
}
int main(void)
{
reactor_t reactor;
reactor_init(&reactor);
reactor_set_affinity(&reactor, 1);
raw_socket_open(&raw_sock, "eth0", ETH_P_ALL);
raw_socket_setup_mmap(&raw_sock, 256, 2048);
reactor_add(&reactor, raw_sock.fd, EPOLLIN | EPOLLET,
on_raw_packet, NULL);
reactor_run(&reactor, -1);
raw_socket_close(&raw_sock);
reactor_destroy(&reactor);
return 0;
}
Build & Run¶
# Compile
gcc -Wall -Wextra -Werror -O3 -std=c11 -o epoll_reactor_demo \
epoll_reactor.c -lpthread
# Run (no special privileges needed for basic usage)
./epoll_reactor_demo
# For CPU pinning, use isolcpus kernel parameter:
# GRUB_CMDLINE_LINUX="isolcpus=2,3"
Performance Characteristics¶
| Metric | Value | Notes |
|---|---|---|
| FD add/remove | O(1) | epoll_ctl is constant time |
| Event dispatch | O(N) | N = number of ready FDs, not total |
| epoll_wait overhead | ~1 µs | Per syscall (amortized over batch) |
| Timer resolution | ~50 µs | timerfd on standard kernel |
| Timer resolution | ~1 µs | With CONFIG_HIGH_RES_TIMERS |
| Max batch size | 256 events | REACTOR_MAX_EVENTS |
CPU Pinning Strategy¶
┌─────────────────────────────────────────────────────────────────┐
│ Recommended CPU Layout (NUMA-aware): │
│ │
│ CPU 0: OS / interrupts │
│ CPU 1: NIC IRQ affinity (set via /proc/irq/N/smp_affinity) │
│ CPU 2: Reactor thread (reactor_set_affinity) │
│ CPU 3: Application worker (if needed) │
│ │
│ Kernel config: │
│ isolcpus=2,3 │
│ nohz_full=2,3 │
│ rcu_nocbs=2,3 │
└─────────────────────────────────────────────────────────────────┘
Test Output¶
$ ./build/epoll_reactor_demo
[reactor] Initialized (epoll_fd=3, max_fds=4096)
[reactor] CPU affinity set to core 2
[reactor] Timer registered: fd=4, interval=10ms
[reactor] Listening on :8080 (fd=5)
[reactor] Running event loop...
[reactor] Timer: 100 expirations in 1.002s (10.02ms avg)
[reactor] Echo: 10000 messages in 48ms (4.8µs avg RTT)
[reactor] Stats: 15234 events, 1102 iterations
[PASS] All epoll_reactor tests passed