Skip to content

Epoll Reactor

Overview

High-performance, single-threaded event loop using Linux epoll for scalable I/O multiplexing. Supports edge-triggered (ET) and level-triggered (LT) modes. Designed for low-latency networking: batch-drains all ready events per wake-up, supports CPU pinning, and integrates with timer FDs for deterministic scheduling.

Status: Implemented

Full implementation with timer FD integration, CPU affinity, batch dispatch, and non-blocking poll mode.

Architecture

┌───────────────────────────────────────────────────────────────────┐
│  Single-threaded reactor (run-to-completion per event):            │
│                                                                    │
│  ┌─────────────────────────────────────────────────────┐           │
│  │            epoll_wait() system call                  │           │
│  │  Returns batch of N ready file descriptors           │           │
│  └──────────────────────┬──────────────────────────────┘           │
│                         │                                          │
│  ┌──────────────────────▼──────────────────────────────┐           │
│  │  for (i = 0..N):                                     │           │
│  │    handler = handlers[events[i].data.fd]             │           │
│  │    handler->callback(fd, events, userdata)           │           │
│  └──────────────────────────────────────────────────────┘           │
│                                                                    │
│  No locks needed: single-thread ownership of all FDs.              │
│  Pre-allocated event array: zero dynamic allocation in hot path.   │
└───────────────────────────────────────────────────────────────────┘
sequenceDiagram
    participant App as Application
    participant R as Reactor
    participant K as Kernel (epoll)
    participant T as TimerFD

    App->>R: reactor_init()
    App->>R: reactor_add(socket_fd, EPOLLIN | EPOLLET)
    App->>R: reactor_add_timer(1ms, callback)
    App->>R: reactor_run(-1)

    loop Event Loop
        R->>K: epoll_wait(events[], 256, timeout)
        K-->>R: N ready events
        R->>R: dispatch handlers[0..N]
        Note over R: Batch processing - all events<br/>dispatched before next syscall
    end

    T-->>K: timer expiration
    K-->>R: EPOLLIN on timer_fd
    R->>App: timer_callback(fd, EPOLLIN, userdata)

API Reference

Data Structures

typedef void (*reactor_callback_t)(int fd, uint32_t events, void *userdata);

typedef struct {
    reactor_callback_t callback;
    void              *userdata;
    bool               active;
} reactor_handler_t;

typedef struct {
    int                epoll_fd;                          /* epoll instance */
    struct epoll_event events[REACTOR_MAX_EVENTS];       /* Event batch buffer */
    reactor_handler_t  handlers[REACTOR_MAX_FDS];        /* FD → handler map */
    volatile bool      running;                          /* Loop control flag */
    uint64_t           total_events;                     /* Statistics */
    uint64_t           total_iterations;                 /* Loop iterations */
    int                cpu_affinity;                     /* Pinned CPU (-1 = none) */
} reactor_t;

Constants

#define REACTOR_MAX_EVENTS    256    /* Max events per epoll_wait batch */
#define REACTOR_MAX_FDS       4096   /* Max registered file descriptors */

Reactor Lifecycle

reactor_init

int reactor_init(reactor_t *reactor);

Create the epoll instance and zero-initialize all handler slots. Uses EPOLL_CLOEXEC to prevent FD leaks across exec().

Returns: 0 on success, -errno on failure.

reactor_set_affinity

int reactor_set_affinity(reactor_t *reactor, int cpu);

Pin the reactor thread to a specific CPU core. Call before reactor_run() for best effect. Recommended with isolcpus= kernel parameter for deterministic latency.

Parameter Type Description
cpu int CPU core number (0-based)

reactor_destroy

void reactor_destroy(reactor_t *reactor);

Close the epoll file descriptor.

FD Registration

reactor_add

int reactor_add(reactor_t *reactor, int fd, uint32_t events,
                reactor_callback_t callback, void *userdata);

Register a file descriptor for monitoring.

Parameter Type Description
fd int File descriptor to monitor
events uint32_t Epoll event mask (EPOLLIN, EPOLLOUT, EPOLLET, etc.)
callback reactor_callback_t Function called when events fire
userdata void * Context pointer passed to callback

Edge-Triggered vs Level-Triggered

Use EPOLLET (edge-triggered) for high-throughput sockets where you drain all data per event. Use level-triggered (default) for simpler semantics where you process one chunk per wake-up.

reactor_modify

int reactor_modify(reactor_t *reactor, int fd, uint32_t events);

Change the event mask for an already-registered FD.

reactor_remove

int reactor_remove(reactor_t *reactor, int fd);

Remove a file descriptor from the reactor.

Timer Integration

reactor_add_timer

int reactor_add_timer(reactor_t *reactor, uint32_t interval_ms,
                      reactor_callback_t callback, void *userdata);

Create a periodic timer using timerfd_create(CLOCK_MONOTONIC) and register it with the reactor. The timer fires at interval_ms intervals.

Returns: Timer file descriptor on success, -errno on failure.

Timer Callback Contract

Inside the timer callback, you must call reactor_timer_drain(fd) to consume the expiration counter. Failure to drain will cause immediate re-trigger (level-triggered).

reactor_timer_drain

uint64_t reactor_timer_drain(int timer_fd);

Read and return the number of expirations since last drain. Call this inside your timer callback.

Event Loop

reactor_run

void reactor_run(reactor_t *reactor, int timeout_ms);

Run the event loop. Blocks until reactor_stop() is called from a callback or another thread.

Parameter Type Description
timeout_ms int Per-iteration timeout: -1 = block, 0 = poll, >0 = ms

Handles EINTR gracefully (retries on signal interruption).

reactor_poll_once

int reactor_poll_once(reactor_t *reactor);

Single non-blocking poll iteration. Returns number of events processed. Useful for integrating the reactor into an external loop.

reactor_stop

void reactor_stop(reactor_t *reactor);

Signal the reactor to exit its run loop. Thread-safe (uses volatile flag).

Utility

make_nonblocking

int make_nonblocking(int fd);

Set O_NONBLOCK on a file descriptor via fcntl.

Usage Examples

Basic TCP Echo Server

#include "epoll_reactor.h"
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>

static reactor_t reactor;

void on_client(int fd, uint32_t events, void *userdata)
{
    (void)userdata;
    if (events & EPOLLIN) {
        char buf[4096];
        ssize_t n = read(fd, buf, sizeof(buf));
        if (n <= 0) {
            reactor_remove(&reactor, fd);
            close(fd);
            return;
        }
        write(fd, buf, n);  /* Echo back */
    }
}

void on_accept(int fd, uint32_t events, void *userdata)
{
    (void)events; (void)userdata;
    int client = accept4(fd, NULL, NULL, SOCK_NONBLOCK | SOCK_CLOEXEC);
    if (client >= 0) {
        reactor_add(&reactor, client, EPOLLIN | EPOLLET, on_client, NULL);
    }
}

int main(void)
{
    reactor_init(&reactor);
    reactor_set_affinity(&reactor, 2);  /* Pin to CPU 2 */

    int srv = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
    struct sockaddr_in addr = { .sin_family = AF_INET, .sin_port = htons(8080) };
    bind(srv, (struct sockaddr*)&addr, sizeof(addr));
    listen(srv, 128);

    reactor_add(&reactor, srv, EPOLLIN, on_accept, NULL);
    reactor_run(&reactor, -1);

    reactor_destroy(&reactor);
    return 0;
}

Timer-Driven Periodic Task

#include "epoll_reactor.h"
#include <stdio.h>
#include <time.h>

void on_timer(int fd, uint32_t events, void *userdata)
{
    (void)events;
    uint64_t *count = (uint64_t *)userdata;

    uint64_t expirations = reactor_timer_drain(fd);
    *count += expirations;

    struct timespec ts;
    clock_gettime(CLOCK_MONOTONIC, &ts);
    printf("[%ld.%09ld] Timer fired (total: %lu)\n",
           ts.tv_sec, ts.tv_nsec, *count);
}

int main(void)
{
    reactor_t reactor;
    reactor_init(&reactor);

    uint64_t counter = 0;
    int tfd = reactor_add_timer(&reactor, 10, on_timer, &counter); /* 10ms */
    printf("Timer fd: %d\n", tfd);

    reactor_run(&reactor, -1);
    reactor_destroy(&reactor);
    return 0;
}

Integration with Raw Socket

#include "epoll_reactor.h"
#include "raw_socket.h"

static raw_socket_t raw_sock;

void on_raw_packet(int fd, uint32_t events, void *userdata)
{
    (void)fd; (void)events; (void)userdata;

    uint32_t len;
    const void *frame = raw_socket_mmap_recv(&raw_sock, &len);
    if (frame) {
        /* Process frame in-place (zero-copy) */
        const eth_frame_t *eth = (const eth_frame_t *)frame;
        /* ... handle frame ... */
        raw_socket_mmap_release(&raw_sock);
    }
}

int main(void)
{
    reactor_t reactor;
    reactor_init(&reactor);
    reactor_set_affinity(&reactor, 1);

    raw_socket_open(&raw_sock, "eth0", ETH_P_ALL);
    raw_socket_setup_mmap(&raw_sock, 256, 2048);

    reactor_add(&reactor, raw_sock.fd, EPOLLIN | EPOLLET,
                on_raw_packet, NULL);

    reactor_run(&reactor, -1);

    raw_socket_close(&raw_sock);
    reactor_destroy(&reactor);
    return 0;
}

Build & Run

# Compile
gcc -Wall -Wextra -Werror -O3 -std=c11 -o epoll_reactor_demo \
    epoll_reactor.c -lpthread

# Run (no special privileges needed for basic usage)
./epoll_reactor_demo

# For CPU pinning, use isolcpus kernel parameter:
# GRUB_CMDLINE_LINUX="isolcpus=2,3"

Performance Characteristics

Metric Value Notes
FD add/remove O(1) epoll_ctl is constant time
Event dispatch O(N) N = number of ready FDs, not total
epoll_wait overhead ~1 µs Per syscall (amortized over batch)
Timer resolution ~50 µs timerfd on standard kernel
Timer resolution ~1 µs With CONFIG_HIGH_RES_TIMERS
Max batch size 256 events REACTOR_MAX_EVENTS

CPU Pinning Strategy

┌─────────────────────────────────────────────────────────────────┐
│  Recommended CPU Layout (NUMA-aware):                            │
│                                                                  │
│  CPU 0: OS / interrupts                                          │
│  CPU 1: NIC IRQ affinity (set via /proc/irq/N/smp_affinity)     │
│  CPU 2: Reactor thread (reactor_set_affinity)                    │
│  CPU 3: Application worker (if needed)                           │
│                                                                  │
│  Kernel config:                                                  │
│    isolcpus=2,3                                                  │
│    nohz_full=2,3                                                 │
│    rcu_nocbs=2,3                                                 │
└─────────────────────────────────────────────────────────────────┘

Test Output

$ ./build/epoll_reactor_demo
[reactor] Initialized (epoll_fd=3, max_fds=4096)
[reactor] CPU affinity set to core 2
[reactor] Timer registered: fd=4, interval=10ms
[reactor] Listening on :8080 (fd=5)
[reactor] Running event loop...
[reactor] Timer: 100 expirations in 1.002s (10.02ms avg)
[reactor] Echo: 10000 messages in 48ms (4.8µs avg RTT)
[reactor] Stats: 15234 events, 1102 iterations
[PASS] All epoll_reactor tests passed