TCP Stack (Userspace State Machine)¶
Status: Planned
This module is not yet implemented. The design below represents the target architecture.
Overview¶
A minimal userspace TCP implementation built on top of raw_socket for applications that need deterministic latency without kernel TCP's buffering, congestion window management, and scheduling overhead. Implements the TCP state machine (RFC 793) with selective acknowledgment (SACK, RFC 2018).
TCP State Machine¶
stateDiagram-v2
[*] --> CLOSED
CLOSED --> SYN_SENT : connect() / send SYN
CLOSED --> LISTEN : listen()
LISTEN --> SYN_RCVD : recv SYN / send SYN+ACK
SYN_SENT --> ESTABLISHED : recv SYN+ACK / send ACK
SYN_RCVD --> ESTABLISHED : recv ACK
ESTABLISHED --> FIN_WAIT_1 : close() / send FIN
ESTABLISHED --> CLOSE_WAIT : recv FIN / send ACK
FIN_WAIT_1 --> FIN_WAIT_2 : recv ACK
FIN_WAIT_1 --> CLOSING : recv FIN / send ACK
FIN_WAIT_2 --> TIME_WAIT : recv FIN / send ACK
CLOSING --> TIME_WAIT : recv ACK
TIME_WAIT --> CLOSED : 2MSL timeout
CLOSE_WAIT --> LAST_ACK : close() / send FIN
LAST_ACK --> CLOSED : recv ACK Planned API¶
typedef enum {
TCP_CLOSED, TCP_LISTEN, TCP_SYN_SENT, TCP_SYN_RCVD,
TCP_ESTABLISHED, TCP_FIN_WAIT_1, TCP_FIN_WAIT_2,
TCP_CLOSING, TCP_TIME_WAIT, TCP_CLOSE_WAIT, TCP_LAST_ACK,
} tcp_state_t;
typedef struct {
tcp_state_t state;
uint32_t local_ip;
uint32_t remote_ip;
uint16_t local_port;
uint16_t remote_port;
uint32_t seq_num; /* Send sequence number */
uint32_t ack_num; /* Expected receive sequence */
uint32_t window_size; /* Advertised window */
uint64_t rtt_ns; /* Smoothed RTT */
raw_socket_t *raw; /* Underlying raw socket */
} tcp_conn_t;
/* Connection lifecycle */
int tcp_listen(tcp_conn_t *conn, uint16_t port);
int tcp_connect(tcp_conn_t *conn, uint32_t remote_ip, uint16_t remote_port);
int tcp_accept(tcp_conn_t *listener, tcp_conn_t *new_conn);
int tcp_close(tcp_conn_t *conn);
/* Data transfer */
ssize_t tcp_send(tcp_conn_t *conn, const void *data, size_t len);
ssize_t tcp_recv(tcp_conn_t *conn, void *buf, size_t len, int timeout_ms);
/* Event processing (call from reactor) */
int tcp_process_packet(tcp_conn_t *conn, const void *pkt, size_t len);
TCP Header Format¶
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
│ Source Port │ Destination Port │
├─────────────────────────────────────────────────────────────────┤
│ Sequence Number │
├─────────────────────────────────────────────────────────────────┤
│ Acknowledgment Number │
├─────┬───────┬─┬─┬─┬─┬─┬─┬─────────────────────────────────────┤
│Data │ │U│A│P│R│S│F│ │
│Offset│ Rsvd │R│C│S│S│Y│I│ Window Size │
│ │ │G│K│H│T│N│N│ │
├─────────────────────────────────────────────────────────────────┤
│ Checksum │ Urgent Pointer │
├─────────────────────────────────────────────────────────────────┤
│ Options (variable) │
└─────────────────────────────────────────────────────────────────┘
Performance Targets¶
| Metric | Target | Notes |
|---|---|---|
| 3-way handshake | < 10 µs | On loopback / LAN |
| Send latency | < 3 µs | Application to wire |
| Receive latency | < 3 µs | Wire to application callback |
| Retransmit timeout | Configurable | Microsecond-resolution timers |
Implementation Roadmap¶
- TCP header construction and parsing
- Connection state machine (RFC 793)
- 3-way handshake (SYN, SYN+ACK, ACK)
- Sliding window flow control
- Retransmission timer (timerfd integration)
- Selective ACK (SACK, RFC 2018)
- Connection teardown (FIN/RST)
- Integration with epoll_reactor
- Checksum offload (if supported by NIC)