July 17, 2025

Robot episode 1: create a server in C

These last months, due to health-related reasons, I had to stay at home and not move too much. That’s part of why I have started this blog, and more generally been back to coding and Data Science. I’ve been a teacher for seven years (you can see some of my work here, but in French). It was wonderful, it was hard, it was inspiring, it was exhausting. But the last two years I’ve been thinking more and more about coming back to the dark side engineering. As life has decided that I should be stuck at home for a few months, it was time for the career change (here is my resume, and here is my LinkedIn, you never know…).

So I started with Data Science projects, since it was my speciality before becoming a teacher. But I’ve been a bit frustrated with finding exciting data, and it did not completely fulfill my need for challenging and fun problem-solving. And to add more context, my partner is a software engineer, and a very talented one. So he has slowly convinced me to work on more coding-oriented projects, and to use his help to gain skills in this subject. Side effect: the field is probably a little less saturated than Data Science, so maybe more open to my atypical profile.

And that’s how I’ve started this project. You can follow it on my GitHub here, and see what it does here. The idea is to move a little robot. But it’s an excuse for working on basic programming skills. So to add more fun, we have decided that I should do it in C, using only the standard library. I also want this article to act as a cheatsheet if I need to do something similar in the future.

Enough of this far too long introduction, let’s get started.

The objective

I just want to create a socket, listen for clients and read what they have for me, and for now, just write back the same request that I received.

But first, what is a socket? Well it is like a case number (we call it a file descriptor) that we use to communicate with the kernel.

The code

Click here to see all the code or hide it (128 line).
C
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int main(void) {
    /*
    Create a socket and get its file descriptor.
    
    A socket is like a case number (a file descriptor)
    Here, we tell him that we wanna use the IPV4 protocol family (AF_INET),
    and the TCP protocol (SOCK_STREAM)
    */
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) {
        perror("socket() failed");
        return 1;
    }
    printf("Socket created, sockfd: %d\n", sockfd);

    // I always want this to avoid error when I use this twice in a row
    int optval = 1;
    int ret = setsockopt(sockfd, SOL_SOCKET, SO_REUSEPORT, &optval, sizeof optval);
    if (ret < 0) {
        perror("setsockopt() failed");
        return 1;
    }

    // Define IP address and port
    const char *interface = "0.0.0.0";
    struct in_addr mysinaddr;
    ret = inet_aton(interface, &mysinaddr);
    if (ret == 0) {
        fprintf(stderr, "Invalid IP address: %s\n", interface);
        return 1;
    }

    struct sockaddr_in myaddr = {
        .sin_family = AF_INET,
        .sin_port = htons(8000),
        .sin_addr = mysinaddr,
    };

    // Attach socket to the previously defined address and port
    ret = bind(sockfd, (struct sockaddr*) &myaddr, sizeof myaddr);
    if (ret < 0) {
        perror("bind() failed");
        return 1;
    }

    // Mark the socket as ready go receive entry connexions
    ret = listen(sockfd, 1);
    if (ret < 0) {
        perror("listen() failed");
        return 1;
    }

    // Repeat indefinitely for each new client
    while (1) {
        // Get client connexion address
        struct sockaddr_in client_addr;
        socklen_t client_addr_len = sizeof client_addr;
        int clientfd = accept(sockfd, (struct sockaddr*) &client_addr, &client_addr_len);
        if (clientfd < 0) {
            perror("accept() failed");
            return 1;
        }
        printf("\n --- NEW CONNEXION RECEIVED, clientfd: %d ---\n", clientfd);
        // Get client IP address in '0.0.0.0' format for printing
        char dst[16];
        const char* ret2 = inet_ntop(AF_INET, &client_addr.sin_addr, dst, sizeof dst);
        if (ret2 == NULL) {
            perror("inet_ntop() failed");
            return 1;
        }
        printf("Client IP address: %s\n", dst);


        char buf[1000];
        // Repeat indefinitely for each new request from current client
        while (1) {
            // Read data sent from client
            // -1 to keep last bit for 0
            ssize_t n = read(clientfd, buf, (sizeof buf) - 1);
            if (n == 0) {
                printf("Client %d disconnected\n", clientfd);
                printf("-------------------------------------\n");
                printf("-------------------------------------\n");
                break;
            } else if (n < 0) {
                perror("read() failed");
                break;
            }
            buf[n] = 0;
            printf("Data received, size: %zi\n", n);
            printf("DATA:\n");
            printf("-------------------------------------\n");
            printf("%s\n", buf);
            printf("-------------------------------------\n");

            // Create header for response
            char header[100];
            int w = snprintf(header, sizeof header, "HTTP/1.0 200 OK\r\nContent-Length: %d\r\n\r\n", n);
            if (w < 0) {
                perror("snprintf() for header failed");
                break;
            }

            // Concatenate header and data of response
            char str[w + n + 1];
            ret = snprintf(str, sizeof str, "%s%s", header, buf);
            if (ret < 0) {
                perror("snprintf() for concatenation failed");
                break;
            }

            // Send response to client (write to client file descriptor)
            ret = write(clientfd, str, w + n);
            if (ret < 0) {
                perror("write() failed");
                break;
            }
        }
    }
}

Note that, here, I am using the code of one of my older commits, to avoid having to much noise. So this code really only creates the server, listens and speaks but no treatment is applied and at this stage, it was only sending back the received request.

Break down

socket – Create a socket and get its file descriptor

C
int sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) {
    perror("socket() failed");
    return 1;
}
printf("Socket created, sockfd: %d\n", sockfd);

// I always want this to avoid error when I use this twice in a row
int optval = 1;
int ret = setsockopt(sockfd, SOL_SOCKET, SO_REUSEPORT, &optval, sizeof optval);
if (ret < 0) {
    perror("setsockopt() failed");
    return 1;
}

We first use the function socket defined in <sys/socket.h>. The complete documentation is here but what I understand and need of it is the following:

int socket(int domain, int type, int protocol);

This function creates an end point for communication and returns a file descriptor that refers to that endpoint. It is this file descriptor that we are going to use in all next steps.

  • Arguments:
    • int domain : communication domain. There are many but I used AF_INET wich is for IPv4 Internet protocols.
    • int type : communication semantics. Same, there are severals (see the doc) but I used SOCK_STREAM for TCP protocol (I guess SOCK_DGRAM might be for UDP protocol as the doc says that it supports datagrams, but I am not sure).
    • int protocol : 0 here, as there is a single protocol for this socket type.
  • Return value:
    • int sockfd : the famous file descriptor. If negative, than there is an error.

Then I use setsockopt. To be honest, I know that it is necessary to avoid errors when you run the server a second time quickly after stopping it, but I’ve just been following this article.

inet_aton – Define IP address and port

C
const char *interface = "0.0.0.0";
struct in_addr mysinaddr;
ret = inet_aton(interface, &mysinaddr);
if (ret == 0) {
    fprintf(stderr, "Invalid IP address: %s\n", interface);
    return 1;
}

struct sockaddr_in myaddr = {
    .sin_family = AF_INET,
    .sin_port = htons(8000),
    .sin_addr = mysinaddr,
};

I first use the function inet_aton defined in <arpa/inet.h> whose documentation can be found here.

int inet_aton(const char *cp, struct in_addr *inp);

This function converts an IPv4 address from number-and-dots notation into binary form.

  • Arguments:
    • const char *cp : host IPv4 address in number-and-dots format. I’ve used 0.0.0.0 that means “listen on every available network interface” from what I understand.
    • struct in_addr *inp : the result is stored in the structure pointed by inp. This structure is a struct in_addr which is defined in <in.h> as follows:
C
typedef uint32_t in_addr_t;
struct in_addr
  {
    in_addr_t s_addr;
  };

So it’s just an unsigned int.

  • Return value: 0 if invalid address.

Then I define myaddr which is a struct sockaddr_in (also defined in <in.h>) containing the following elements:

  • sa_family_t (int) sin_family : still AF_INET for IPv4.
  • in_port_t (uint) sin_port : port number, I used 8000.
  • struct in_addr sin_addr : internet address, filled by the previous function.

bind – Attach socket to the previously defined address and port

C
ret = bind(sockfd, (struct sockaddr*) &myaddr, sizeof myaddr);
if (ret < 0) {
    perror("bind() failed");
    return 1;
}

As its name suggests, bind (defined in <socket.h> and documented here) will…bind our socket to our address and port:

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
  • Arguments:
    • int sockfd : file descriptor of our previously created socket.
    • const struct sockaddr *addr : previously defined IP address and port. Note that it was a sockaddr_in and now it is a sockaddr. In fact, sockaddr is the generic structure for socket addresses, but we have defined an IPv4 one (that’s what the _in in sockaddr_in means). So by adding (struct sockaddr*) before &myaddr we are telling to the compiler to interpret it as a generic socket address.
    • socklen_t addrlen : size of IP address.
  • Return value: 0 if success, -1 otherwise.

listen – Mark the socket as ready to receive entry connexions

C
ret = listen(sockfd, 1);
if (ret < 0) {
    perror("listen() failed");
    return 1;
}

Now we can finally listen to potential clients!

It’s defined in <socket.h> and documentation is here.

int listen(int sockfd, int backlog);

For once, it’s an easy one:

  • Arguments:
    • int sockfd : file descriptor of our socket.
    • int backlog : maximum length of pending connections. Just 1 for now, so we will refuse new clients while we are dealing with one.
  • Return value: 0 if success, -1 otherwise.

accept – Get client connexion address

C
struct sockaddr_in client_addr;
socklen_t client_addr_len = sizeof client_addr;
int clientfd = accept(sockfd, (struct sockaddr*) &client_addr, &client_addr_len);
if (clientfd < 0) {
    perror("accept() failed");
    return 1;
}
printf("\n --- NEW CONNEXION RECEIVED, clientfd: %d ---\n", clientfd);
// Get client IP address in '0.0.0.0' format for printing
char dst[16];
const char* ret2 = inet_ntop(AF_INET, &client_addr.sin_addr, dst, sizeof dst);
if (ret2 == NULL) {
    perror("inet_ntop() failed");
    return 1;
}
printf("Client IP address: %s\n", dst);

Now everything is ready, so we can accept a new client! (from now, we will repeat the following steps indefinitely for each new client).

We use accept, defined in <socket.h> and documented here.

int accept(int sockfd, struct sockaddr *_Nullable restrict addr, socklen_t *_Nullable restrict addrlen);

This function extracts the first connection request from the queue of pending connections (so here, the only one as our queue is maximum of length 1), and returns a new file descriptor now refering to the client.

  • Arguments:
    • int sockfd : file descriptor of listening socket.
    • struct sockaddr *_Nullable restrict addr : pointer to a sockaddr structure that is initialized before (same than before, we are working with IPv4 so sockaddr_in, but we interpret it as a generic sockaddr) for saving client address.
    • socklen_t *_Nullable restrict addrlen : size of client address.
  • Return value:
    • int: -1 if error, client socket file descriptor otherwise.

Then we convert the client IP address into the number-and-dots format but that was only necessary when I wanted to print it at the beginning of the project. At the time I am writing this article, I have removed this part. But I used inet_ntop function, defined in <inet.h> and documented here (I don’t go in details here because I am not using it anymore).

read – Read data sent from client

C
char buf[1000];
// …
ssize_t n = read(clientfd, buf, (sizeof buf) - 1);
if (n == 0) {
    printf("Client %d disconnected\n", clientfd);
    printf("-------------------------------------\n");
    printf("-------------------------------------\n");
    break;
} else if (n < 0) {
    perror("read() failed");
    break;
}
buf[n] = 0;
printf("Data received, size: %zi\n", n);
printf("DATA:\n");
printf("-------------------------------------\n");
printf("%s\n", buf);
printf("-------------------------------------\n");

This one is a bit tricky and I need to go a bit deeper in what happens when we want to read what the client sent us. When the client sends data to my server, the network interface card writes it in a kernel buffer (or maybe the kernel does, but at our level, it doesn’t change anything). We cannot directly access a kernel buffer, only the kernel can. So we are using the read function to ask the kernel to copy its buffer into a buffer that we can use (the one called buf in my code). The thing is, the kernel doesn’t know if the network interface card has received all the data sent by the client, and the kernel copies all it can copy. Which means that we are making here 2 assumptions to make our lives easier:

  • We suppose that the whole client request has been received (but we can have fun with the nc command: start the server in a terminal, then open a new terminal and run the command nc 127.0.0.0 8000 and hit Return, then type the begining of a request: GET / HTTP/1.1 and hit Return. The server sends you a response immediately, before you have any chance to finish sending your request).
  • We suppose that our buffer buf is big enough to get the full request. If it’s not, read will copy what it can from the request, but not return an error. The rest of the request will remain available in kernel memory. So we end with an incomplete request, which may (or may not) be a problem. That’s why for now, I am using a large buffer by doing char buf[1000];. But we will see later how to deal with this by checking if the kernel buffer is empty or not, and reallocating memory to buf when necessary.

Anyway, now let’s see how read works. It is defined in <unistd.h> and documented here.

ssize_t read(int fd, void buf[.count], size_t count);
  • Arguments:
    • int fd : file descriptor from where we want to read. So the client file descriptor.
    • void *buf : beginning of the buffer into which we want the kernel to write.
    • size_t count : number of bytes that we want to read. I use (sizeof buf) - 1 to keep one byte for the terminating 0 that is required by functions I use later.
  • Return value:
    • On success: the number of bytes read (can be smaller than count if there are fewer bytes actually available than required).
    • On error: -1.
    • If end of file: 0.

Note: what has been read is removed from kernel memory; we cannot read the same data twice.

write – Send response to client

C
ret = write(clientfd, str, w + n);
if (ret < 0) {
    perror("write() failed");
    break;
}

Finally, we can answer! And this one is easier. write is defined in <unistd.h> and documented here.

ssize_t write(int fd, const void buf[.count], size_t count);
  • Arguments:
    • int fd : file descriptor to which we want to write (so the client file descriptor).
    • const void *buf : beginning of the buffer containing what we want to write (so our answer, for example an HTML file).
    • size_t count : number of bytes that we want to write (so should be the size of buf if we want to send all its content).
  • Return value: -1 if error, number of bytes written otherwise.

close – Close client’s file descriptor

Well I had forgotten it at the time but you should close the client’s file descriptor when the client is disconnected (I found it because I had a bug when I kept my finger on the F5 key, refreshing until death… I had no available file descriptors anymore). Luckily, that’s pretty easy :

C
// Close client file descriptor
int r = close(clientfd);
if (r < 0) {
    perror("close() failed");
    return 1;
}

close() is documented here.

int close(int fd);
  • Arguments:
    • int fd: client’s file descriptor to close.
  • Return value: -1 if error, 0 otherwise.

poll – Go further: use poll to read properly everything in the kernel buffer

At the read stage, we cheated a bit by defining a big buffer and hoping that the request would fit in. But what if we want to do it a bit more properly, so create a small buffer, read, check if everything has been retrieved, if not realloc memory for our buffer, read again…until all the data has been recovered?

Well the function we need for that is poll, defined in <poll2.h> and documented here.

int poll(struct pollfd *fds, nfds_t nfds, int timeout);
  • Arguments:
    • struct pollfd *fds : list of pollfds, which is a structure with the following elements:
      • int fd : file descriptor (so for me the client file descriptor).
      • short events : requested events (I used POLLIN to know if I can read, the list is in the documentation).
      • short revents : returned events (it is the output parameter of poll).
    • nfds_t nfds : length of list, so for me 1.
    • int timeout : number of milliseconds that poll should block waiting for a file descriptor to become ready. I used 0 so that poll returns immediately.
  • Return value: -1 if error, number of elements whose revents are nonzero (so for me, 0 or 1) otherwise.

This function waits for the file descriptor (could be many but we are using only one here) to be ready for input/output.

Here is how I use it:

C
// Read data sent from client
size_t buf_size = 10;
size_t data_len = 0;
char* buf = malloc(buf_size);

// Repeat indefinitely for each new request from current client
while (1) {
    // Reinitialize data_len
    data_len = 0;

    struct pollfd client_pollfd;
    client_pollfd.fd = clientfd;
    client_pollfd.events = POLLIN; // can I read?

    int n = read_client(clientfd, buf, &data_len, &buf_size);
    if (n == 0) {
        return;
    }
    if (n < 0) {
        fprintf(stderr, "%s:%d - read_client() failed\n", __FILE__, __LINE__);
        free(buf);
        return;
    }

    while (1) {
        int r = poll(&client_pollfd, 1, 0); // timeout = 0 causes poll() to return immediately, even if no file descriptors are ready
        if (r < 0) {
            perror("poll() failed");
            return;
        }

        if ((client_pollfd.revents & POLLIN) == 0) {
            break;
        }

        // Fill buf from where we stopped
        n = read_client(clientfd, buf, &data_len, &buf_size);
        if (n == 0) {
            break;
        }
        if (n < 0) {
            fprintf(stderr, "%s:%d - read_client() failed\n", __FILE__, __LINE__);
            free(buf);
            break;
        }
    }
    // Finally, last realloc to have the exact needed size and add final 0
    buf = realloc(buf, data_len + 1);
    buf_size = data_len + 1;
    buf[data_len] = 0;

read_client is just a wrapper around read to avoid copying some verifications twice:

C
int read_client(int clientfd, char* buf, size_t* p_data_len, size_t* p_buf_size) {
    /*
    Reading of client buffer into buf at position data_len (data_len is the
    size of already written data) for as many characters as possible (to fill
    the buffer buf or to read everything).

    Returns an int:
        0: client disconnected
        -1: fail
        n: read response, number of characters read
    */
    ssize_t n = read(clientfd, buf + *p_data_len, *p_buf_size - *p_data_len);
    if (n == 0) {
        printf("Client %d disconnected\n", clientfd);
        printf("-------------------------------------\n");
        printf("-------------------------------------\n");
        return 0;
    }

    if (n < 0) {
        perror("read() failed");
        return -1;
    }

    *p_data_len += n;

    if (*p_data_len == *p_buf_size) {
        // Buf is full, need to realloc
        *p_buf_size *= 2;
        buf = realloc(buf, *p_buf_size);
        // printf("REALLOC!! New size: %zu\n", buf_size);
    }
    return n;
}

Here’s roughly what my code does:

  • Creates a small buffer buf. buf_size contains the maximum size of the buffer, and data_len the size already used in the buffer (so the remaining size is buf_size - data_len).
  • I read a first time. It is necessary to read a first time before using poll because read is the function that will wait for something to arrive. If we don’t put this first read_client, we will loop on empty requests before the first one arrives.
  • Then we loop until everything has been retrieved:
    • Use poll to check if there is still something to read.
    • If not ((client_pollfd.revents & POLLIN) == 0) then all the request has been retrieved and we can quit the loop.
    • Otherwise, we read everything we can. Note that in the function read_client, I check if buf is full or not. If it is, I reallocate twice the memory to be able to read more if necessary on next loop. We might reallocate once too much at the end though (if we fill the buffer with exactly everything that was left to read).
  • Once everything was retrieved, we reallocate one last time to the exact needed size and add final 0 (this step is not necessarily required).

And here you have it! Now we just have to process the request and make something interesting out of it! But it will be for the next episode…