Serving index HTML at small scale

2025-08-13

To serve static content, a CDN is the main go-to approach. The main issue is that if a CDN is compromised or serves malicious content, arbitrary code can be injected into your web app since index.html is the entry point. To overcome that, Subresource Integrity, also known as SRI, is available for the rescue. While SRI protects individual assets, this creates a chicken-and-egg problem: to use SRI, you need the asset hashes in your index.html, but serving that index.html requires making sure that the entry point is coming from a secure environment.

Let's immerse our feet into how the design of index file serving machinery could look like for an SPA at a small scale, yet allowing us to serve thousands of RPS with ease.

Storing the files

One obvious solution is to have the index file stored on a filesystem. Distribution becomes a challenge - rsync to machines? That breaks down with ephemeral infrastructure. NFS? Please no. You could run a distributed file system for that, but it seems like an overkill. You probably use a relational DB already, so why not make use of that?

The overall approach seems to be viable. Just compress the file before storing it in the table field to avoid consuming extra space, and as long as index files are reasonably small, it should be no problem. The table is straightforward, and may look something like this:

CREATE TABLE index_files (
    id BIGSERIAL NOT NULL PRIMARY KEY,
    version TEXT NOT NULL,
    checksum BYTEA NOT NULL,
    content BYTEA NOT NULL,
    created_at TIMESTAMP NOT NULL,
    UNIQUE (version, checksum)
);

Every web client release simply inserts the index file in the table along with checksum and the version of the app. Don't forget to clean up old entries to avoid bloat, which is why created_at field is there, allowing for simple SQL query to delete the records older than N days.

Picking the index file to serve

Continuing on the data model, we need to somehow mark which index file should be served. The most straightforward approach is to just have a boolean flag set on one of the files, but we can do better. Since we are designing a custom serving machinery, we could make the rollout by stages. Each stage could be pinned to a browser via cookie, initially doing a hashmod of an IP address. Then, it's a matter of specifying your stages and percentages of files served per stage. The table is also very basic:

CREATE TABLE index_file_stages (
    deployment TEXT NOT NULL,
    stage TEXT NOT NULL,
    index_file_id BIGINT NOT NULL REFERENCES index_files (id),
    updated_at TIMESTAMP NOT NULL,
    PRIMARY KEY (deployment, stage)
);

The stage could be an text-encoded enum, where each stage, something along these lines:

pub enum Stage {
    Default,
    Canary,
    Preview,
}

impl Stage {
    const STAGES: &[Stage] = &[Stage::Canary, Stage::Preview];

    fn percentage(&self) -> u64 {
        match self {
            Stage::Default => 100,
            Stage::Canary => 5,
            Stage::Preview => 25,
        }
    }
}

Actual selection logic is not complicated. First, check for the cookie in the request, if the stage is present there, serve the index file for that stage directly. If not, extract peer IP address, hash it, iterate through stages and pick the best match:

for stage in Stage::STAGES {
    if ip_hash % 100 < stage.percentage() {
        return *stage;
    }
}

Serving the actual files

Here's where it gets interesting. Loading data from the DB is easy, but handing out the file is non-trivial. There could be multiple content encodings requested by the client, and hence we need to serve the correct ones. The good thing about building our own machinery, is that we can optimize this by pre-compressing index file contents and store that in memory. The next bit to take into account is updates to index files on a running server - the server would need to re-read the data from the database and update the files while they are being actively requested. Concurrency problem. The pattern is read-heavy with infrequent writes. Sounds familiar? Exactly! My tiny MVCC can help us here! That's the pattern where it shines.

First off, let's tackle content encoding. I decided to keep index files with the following data model on the app server side:

struct IndexFiles {
    contents: Vec<IndexFileContents>,
}

struct IndexFileContents {
    id: i64,
    bytes: Vec<(header::ContentEncoding, Bytes)>,
    known: Vec<header::Encoding>,
    stage: Stage,
    version: Version,
}

The bytes vector stores pre-compressed bytes for every available content encoding. On my side I selected identity, gzip and brotli.

The trick is to negotiate the content with the client. I'm not going to cover the whole semantics, you can read the spec, but that's where known struct field comes in. I use actix as a web server in my project, and luckily it provides the method to negotiate the encoding, making the code dead simple:

impl IndexFileContents {
    fn negotiate(
        &self,
        accept_encoding: Option<&header::AcceptEncoding>,
    ) -> Option<(header::ContentEncoding, Bytes)> {
        let identity = self
            .bytes
            .iter()
            .find(|(enc, _)| *enc == header::ContentEncoding::Identity)?;

        let accept_encoding = match accept_encoding {
            Some(accept_encoding) => accept_encoding,
            None => return Some((identity.0, identity.1.clone())),
        };

        match accept_encoding.negotiate(self.known.iter()) {
            Some(header::Encoding::Known(encoding)) => self
                .bytes
                .iter()
                .find(|(enc, _)| *enc == encoding)
                .map(|(enc, bytes)| (*enc, bytes.clone()))
                .or_else(|| Some((identity.0, identity.1.clone()))),
            _ => Some((identity.0, identity.1.clone())),
        }
    }
}

Next to concurrency. All we need is to store IndexFiles in a vlock, get it for every request, and update on every refresh event. The minimal handler may looks like this:

async fn serve(
    req: HttpRequest,
) -> Result<impl Responder, Error> {
    let stage = Stage::from(&req);

    let data: &Data<vlock::VLock<IndexFiles, 2>> = req.app_data()
        .ok_or_else(internal_server_error)?;

    let (encoding, body) = {
        let files = data.read();
        let contents = files.get(stage).ok_or_else(internal_server_error)?;

        let accept_encoding = header::AcceptEncoding::parse(&req).ok();
        contents.negotiate(accept_encoding.as_ref()).ok_or_else(internal_server_error)
    }?;

    let mut response = HttpResponse::Ok();

    // Add any extra relevant headers.
    response
        .insert_header(header::ContentType::html())
        .insert_header(encoding)
        .append_header((
            header::VARY,
            header::HeaderValue::from_static("Accept-Encoding"),
        ))
        .insert_header(header::CacheControl(vec![
            header::CacheDirective::Private,
            header::CacheDirective::MaxAge(300),
        ]));

     if stage != Stage::Default {
        let cookie = cookie::Cookie::build(
            Stage::COOKIE_NAME,
            u8::from(stage).to_string(),
         )
         .path("/")
         .http_only(false)
         .secure(true)
         .same_site(cookie::SameSite::Strict)
         .max_age(cookie::time::Duration::days(7))
         .finish();
         response.cookie(cookie);
     }

     Ok(response.body(body))
}

To refresh, just add an internal handler that will go to the database, pre-compress the index files into IndexFiles struct, and write it to shared state lock via this call:

let files = load_and_compress_files().await?;
index_files.update_default(|_, index_files| {
    *index_files = files;
});

In summary

This adds a little bit of complexity, but what can you do if you want to avoid code injection. As a bonus, you have very high performance path that cannot be improved much - serving stuff directly from memory is the most efficient as you can get. I've lost results of load testing, but from what I remember this path alone could handle a thousand RPS at few percent CPU utilization with refresh running every 5 minutes.

You do lose CDN benefits - caching the content close to the client. You may have noticed that cache control header directive is set to private, that's because we need to control how the stages are handed off. If you simplify the code to avoid stages, you could just cache it with public directive and have the application server sit behind a CDN.

If you do run behind CDN or a load balancer, hashmod by IP may not distribute the files correctly if the IP address is masked. In some setups this is not a problem at all, but be careful about the traffic flow is what I want to say.

I don't know if that's going to work well, given I'm using experimental concurrency library here, but so far it worked fine. The real world will tell.