🚀 Feature request
I've been investigating the best way to add a middleware to my razzle server to detect whenever it serves a file built by razzle that includes a webpack [hash:8]
or [contenthash:8]
in the filename. I first discussed some of the problems I'm running into here https://github.com/jaredpalmer/razzle/pull/1368#issuecomment-664015050
I would like razzle to generate and expose the list of files/assets safe to be considered "immutable" (for the purposes of setting Cache-Control
headers in responses) in a way that is easy to consume without extra transformation of the chunks.json and/or assets.json files
NOTE: when setting long-lived & immutable cache-control responses I want to avoid doing any kind of "approximation" on whether a file can be considered immutable (AKA regex to detect a hash in the filename) because a false positive can lead to a file being immutably cached for a long time and wouldn't be fixable by a server-side cache invalidation, which can be a very painful problem to work around.
Current Behavior
TL;DR of why trying to use the currently exposed json files is difficult:
- In order to get the concrete list of all files that are safe to cache immutably (because they have build or content hashes in them) I need to use both
chunks.json
and assets.json
. chunks.json includes sourcemap files and assets.json has files like png/fonts etc which chunks.json doesn't.
- The assets.json and chunks.json aren't in the same format (this is possibly a problem that manifests for me because I let webpack split things across multiple chunks) so require different ad-hoc transforming to collate the complete list of all files/assets. Some of the differences are:
- It seems for any chunk that isn't in
(assets.json).client
(eg: "client": { "js": "/static/js/bundle.6fc534aa.js" }
), assets.json group all other assets under an empty string (eg: "": { "js": "/static/js/0.cb47cee9.chunk.js" }
).
- if only one file is present in a chunks.json group it will be an array with one item in it (eg:
"client": { "css": ["filename.css"] }
), if there's only one file file present in assets.json it will instead just be the single string (eg: "client": { "css": "filename.css" }
).
- My assets.json currently contains
"json": "/../chunks.json"
which is not something that I think should be in there (i'm not sure if this is a bug or not) but I have to manually strip this out when making the list of files that can be given long lived cache-Control response headers.
- The plan to add a
chunks: ["1", "2", "3"]
array to the chunks.json is somewhat annoying because it means I have to do extra work to filter out the (chunks.json).client.chunks
because it doesn't contain an array of files like (chunks.json).client.css
and (chunks.json).client.js
etc.
- Before the change I made here files not in the
client
chunk weren't even appearing in the chunks.json
file. I made/suggested the change to change it to using the chunk number(s) as the key because at least they then appear in the file. The downside of this is that now chunks.json
and assets.json
diverse further in their schema when dealing with chunks that aren't the primary named chunk ("client": {/* blah */ }
).
using assets.json and chunks.json
Currently using the assets.json and chunks.json this is what I've had to do roughly so far
I haven't:
- added loading the assets.json yet and resolving differences between the formats
- Filtering out files/fields in the json that I know aren't meant to be there like
"chunks": ["1", "2", "3"]
and "json": "/../chunks.json"
function razzleCacheableFiles() {
// TODO: Add loading the assets.json file to support (png/txt files etc)
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
const chunks = require(process.env.RAZZLE_CHUNKS_MANIFEST!);
const filesByType = Object.entries(chunks).reduce(
(chunkAcc: any, [, chunk]) => {
const types = Object.entries(chunk as any).reduce(
(typeAcc, [fileType, files]) => {
return {
[fileType]: chunkAcc[fileType]
? [...chunkAcc[fileType], ...(files as string[])]
: files,
};
},
{},
);
return types;
},
{},
);
const files = Object.entries(filesByType).reduce(
(acc: any[], [, files]) => [...acc, ...(files as string[])],
[],
);
return files;
}
const cacheableFiles = razzleCacheableFiles();
// Serve static files located under `process.env.RAZZLE_PUBLIC_DIR`
const assetCaching = {
immutable: {
maxAge: CacheFor.OneMonth,
sMaxAge: CacheFor.OneYear,
},
default: {
maxAge: CacheFor.OneDay,
sMaxAge: CacheFor.OneWeek,
}
};
app.use(
serve(process.env.RAZZLE_PUBLIC_DIR!, {
setHeaders(res, path) {
const filename = path.replace(process.env.RAZZLE_PUBLIC_DIR!, "");
const hasHashInFilename = cacheableFiles.includes(filename);
if (hasHashInFilename) {
const { immutable } = assetCaching;
res.setHeader(
"Cache-Control",
`max-age=${immutable.maxAge},s-maxage=${immutable.sMaxAge},immutable`,
);
return;
}
res.setHeader(
"Cache-Control",
`max-age=${assetCaching.default.maxAge},s-maxage=${asetCaching.default.sMaxAge}`,
);
},
}),
);
Desired Behavior
There are would probably be many ways to do this but the primary thing I want is just a way to load an array of all cacheable/immutable assets generated by razzle build. the result could look something like this:
// File: caching.json
// contains all files/assets with a hash in them regardless of what type of file they are.
{
"immutable": [
"/static/js/0.cb47cee9.chunk.js",
"/static/js/0.cb47cee9.chunk.js.map",
"/static/js/0.cb47cee9.chunk.js.LICENSE.txt",
"/static/media/ferris-error.407b714e.png"
],
// I'm not even sure if this is required because I don't think razzle generates any files that don't have hashes in them?
// possibly files copied in from the `public/` directory during build. but I'm not even sure if it'd that'd useful
"standard": []
}
// RAZZLE_CACHING_MANIFEST is probably a silly name but
const cacheableFiles = require(process.env.RAZZLE_CACHING_MANIFEST!);
// Serve static files located under `process.env.RAZZLE_PUBLIC_DIR`
const assetCaching = {
immutable: {
maxAge: CacheFor.OneMonth,
sMaxAge: CacheFor.OneYear,
},
default: {
maxAge: CacheFor.OneDay,
sMaxAge: CacheFor.OneWeek,
}
};
app.use(
serve(process.env.RAZZLE_PUBLIC_DIR!, {
setHeaders(res, path) {
const filename = path.replace(process.env.RAZZLE_PUBLIC_DIR!, "");
const hasHashInFilename = cacheableFiles.immutable.includes(filename);
if (hasHashInFilename) {
const { immutable } = assetCaching;
res.setHeader(
"Cache-Control",
`max-age=${immutable.maxAge},s-maxage=${immutable.sMaxAge},immutable`,
);
return;
}
res.setHeader(
"Cache-Control",
`max-age=${assetCaching.default.maxAge},s-maxage=${asetCaching.default.sMaxAge}`,
);
},
}),
);
Suggested Solution
I haven't fully investigated what a good solution would be but after trying to put together this list of "cacheable assets" at runtime using the assets.json
and chunks.json
I'm pretty convinced that at a minimum the best way to accomplish this would would be at build-time with some kind of webpack plugin and bypass the inconsistencies of those two files.
For my purposes I'll probably initially start looking into how to accomplish this with a plugin rather than with at runtime as i've been doing, but I think there'd be significant value to have this baked-in to razzle by default. Being able to set long-lived cache-control on hashed files is largely why they get hashed to begin with, so exposing a list of all those files seems appropriate.
Who does this impact? Who is this for?
Any users who want to set appropriate long-lived & immutable cache-control response headers for files generated & hashed by razzle.
Describe alternatives you've considered
- Generate list of all immutable/cacheable files at runtime by clobbering together
chunks.json
and assets.json
(seems error prone and fragile).
- Create an external plugin to pre-generate list of cacheable files at buildtime. (seems possibly fragile across razzle versions for a feature that seems like it should be baked-in/stable)
- Add it as an internal plugin to razzle's default config and expose a way to access it by default eg:
require(process.env.RAZZLE_CACHING_MANIFEST!)
. ()
Additional context
I'd be willing to help/contribute towards making this change but I might need a bit of a point in the right direction (and of course whether or not this is a change that would be accepted/welcomed).
Also a thought, having something like this might make it easier to have some tests/stability around ensuring that things are using [contenthash:8]
instead of [hash:8]
(build hash) if/when they can https://github.com/jaredpalmer/razzle/issues/1331
enhancement help wanted discussion webpack-config razzle webpack