Asset bundling from the command line

I wanted a simple solution to bundle and fingerprint the assets of my homepage, which is built with Jekyll.

Existing solutions seemed an overkill. It turned out that I could achieve the same result with a simple script.

Web Assets Management

Asset Bundling. Performances of webapps and websites can be improved by bundling assets. Roughly speaking, the process consists in concatenating all stylesheets and JavaScript sources in two files, removing spaces and other useless characters (a process sometimes called minification or uglifying), and optionally compressing the resulting files, to further reduce their size. This reduces the number and size of network requests required to load and render out content.

Cache Busting. Another good practice is implementing a cache busting strategy, that is, a mechanism that invalidates the cache of browsers when our assets change.

Browsers usually keep local copies of the files they download when visiting a site. Some of these files, such as stylesheets and JavaScript, tend to change rarely, if at all. By caching these files, browsers can render a website faster when it is visited for a second time, since there is no need to download again files which are already available locally.

This, however, can be a problem when the cached assets are actually changed: if the browser already has a local copy, it might not check for new versions and our website might not render as intended.

There are various techniques to avoid this issue. A simple approach consists in revving (or fingerprinting) assets, that is, changing their filename when their content changes. Common techniques use timestamps, version numbers, and the SHA of the file. If you are interested on the topic, see Cache Bust that Asset for a review of cache busting techniques and HTTP Caching for an excellent introduction about caching.

Tools. There are many tools to automatically bundle and fingerprint your assets. For instance, if you are on Ruby on Rails or Middleman, asset bundling is automatically managed by Sprockets.

If you are on Jekyll, the situation is a bit more complex. There are, in fact, various plugins, such as Jekyll Assets, Jekyll Minibundle, and Jekyll::Littlefinger, providing different degrees of automation. However, their integration requires a bit of work and, in some cases, there are some quirks: for instance I could not integrate Jekyll::Littlefinger and Jekyll’s support for Sass.

A different approach is using one of the many tools available for JavaScript, such as, for instance, Gulp, Webpack, and Grunt. Of these, the one most often used with Jekyll seems to be Gulp, for which there are some nice tutorials, such as Building a production website with Hugo and GulpJS and Using Gulp asset versioning with Hugo data files.

The Command Line to the Rescue

My website has two stylesheets written in Sass and two JavaScript files. When I moved from Foundation to Spectre.css I had an issue with cache busting, but incorporating into my toolchain Sprockets, Gulp or any other build system seemed to be an overkill.

Fortunately, the operations required to bundle assets can be easily performed from the command line and, thus, I decided to write a bash script to bundle and fingerprint the assets of my website.

Here it is:

# use Sass to compile stylesheets
sassc -I node_modules/spectre.css/src _sass/main.scss > assets/css/main.css

# fingerprint the compiled stylesheet and move it to a directory Jekyll understands
SHA=$(sha256sum assets/css/main.css  | cut -f1 -d" ")
mv assets/css/main.css assets/css/main-${SHA}.css

# keep track of the filename with the SHA in _data/assets.yml 
# (the > ensures the old version of assets.yml gets overwritten)
cat > _data/assets.yml <<EOF
main.css:
  digested: assets/css/main-${SHA}.css
EOF

# use UglifyJS to concatenate all JavaScript files
uglifyjs node_modules/zepto/dist/zepto.min.js _js/main.js > assets/js/main.js

# fingerprint the javascript and move it to a directory Jekyll understands
SHA=$(sha256sum assets/js/main.js  | cut -f1 -d" ")
mv assets/js/main.js assets/js/main-${SHA}.js

# keep track of the filename with the SHA in _data/assets.yml 
# (the >> ensures we are appending to _data_assets.yml)
cat >> _data/assets.yml <<EOF
main.js:
  digested: assets/js/main-${SHA}.js
EOF

One important bit of the script is the generation of the assets.yml file, which keeps the correspondence between original and fingerprinted filenames. When fingerprinting, in fact, assets names change everytime an asset is modified; referencing assets from layouts requires to be able to compute the correct name (or update them manually everytime they change).

Jekyll data file and, more specifically, the assets.yml we generate in the script come to the rescue, since we can use it in our layouts to load the proper asset, as follows:

<link rel="stylesheet" type="text/css" href="https://www.ict4g.net/adolfo/assets/css/main-3af90e380762c3447e7d390e86069ca447ba3144bdd87265f77deab6737d67bf.css">
<script type="text/javascript" src="https://www.ict4g.net/adolfo/assets/js/main-8dfe33580f223aba67dcb6990190b56475af4de1009eaf1cb7dce78ee38dfede.js"></script>

The important bit is the site.data.assets['main.css'].digested which gets the name of the fingerprinted main.css asset. See Example: Accessing a specific author for more details.

Now, every time I change my assets, I run the script above to generate fresh copies of my assets.

The solution might be simple and naive (there is no deletion of all assets, to name one), but it works, it is blazing fast, it helps limiting the number of external dependencies, and the complexity of the build process. If you have spent time debugging a toolchain which stops working after updating some components, you probably know what I mean.

Conclusions and take-home Lesson

Asset bundling and revving can improve the performances of your website/webapp and there are many nice tools which support us very well in complex workflows.

There are situations, however, where a simpler and more robust solution is there at hand: we just need to think about what our requirements really are and what we really need from the tools we use.

Get in touch

Comments