Bibliography caching for Jekyll-Scholar 20 May 2020

The bibliography on my About page is generated automatically. Because I already administer a bibliography in a dedicated software, an automatic extraction of the relevant entries to generate the webpage’s bibliography ensures a unified style and a minimal updating effort.

Since a long time, I use Jekyll to produce a static webpage. Naturally, the Jekyll-Scholar plugin lends itself to generate a bibliography page. However, using it can lead to very long webpage regeneration times. Here, I provide a new fragment caching plugin, which offers a significant speed gain.

The usual bibliography source for Jekyll-Scholar is a dedicated BibTeX file, of which each entry is rendered on the bibliography page. However, I prefer to avoid the usual process of manually entering the bibliography in a BibTeX file. Instead, I reference a BibTeX file that contains my whole personal library of papers and is generated from Zotero. Using this collection, I instruct Jekyll-Scholar to filter the relevant entries with a pattern matching for the author name and the desired entry type.

For example, the following Liquid template tag inserts a bibliography of books where at least one of the authors bears my last name:
{% bibliography --query @book[author~=Sedding] %}

Loading a large BibTeX file, however, complicates the workflow at another point: Changing the webpage always triggers a regeneration of the bibliography page. In my case, using a 1.5MB BibTeX file to generate four bibliographies, each with a different entry type like book or article, takes about five seconds.

The long loading time is accentuated when using Jekyll’s “serve” mode to preview changes locally: in some way, each regeneration becomes slower, up to a point at which it is unbearably slow. After several edits, one needs to literally wait about 50 seconds until the webpage is regenerated.

A typical remedy is to cache the bibliography page to avoid regenerating it unnecessarily. Unfortunately, Jekyll-Scholar provides no caching feature. Also, Jekyll’s own page-based cache is of no use here: although Jekyll 4.0 seems to support caching of Markdown pages, the bibliographic parts in such pages still get executed in each regeneration.

However, we are able to achieve a major speed up by explicitly calling the Jekyll cache.

This method is facilitated by developing a custom fragment caching plugin, which provides the Jekyll Liquid tag “cached_bibliography”. Its usage is simple: each standard bibliography call is replaced one-to-one by cached_bibliography.

Then, the call in the previous example simply changes to:
{% cached_bibliography --query @book[author~=Sedding] %}

The speed gain speaks for itself: after having cached the fragments, the regeneration time is down to the original 0.05 seconds.

If you like to use the fragment caching as well, you are welcome to use my plugin’s source provided below. It is installed by pasting it into a new file within the plugins’ subfolder, like _plugins/cached_bibliography.rb. Note that the fragments get stored, like all other Jekyll’s cached elements, in the .jekyll-cache subfolder, which can be deleted if necessary, like on adding a new bibliography entry.

require 'Jekyll-Scholar'
module Jekyll
  class CachedBibliographyTag < Jekyll::Scholar::BibliographyTag
    def initialize(tag_name, text, tokens)
      @text = text
      super
    end
    def render(context)
      Jekyll::Cache.new("Jekyll::CachedBibliographyTag").getset(@text) do
        super
      end
    end
  end
end
Liquid::Template.register_tag(
  'cached_bibliography', Jekyll::CachedBibliographyTag)