Bibliography caching for Jekyll-Scholar 20 May 2020

The bibliography on my About page is generated automatically. Because I already administer a bibliography in a dedicated software, an automatic extraction of the relevant entries to generate the webpage’s bibliography ensures a unified style and a minimal updating effort.

Since a long time, I use Jekyll to produce a static webpage. Naturally, the Jekyll-Scholar plugin lends itself to generate a bibliography page. However, using it can lead to very long webpage regeneration times. Here, I provide a new fragment caching plugin, which offers a significant speed gain.

The usual bibliography source for Jekyll-Scholar is a dedicated BibTeX file, of which each entry is rendered on the bibliography page. However, I prefer to avoid the usual process of manually entering the bibliography in a BibTeX file. Instead, I reference a BibTeX file that contains my whole personal library of papers and is generated from Zotero. Using this collection, I instruct Jekyll-Scholar to filter the relevant entries with a pattern matching for the author name and the desired entry type.

For example, the following Liquid template tag inserts a bibliography of books where at least one of the authors bears my last name:
{% bibliography --query @book[author~=Sedding] %}

Loading a large BibTeX file, however, complicates the workflow at another point: Changing the webpage always triggers a regeneration of the bibliography page. In my case, using a 1.5MB BibTeX file to generate four bibliographies, each with a different entry type like book or article, takes about five seconds.

The long loading time is accentuated when using Jekyll’s “serve” mode to preview changes locally: in some way, each regeneration becomes slower, up to a point at which it is unbearably slow. After several edits, one needs to literally wait about 50 seconds until the webpage is regenerated.

A typical remedy is to cache the bibliography page to avoid regenerating it unnecessarily. Unfortunately, Jekyll-Scholar provides no caching feature. Also, Jekyll’s own page-based cache is of no use here: although Jekyll 4.0 seems to support caching of Markdown pages, the bibliographic parts in such pages still get executed in each regeneration.

However, we are able to achieve a major speed up by explicitly calling the Jekyll cache.

This method is facilitated by developing a custom fragment caching plugin, which provides the Jekyll Liquid tag “cached_bibliography”. Its usage is simple: each standard bibliography call is replaced one-to-one by cached_bibliography.

Then, the call in the previous example simply changes to:
{% cached_bibliography --query @book[author~=Sedding] %}

The speed gain speaks for itself: after having cached the fragments, the regeneration time is down to the original 0.05 seconds.

If you like to use the fragment caching as well, you are welcome to use my plugin’s source provided below. It is installed by pasting it into a new file within the plugins’ subfolder, like _plugins/cached_bibliography.rb. Note that the fragments get stored, like all other Jekyll’s cached elements, in the .jekyll-cache subfolder, which can be deleted if necessary, like on adding a new bibliography entry.

require 'Jekyll-Scholar'
module Jekyll
  class CachedBibliographyTag < Jekyll::Scholar::BibliographyTag
    def initialize(tag_name, text, tokens)
      @text = text
    def render(context)"Jekyll::CachedBibliographyTag").getset(@text) do
  'cached_bibliography', Jekyll::CachedBibliographyTag)

QML for desktop apps 21 December 2012

The Qt Developer Days 2012 in Berlin were quite eventful, also for me, because the people had the chance to see our product, IPO.Log. I held a presentation about some generic internals that are useful for building QML applications on the desktop.

A screencast of my talk is available on YouTube.

In addition, the presentation slides are provided here:

Slides PDF

First publication 04 September 2012


My first scientific contribution has just been published. What an amazing moment. (“Massively Parallel Multiclass Object Recognition”, together with F. Deger, H. Dammertz, J. Bouecke and H. P. A. Lensch, in Proc. of VMV 2010)


A citing paper (Orchard et al., 2013) appears in IEEE Transactions on Neural Networks and Learning Systems, mentioning us as their predecessor and as a performance reference. They bring our approach to the next level of integration (to FPGAs).


It all started as I went to a lecture called Massively Parallel Computing by Hendrik Lensch. This lecture taught me how one can speed up computer calculations by about a factor of 100.

How is this possible? Usually, to speed up an algorithm, one simply invents a more sophisticated algorithm with reduced complexity. But say, if you are already using the best possible or the best known algorithm, how can you be quicker? Say, you already used the best suiting processor instructions. Can we run it on faster hardware? As computers get quicker, usually doubling their speed every 18 months, one can simply wait 10 years to buy a then shiny new PC to achieve a factor of 100. But I guess waiting for so long, you probably don’t need the results anymore.

But how is it then possible on your PC to speed up calculations that much? You already felt it coming: The calculation units on your graphics processing unit (GPU).

The GPU has vast amounts of processing units, about 500 of them. They have a very simple instruction set, and this is their advantage. Few cruft of the past, and no need for general computation: these units dedicate all processing power to one task, your algorithm.

And this is where your brain gets involved again: how can you redesign your algorithm to fit the GPU? A GPU works best on data parallel instructions, where all processors units do the same calculation at the same time. Thus, you need to distribute all data in a way it can be munched through synchronized. Imagine a bar mower: flowers, grass and weed are treated equally when a row is cut by this mower. But this mower only works efficient, if it cuts with the full length of the bar. Hence, you need to take care to find the best covering path through your field of grass. Only such an algorithm keeps all processing units fully utilized.

What we do now with this power? Find a hard problem, find a suiting algorithm, sew it again to run on the GPU, and compare it to important other existing solutions.

This is what we did. We asked our faculty neighbors for Computer Vision for interesting problems. This is how we found work about multiclass object recognition, which allows to classify image content into known categories. As a base for our parallelized implementation, we used a visual-cortex-like model of Jim Mutch and David G. Lowe. Such biologically motivated models compute in feed-forward layers which naturally lend themselves for parallel computation. Additionally, the graphical nature of an image recognition system suits the architecture of GPUs quite well.

Sufficed to say, the implementation went smoothly and moreover, it opened up new possibilities: the usage of object recognition even on low end mobile computers. For this, you can watch the short demonstration video on YouTube.

In comparison with other implementations of the selected object recognition model, our algorithm was faster, while still achieving the same quality of results. That means our work was not only working well, but even compared great to other published results.


This is why we spent the following weeks in the lab to further optimize our implementation, measure results, write our first drafts, correct it, correct it even more, send it in – and finally were accepted at the VMV2010. Hard work, great results, and only a little bit of luck of having the right people at the right place. I thank you all for these amazing moments.

Helmut Sedding; Ferdinand Deger; Holger Dammertz; Jan Bouecke; Hendrik P. A. Lensch:
Massively Parallel Multiclass Object Recognition
Proceedings of the 15th Vision, Modeling and Visualization Workshop 2010, pp. 251-257
[paper] [poster] [source code] [BibTeX] [video] [doi:10.2312/PE/VMV/VMV10/251-257]