Benki Lazy Chat

August 5, 2022

Matthias #

Netty, gRPC, and accounting for direct memory allocation

When the Java Virtual Machine determines whether it may grow the heap or has to force the garbage collector to run in order to free space, it relies on the information it has about the various memory spaces that it is managing. In general, however, this is not all the memory that the application uses. For one, there is the C heap, which native libraries may allocate memory from. Second, the Java Virtual Machine itself allocates some memory outside the Java heap to perform its basic operations. And then there is off-heap memory allocated by Java code, a mechanism that is called direct allocation.

Native Memory Tracking, turned on by the -XX:NativeMemoryTracking=summary command-line flag, enables the Java Virtual Machine to collect statistics on how much native memory of which kind it itself is using and expose them to jcmd. Direct allocations are listed in the “Other” category.

If you use ByteBuffer#allocateDirect to allocate a direct ByteBuffer, the Java Virtual Machine will keep of track of the allocation. If you have set a memory limit with the help of the -XX:MaxRAMPercentage= command-line flag, it will also reduce the maximum heap size accordingly.

Now, for better or worse, there are ways to circumvent ByteBuffer#allocateDirect, and some Java libraries do just that in the name of performance either by calling methods of sun.misc.Unsafe or calling into native libraries. One particularly popular offender is Netty.

There are several different knobs you can turn. For example, you can run your program with -Dio.netty.noUnsafe=true to disable the use of sun.misc.Unsafe; you can try -Dio.netty.allocator.type=unpooled -Dio.netty.noPreferDirect=true to avoid direct allocations as far as possible; or (and this is what I recommend) you can run your program with -Dio.netty.maxDirectMemory=0, which makes direct allocations go through ByteBuffer#allocateDirect without inhibiting other unsafe performance shenanigans.

Now with that out of the way, you would think that Netty behaves. But perhaps you are using gRPC in your code base. If you expected the above system properties to have an effect, you would have been wrong because gRPC ships with its own shaded Netty, which has an effect on the names of the system properties that it reads.

So what you actually have to do is set -Dio.netty.maxDirectMemory=0 -Dio.grpc.netty.shaded.io.netty.maxDirectMemory=0.

Now the Java Virtual Machine knows what is going on and limits allocations and the heap size according to -XX:MaxRAMPercentage= to the best of its ability.

July 24, 2022

Matthias #

Boehm–Berarducci encoding in Java

Last time we looked at how to apply Scott encoding to algebraic data types in Java. This time we are going to look at Boehm–Berarducci encoding, which is related.

Again we take the well-known persistent binary tree data type with labeled leaves:

Bᵨ := ρ + (Bᵨ × Bᵨ)

Or, in Haskell notation:

data BinaryTree t = Leaf t | Fork (BinaryTree t) (BinaryTree t)

Recall the Scott encoding of this data structure:

interface BinaryTreeScott<T> {
  <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTreeScott<T>, BinaryTreeScott<T>, R> onFork);
}

From a type-theoretic perspective, there is a problem with this: BinaryTreeScott<T> is defined in terms of itself. Perhaps you want to work in a type system that does not support recursive types.

Encoding

The Boehm–Berarducci encoding comes to the rescue. For our binary tree example it is:

interface BinaryTreeBoehmBerarducci<T> {
  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);
}

Note how the only technical difference is in how the self-type is encoded. In other words, the two encodings coincide on non-recursive types.

The intuitive difference is that while the Scott encoding encodes a data type using its pattern match, Boehm–Berarducci encodes it using its fold.

Implementation

The implementation is straight-forward:

interface BinaryTreeBoehmBerarducci<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);

  static <T> BinaryTreeBoehmBerarducci<T> ofLeaf(T value) {
    //return (foldLeaf, foldFork) -> foldLeaf.apply(value);

    return new BinaryTreeBoehmBerarducci<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork) {
        return foldLeaf.apply(value);
      }
    };
  }

  static <T> BinaryTreeBoehmBerarducci<T> ofFork(BinaryTreeBoehmBerarducci<T> left, BinaryTreeBoehmBerarducci<T> right) {
    //return (foldLeaf, foldFork) -> foldFork.apply(left.fold(foldLeaf, foldFork), right.fold(foldLeaf, foldFork));

    return new BinaryTreeBoehmBerarducci<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork) {
        return foldFork.apply(left.fold(foldLeaf, foldFork), right.fold(foldLeaf, foldFork));
      }
    };
  }
}

Summing up the leaves is now even easier than it was before, without even requiring explicit recursion:

class BinaryTreeBoehmBerarducciOps {

  static Integer sum(BinaryTreeBoehmBerarducci<Integer> b) {
    return b.fold(
        n -> n,
        (l, r) -> l + r);
  }
}

Pattern matching

But how do you perform a mere pattern match without recurring? This has now become quite tricky to do.

Your first idea might be to convert the Boehm–Berarducci-encoded data type into its Scott encoding:

interface BinaryTreeBoehmBerarducci<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);

  static <T> BinaryTreeBoehmBerarducci<T> ofLeaf(T value) { ... }
  static <T> BinaryTreeBoehmBerarducci<T> ofFork(BinaryTreeBoehmBerarducci<T> left, BinaryTreeBoehmBerarducci<T> right) { ... }

  default BinaryTreeScott<T> toScott() {
    return fold(
        BinaryTreeScott::ofLeaf,
        BinaryTreeScott::ofFork);
  }
}

This works, but now we depend on the recursively defined type of BinaryTreeScott<T> again, which is what we wanted to avoid. To get around this limitation, we define a non-recursive helper type that fulfills the same role as BinaryTreeScott<T>, but based on BinaryTreeBoehmBerarducci<T>:

interface Deconstrutor<T> {
  <W> W visit(Function<T, W> onLeaf, BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork);
}

Then we fold our BinaryTreeBoehmBerarducci<T> into a Deconstructor<T>, on which we can call visit to perform our pattern match:

interface BinaryTreeBoehmBerarducci<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);

  static <T> BinaryTreeBoehmBerarducci<T> ofLeaf(T value) { ... }
  static <T> BinaryTreeBoehmBerarducci<T> ofFork(BinaryTreeBoehmBerarducci<T> left, BinaryTreeBoehmBerarducci<T> right) { ... }

  default <R> R visit(
      Function<T, R> onLeaf,
      BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, R> onFork) {

    interface Deconstructor<T> {
      <W> W visit(Function<T, W> onLeaf, BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork);
    }

    return
        this.<Deconstructor<T>>fold(
                v ->
                    new Deconstructor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork1) {
                        return onLeaf1.apply(v);
                      }
                    },

                (left, right) ->
                    new Deconstructor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork1) {
                        return onFork1.apply(
                            left.visit(BinaryTreeBoehmBerarducci::ofLeaf, BinaryTreeBoehmBerarducci::ofFork),
                            right.visit(BinaryTreeBoehmBerarducci::ofLeaf, BinaryTreeBoehmBerarducci::ofFork));
                      }
                    })

            .visit(onLeaf, onFork);
  }
}

BinaryTreeBoehmBerarducci#visit now works as BinaryTreeScott#visit did before.

The only problem is that it is horrendously inefficient as it traverses the whole tree and constructs a complete mirror tree of Deconstructor<T> objects for just a single pattern match.

Optimization: lazy fold

We can remedy the pathological inefficiency by making the fold operation lazy in the recursive argument:

interface BinaryTreeBoehmBerarducciLazy<T> {
  <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork);
}

The complete code is thus:

@FunctionalInterface
public interface BinaryTreeBoehmBerarducciLazy<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork);

  static <T> BinaryTreeBoehmBerarducciLazy<T> ofLeaf(T value) {
    return new BinaryTreeBoehmBerarducciLazy<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork) {
        return foldLeaf.apply(value);
      }
    };
  }

  static <T> BinaryTreeBoehmBerarducciLazy<T> ofFork(BinaryTreeBoehmBerarducciLazy<T> left, BinaryTreeBoehmBerarducciLazy<T> right) {
    return new BinaryTreeBoehmBerarducciLazy<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork) {
        return foldFork.apply(() -> left.fold(foldLeaf, foldFork), () -> right.fold(foldLeaf, foldFork));
      }
    };
  }

  default <R> R visit(
      Function<T, R> onLeaf,
      BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, R> onFork) {

    interface Deconstrutor<T> {
      <W> W visit(Function<T, W> onLeaf, BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, W> onFork);
    }

    return
        this.<Deconstrutor<T>>fold(
                value ->
                    new Deconstrutor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, W> onFork1) {
                        return onLeaf1.apply(value);
                      }
                    },

                (left, right) ->
                    new Deconstrutor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, W> onFork1) {
                        return onFork1.apply(
                            left.get().visit(BinaryTreeBoehmBerarducciLazy::ofLeaf, BinaryTreeBoehmBerarducciLazy::ofFork),
                            right.get().visit(BinaryTreeBoehmBerarducciLazy::ofLeaf, BinaryTreeBoehmBerarducciLazy::ofFork));
                      }
                    })

            .visit(onLeaf, onFork);
  }
}

Matthias #

Scott encoding in Java

Assume you have your typical persistent binary tree data type with labeled leaves:

Bᵨ := ρ + (Bᵨ × Bᵨ)

Or, in Haskell notation:

data BinaryTree t = Leaf t | Fork (BinaryTree t) (BinaryTree t)

We can represent it in modern Java in a straight-forward way:

sealed interface BinaryTree<T> {
  record Leaf<T>(T value) implements BinaryTree<T> {}
  record Fork<T>(BinaryTree<T> left, BinaryTree<T> right) implements BinaryTree<T> {}
}

How do you pattern-match on it?

You could wait for pattern matching for switch to exit preview. Then you will be able to write a function that sums up all the leaves in a tree of integers in a straight-forward way:

class BinaryTreeOps {

  static Integer sum(BinaryTree<Integer> b) {
    return switch(b) {
      BinaryTree.Leaf leaf -> leaf.value;
      BinaryTree.Fork fork -> sum(fork.left) + sum(fork.right);
    };
  }
}

If you do not want to wait, here is an alternative. Add a visit method that takes a deconstructor function for each case and implement it by calling the corresponding deconstructor in each case class:

sealed interface BinaryTree<T> {

    <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork);

    record Leaf<T>(T value) implements BinaryTree<T> {

      @Override public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onLeaf.apply(value);
      }        
    }

    record Fork<T>(BinaryTree<T> left, BinaryTree<T> right) implements BinaryTree<T> {

      @Override public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onFork.apply(left, right);
      }        
    }
}

Now you can sum up a tree of integers by folding it using the visit method:

class BinaryTreeOps {

  static Integer sum(BinaryTree<Integer> b) {
    return b.visit(
        n -> n, 
        (l, r) -> sum(l) + sum(r));
  }
}

Once you realize that pattern matching is a universal operation and therefore all you ever need out of a data structure (if we ignore performance concerns, that is), it becomes evident that you can get rid of the record middlemen and take visit as the data structure itself:

@FunctionalInterface
interface BinaryTree<T> {

  <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork);

  static <T> BinaryTree<T> ofLeaf(T value) {
    return (onLeaf, onFork) -> onLeaf.apply(value);   // does not compile; see below
  }

  static <T> BinaryTree<T> ofFork(BinaryTree<T> left, BinaryTree<T> right) {
    return (onLeaf, onFork) -> onFork.apply(left, right);   // does not compile; see below
  }
}

No records involved. In fact, notice how there are no data structures in a classical sense whatsoever. Instead, this implementation uses pure closures as data containers.

This would be a valid representation of a binary tree; alas, Java does not permit lambda expressions to implement interface methods with type parameters, so you have to forgo the syntactic sugar and write it out in the old-fashioned way:

interface BinaryTree<T> {

  <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork);

  static <T> BinaryTree<T> ofLeaf(T value) {
    return new BinaryTree<>() {
      @Override
      public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onLeaf.apply(value);
      }
    };
  }

  static <T> BinaryTree<T> ofFork(BinaryTree<T> left, BinaryTree<T> right) {
    return new BinaryTree<>() {
      @Override
      public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onFork.apply(left, right);
      }
    };
  }
}

More verbose, but the same thing, and this time it compiles and works just fine.

This encoding of an algebraic data structure is called the Scott encoding of the data structure.

Clearly it is not very Enterprise-ready, but it does demonstrate how you can summon data structures out of nothing in a language that supports lexical closures.

June 29, 2022

Matthias #

Simulating bad drive blocks with Device Mapper

Say you have a 0.5 MiB (= 1,024 sectors of 512 bytes each) drive at /dev/loop1 and would like to boot it with QEMU while simulating a broken sector at position 256.

You can use dm-error for this.

Write the following into a file and call it broken-drive.dm:

0 256 linear /dev/loop0 0
256 1 error
257 767 linear /dev/loop0 257

Alternatively, you can make use of dm-flakey to simulate a sector that is only sometimes broken, or that does something even worse such as drop any writes made to it. For example:

0 256 linear /dev/loop0 0
256 1 flakey /dev/loop0 256 5 5
257 767 linear /dev/loop0 257

Refer to the documentation of dm-flakey on how exactly it works and what the parameters are that it expects.

Create a virtual device at /dev/mapper/broken-drive using dmsetup create:

dmsetup create broken-drive <broken-drive.dm

You can now use it with QEMU just like any other drive or drive image.

June 19, 2022

Matthias #

Ergonomic mechanical split keyboards

All of these are split down the middle and column-staggered.

June 4, 2022

Matthias # (3)

Comments are now supported

This web site now supports comments.

Comments are simple and anonymous by default. They support a restricted number of HTML tags and their Markdown equivalents:

blocks: p, blockquote, pre
inline style: em, strong, code, sub, sup, s, ins, del
lists: ul, ol, li, dl, dt, dd
accessibility hints: abbr, acronym

To comment, click the permalink hash mark (#) at the top of the post in question. Be sure to enable JavaScript in your web browser.

May 24, 2022

Matthias #

YAML style recommendations

While I’m not a particularly big fan of YAML overall, it does have some clear benefits over both JSON and XML for human-editable configuration files. And while there are some pretty compelling alternatives, YAML is currently ubiquitous, so it makes sense to make the best of it.

Here are my favorite simple style rules. I apply them aggressively whenever I see some YAML that isn’t bolted to the wall and fixed with superglue.

Rule 1. Indent enumerations.

Rule: Indent enumerated items relative to the key that defines them.

Reason: Adhering to the Rectangle Rule makes it easier to find where blocks begin and end.

Bad:

list1:
- item1
- item2
list2:
- item3
- item4

Good:

list1:
  - item1
  - item2
list2:
  - item3
  - item4

This is by far my favorite rule. Even if you do nothing else, it improves the readability of any YAML file by an order of magnitude.

Rule 2. Separate nested blocks by white space.

Rule: Use empty lines to separate blocks of nested content.

Hint: A good rule of thumb is that any block that has children should be surrounded by white space on both sides.

Reason: Empty lines make it easier to find where blocks begin and end.

Bad:

metadata:
  name: mulkcms2
  namespace: mulk
  labels:
    name: mulkcms2-web
    app: mulkcms2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mulkcms2
      group: mulk
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%

Good:

metadata:
  name: mulkcms2
  namespace: mulk

  labels:
    name: mulkcms2-web
    app: mulkcms2

spec:
  replicas: 1

  selector:
    matchLabels:
      app: mulkcms2
      group: mulk

  strategy:
    type: RollingUpdate

    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%

This is a rule that I apply not just to YAML, but to any type of code whatsoever. It not just helps with readability, but also with editability (by enabling me to navigate by block).

May 2, 2022

Matthias #

Well-maintained (or not) OpenJDK Docker images

Here is a list of major OpenJDK vendors and the container images they offer.

Vendor	Image name	Tag	Release cycle	Base OS	Remarks
Azul	`docker.io/azul/zulu-openjdk`	`17`	LTS	Ubuntu
Azul	`docker.io/azul/zulu-openjdk-alpine`	`17`	LTS	Alpine Linux
Azul	`docker.io/azul/zulu-openjdk-centos`	`17`	LTS	CentOS
Azul	`docker.io/azul/zulu-openjdk-debian`	`17`	LTS	Debian
BellSoft	`docker.io/bellsoft/liberica-openjdk-alpine`	`17`	LTS	Alpine (glibc)
BellSoft	`docker.io/bellsoft/liberica-openjdk-alpine`	`latest`	non-LTS	Alpine (glibc)
BellSoft	`docker.io/bellsoft/liberica-openjdk-alpine-musl`	`17`	LTS	Alpine (musl)
BellSoft	`docker.io/bellsoft/liberica-openjdk-alpine-musl`	`latest`	non-LTS	Alpine (musl)
BellSoft	`docker.io/bellsoft/liberica-openjdk-centos`	`17`	LTS	CentOS
BellSoft	`docker.io/bellsoft/liberica-openjdk-centos`	`latest`	non-LTS	CentOS
BellSoft	`docker.io/bellsoft/liberica-openjdk-debian`	`17`	LTS	Debian
BellSoft	`docker.io/bellsoft/liberica-openjdk-debian`	`latest`	non-LTS	Debian
Eclipse	`docker.io/library/eclipse-temurin`	`latest`	non-LTS	Ubuntu	recommended non-LTS¹
Eclipse	`docker.io/library/eclipse-temurin`	`17`	LTS	Ubuntu	recommended LTS¹
Eclipse	`docker.io/library/eclipse-temurin`	`17-alpine`	LTS	Alpine
Google	`gcr.io/distroless/java17-debian11`	`latest`	LTS	Debian
Microsoft	`mcr.microsoft.com/openjdk/jdk`	`17-ubuntu`	LTS	Ubuntu
Microsoft	`mcr.microsoft.com/openjdk/jdk`	`17-mariner`	LTS	CentOS (derivative)
Microsoft	`mcr.microsoft.com/openjdk/jdk`	`17-cbld`	LTS	Debian (derivative)
Oracle	`container-registry.oracle.com/java/openjdk`	`latest`	non-LTS	Oracle Linux	recommended non-LTS²
Red Hat	`registry.access.redhat.com/ubi8/openjdk-17`	`latest`	LTS	RHEL (UBI)⁴	recommended LTS³
Red Hat	`registry.access.redhat.com/ubi8/openjdk-17-runtime`	`latest`	LTS	RHEL (UBI)⁴

General remarks:

As is apparent from the list, most vendors do not offer a rolling non-LTS image. Be careful when using a non-LTS image pinned to a specific version as its time under support will be quite limited. Rolling non-LTS images that always update to the latest OpenJDK version are fine (and may in fact be more secure and reliable than any LTS image considering that OpenJDK Updates primarily consists of backports from later versions).

Generally speaking, Docker images, particularly OpenJDK images, tend to drift from the latest update state of the base OS underlying them. It is probably a good idea to build your own runtime image (perhaps based on something like UBI Micro (manual)) and keep it up to date through a nightly CI job.

I cannot recommend any Alpine-based images at present because there are too many dependencies on glibc specifics (see also) in the ecosystem and using glibc on Alpine is not a supported configuration.

Footnotes:

Being a widely deployed image with lots of attention given to it, the Temurin image is probably the one you want if you prefer Ubuntu over RHEL.
Oracle is the main sponsor of OpenJDK. New OpenJDK releases tend to find their way into their image promptly. Oracle Linux is also a generally well-maintained and secure base; do note, however, that the OpenJDK image is typically only updated when a new OpenJDK is released, so you have to install system package updates yourself.
Red Hat is the second largest contributor to OpenJDK (after Oracle) and one of the sponsors of the OpenJDK 17 Updates project and is typically quick to release security patches. UBI8 is also a well-maintained and secure image base.
UBI is a trimmed-down version of RHEL that Red Hat distribute free of charge as part of their container image offerings.

April 24, 2022

Matthias #

How do I create smaller initramfs images on Ubuntu?

If you are running Ubuntu, your initramfs images may be quite large. On my system, for instance, each initramfs took up about 100 MiB of space. Because I did not pay enough attention when setting up the computer I ended up with a very small boot partition, which prompted me to look for a way to make the initramfs images generated by update-initramfs smaller.

Caution: Any of the below may make your system unbootable. Please only copy the steps if you understand what they do.

Step 1. Fewer kernel modules.

Create a file called, say, /etc/initramfs-tools/conf.d/zzz-custom (the exact name of the file does not matter much) and fill it with the following:

MODULES=dep

This causes mkinitramfs to guess the set of kernel modules required to boot your system based on what is currently loaded and what hardware is present instead of indiscriminately including whatever could be useful to make a computer boot.

This saved me about 50 MiB, reducing the size of the initramfs from 100 MiB to 50 MiB.

Step 2. No GPU.

Assuming you do not need to interact with the initramfs (to debug boot problems or to type in a disk encryption password, say), you can disable the scripts that deal with setting up a graphics frame buffer. Doing so gets rid of GPU firmware, which at least for the amdgpu driver is a pretty sizable amount of data.

Adding the following to /etc/initramfs-tools/conf.d/zzz-custom may or may not be good enough:

FRAMEBUFFER=n

In my case it was not good enough. Since I boot from ZFS, the /usr/share/initramfs-tools/conf-hooks.d/zfs hook was active, forcing FRAMEBUFFER to y regardless of what /etc/initramfs-tools/conf.d says.

But of course you can add your own configuration hook to /usr/share/initramfs-tools/conf-hooks.d. You just have to ensure it runs after the zfs one by giving it a lexicographically higher name. So create a file called /usr/share/initramfs-tools/conf-hooks.d/zzz-custom and fill it with the same content as above:

FRAMEBUFFER=n

As long as you do not need a disk encryption passphrase prompt in the initramfs, this should not break anything. If you do, it is probably a bad idea.

This saved me another 30 MiB, reducing the size of the initramfs from 50 MiB to 20 MiB.

April 9, 2022

Matthias #

How do I access binary data using Vert.x-redis or Quarkus Redis Client?

Vert.x-redis provides the RedisAPI class to access most Redis functionality. The methods provided by RedisAPI only take Strings as arguments, which it encodes in UTF-8. With Quarkus you get to choose between RedisClient and ReactiveRedisClient, both of which mirror RedisAPI, again only containing String-based methods.

So how do you store binary strings?

With Vert.x-redis, use the send method on the Redis class, which is the lower-level interface underlying RedisAPI. Similarly, with Quarkus, instead of injecting a RedisClient or ReactiveRedisClient, inject the lower-level MutinyRedis:

@Inject
MutinyRedis mutinyRedis;

Uni<Void> set(String key, byte[] data) {
    return mutinyRedis
        .send(
            Request.cmd(Command.SET)
                .arg(key)
                .arg(data))
        .replaceWithVoid();
}

Uni<byte[]> get(String key) {
    return mutinyRedis
        .send(
            Request.cmd(Command.GET)
                .arg(key))
        .ifNotNull()
        .transform(Response::toBytes);
}

October 16, 2021

Matthias #

I tried out GNOME 40 today.

I had already read about it online, with quite a few articles suggesting that it was going into the wrong direction UI-wise. One common complaint was that it seemed more touch-oriented and less efficient to use with a mouse.

After trying it out, I am pleasantly surprised. I find I am quite happy with the gesture-based input paradigm that GNOME 40 is built around. In particular, navigating between the activity overview and the main desktop has become significantly quicker for me.

Perhaps my fondness of the MacBook’s way of treating its trackpad as a first-class input device has rubbed off on me and I am now attuned to it. What is surprising, however, is that notwithstanding the substantial amount of time I have spent in macOS, I seem to prefer GNOME’s implemention over the Mac’s. It feels, to me, even more efficient and to the point.

September 19, 2021

Matthias #

Even with popular support for the CSU dwindling, it still seems unlikely for the left to capture a significant number of direct mandates (first-vote seats assigned by first-past-the-post voting) in Bavaria due to vote splitting between the three left-leaning parties (the Left Party, the Social Democrats, and the Greens).

If I were responsible for any of the three parties’ Bavarian election strategies, I would propose to the other two to hold a joint primary election in each voting district. This would likely improve the odds for both the Social Democrats and the Greens while not really changing them for the Left Party (while still giving them the benefit of hurting the other end of the political spectrum plus the opportunity to build alliances within their own).

April 6, 2021

Matthias #

The big issue that I have with how the German government does COVID politics is that they can’t seem to handle the trolley problem.

The default choice appears to be inaction. A decision to act is made only when there is (seemingly) irrefutable scientific consensus that it causes no harm. That is arguably the correct approach for a doctor to take on a patient who cannot make an informed decision for themselves. But a pandemic is more akin to war than to a doctor’s visit. There will be casualties, some of which innocent bystanders—the only question is how many. In that kind of situation, inaction is not the safe choice. There is no safe choice.

January 2, 2021

Matthias #

I just finished migrating my Mailcow installation from a native Kubernetes port that I had made by hand and that was becoming impossible to update to a more streamlined (albeit wasteful with resources) deployment where I wrap docker-compose and a dedicated Docker instance in a Kata container that I run as a Kubernetes pod.

The container is built as a Nix expression and is available in my public Kubeia repository, which contains part of my Kubernetes deployment configuration and image build scripts. A README is available too, in case you would like to try running it yourself.

Do note that the Kubernetes deployment file is just provided as an example. It contains some pretty specific references to my particular deployment – view it as something like a template that you will have to copy and fill with your own data. Another idiosyncracy is that I really really dislike running multiple database servers on a single piece of hardware and so I kluged something in that makes Mailcow use my already provisioned MariaDB instance rather than its own. In other words, your mileage may very much vary.

December 6, 2020

Matthias #

On unsafe Rust

A common misunderstanding is that unsafe Rust is more liberal than safe Rust. It is not. The same invariants apply, but instead of the compiler, it’s your job to uphold them.

Unsafe Rust is what you write when in any other language you would have written a C extension. It is a very rare thing to do.

November 20, 2020

Matthias #

Ich wurde kürzlich darauf hingewiesen, daß Doppelnennungen wie in „Studentinnen und Studenten“ nicht als inklusiv gelten, da sie nur Frauen und Männer einschließen, nicht aber Menschen, die sich als weder noch begreifen. Nur mit Gendersternchen sei es inklusiv: „Student*innen“.

Ich schlage eine Alternative vor. Wie wäre es, wenn wir wieder dazu übergingen, mehr angelsächsische Kultur zu übernehmen, und einfach „Studenten“ sagten? Revolutionär, ich weiß.

Matthias #

Remember: The closer we get to a vaccine, the easier it is to justify a stricter lockdown.

If the logic behind that statement doesn’t seem obvious to you, think about the extreme cases: If we were in a pandemic with no chance of ever getting rid of it or finding any sort of treatment, a lockdown would make little sense. Since everyone would contract the virus eventually, very few people would be saved by the measures, but more people would suffer or even die due to the economic consequences of a lockdown whose duration would have to be indefinite. If, on the other hand, we were just two weeks away from eradicating the pandemic at the snip of a finger, then it would clearly be the correct thing to do to impose a strict lockdown for those two weeks in order to maximize the number of lives saved, since each person who manages to avoid the virus for just another two weeks would be saved from it for good.

November 10, 2020

Matthias #

Guess what one of the top disk latency inducers is on my (functionally mostly idle) server.

# zfsslower
Tracing ZFS operations slower than 10 ms
TIME     COMM           PID     T BYTES   OFF_KB   LAT(ms) FILENAME
09:14:10 async_49       2675004 S 0       0          18.29 journal.jif

The mysteriously named async_49 represents Mnesia as used by RabbitMQ as part of… Zulip.

Have I mentioned that a Zulip instance hosting 3 users is a waste of resources? Oh, I have, haven’t I?

Matthias #

How do I fix CGit’s display of a repository’s time of last update?

If you copied Git repositories into CGit at one point, you may have done so without keeping their mtimes intact. In this case, CGit will display an incorrect time of last update for the affected repositories, as it does not determine it based on the most recent Git commit but rather the time the default branch was last touched on the local file system.

By default, CGit uses the mtime of refs/heads/master (assuming that master is your default branch) to determine the time of last update, so this is how you can fix the time to be the same as the commit date of the last commit:

touch -c refs/heads/master -t $(date +"%Y%m%d%H%M.%S" --date=@$(git show -s --format=%ct HEAD))

November 3, 2020

Matthias #

You can now subscribe to this web site via a weekly email newsletter. The content is the same as in the public news feed. Be warned: The code is beta quality. The very first issue of the newsletter is also going to contain all posts ever made up to this point.

October 31, 2020

Matthias #

On Reddit there is currently a discussion going on about the newly written Haskell committee guidelines for respectful communication. There’s nothing new about it—one side saying it’s long overdue, another saying it’s the usual overly draconian, one-sided snowflake nonsense, with little more nuanced commentary in between.

Now, I haven’t read the document being discussed nearly carefully enough and from the cursory look I’ve given it, it actually looks like one of the better ones that I can find little to disagree with in, so I’ll not be commenting on it specifically. But I do have an opinion on codes of conduct like it in general and I’ll make use of its wording to illustrate it (in part because it feels like such an okay code of conduct to me, all things considered).

So let’s look at codes of conduct in more generality.

In my opinion, while their goals are clearly noble and worthy of support, the concrete regulations stipulated in such documents often go way overboard and, if implemented, can be harmful and unfair to the parties involved.

To illustrate why, take the following statement the document linked above makes:

In our communication, we consistently honour and affirm the … good intentions of others.

Which sounds very reasonable. But then contrast it with this part:

Our response should usually be to apologise … Even if we feel we have been misinterpreted or unfairly accused, the chances are good there was something we could have communicated better …

There is an apparent contradiction here depending on how you read the word “should.” Under the assumption that person B accuses person A of insulting them, should we:

assume that person A had only good intentions and therefore there is no need to apologize, or
assume that person A is obliged to apologize because if in doubt, person B was probably right that person A said something wrong?

I think the fundamental problem lies in the three different forms that the communication in question takes and the nontrivial mappings between them:

                               ┌────┐  ┏━━━━━━━━━━━━━━┓   ┌────┐                   
╔═══════════════════╗          │g[B]│  ┃              ┃   │f[A]│                   
║realm of expression║         ┌┴────┴──┃ what is said ┃◀─┴────┴┐                  
╚═══════════════════╝         │        ┃              ┃         │                  
                              │        ┗━━━━━━━━━━━━━━┛         │                  
                              ▼                                │                  
                       ┏━━━━━━━━━━━━┳───────────┐       ┏━━━━━━━━━━━━┳───────────┐
╔═══════════════════╗  ┃  what is   │ Person B  │       ┃  what is   │ Person A  │
║ realm of meaning  ║  ┃ understood ┣───────────┘       ┃  intended  ┣───────────┘
╚═══════════════════╝  ┗━━━━━━━━━━━━┛                   ┗━━━━━━━━━━━━┛

If what was said causes person B to be offended, do you automatically assume that what was intended was an offense? Or do you assume that if person A maintains that they did not mean to offend, do you completely discount what was understood? Is either extreme reasonable? Clearly not. Hence the contradictory phrasing in the document.

What you need to realize to untangle this mess is that both f and g are dependent on both the person executing them and the situation that they are in at the moment they do so (which I’m going to simplify notationally by assuming that the situation is a part of the person). With this realization we can now rephrase the problem:

How much responsibility do we put on person A to anticipate g[B] for any given person B?

Clearly we cannot expect person A to anticipate g[B] for all possible B in a discussion on a public mailing list, since the space {g[B] | B ∈ Audience} is, for all intents and purposes, infeasible to compute when Audience is sufficiently large. On the other hand it is also obvious that if what is said is in the context of a face-to-face meeting between two individuals who know each other well, the challenge is much simpler and we can expect person A to be more considerate of what they can reasonably expect person B to understand based on what they express.

(What people will actually do in practice when they do not know the audience well is to substitute B := A, which may seem overly simplistic, but is really as good an approximation as any when Audience = World.)

In practice, what this means is that as an outside observer C, the wise thing to do is probably to assume both that:

when evaluation person B’s conduct, the only correct way to interpret what was said is in the most malevolent way reasonably interpretable, and
when evaluating person A’s conduct, the only correct way to interpret what was said is in the most benevolent way reasonably interpretable.

Which results in the contradiction mentioned previously. As far as I know, there is no way to resolve it, so all we can do is accept it.

October 25, 2020

Matthias #

Zulip is ridiculous.

Here are the biggest RAM eaters in my Kubernetes cluster, all of which are pretty much completely idle right now:

NAME            CPU(cores)   MEMORY(bytes)
gerrit          5m           315Mi
keycloak        3m           440Mi
mulkcms2        2m           325Mi
zulip           31m          2698Mi

Remember, Keycloak is built on top of JBoss.

Let that sink in for a bit.

What’s going on there? Well, Zulip consists of 20 microservices. It’s the modern way.

Matthias #

How to switch from Docker to Containerd on a kubeadm-managed Kubernetes node:

Install Containerd.
Edit /etc/default/kubelet and add the following line:

KUBELET_EXTRA_ARGS=--container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock

If KUBELET_EXTRA_ARGS exists already, add the additional parameters to it instead.
Restart kubelet.
Uninstall Docker.
To make kubeadm aware of the change (so that kubeadm upgrade apply works): Let ${NODE_NAME} be the name of your node. Run:

kubeadm edit nodes/${NODE_NAME}

Look for the kubeadm.alpha.kubernetes.io/cri-socket annotation. Change its value to /run/containerd/containerd.sock.

Now everything should be set up for Containerd and you can do such fun things as running Kubernetes pods in Kata containers.

Presumably, these instructions will work for a migration to CRI-O and other container runtimes as well, but I have not tried.

September 17, 2020

Matthias #

This website now has a search bar.

Since I am using PostgreSQL, which has a basic form of full text indexing built in, it was pretty easy to implement. You can find the implementation on Gerrit.

One interesting question was how to integrate a clause using the full-text search operator @@ into a Criteria query. I experimented a bit and found that if you define an IMMUTABLE function and use it in a query, PostgreSQL has no trouble inlining and optimizing it, so it’s a great way to call into PostgreSQL-specific functionality from within a Criteria query. For instance, it will make perfectly fine use of indices where possible:

mulkcms=# EXPLAIN
            SELECT cached_description_html 
              FROM benki.bookmark_texts
             WHERE post_matches_websearch(search_terms, 'en', 'test');

                        QUERY PLAN
---------------------------------------------------------------
 Bitmap Heap Scan on bookmark_texts
   Recheck Cond: (search_terms @@ '''test'''::tsquery)
   ->  Bitmap Index Scan on bookmark_texts_search_terms_idx
         Index Cond: (search_terms @@ '''test'''::tsquery)

Just as you expect from a high-quality database system.

September 11, 2020

Matthias #

How to run Docker in a Kubernetes pod powered by a Kata container:

Make sure that you are running containerd >= 1.3.
Configure containerd as described at Kata Containers as a Runtime Class in the Kata Containers documentation.
Add privileged_without_host_devices = true to the [plugins.cri.containerd.runtimes.kata] section of containerd’s config.toml file. This ensures that privileged Kata containers can only access the guest VM managed by the Kata containers runtime and not also the host system.
Create a Kubernetes pod running an ubuntu:20.04 container with securityContext: {privileged: true} set and runtimeClassName: kata. You may wish to double-check that host devices are really inaccessible (for example by checking whether the host’s root disk is visible in /dev) before you proceed.
Enter the Kubernetes pod, install Docker by running apt update; apt install -y --no-install-recommends docker.io, and type dockerd --storage-driver=vfs. Docker should now be running.

If you are migrating an existing kubeadm-managed, Docker-based cluster to Containerd, see my post on how to migrate kubeadm to Containerd.