Benki → All Posts

⇠ previous page next page ⇢
Matthias #

Scott encoding in Java

Assume you have your typical persistent binary tree data type with labeled leaves:

Bᵨ := ρ + (Bᵨ × Bᵨ)

Or, in Haskell notation:

data BinaryTree t = Leaf t | Fork (BinaryTree t) (BinaryTree t)

We can represent it in modern Java in a straight-forward way:

sealed interface BinaryTree<T> {
  record Leaf<T>(T value) implements BinaryTree<T> {}
  record Fork<T>(BinaryTree<T> left, BinaryTree<T> right) implements BinaryTree<T> {}
}

How do you pattern-match on it?

You could wait for pattern matching for switch to exit preview. Then you will be able to write a function that sums up all the leaves in a tree of integers in a straight-forward way:

class BinaryTreeOps {

  static Integer sum(BinaryTree<Integer> b) {
    return switch(b) {
      BinaryTree.Leaf leaf -> leaf.value;
      BinaryTree.Fork fork -> sum(fork.left) + sum(fork.right);
    };
  }
}

If you do not want to wait, here is an alternative. Add a visit method that takes a deconstructor function for each case and implement it by calling the corresponding deconstructor in each case class:

sealed interface BinaryTree<T> {

    <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork);

    record Leaf<T>(T value) implements BinaryTree<T> {

      @Override public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onLeaf.apply(value);
      }        
    }

    record Fork<T>(BinaryTree<T> left, BinaryTree<T> right) implements BinaryTree<T> {

      @Override public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onFork.apply(left, right);
      }        
    }
}

Now you can sum up a tree of integers by folding it using the visit method:

class BinaryTreeOps {

  static Integer sum(BinaryTree<Integer> b) {
    return b.visit(
        n -> n, 
        (l, r) -> sum(l) + sum(r));
  }
}

Once you realize that pattern matching is a universal operation and therefore all you ever need out of a data structure (if we ignore performance concerns, that is), it becomes evident that you can get rid of the record middlemen and take visit as the data structure itself:

@FunctionalInterface
interface BinaryTree<T> {

  <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork);

  static <T> BinaryTree<T> ofLeaf(T value) {
    return (onLeaf, onFork) -> onLeaf.apply(value);   // does not compile; see below
  }

  static <T> BinaryTree<T> ofFork(BinaryTree<T> left, BinaryTree<T> right) {
    return (onLeaf, onFork) -> onFork.apply(left, right);   // does not compile; see below
  }
}

No records involved. In fact, notice how there are no data structures in a classical sense whatsoever. Instead, this implementation uses pure closures as data containers.

This would be a valid representation of a binary tree; alas, Java does not permit lambda expressions to implement interface methods with type parameters, so you have to forgo the syntactic sugar and write it out in the old-fashioned way:

interface BinaryTree<T> {

  <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork);

  static <T> BinaryTree<T> ofLeaf(T value) {
    return new BinaryTree<>() {
      @Override
      public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onLeaf.apply(value);
      }
    };
  }

  static <T> BinaryTree<T> ofFork(BinaryTree<T> left, BinaryTree<T> right) {
    return new BinaryTree<>() {
      @Override
      public <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTree<T>, BinaryTree<T>, R> onFork) {
        return onFork.apply(left, right);
      }
    };
  }
}

More verbose, but the same thing, and this time it compiles and works just fine.

This encoding of an algebraic data structure is called the Scott encoding of the data structure.

Clearly it is not very Enterprise-ready, but it does demonstrate how you can summon data structures out of nothing in a language that supports lexical closures.

Economic modeling software by Steve Keen, the author of Debunking Economics.

A book about the great scientific discoveries of the 20th century. What makes this book special is that works with the actual original scientific papers.

An ML-family, statically typed, functional programming language for data-parallel computations. Targets CUDA, OpenCL, and CPUs.

An interpretation of quantum mechanics that—as far as I understand—is an extension of the Copenhagen interpretation, with the main difference being that it rejects the notion of a single physical reality that is independent of the observer. Each observer, it seems to go, experiences its own Everett branch. This resolves the measurement problem.

The main difference between it and Many Worlds seems to be that Everett branches are indexed by observers rather than possibilities. In other words, observers do not split. From this it seems to me that it follows that Everett branches cannot diverge under Relational Quantum Mechanics; as soon as two observers interact with each other, their realities must synchronize, which puts constraints on the set of possible histories.

To put it in terms of Schrödinger’s Cat: If the cat survives according to its own experience, then the physicist who opens its box must find it alive. This is not true under Many Worlds, where there will be copies of the physicist who find it alive and those who find it dead.

Personally speaking, as far as Copenhagen-based interpretations go, this one seems at least not quite so insane.

I still find Many Worlds (plus decoherence) more intuitive, though.

Interval tree clocks for Haskell.

An interval tree clock is like a vector clock except it can shrink as well as grow.

Matthias #

Simulating bad drive blocks with Device Mapper

Say you have a 0.5 MiB (= 1,024 sectors of 512 bytes each) drive at /dev/loop1 and would like to boot it with QEMU while simulating a broken sector at position 256.

You can use dm-error for this.

Write the following into a file and call it broken-drive.dm:

0 256 linear /dev/loop0 0
256 1 error
257 767 linear /dev/loop0 257

Alternatively, you can make use of dm-flakey to simulate a sector that is only sometimes broken, or that does something even worse such as drop any writes made to it. For example:

0 256 linear /dev/loop0 0
256 1 flakey /dev/loop0 256 5 5
257 767 linear /dev/loop0 257

Refer to the documentation of dm-flakey on how exactly it works and what the parameters are that it expects.

Create a virtual device at /dev/mapper/broken-drive using dmsetup create:

dmsetup create broken-drive <broken-drive.dm

You can now use it with QEMU just like any other drive or drive image.

A Java source code transformation engine, usable for refactoring, API migration, and such.

⇠ previous page next page ⇢