Benki All Posts

August 20, 2022

Matthias #

🔖 Magmide.

A dependently typed, imperative, bare-metal programming language.

Matthias #

🔖 Znai.

Code manual (as in documentation) generator with Javadoc and OpenAPI integration.

August 19, 2022

Matthias #

🔖 A Gentleman in Moscow, by Amor Towles.

The quite exquisite novel I read while on vacation.

Matthias #

🔖 Filthy Lucre: Economics for People Who Hate Capitalism, by Joseph Heath.

Matthias #

🔖 Progressive Consumption Taxation: The X Tax Revisited.

Another take on a progressive consumption tax with many of the same benefits as the personal expenditures tax, except it treats capital gains as non-taxable. Apparently the math makes this work out the same in the end somehow, but it feels a little less politically viable (and obviously fair) to me.

Matthias #

🔖 Matthew Yglesias on taxes.

His preferred tax system: a personal expenditures tax. It works like the following:

Idea: Tax only consumption, but in a progressive way.
How? Tax people’s income (similar to how it is done today, but regardless of the source of income), deducting what they leave in savings (be it checking accounts, stocks, whatever).

What’s nice about this is that it is income-agnostic (capital gains and wages are taxed the same) and progressive while still incentivizing people to save.

Plus a Georgist land value tax, of course.

Matthias #

🔖 Beyond Inductive Data Types: Exploring Self-Types.

How to encode inductive data types in Kind’s type system.

Matthias #

🔖 Kind.

A small functional programming and proof language with accessible syntax.

Matthias #

🔖 Effect Zoo for Haskell.

Comparison of effect systems for Haskell.

Matthias #

🔖 HVM.

Runtime for functional programs. Claims to be very efficient. Lazy, garbage-collection-free, parallel, and beta-optimal.

Matthias #

🔖 Effect Zoo for Scala 3.

Comparison between different monadic effect systems.

I am still hoping Scala 3 gets a built-in effect system eventually so that all of this can go away. Monadic style does not fit the language well.

August 17, 2022

Matthias #

🔖 Cryptol.

A language for the description of cryptographic algorithms.

Examples:

August 16, 2022

Matthias #

🔖 Flix.

Functional programming language for the JVM with row types, algebraic effects, and embedded Datalog.

Matthias #

🔖 idris-jvm.

JVM back end for Idris 2.

August 11, 2022

Matthias #

🔖 u-root.

A one-binary initramfs.

Matthias #

🔖 TrenchBoot.

A framework for secure booting, building on top of whatever secure booting functionality the firmware provides and then taking over from there.

Matthias #

🔖 LinuxBoot.

Using Linux as firmware and as a boot loader.

August 8, 2022

Matthias #

🔖 Bruce Schneier: NIST’s post-quantum cryptography standards.

We need to be flexible and build our systems in such a way that we can switch out cryptographic algorithms as new attacks come along.

Matthias #

🔖 Nuclear deterrence, by Tyler Cowen.

What worked in the past does not necessarily work today. Evolving defensive measures and notions of what “unthinkable” actions are force changes in strategy.

August 5, 2022

Matthias #

🔖 The Breakthrough Institute.

Matthias #

🔖 An Ecomodernist Manifesto.

Matthias #

Netty, gRPC, and accounting for direct memory allocation

When the Java Virtual Machine determines whether it may grow the heap or has to force the garbage collector to run in order to free space, it relies on the information it has about the various memory spaces that it is managing. In general, however, this is not all the memory that the application uses. For one, there is the C heap, which native libraries may allocate memory from. Second, the Java Virtual Machine itself allocates some memory outside the Java heap to perform its basic operations. And then there is off-heap memory allocated by Java code, a mechanism that is called direct allocation.

Native Memory Tracking, turned on by the -XX:NativeMemoryTracking=summary command-line flag, enables the Java Virtual Machine to collect statistics on how much native memory of which kind it itself is using and expose them to jcmd. Direct allocations are listed in the “Other” category.

If you use ByteBuffer#allocateDirect to allocate a direct ByteBuffer, the Java Virtual Machine will keep of track of the allocation. If you have set a memory limit with the help of the -XX:MaxRAMPercentage= command-line flag, it will also reduce the maximum heap size accordingly.

Now, for better or worse, there are ways to circumvent ByteBuffer#allocateDirect, and some Java libraries do just that in the name of performance either by calling methods of sun.misc.Unsafe or calling into native libraries. One particularly popular offender is Netty.

There are several different knobs you can turn. For example, you can run your program with -Dio.netty.noUnsafe=true to disable the use of sun.misc.Unsafe; you can try -Dio.netty.allocator.type=unpooled -Dio.netty.noPreferDirect=true to avoid direct allocations as far as possible; or (and this is what I recommend) you can run your program with -Dio.netty.maxDirectMemory=0, which makes direct allocations go through ByteBuffer#allocateDirect without inhibiting other unsafe performance shenanigans.

Now with that out of the way, you would think that Netty behaves. But perhaps you are using gRPC in your code base. If you expected the above system properties to have an effect, you would have been wrong because gRPC ships with its own shaded Netty, which has an effect on the names of the system properties that it reads.

So what you actually have to do is set -Dio.netty.maxDirectMemory=0 -Dio.grpc.netty.shaded.io.netty.maxDirectMemory=0.

Now the Java Virtual Machine knows what is going on and limits allocations and the heap size according to -XX:MaxRAMPercentage= to the best of its ability.

August 3, 2022

Matthias #

🔖 AutoRegex.

Converts a pattern description in English into a regular expression with the help of artificial intelligence.

July 25, 2022

Matthias #

🔖 Cedille.

A dependently-typed programming language and interactive theorem prover with an extrinsic type system based on the untyped lambda calculus.

July 24, 2022

Matthias #

Boehm–Berarducci encoding in Java

Last time we looked at how to apply Scott encoding to algebraic data types in Java. This time we are going to look at Boehm–Berarducci encoding, which is related.

Again we take the well-known persistent binary tree data type with labeled leaves:

Bᵨ := ρ + (Bᵨ × Bᵨ)

Or, in Haskell notation:

data BinaryTree t = Leaf t | Fork (BinaryTree t) (BinaryTree t)

Recall the Scott encoding of this data structure:

interface BinaryTreeScott<T> {
  <R> R visit(Function<T, R> onLeaf, BiFunction<BinaryTreeScott<T>, BinaryTreeScott<T>, R> onFork);
}

From a type-theoretic perspective, there is a problem with this: BinaryTreeScott<T> is defined in terms of itself. Perhaps you want to work in a type system that does not support recursive types.

Encoding

The Boehm–Berarducci encoding comes to the rescue. For our binary tree example it is:

interface BinaryTreeBoehmBerarducci<T> {
  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);
}

Note how the only technical difference is in how the self-type is encoded. In other words, the two encodings coincide on non-recursive types.

The intuitive difference is that while the Scott encoding encodes a data type using its pattern match, Boehm–Berarducci encodes it using its fold.

Implementation

The implementation is straight-forward:

interface BinaryTreeBoehmBerarducci<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);

  static <T> BinaryTreeBoehmBerarducci<T> ofLeaf(T value) {
    //return (foldLeaf, foldFork) -> foldLeaf.apply(value);

    return new BinaryTreeBoehmBerarducci<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork) {
        return foldLeaf.apply(value);
      }
    };
  }

  static <T> BinaryTreeBoehmBerarducci<T> ofFork(BinaryTreeBoehmBerarducci<T> left, BinaryTreeBoehmBerarducci<T> right) {
    //return (foldLeaf, foldFork) -> foldFork.apply(left.fold(foldLeaf, foldFork), right.fold(foldLeaf, foldFork));

    return new BinaryTreeBoehmBerarducci<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork) {
        return foldFork.apply(left.fold(foldLeaf, foldFork), right.fold(foldLeaf, foldFork));
      }
    };
  }
}

Summing up the leaves is now even easier than it was before, without even requiring explicit recursion:

class BinaryTreeBoehmBerarducciOps {

  static Integer sum(BinaryTreeBoehmBerarducci<Integer> b) {
    return b.fold(
        n -> n,
        (l, r) -> l + r);
  }
}

Pattern matching

But how do you perform a mere pattern match without recurring? This has now become quite tricky to do.

Your first idea might be to convert the Boehm–Berarducci-encoded data type into its Scott encoding:

interface BinaryTreeBoehmBerarducci<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);

  static <T> BinaryTreeBoehmBerarducci<T> ofLeaf(T value) { ... }
  static <T> BinaryTreeBoehmBerarducci<T> ofFork(BinaryTreeBoehmBerarducci<T> left, BinaryTreeBoehmBerarducci<T> right) { ... }

  default BinaryTreeScott<T> toScott() {
    return fold(
        BinaryTreeScott::ofLeaf,
        BinaryTreeScott::ofFork);
  }
}

This works, but now we depend on the recursively defined type of BinaryTreeScott<T> again, which is what we wanted to avoid. To get around this limitation, we define a non-recursive helper type that fulfills the same role as BinaryTreeScott<T>, but based on BinaryTreeBoehmBerarducci<T>:

interface Deconstrutor<T> {
  <W> W visit(Function<T, W> onLeaf, BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork);
}

Then we fold our BinaryTreeBoehmBerarducci<T> into a Deconstructor<T>, on which we can call visit to perform our pattern match:

interface BinaryTreeBoehmBerarducci<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<R, R, R> foldFork);

  static <T> BinaryTreeBoehmBerarducci<T> ofLeaf(T value) { ... }
  static <T> BinaryTreeBoehmBerarducci<T> ofFork(BinaryTreeBoehmBerarducci<T> left, BinaryTreeBoehmBerarducci<T> right) { ... }

  default <R> R visit(
      Function<T, R> onLeaf,
      BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, R> onFork) {

    interface Deconstructor<T> {
      <W> W visit(Function<T, W> onLeaf, BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork);
    }

    return
        this.<Deconstructor<T>>fold(
                v ->
                    new Deconstructor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork1) {
                        return onLeaf1.apply(v);
                      }
                    },

                (left, right) ->
                    new Deconstructor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducci<T>, BinaryTreeBoehmBerarducci<T>, W> onFork1) {
                        return onFork1.apply(
                            left.visit(BinaryTreeBoehmBerarducci::ofLeaf, BinaryTreeBoehmBerarducci::ofFork),
                            right.visit(BinaryTreeBoehmBerarducci::ofLeaf, BinaryTreeBoehmBerarducci::ofFork));
                      }
                    })

            .visit(onLeaf, onFork);
  }
}

BinaryTreeBoehmBerarducci#visit now works as BinaryTreeScott#visit did before.

The only problem is that it is horrendously inefficient as it traverses the whole tree and constructs a complete mirror tree of Deconstructor<T> objects for just a single pattern match.

Optimization: lazy fold

We can remedy the pathological inefficiency by making the fold operation lazy in the recursive argument:

interface BinaryTreeBoehmBerarducciLazy<T> {
  <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork);
}

The complete code is thus:

@FunctionalInterface
public interface BinaryTreeBoehmBerarducciLazy<T> {

  <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork);

  static <T> BinaryTreeBoehmBerarducciLazy<T> ofLeaf(T value) {
    return new BinaryTreeBoehmBerarducciLazy<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork) {
        return foldLeaf.apply(value);
      }
    };
  }

  static <T> BinaryTreeBoehmBerarducciLazy<T> ofFork(BinaryTreeBoehmBerarducciLazy<T> left, BinaryTreeBoehmBerarducciLazy<T> right) {
    return new BinaryTreeBoehmBerarducciLazy<>() {
      @Override
      public <R> R fold(Function<T, R> foldLeaf, BiFunction<Supplier<R>, Supplier<R>, R> foldFork) {
        return foldFork.apply(() -> left.fold(foldLeaf, foldFork), () -> right.fold(foldLeaf, foldFork));
      }
    };
  }

  default <R> R visit(
      Function<T, R> onLeaf,
      BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, R> onFork) {

    interface Deconstrutor<T> {
      <W> W visit(Function<T, W> onLeaf, BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, W> onFork);
    }

    return
        this.<Deconstrutor<T>>fold(
                value ->
                    new Deconstrutor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, W> onFork1) {
                        return onLeaf1.apply(value);
                      }
                    },

                (left, right) ->
                    new Deconstrutor<>() {
                      @Override
                      public <W> W visit(
                          Function<T, W> onLeaf1,
                          BiFunction<BinaryTreeBoehmBerarducciLazy<T>, BinaryTreeBoehmBerarducciLazy<T>, W> onFork1) {
                        return onFork1.apply(
                            left.get().visit(BinaryTreeBoehmBerarducciLazy::ofLeaf, BinaryTreeBoehmBerarducciLazy::ofFork),
                            right.get().visit(BinaryTreeBoehmBerarducciLazy::ofLeaf, BinaryTreeBoehmBerarducciLazy::ofFork));
                      }
                    })

            .visit(onLeaf, onFork);
  }
}