Sanely-automatic derivation - or how type class derivation works and why everyone else is doing it wrong

In Scala, we generate quite a lot of code. A lot of that involves the compiler generating so-called type class instances. This mechanism is quite widespread, yet, very few people actually understand it. As a result we, the community, settled down with the suboptimal practices, resulting in the suboptimal performance and user experience.

In this post, I’ll try to explain:

  • what is a “type class” and “type class derivation”
  • what is “automatic” and “semi-automatic” derivation
  • why there is a lot of misconception about them
  • why Shapeless/Mirrors approaches won so far
  • and why macros could be used to improve the UX in ways inaccessible to Shapeless nor Mirrors

In other words: I’ll try to explain how sanely-automatic derivation can mop the floor with anything based on the current mainstream approach, saving you frustration, time and CPU cycles.

If you are familiar with what a type class is and how automatic/semi-automatic derivation works, you might want to jump into The cost of “convenience”.

If you only want to read about the sanely-automatic derivation pattern, you can jump into the eponymous section - I’d still recommend reading why it would be better, I am not the fan of a cargo cult programming.

Type classes

Putting aside all Haskell-ly esotericism, that might be an interesting historical trivia but is hardly useful in the everyday $work, a “type class” is an interface. You define it as an interface, you implement it as an interface, and you use it as an interface.

It has one special purpose though: to be passed around (and maybe even instantiated) by the compiler. “But any interface might have multiple implementations!” you might ask. Exactly. So to make it possible, we add some assumptions:

  • this interface has a type parameter (is a generic if you prefer)
  • each implementation is defined for a different type

Sounds kind of abstract, so maybe we need an example:

// An interface
trait SafePrinter[A] {
  def safeToString(value: A): String
}
object SafePrinter {
  given safeString: SafePrinter[String] =
    str => str
  given safeInt: SafePrinter[Long] =
    long => long.toString
  given safeArray[A](using A: SafePrinter[A]): SafePrinter[Array[A]] =
    array => array.map(A.safeToString).mkString("Array[", ", ", "]")
}

def safeToString[A](value: A)(using A: SafePrinter[A]): String =
  A.safeToString(value)

// Some custom type, that we can add an implementation for
case class Confidential[A](value: A) extends AnyVal
object Confidential {
  given safeConfidential[A]: SafePrinter[Confidential[A]] =
    _ => "[redacted]"
}

// usage
safeToString("value") // "value"
safeToString(10) // "10"
safeToString(Array(1, 2, 3)) // "Array[1, 2, 3]"
safeToString(Confidential("login:password")) // "[redacted]"

So, what happens here?

  • we define SafePrinter[A] as an interface - the intent is that we can convert a type to String without relying on its toString method removing some sensitive information: PII, credentials, etc
  • we define its implementation for some types (String, Int, any Array that has an implementation for its elements)
  • we define some type (Confidential) and implement SafePrinter for it that would print just "[redacted]" when using our interface

It relies on given/using mechanism, which we will explain in details later, but for now you have to understand that:

  • when you define some value/method, you can “tag” it with given keyword
  • then, when you call a method, and its parameter list is “tagged” with using keyword, the compiler will look at all definitions available in the scope of that call, check if there is a definition tagged as given with its type matching the sought parameter’s type
  • if it can find exactly 1 such definition for each using parameter, it uses it (duh), but if it cannot, then the compilation fails

The last element of a “type class”, which is not mandatory but very common, are extension methods:

extension [A](value: A)
  def safeToString(using A: SafePrinter[A]): String = A.safeToString(value)

"value".safeToString // "value"
10.safeToString // "10"
Array(1, 2, 3).safeToString // "Array[1, 2, 3]"
Confidential("login:password").safeToString // "[redacted]"

When you combine: an interface with type params, using/givens resolving it by its type, and extension methods delegating implementation to that interface, what you get is a way to extend and customize the behavior even for the types that you are not controlling: coming from other people’s modules, external libraries, generated by codegens, etc.

The main issue is: even if you find it useful, how do you provide an implementation for tens, maybe hundreds data types, that you have in your project?

Derivation

Let’s start by saying: if the implementation is somewhat special, and you cannot tell what it should do, just by looking at the type definition, you have to write that implementation by hand.

But quite a lot of implementations (that fit the type class pattern) just follows this recipe:

  • if it’s handling a case class, take each field’s value and run the implementation for the field’s type, and then combine the individual results
  • if it’s constructing a case class, for each field take the implementation that constructs its type, construct each field and call the constructor
  • if it’s an enum, pattern-match on the value, and for each case run the implementation for the subtype

Of course, it requires that each field’s/subtype’s implementation has to exist or be constructable (by the same recipe).

// If this method worked...
def deriveSafePrinter[A]: SafePrinter[A] = ???

// ...then this data type...
case class Foo(a: Int, b: String, c: Array[Int])

// ...should get implementation for free, as long as:
//  - SafePrinter[Int] (for a)
//  - SafePrinter[String] (for b)
//  - SafePrinter[Array[Int]] (for c)
// exists.
deriveSafePrinter[Foo]

We could say that such Foo implementation would be derived from the Foo shape: its fields and their types. (So, it has nothing to do with integrals and derivatives).

But, how could we implement such a method?

Mirrors and inline defs

The trick with generation of such implementations is based on 2 foundations:

  • you can convert to and from a tuple:
    • case class value can be converted to and from a tuple value
    • case class field names can be represented with a type that is a tuple of String literal singletons
    • enum type can be represented with a type that is a tuple of all subtypes
    • enum subtype names can be represented with a type that is a tuple of String literal singletons
  • tuples can be treated like lists:
    • EmptyTuple is similar to Nil
    • value *: tuple works similarly to value :: list
    • so you can start with an empty tuple and prepend values, one by one, until you build a whole tuple
    • but you can also pattern-match on a tuple, handling one value at a time

So e.g. creating a case class could be done by:

  • finding its field types represented as a tuple
  • constructing that tuple by prepending elements one by one
  • converting a tuple into a case class
// Some case class
case class Foo(int: Int, double: Double, string: String)

// Tuple with Foo's fields in the order of declaration
// Build by prepending values to an empty tuple
val fooTuple = 42 *: 0.7 *: "Wololusa" *: EmptyTuple

// Turning a tuple into a Foo
import scala.deriving.Mirror
val foo = summon[Mirror.ProductOf[Foo]].fromProduct(fooTuple)

// Turning whole Foo into a Tuple
val tupleFromFoo = Tuple.fromProduct(foo)

We can make it more generic thanks to this Mirror interface, we just need to build fooTuple programmatically:

import scala.compiletime.*

// Will make a tuple out of thin air.
// As long as you don't mind that is only:
//  - fills Int with 43
//  - fills Double with 0.7
//  - fills String with "Wololusa"
// and fails compilation for any other type of field.
inline def makeTuple[A <: Tuple]: A = inline erasedValue[A] match {
  case _: EmptyTuple =>
    // To prove that the constructed value is of
    // the type A without asInstanceOf :|
    inline EmptyTuple match {
      case b: A => b
    }
  case _: (Int *: a) =>
    inline (42 *: makeTuple[a]) match {
      case b: A => b
    }
  case _: (Double *: a) =>
    inline (0.7 *: makeTuple[a]) match {
      case b: A => b
    }
  case _: (String *: a) =>
    inline ("Wololusa" *: makeTuple[a]) match {
      case b: A => b
    }
}

import scala.deriving.Mirror

// Will create a tuple and convert it into a case class
// (if makeTuple supports its tuple representation)
inline def makeViaTuple[P](using P: Mirror.ProductOf[P]): P = {
  val tuple = makeTuple[P.MirroredElemTypes]
  P.fromProduct(tuple)
}
// Some case class
case class Foo(int: Int, double: Double, string: String)

// Weee, it works!
val foo = makeViaTuple[Foo]

// This already worked, so no surprise
val tupleFromFoo = Tuple.fromProduct(foo)

These examples introduced a few new concepts:

  • inline def - this function behaves as-if we copy-pasted the code in the place where we are using it. Which means that we know exactly which types it uses. Which is useful for e.g.
  • inline match where the pattern-matching can be done during compilation, removing all the branches that are not matched. Since we have no value of the type which we are still constructing, we are matching against
  • erasedValue which pretends that we have such a value, it would fail compilation if we had it anywhere in the code… but since inline match erases the whole match (once it finds the matching branch), we can use it as something to match against
  • Mirror[A] - is a value that has some types defined e.g. P.MirroredElemTypes for a case class is a tuple of types of its fields (and types of children for enum), P.MirroredElemLabels is a tuple of String literal singleton types (which we could turn into Strings with ValueOf) that represents the names of the case class fields (or enum children)

These utilities would come in handy in understanding how the derivation is actually implemented (with Mirrors).

Automatic vs semi-automatic

So far we were calling our instance-generating method ourselves. But the earlier examples used givens, to provide an implementation into an extension method in a seamless way. Let’s do exactly that:

import scala.compiletime.*
import scala.deriving.Mirror

trait SafePrinter[A] {
  def safeToString(value: A): String
}
object SafePrinter {

  given safeString: SafePrinter[String] =
    str => str
  given safeInt: SafePrinter[Long] =
    long => long.toString
  given safeArray[A](using A: SafePrinter[A]): SafePrinter[Array[A]] =
    array => array.view.map(A.safeToString).mkString("Array[", ", ", "]")

  // This provides SafePrinter for any case class!
  inline given automatic[A](using A: Mirror.Of[A]): SafePrinter[A] =
    inline A match {
      // Handling case class Mirror
      case p: Mirror.ProductOf[A] =>
        val name = valueOf[p.MirroredLabel]

        type ValuesOfFieldNames = Tuple.Map[p.MirroredElemLabels, ValueOf]
        val fieldNames =
          summonAll[ValuesOfFieldNames].productIterator
            .asInstanceOf[Iterator[ValueOf[String]]]
            .map(_.value)

        type SafePrinters = Tuple.Map[p.MirroredElemTypes, SafePrinter]
        lazy val safePrinters = summonAll[SafePrinters].productIterator
          .asInstanceOf[Iterator[SafePrinter[Any]]]

        CaseClassPrinter[A](name, fieldNames.zip(safePrinters))
      
      // Handling enum Mirror
      case s: Mirror.SumOf[A] =>
        type SafePrinters = Tuple.Map[s.MirroredElemTypes, SafePrinter]
        lazy val safePrinters = summonAll[SafePrinters].productIterator
          .asInstanceOf[Iterator[SafePrinter[A]]]

        EnumPrinter(s.ordinal, safePrinters)
    }

  // Newer version of Scala 3 complain about making anonymous classes
  // in inline code, probably because you are making tons of classes
  // which kills perf. And quite often you can just extract a few values
  // and pass them into one class.
  
  class CaseClassPrinter[A](
      name: String,
      makeFields: => Iterator[(String, SafePrinter[Any])]
  ) extends SafePrinter[A] {
    private lazy val fields = makeFields.toSeq

    override def safeToString(value: A): String =
      name + "(" + {
        fields
          .zip(value.asInstanceOf[Product].productIterator)
          .map { case ((fieldName, safePrinter), field) =>
            fieldName + " = " + safePrinter.safeToString(field)
          }
          .mkString(", ")
      } + ")"
  }

  class EnumPrinter[A](
      select: A => Int,
      makeChildren: => Iterator[SafePrinter[A]]
  ) extends SafePrinter[A] {
    private lazy val children = makeChildren.toArray

    override def safeToString(value: A): String =
      children(select(value)).safeToString(value)
  }
}

Now, when we need an instance of SafePrinter for some case class it will be created in-place.

case class Foo(a: Long, b: String, c: Array[Long])

Foo(42, "test", Array(1L, 2L, 3L)).safeToString
// Foo(a = 42, b = test, c = Array[1, 2, 3])
Bar(Foo(42, "test", Array(1L, 2L, 3L))).safeToString
// Bar(foo = Foo(a = 42, b = test, c = Array[1, 2, 3]))
Baz(Bar(Foo(42, "test", Array(1L, 2L, 3L)))).safeToString
// Baz(bar = Bar(foo = Foo(a = 42, b = test, c = Array[1, 2, 3])))

As we can see, it works even when there are nested structures! Similarly for enums/sealed traits:

Either.cond(false, 1L, "string").safeToString
// Left(value = string)

But what, if there is a recursive data structure?

enum Tree[A]:
  case Leaf(value: A)
  case Node(left: Tree[A], right: Tree[A])

Tree.Node(Tree.Node(Tree.Leaf(1L), Tree.Leaf(2L)), Tree.Leaf(3L)).safeToString
// We got compilation error:
// No given instance of type SafePrinter[Tree[Long]] was found.

Hmm, it seems that there is a circular dependency issue:

  • Tree is an enum made of Leaf and Node, so it requires their instances

  • Node is a case class whose fields are Trees

  • if both instances were lazy vals, they could refer each other safely (that’s why we made fields’/childrens’ instances lazy vals to make it possible!)

  • but they aren’t: currently we are trying to build a single expression, and compiler sees that it would have to generate inlines infinitely, and gives up after exceeding some inlining limit

  • also we cannot do:

    given treePrinter: SafePrinter[Tree[A]] =
      summon[SafePrinter[Tree[A]]]
    

    because we would just create:

    given treePrinter: SafePrinter[Tree[A]] =
      treePrinter
    

    with extra steps (it might compile, but with NullPointerException/InitializationError/StackOverflowException in runtime)

The bottom line is, we would avoid this issue if we could create an instance in such a way that does not rely on implicits always being in scope, but allows creating them on demand and binding them to values.

object SafePrinter {
  
  // ...

  // Let's replace
  // inline given automatic[A](using A: Mirror.Of[A]): SafePrinter[A] =
  // with
  inline def derived[A](using A: Mirror.Of[A]): SafePrinter[A] = ...
  
  // ...
}
enum Tree[A]:
  case Leaf(value: A)
  case Node(left: Tree[A], right: Tree[A])
object Tree {
  given leafPrinter[A: SafePrinter]: SafePrinter[Leaf[A]] =
    SafePrinter.derived[Leaf[A]]
  given nodePrinter[A: SafePrinter]: SafePrinter[Node[A]] =
    SafePrinter.derived[Node[A]]
  given treePrinter[A: SafePrinter]: SafePrinter[Tree[A]] =
    SafePrinter.derived[Tree[A]]
}

given treePrinter[A: SafePrinter]: SafePrinter[Tree[A]] =
  SafePrinter.derived[Tree[A]]

Tree.Node(Tree.Node(Tree.Leaf(1L), Tree.Leaf(2L)), Tree.Leaf(3L)).safeToString
// Node(left = Node(left = Leaf(value = 1), right = Leaf(value = 2)), right = Leaf(value = 3))

or

// The "derives TypeClass" creates a given in the companion object using
// the "TypeClass.derived" method - a method has to be named that way,
// it's a convention.
enum Tree[A] derives SafePrinter:
  case Leaf(value: A)
  case Node(left: Tree[A], right: Tree[A])
object Tree {
  // Sorry, no "case Leaf(...) derives SafePrinter" for ya!
  given leafPrinter[A: SafePrinter]: SafePrinter[Leaf[A]] =
    SafePrinter.derived[Leaf[A]]
  given nodePrinter[A: SafePrinter]: SafePrinter[Node[A]] =
    SafePrinter.derived[Node[A]]
}

Tree.Node(Tree.Node(Tree.Leaf(1L), Tree.Leaf(2L)), Tree.Leaf(3L)).safeToString
// Node(left = Node(left = Leaf(value = 1), right = Leaf(value = 2)), right = Leaf(value = 3))

It works!

But it requires us to define a separate given for each nested type (which reaallly sucks for enums).

The former is called an automatic derivation. It puts this instance-creating method as a given, which makes nesting supported out-of-the-box, but since we are (usually) obtaining it via using, it is also prone to circular dependency on initialization.

The latter is called a semi-automatic derivation. It does not put anything into implicit scope, requiring an explicit call, so nesting is not supported out-of-the-box. On the other hand, it does not rely on summoning the created type, so it has no issues with circular dependencies.

These different properties mean, that people use them for different things:

  • quick prototyping - automatic derivation
  • making sure that your whole codebase uses the same instance for some type - semi-automatic derivation
  • deeply nested structures - depending on “taste” a lot of semi-automatic derivation, or semi-automatic derivation of the final value with automatic derivation enabled locally for the nested types’ instances

Of course, one can mix them together, e.g. have automatic derivation enabled for easy derivation of nested values, and semi-automatic for binding the results to givens in the companion object.

That is much easier on Scala 3 where we can simply define inline given derived[A] in the companion object, but on Scala 2 where were no inlines, only implicits, the semi-automatic derivation often involved some tricks - like class ExportedTypeClass[A](val typeClass: TypeClass[A]) wrapper - to create an instance by summoning an implicit but of a different type than the type class.

Additionally, in Scala 2, we often had implicit def automaticTypeClass[A] and def deriveTypeClass[A] defined in separate my.library.auto._, my.library.semiauto._ imports. The difference was much more pronounced, when you had to add an extra import either or both of them for the derivation to work.

Now, we have less reasons to implement the derivation twice - we can use the same mechanics for both cases - yet some people still do it.

However, we cannot hide, that more often that not, the division into automatic vs semiautomatic exists only because there are some serious flaws in the whole derivation design.

The cost of “convenience”

The automatic vs semi-automatic pattern exists since the very first versions of Shapeless library, back in Scala 2.9 times. inline defs were not a thing, but implicit vals, implicit defs (givens predecessors) and implicit parameters (using predecessors) were already a thing.

Then Shapeless happened - tuples back then couldn’t be built by prepending, so it provided a way of converting between records/enums and tuple-like structures - and people discovered with joy, that they can generate code just by defining a bunch of implicits. No new language to learn, everything is just a bunch of functions! But then they run into some issues.

Implicit resolution and long compilation times

We don’t have the fastest compiler. When someone is profiling it, usually the blame is on the typer phase. The conventional wisdom is that “macros are to blame”. That macro-heavy libraries slow everything down.

But hardly ever you hear that someone digs deeper, and points out that these super slow macros were often Shapeless’ whitebox macros. More often you’d hear that people proudly state that their library is macro-free while being based on Shapeless.

Also, hardly ever you hear that the slowness is not only-macro-related, and another contributing factors is that implicit resolution (or givens resolution for usings in Scala 3 terminology) is not a direct, and straightforward process.

How it actually works?

Well, there are specifications (e.g. for Scala 2.13 and 3.4), but we’d prefer some human language instead. So, let’s talk this through by examples.

The first observation is that using will find some given only if it’s visible in the scope. We can put it there ourselves:

class Foo(val a: String)

locally {
  given a: Foo = new Foo("a")
  println(summon[Foo].a) // prints "a"
}

If there is no given in scope, compilation would fail.

class Foo(val a: String)

locally {
  given a: Foo = new Foo("a")
}

println(summon[Foo].a)
// We got compilation error:
// No given instance of type Foo was found for parameter x of method summon in object Predef

We can put it in scope also by import or extending a trait/class that has given values.

class Foo(val a: String)

trait FooGivens {
  given a: Foo = new Foo("a")
}

// givens from FooGivens are in scope of the FooUsage's inside
object FooUsage extends FooGivens {
  println(summon[Foo].a) // prints "a"
}

// givens imported into the scope via import
locally {
  import FooUsage.given
  println(summon[Foo].a) // prints "a"
}

But that’s not the only way some given can start being considered. There are also companion objects.

class Foo(val a: String)
object Foo {
  // givens in the companion are in the implicit scope ALWAYS
  given a: Foo = new Foo("a")
}

locally {
  // Foo visible without imports, mixins nor defining it explicitly
  println(summon[Foo].a) // prints "a"
}

And… this is when things start getting hairy. When it comes to givens the companion objects are a blessed destination for all instances, that should be supported out-of-the-box. But that might be a lot of types. And there might be some overlap.

class Bar[A](val a: String)
object Bar {
  
  given anyBar[A]: Bar[A] = new Bar("some value")
  
  given listBar[A]: Bar[List[A]] = new Bar("some list")
}

locally {

  // Both `anyBar[List[Int]]` and `listBar[Int]` would
  // match the same type - ambiguity!
  println(summon[Bar[List[Int]]].a) // does not compile!
}

If we just put everything in one place, and more than one given is matching, there is an ambiguity. And ambiguity means compilation error. How do we solve it? By telling the compiler that some givens are more important than the other. In the old days, when givens were still called implicits, it was known as implicit priorities, and was achieved by splitting the companion into a series of traits - the further away from the actual companion object type (in the inheritance hierarchy) is a given declared, the lower is its priority.

class Bar[A](val a: String)

object Bar extends BarImplicits0
// The most specific givens are higher in priority,
// e.g. givens without any type parameters should be the top.
trait BarImplicits0 extends BarImplicits1 { this: Bar.type =>
  
  given stringBar: Bar[String] = new Bar("some string")
  given intBar: Bar[Int] = new Bar("some int")
}
// Map and Option are both Iterable, so if we are
// handling all collections with a single given,
// but need to specialise for Map/Option,
// they should be higher than Iterable given.
trait BarImplicits1 extends BarImplicits2 { this: Bar.type =>
  
  given mapBar[K, V](using K: Bar[K], V: Bar[V]): Bar[Map[K, V]] =
    new Bar(s"some map of ${K.a} -> ${V.a}")
  given optBar[A](using A: Bar[A]): Bar[Option[A]] =
    new Bar(s"some option of ${A.a}")
}
// Given for Iterables is still more specific than
// something that would e.g. catch all case classes
// and enums.
trait BarImplicits2 extends BarImplicits3 { this: Bar.type =>

  given iterableBar[A, CC <: Iterable[A]](using A: Bar[A]): Bar[CC] =
    new Bar(s"some iterable of ${A.a}")
}
// (Automatic) derivation should always be the last resort.
trait BarImplicits3 { this: Bar.type =>
  
  import scala.deriving.Mirror
  given derived[A](using Mirror.Of[A]): Bar[A] =
    new Bar("something with a Mirror")
}

// Here we have 3 layers, but it's not a hardcoded value,
// you do as many as you need to. And then tell MiMa to shut up.

Actually, this rule is true no matter where the given comes from (companion object. import, mix-in, etc) - the compiler would make a list of all givens, whose type matches the sought type (calling them eligible), sorting by priorities (the “distance” from the place where they are being used), and check that there is exactly one closest implicit (no ex aequo 1st place).

And there is plenty of opportunities for having ex aequo 1st places, because it’s not only the type class’ companion object, and its givens, that are being automatically put into the scope. It’s for all of types involved.

class Bar[A](val a: String)

object Bar {
  
  // Someone put something generic straight into companion, hmm
  given listBar[A](using A: Bar[A]): Bar[List[A]] =
    new Bar(s"some list of ${A.a}")
}

// Someone defined their own type...
case class Baz()
object Baz {
  
  // ...and put instance for it in its companion
  // - like everyone recommends.
  given listBazBar[A]: Bar[List[Baz]] =
    new Bar(s"list of Baz")
}


// Oh noez, both Bar.listBaz and Baz.listBazBar are contestants for
// the best implicit! The match ended with a draw, and compilation fails!
summon[Bar[List[Baz]]]

For extra giggles we can consider this case with our previous SafePrinter typeclass:

import scala.compiletime.*
import scala.deriving.Mirror

trait SafePrinter[A] {
  def safeToString(value: A): String
}
object SafePrinter {

  given safeString: SafePrinter[String] =
    str => str
  given safeInt: SafePrinter[Long] =
    long => long.toString
  given safeArray[A](using A: SafePrinter[A]): SafePrinter[Array[A]] =
    array => array.view.map(A.safeToString).mkString("Array[", ", ", "]")

  inline given automatic[A](using A: Mirror.Of[A]): SafePrinter[A] = ...
}

If we did this:

case class Example1()
List(Example1()).safeToString

the compilation would fail, because List would be handled as an enum - after all it’s a sum of Nil and :: - but via ::[A](head: A, tail: List[A]) it’s also a recursive structure, so it would require a semi-automatic derivation!

given listPrinter[A: SafePrinter]: SafePrinter[List[A]] =
  SafePrinter.derived[List[A]]
given consPrinter[A: SafePrinter]: SafePrinter[::[A]] =
  SafePrinter.derived[::[A]]
given nilPrinter: SafePrinter[Nil.type] = SafePrinter.derived[Nil.type]

List(Example1()).safeToString
// "::(head = Example1(), tail = Nil())" - WTH?

And the output… Not what anyone would want! We need to specialize! Then, the specialization for List cannot be placed on the same priority as generic derivation - to avoid ambiguity. And then, all of these specializations have to be shoved down the implicit priority hierarchy, to avoid ambiguity when the user would just put something into their companion. Uff!

Or, wait! The fun does not end there! After all, some given might require some using parameters, and they have to be resolved if we want to tell if it’s eligible! So the compiler might need to sort the potential givens by their priorities, and then try to find the first eligible one! (Which might involve resolving another given)!

And this is where macros can add some spice.

So, when we’re checking if some candidate for our given is the one, we only have to look at the signature - what it returns, are there some type parameters to resolve, can its using be constructed - but would that work for a macro?

No, it would not.

You can write Expr.summon[SomeType] inside a macro and it will return Option[Expr[SomeType]]. It’s not reflected in the signature. Actually, even non-macro inline defs that use summonAll (or summonInline, or summonFrom) make use of this mechanics. It lets you figure out which givens to summon, before summoning them. Because how would you write a type signature for the derivation if it would have to be different for every single type?

Hmm. When a normal given has signature that it does not work, it is skipped, and typer moves on to test the next one. But it cannot work the same way for macros! Only when you expand them the typer can learn if it works or not! That’s why when the typer reaches inline given as the possible candidate, it will be the last thing to test: if it succeeds, we got our implicit, if it fails we get a compilation error (and the author of that macro can define the error message). So, you need to pay attention, which macros have the signatures that would make them candidates for some types, and what is their priority.

Except, just like inline givens are exceptions to the givens, transparent inline givens are exceptions to inline givens.

transparent inline defs exist, because sometimes you cannot predict the returned type, just from the types of the input. E.g. let’s say we want a type class that converts a case class into its corresponding Tuple:

trait AsTuple[A] {
  type Result <: Tuple
  def asTuple(value: A): Result
}

How could we define a signature for the method, that returns it? And make sure that Result is a concrete Tuple type?

inline def makeAsTuple[A]: AsTuple[A] { type Result = ??? }

Actually, for this particular case, we could cheat, and do something like:

inline def makeAsTuple[A](m: Mirror.ProductOf[A]):
    AsTuple[A] { type Result = m.MirroredElemTypes }

but in general we have no guarantee that our path-dependent type can be obtained from some existing utility. Sometimes we are writing such utility.

That’s where transparent inline defs come in handy: they allow you to define a returned type and then make it even more specific during expansion:

// pseudocode, I just want to show the idea
transparent inline def makeAsTuple[A]: AsTuple[A] = {
  val p = summonInline[Mirror.ProductOf[A]]
  new AsTuple[A] {
    type Result = m.MirroredElemTypes
    def asTuple(value: A): Result = Tuple.from(value).asInstanceOf[Result]
  }
}

class Foo(a: String, b: Int)
makeAsTuple[Foo] // : AsTuple[Foo] { type Result = (String, Int) }

Here, it was used to add a refinement. But nothing stops you from writing:

transparent inline given derive: TypeClass[?]

where all that is known is that the result is a TypeClass - but is it a type class of List? Primitive? case class? We don’t know until we expand it. So can we exclude it from our list of given candidates? No, we have not enough information. But it would be a pain if it failed, because the type does not match or if there is some error, since we cannot do anything to exclude it!

And that’s why kids, transparent inline def (and whitebox macros on Scala 2) on their own do not fail the compilation. If there is an error during their expansion, the typer assumes that it was not matched and moves on to test next implicits. You would only see their error message if you called such transparent inline given directly. If you are still wondering what’s the fuss:

// pseudocode
trait TypeClass[A]

object TypeClass extends TypeClassImplicits0 {
  transparent inline given maybeThisOne: TypeClass[?] = ...
}
trait TypeClassImplicits0 extends TypeClassImplicits1 {
  transparent inline given orThisOne: TypeClass[?] = ...
}
trait TypeClassImplicits1 extends TypeClassImplicits2 {
  transparent inline given orThatOne: TypeClass[?] = ...
}
...
summon[TypeClass[Int]]

You can very easily end up with a situation when the compiler will expand some transparent macro - perhaps it will trigger implicit searches for some other givens that also need to be constructed - burn a lot of CPU to construct a whole correct expression, and then flush it down the toilet and move on. With no error message or any other indication of the screwup.

Why would anyone need such crazy things?

  1. Well, not all transparent inline def is transparent inline given, some are e.g. macros for turning _.field1.field2 arguments of a method into some type-information (e.g. in Chimney’s withFieldConst(_.field, value))
  2. And sometimes there is no better way, e.g. when you need to refine the type based on what you’ll find about some other type…
  3. …like with the most used macros in whole Scala 2 ecosystem: Shapeless’ Generic and LabelledGeneric whitebox macros (giving us the stuff that Scala 3 delegates to Mirrors - which are a bit cheaper to create, since the compiler just runs .asInstanceOf on the companion object to add refinements)

When you hear the complaints about the compilation being slow “due to macros” it is very likely that it was actually Shapeless proving or refuting that there is some HList/Coproduct for your case class/sealed trait. Actually, blaming it solely on the macros was not very fair. It’s more about how they were used. For example:

// On Scala 2, pseudocode

import shapeless._

trait SafePrinter[A] {
  def safeToString(value: A): String
}
object SafePrinter {
  
  // Here we would have some printers for String, Int, etc
  // but I don't want to write it, you don't want to read it,
  // so just let's imagine they're there, ok?
  
  // And let's imagine that there are some priorities done
  // as well.
  
  // Explaining it with Scala 3 terminology:
  // we are taking SafePrinter of a NamedTuple
  // and converting it into SafePrinter of a case class
  implicit def caseClassPrinter[A, Repr <: HList](
    implicit
    gen: LabelledGeneric.Aux[A, Repr],
    classTag: ClassTag[A],
    hlistPrinter: SafePrinter[Repr]
  ): SafePrinter[A] =
    (value: A) => {
      val name = classTag.runtimeClass.getName
      val fields = hlistPrinter.safeToString(
        gen.to(value) // A -> AHList
      )
      s"$name($fields)"
    }
  
  // But how to get this NamedTuple?
  // Build it, one element at a time
  
  // This HNil is like an EmptyTuple
  implicit val emptyTuplePrinter: SafePrinter[HNil.type] =
    (_: HNil.type) => ""
  
  // This is something like (HeadName : HeadType) *: Tail
  implicit def prependFieldToTuplePrinter[
    HeadName <: Symbol,
    HeadType,
    Tail <: HList
  ](
    implicit 
    headName: Witness.Aux[HeadName], // Like ValueOf[HeadName <: String]
    headPrinter: SafePrinter[Head],
    tailPrinter: SafePrinter[Tail]
  ): SafePrinter[FieldType[HeadName, HeadType] :: Tail] =
    (value: FieldType[HeadName, HeadType] :: Tail) => {
      val fieldName = headName.value.name
      val (head :: tail) = value
      val headStr = headPrinter.safeToString(head.value)
      val tailStr = tailPrinter.safeToString(tail)
      if (tailStr.isEmpty) s"$fieldName = $headStr"
      else s"$fieldName = $headStr, $tailStr"
    }
}
case class Foo(a: Int, b: String, c: Bar)
case class Bar(d: Double, e: Char, f: Baz)
case class Baz(g: Short)
implicitly[SafePrinter[Foo]] // how it would be generated?

So, all this complicated code should achieve the following:

  • Baz is like (b: Short), so start by handling EmptyTuple case, and then “prepend” logic handling b via some SafePrinter[Short], and convert the printer for that named tuple into printer for Baz
  • Bar is like (d: Double, e: Char, f: Baz), so start by handling EmptyTuple, then “prepend” f using some SafePrinter[Baz], then prepend e handled by some SafePrinter[Char], and so on
  • Foo is like (a: Int, b: String, c: Bar)… you get the gist.

But it involves quite a lot of work.

  1. we’re starting with SafePrinter[Foo] - the compiler will have to look for the implicits in the current scope, SafePrinter companion and Foo companion objects
  2. only SafePrinter has some implicits, so they have to be investigated
  3. it starts by pruning types that surely do not match and sorting what’s left by implicit priorities
  4. then it needs to test them in the order of priorities
  5. it would reject all instances for some specific types, all generic specializations, including implicits for HLists!
  6. finally, it gets to caseClassPrinter - the type might match, but it requires some implicits as inputs! So we need to resolve them!
  7. it looks for LabelledGeneric.Aux[A, Repr] - trying to find if there is some Repr for which this thing exists, and that requires expanding a whitebox macro
  8. let’s say it succeeded - yay! - then we can look for other implicits
  9. next is ClassTag[Foo] - let’s say that it’s doable
  10. the last one would be SafePrinter[FieldType['a, Int] :: FieldType['b, String] :: FieldType['c, Bar] :: HNil] - such a pleasure to read! - if we can find this instance, we can create the SafePrinter[Foo]!
    1. so, the compiler starts looking at: the current scope and the companion objects of: SafePrinter, Int, String, Bar, FieldType, HNil and :: (H-cons) (if they exist)
    2. all the work about initial filtering and sorting implicits starts again, but for a different final type
    3. all the signature testing has to be done, again
    4. we got to prependFieldToTuplePrinter case - nice! If we resolve Witness and 2 SafePrinters we are done!
    5. hmm, the Witness requires some macro expansion on Scala 2.12 (on 2.13 it can use ValueOf)… but we got it
    6. the head’s SafePrinter… it’s yet another nested implicit resolution, I’ll skip it, you can re-read points 1-3 again if you want to, then it finds SafePrinterInt defined in the companion
    7. the tail’s SafePrinter[FieldType['a, Int] :: FieldType['b, String] :: FieldType['c, Bar] :: HNil]… do I really need to explain?
    8. the game repeats mostly the same, until we get into SafePrinter[Bar] because then we have to expand Shapeless’ whitebox macro to learn what Bar is made of!
    9. and then we repeat the whole process for Bar generic representation, and then for Baz

Sounds scary, isn’t it? We:

  • call a whitebox macro once for every case class that didn’t have implicit already provided in the scope
  • (on 2.12) call another macro for every field in a case class because we want to learn its name
  • forcing the typer to create:
    • 1 instance for every case class
    • and additionally 1 instance for every field of a case class
    • after testing and rejecting all the other options before it got to this result

(It also means, that our simple example called caseClassPrinter 3 times and prependFieldToTuplePrinter 7 times).

This explosion of type and signature testing is the actual reason why type class derivation might be slow. It is also the true reason why many people prefer semi-automatic derivation - they want to force the compiler to pay the price only once per type! On Scala 2.12 it was much worse than on 2.13 - on 2.13 Scala learned to cache the result if it could reuse some derivation between several fields - but the worst case is still bad. Even now you can easily write an example where a single short file would compile e.g. 40 seconds.

This approach has also the other issue. If your method signatures and their location drive the whole logic… how do you fix a bug? By changing the type signature or shuffling location of methods. This basically forces to break backward compatibility each time you want to fix something.

(In Scala 3 the situation should be better - less implicit-driven-derivation - but recursive match types on Tuples and paired with recursive inline defs seem to bully the typer like the Generics of the Old).

In my experience, every author of a Shapeless/Mirrors-derivation-based library should be familiar with all these mechanics.

In my experience, almost none of them does.

We have whole ecosystems running on top of something hardly anyone understands, where making things “good enough” require multiple iterations, and each fix is a potential breaking change. We’ve built our house on sand.

Allocations, allocations everywhere

Well, it’s not great, but all can be forgiven if the generated code is correct and fast. Because it is, right? After all, we hoped for

// case class Foo(a: Int, b: String, c: Bar)
// case class Bar(d: Double, e: Char, f: Baz)
// case class Baz(g: Short)
implicitly[SafePrinter[Foo]]

to generate the code behaving more or less like:

val baz = new SafePrinter[Baz] {
  override def safeToString(value: Baz): String =
    "Baz(g = " +
      SafePrinter.shortPrinter.safeToString(value.g) +
      ")"
}
val bar = new SafePrinter[Bar] {
  override def safeToString(value: Bar): String =
    "Bar(d = " +
       SafePrinter.doublePrinter.safeToString(value.d) +
       ", e = " +
       SafePrinter.charPrinter.safeToString(value.e) +
       ", f = " +
       baz.safeToString(value.e) +
       ")"
}
val foo = new SafePrinter[Foo] {
  override def safeToString(value: Foo): String =
    "Bar(a = " +
       SafePrinter.intPrinter.safeToString(value.a) +
       ", b = " +
       SafePrinter.stringPrinter.safeToString(value.b) +
       ", c = " +
       bar.safeToString(value.c) +
       ")"
}
implicitly(foo)

Sure, it won’t be as pretty, there might be a few extra steps, but it would be close enough.

Well, what we got certainly behaves as-if it looked like that, but it actually looks more like:

val baz = SafePrinter.caseClassPrinter( // new
  new LabelledGeneric { ... }, // new, generated by macro
  classTag[Baz],
  SafePrinter.prependFieldToTuplePrinter( // new
    Witness(...), // new, wrapper for "g"
    SafePrinter.shortPrinter,
    SafePrinter.emptyTuplePrinter
  )
)
val bar = SafePrinter.caseClassPrinter( // new
  new LabelledGeneric { ... }, // new, generated by macro
  classTag[Bar],
  SafePrinter.prependFieldToTuplePrinter( // new
    Witness(...), // new, wrapper for "d"
    SafePrinter.doublePrinter,
    SafePrinter.prependFieldToTuplePrinter( // new
      Witness(...), // new, wrapper for "e"
      SafePrinter.charPrinter,
      SafePrinter.prependFieldToTuplePrinter( // new        
        Witness(...), // new, wrapper for "f"
        baz,
        SafePrinter.emptyTuplePrinter
      )
    )
  )
)
val foo = SafePrinter.caseClassPrinter( // new
  new LabelledGeneric { ... }, // new, generated by macro
  classTag[Foo],
  SafePrinter.prependFieldToTuplePrinter( // new
    Witness(...), // new, wrapper for "a"
    SafePrinter.intPrinter,
    SafePrinter.prependFieldToTuplePrinter( // new
      Witness(...), // new, wrapper for "b"
      SafePrinter.stringPrinter,
      SafePrinter.prependFieldToTuplePrinter( // new        
        Witness(...), // new, wrapper for "c"
        bar,
        SafePrinter.emptyTuplePrinter
      )
    )
  )
)
implicitly(foo)

Hmm. I mean, modern JVMs are quite advanced and surely can optimize it and stuff? It should not be a problem?

Unfortunately, it is a problem, and if such a code is on your hot path, you might feel it. All these allocations… and if the result is not stored in some val you would create all of these, anew, every time the code is being run. And with some pointer chasing too. The JVM will try to do something about it, but such code will make it run for its money. Show this code to some performance engineer and get ready for a waterboarding session! But at least you are sure, that you are always correct, because you generated this code in a principled way.

This isn’t the instance you’re looking for

Because generating it in a principled way, guarantees all correctness, the behavior will be as we want etc. Oh, wait.

trait Default[A] {
  def value(): A
}
object Default extends DefaultImplicits0
trait DefaultImplicits0 { this: Default.type =>
  // Built-in instances
  given defaultString: Default[String] =
    () => ""
  given defaultList[A](using A: Default[A]): Default[List[A]] =
    () => List(A.value())
}

locally {
  // Here we customized the behavior.
  given customListDefault[A]: Default[List[A]] = () => Nil

  summon[Default[List[List[String]]]].value() // Nil
}

locally {
  // Here we didn't customize the behavior.
  // But it still compiles! Oops.

  summon[Default[List[List[String]]]].value() // List(List(""))
}

And before you start “well, the defaults are sane…”. No. There are no defaults which would work always. Sometimes you might have less dangerous defaults, but virtually every library will have cases when there might be multiple correct behaviors - which one is correct depends on a project, conventions, particular requirements…

I once saw an argument against one of my libraries (Chimney): “well, I see no reason in using such library if I still have to write tests checking what it does”. I sure hope you write tests for your JSON codecs, mate, (or do you write them all by hand?), because the amount of assumptions that can be made there… virtually every single JSON library would have a different “default” behavior, and each of them would have a group defending them, and calling all the other insane.

There is no such thing as a default that works for everyone, for every case. You will need to override the default behavior. And no amount of screeching about being principled can save you from the fact, that a type-checking, working code that does something, but not what you want will still compile. I’m afraid that even if you can write less tests, you still have to write some, even for things that you generated. Because, even if you “shouldn’t test someone else library”, you surely can test if you configured it correctly because this is where everyone screws up.

I get it, if you want to guarantee that your whole codebase uses the same implementation, you should define it once, and then reuse it. But you still have to verify, that your code actually reuses that one implementation. (It’s one of the reasons many people demanded, that an automatic derivation should be opt-in-only).

HumanReadableErrorMessage not found

could not find implicit value for parameter encoder: TypeClass[A]

– Scala 2

No given instance of type TypeClass[A] was found for parameter paramName of method methodName

– Scala 3

Our favourite error message. It was so problematic, that Magnolia (originally by Jon Pretty) came to be and stole some users from Shapeless, because it offered some better error messages:

magnolia: could not find Typeclass for type ParamType
     in parameter 'paramName' of product type A

Scala 3 managed to improve the error messages a bit. But everyone - Shapeless (Scala 2), Mirrors (Scala 3), Magnolia - fails to provide a good error message for missing implicit in a nested structure.

Yet, in my experience, you’ll find a lot of senior developers claiming that it’s just a “skill issue”. And that debugging implicits is an easy and essential skill for every Scala developer.

In my experience, a lot of people who decided to GTFO Scala, considered such people to be fugitives from some mental institution, who somehow took over the building and started gaslighting hostages, that this is the only way (even if crappy-runtime-reflection based solutions from Java, can be fixed quicker by junior Java dev, than missing implicit in deep hierarchy by a senior Scala dev).

“But at least it’s not macros”

As a matter of fact, not only users of such code have issues. Even maintainers have.

One of the premises of inline defs in Scala 3 was that macros would be less needed than in Scala 2. That much more could be implemented simply with:

  • inline def
  • inline if-else
  • inline match
  • etc

That you could implement things just writing the regular code, just slapping inline keyword here and there until it works. Then cake.

For instance, I remember that once someone on Reddit had an interesting idea:

  • define case class Tagged[Tag, Value](value: Value) that represents some runtime value tagged with a compile time type

  • then use match types and inline defs to define a type safe zip on tuples:

    type ZipTagged[Tags <: Tuple, Values <: Tuple] <: Tuple
      
    inline def zipTagged[Tags <: Tuple, Values <: Tuple](
        values: Values
    ): ZipTagged[Tags, Values]
    
  • so that you could zip ("name", "another name") with (Type, AnotherType)

How we could implement such thing?

We would have to start with a shapeless-y-like tuple decomposition: either we have an empty tuple, or some value prepended to another tuple. Then we would have to use mathematical induction: zipping of 2 empty tuples, should be an empty tuple, zipping of 2 non-empty tuples, should be a tagged value prepended to zipping of the 2 tail tuples.

type ZipTagged[Tags <: Tuple, Values <: Tuple] <: Tuple =
  (Tags, Values) match {
    case (EmptyTuple, EmptyTuple) =>
      EmptyTuple
    case (tagHead *: tagTail, valueHead *: valueTail) =>
      Tagged[tagHead, valueHead] *: ZipTagged[tagTail, valueTail]
   }

Ok, that computes what should be the return type of the zipping. But what would be the zipping itself? My attempt ended up as:

inline def zipTagged[Tags <: Tuple, Values <: Tuple](
    values: Values
): ZipTagged[Tags, Values] =
  inline (erasedValue[Tags], erasedValue[Values]) match {
    case _: (EmptyTuple, EmptyTuple) =>
      // ZipTagged[Tags, Values] =
      //   EmptyTuple.type
      EmptyTuple
    case (tagHead *: tagTail, valueHead *: valueTail) =>
      // ZipTagged[Tags, Values] =
      //   Tagged[tagHead, valueHead] *: ZipTagged[tagTail, valueTail]
      val vs: (valueHead *: valueTail) =
        values.asInstanceOf[(valueHead *: valueTail)]
      (Tagged[tagHead, valueHead](vs.head) *: zipTagged[
        tagTail,
        valueTail
      ](vs.tail))
  }

Hmm, it’s a bit messy. Why?

  1. if we want to prove, that the results is ZipTagged we have to follow the same matches as in the type match
  2. that means we have to perform type matches (case _: (A, B) =>), we cannot rely on extractors (case (a: A, b: B) =>)
  3. additionally, we would have to match on the values we are still constructing, so we have to work around that by using erasedValue
  4. and the branch needs to be picked during compilation, so we need an inline match as well
  5. unfortunately, I had no idea how to make the compiler believe me that values: Values is proven to be of type valueType *: valueTailType, so I did the same thing I saw in many standard library’s methods that work on Tuples - used asInstanceOf

But later, someone in that thread made me aware, that we can use this one trick to make the compiler aware, that values is of the right type:

inline def zipTagged[Tags <: Tuple, Values <: Tuple](
    values: Values
): ZipTagged[Tags, Values] =
  inline (erasedValue[Tags], erasedValue[Values]) match {
    case _: (EmptyTuple, EmptyTuple) =>
      // ZipTagged[Tags, Values] =
      //   EmptyTuple.type
      EmptyTuple
    case (tagHead *: tagTail, valueHead *: valueTail) =>
      // ZipTagged[Tags, Values] =
      //   Tagged[tagHead, valueHead] *: ZipTagged[tagTail, valueTail]
      inline values match { // <-- compile-time cast, \/ note the backticks
        case values: (`valueHead` *: `valueTail`) =>
          (Tagged[tagHead, valueHead](vs.head) *: zipTagged[
            tagTail,
            valueTail
          ](vs.tail))
      }
  }

Inside the case that picks the right branch, you can make another single-case inline match - and, that allows compiler to prove, that the type of values is valueHead *: valueTail, so the type of returned value will be correctly inferred to Tagged[tagHead, valueHead] *: ZipTagged[tagTail, valueTail] (which is the ZipTagged for 2 non-empty tuples).

It doesn’t look that bad, does it? You only need to remember 1 trick and you are ok?

Since that post a year ago, I had to dig up this post several times, either when I was doing something with match types, or when some other experienced maintainer of an advanced Scala 3 library asks about this crap, and gets stuck on it for hours.

Sadly, I don’t bookmark all of such trivial-code-turned-IQ-tests examples, but these principled, good for teaching type theory, low-entry bar inline utilities surprisingly often grow in complexity so fast, that every other library based on them could be a PhD thesis on some university. After the initial dopamine rush from the few hello world examples, one can speedrun into the wall with a head first, the moment they want to do something remotely useful. People have no clear intuitions what is done during the compilation, what is done in the runtime, what can only be done during the compilation, what can only be done during the runtime, and indicating the difference only by inline may be way too little. Especially, when you have utilities like summonInline, summonAll, summonFrom which are inlined without the keyword (and anyone can provide more of them!).

Actually, there is one more pitfall with inlines (relevant for both macros and “macro-free” inline def development), that I think of OTOH: did you know that JVM methods have a size limit?

There is no specific value set in JVM specification, that all vendors would implement, but there is some fixed number of bytecode instruction above which a virtual machine will show you a middle finger. Scala compiler has to assume such limit, and if your method would grow beyond certains size it will fail the compilation. You cannot change that value (it’s not -Xmax-inlines), you have to fix the code.

I hit that issue over a year ago, when I was preparing benchmarks that compared different ways of deriving a type class. You see, when you are deriving a type class using an automatic derivation, you are calling inline def, within an inline def, within an inline def… which, in some nasty cases, might expand your current method into some abomination that exceeds the size limit.

But you know what can fix it? Not computing the result of that nested expansion as a val. Do it in a non-inline def and the compiler would chop the mega-method into a several smaller method that will call each other. It’s true with macros as well.

It’s one of these non-obvious things, that you can only be aware of if you know that byte code is a thing. (Perhaps, I should shill my e-book about JVM, presentations and this blog, a bit more, because a year after I made a presentation about macros where I mentioned the issue with vals in inline defs, someone rediscovered it).

Now you know, why I believe that while all these new things - inlines, match types, named tuples… - allow some interesting things, and are great educational and prototyping tools - they will never be production ready. It’s precisely because they try to interleave what happens during compilation and what happens during runtime, hide all of that complexity, except some things only work during compilation, some only in runtime, and most of the people have no good intuitions which is which and why.

“When are they going to fix this?”

At this point, you might ask: if it’s so bad, why the compiler team won’t just fix it?

Hmm, exactly, why?

The first issue is that, if you think about it, a large part of Shapeless/Mirrors is about “Prolog-like programming on top of declarations”:

  • implicit val/given with no arguments is like a predicate “X exists”
  • implicit def/given with using arguments is like an implication “(all using values exist) imply X exists”
  • implicit priorities are like pattern-matching: the first match decides the branch, if there’s no match then it’s an error

Except: it’s not written directly as a byte code that JVM interprets. It’s written as val/def signatures, and compiler has to run this with typer as an ad-hoc interpreter, where every simple if-else instruction becomes an expensive type-matching exercise. With a lot of allocations, tree traversals, hopeless pruning of exploding possible choices.

And with no context: every other use case, needs a separate type class, which has to be allocated and defined as a new type, including all intermediate types, because this approach does not allow you to just define a def that does the job: this def would have to be composed from other defs and to distinguish which defs are relevant for the use case and which not, they all have to be wrapped into JVM objects extending the same interface.

And if you need to generate a behavior for a case class which contains a case class which contains a case class… you are always ending up with:

// You cannot let the Bar => String expression walk around
// without their legal guardian in the form of
// a fully adult class instance, right? It's unethical!
val anon$1 = new SafePrinter[Baz] {
  override def safeToString(value: Baz): String = ...
}
// The String comes to a Bar...
// Wait! It's the other way round.
// A String comes out of a Bar...
val anon$2 = new SafePrinter[Bar] {
  override def safeToString(value: Bar): String = ...
}
// How do you like them allocations?
val anon$3 = new SafePrinter[Foo] {
  override def safeToString(value: Foo): String = ...
}

instead of:

val anon$1 = new SafePrinter[Foo] {
  override def safeToString(value: Foo): String = {
    // So, you're saying a bunch of local methods
    // is "morally equal" to a bunch of instances
    // that would merely wrap these methods?
    def printBaz(baz: Baz): String = ...
    def printBar(bar: Bar): String = ...
    def printFoo(foo: Foo): String = ...
    // Actually, since we have these here as methods already...
    // they could also handle recursive data structures without
    // the lazy val/Lazy[A] bullshit?
    printFoo(value)
  }
}

because:

  • the automatic/semi-automatic pattern doesn’t really support such approach (from library’s maintainer side)
  • the compiler has too little information about WTH we’re doing to do anything smarter than figuring “I may reuse that implicit, instead of generating it again for another field in the same derivation”

All the information: how derivation itself could be optimized, how its output could be optimized and how to improve the errors for the user are in library’s maintainers heads. And it simply cannot be expressed well enough using these interfaces.

(It’s as if these patterns were doomed to provide bad UX by design. As if they were insane derivation).

Macros, however, can do this.

Enter the Macros

If we decide to look at Scala 3 macro, we will see something like this:

inline def macroMethod[A](inline expr: A): String =
  ${ macroMethodImplementation('{ expr }) }

import scala.quoted.*

def macroMethodImplementation[A: Type](expr: Expr[A])(
  using quotes: Quotes
): Expr[String] = {
  import quotes.*, quotes.reflect.*
  val exprCode = expr.asTerm.show(using Printer.TreeAnsiCode)
  val exprTree = expr.asTerm.show(using Printer.TreeStructure)
  val result = s"""The following code:
                  |$exprCode
                  |is constructed like this:
                  |$exprTree""".stripMargin
  Expr(result)
}

Similarly to all other inline things, it is an inline def but this inline def’s body has only 1 element: ${} block, that calls a single method.

This method has to be stable (top level, in package or object), is allowed to only take Expr and Type arguments, and a single using Quotes argument, and it has to return a single Expr.

That’s virtually all of limitations.

You want to write if-elses? Sure.

if (TypeRepr.of[A] <:< TypeRepr.of[Unit]) {
  ... // A is Unit
} else {
  ... // A is not Unit
}

Pattern-matching on values? Why not.

Expr.summon[TypeClass[B]] match {
  case Some(expr) => ... // given resolved
  case None => ... // given not found or ambiguous
}

Have some actual engineering, where you organise things into methods and data objects, log your decisions, and accumulate errors instead of fail-fast semantics? Go for it!

// This will let us remember how we got to where the result/log is.
enum Path {
  case Root
  case FieldSelected(path: Path, field: String)
  case SubtypeSelected(path: Path, subtype: String)
  
  override def toString: String = this match {
    case Root => "_"
    case FieldSelected(path, field) => s"$path.$field"
    case SubtypeSelected(path, subtype) => s"$path.asInstanceOf[$subtype]"
  }
}

// Let's bundle all useful information together.
case class DerivationContext[A](
  expr: Expr[A],
  tpe:  Type[A],
  path: Path = Path.Root,
  logs: mutable.MutableList[String] = mutable.MutableList.empty,
  ... // whatever else you need, e.g. some config?
) {
  
  def log(msg: String): Unit = logs += s"[$path]: $msg"
  
  def downField[B: Type](
    fieldName: String,
    fieldValue: Expr[B]
  ): DerivationContext[B] = copy(
    expr = fieldValue,
    tpe  = summon[Type[B]],
    path = Path.FieldSelected(this.path, fieldName)
  )
  
  def downSubtype[B: Type](
    expr: Expr[B]
  ): DerivationContext[B] = copy(
    expr = expr,
    tpe  = summon[Type[B]],
    path = Path.FieldSelected(this.path, TypeRepr.of[B].show)
  )
}

given currentContextType[A](using ctx: DerivationContext[A]): Type[A] =
  ctx.tpe

// It's easier to e.g. group errors together by their cause
// and sort by place, if they are NOT raw Strings.
enum DerivationError {
  case TypeNotSupported(msg: String, path: String)
  case InvalidConfig(msg: String, path: String)
  ...
}

extension (errors: List[DerivationError])
  def render: String = ...

// We can also not be animals and handle logic in macros
// as we would in any other high quality code:
// by not removing the errors that user should see.
type DerivationResult[A] = Either[List[DerivationError], A]

extension[A](list: List[A]) {
  def parTraverse[B](f: A => DerivationResult[A]): DerivationResult[B] =
    ...
}

// And organize the logic into smaller self contained rules:
//  - if rule applies, it might succeed or fail the whole derivation
//  - if rule does not apply, it yields to another rule
// Then the whole logic becomes a chain of responsibility.

type DerivationAttempt[A] = DerivationResult[Option[A]]

extension [A](attempt: DerivationAttempt[A]) {
  def orElseTry(or: => DerivationAttempt[A]): DerivationAttempt[A] = ...
  def orFail(error: => DerivationError): DerivationResult[A] = ...
}

def attemptUsingGiven[A](ctx: DerivationContext[A]):
    DerivationAttempt[Expr[String]] = ...
def attemptBuiltInTypesSupport[A](ctx: DerivationContext[A]):
    DerivationAttempt[Expr[String]] = ...
def attemptCaseClassSupport[A](ctx: DerivationContext[A]):
    DerivationAttempt[Expr[String]] = ...
def attemptEnumSupport[A](ctx: DerivationContext[A]):
    DerivationAttempt[Expr[String]] = ...

def derivationLogic[A](ctx: DerivationContext[A]):
    DerivationResult[Expr[String]] =
  attemptUsingGiven(ctx)
    orElseTry attemptBuiltInTypesSupport(ctx)
    orElseTry attemptCaseClassSupport(ctx)
    orElseTry attemptEnumSupport(ctx)
    orFail DerivationError.TypeNotSupported(..., ctx.path)

// And we can write some high level utilities to make the logic
// in the attempts above easier to read and maintain.

def typeAsCaseClass[A: Type, B](
  onField: [Val] => Type[Val] ?=> (String,Expr[Val]) => DerivationResult[B]
): Option[Expr[A] => DerivationResult[List[B]] = ...

def typeAsEnum[A: Type, B](
  onCase: [Case] => Type[Case] ?=> Expr[Case] => DerivationResult[B]
): Option[Expr[A] => DerivationResult[List[B]] = ...

And it is absolutely obvious which part is done by the compiler, and which is performed in runtime: do you see Expr as the type or ${} or '{}? That’s the code that will be created, so it will run by the user. You don’t see them? Then it’s run by the compiler during macro expansion.

And we can use all of that to our advantage.

Let’s say we are using semi-automatic derivation. No implicits in scope. It’s a pain if we need to generate it for nested structures, right?

Wrong! We are writing normal Scala code, so we can handle nestings with a plain old recursion!

def attemptCaseClassSupport[A](ctx: DerivationContext[A]):
    DerivationAttempt[Expr[String]] =
  typeAsCaseClass[A, Expr[String]] { [Field] =>
      (Field: Type[Field]) ?=>
      (name: String, field: Expr[Field]) =>
    
    // Handles each field by a recursion in a macro!
    derivationLogic(ctx.downField(name, field)).map { result =>
      
      // builds `name = value`
      '{ ${Expr(name)} + " = " + ${result} }
    }.map { exprs =>
      
      // builds `TypeName(name = value, name2 = value2, ...)`
      val name = TypeRepr.of[A].show
      val params = exprs.reduceOption { (a, b) =>
        '{ ${a} + ", " + $b }
      }.getOrElse(Expr(""))
      '{ ${Expr(name)} + "(" + ${Expr(params)} + ")" }
    }
  }.map {
    // This type is handled as a case class, let's do it!
    case Some(handleCaseClass) => handleCaseClass(ctx.expr).map(Some(_))
    // It's not handled, yield to another rule.
    case None => Right(None)
  }

// This uses attemptCaseClassSupport is one of derivation rules.
def derivationLogic[A](ctx: DerivationContext[A]):
    DerivationResult[Expr[String]] = ...

As a matter of fact, we can still try implicits - for overrides/customization rather than the only game in town (the Jsoniter Scala way).

// Instead of defining implicit priorities in the companion
// we can use if-else or pattern-match on a type inside a macro!
def attemptBuiltInTypesSupport[A](ctx: DerivationContext[A]):
    DerivationAttempt[Expr[String]] =
  ctx.type match {
    case '[String] => ...
    case '[Int] => ...
    ...
    case '[Option[a]] => ...
    case '[Either[l, r]] => ...
    ...
  }

def derivationLogic[A](ctx: DerivationContext[A]):
    DerivationResult[Expr[String]] =
  // Givens are still used... but only to override the defaults!
  attemptUsingGiven(ctx)
    // No given? Then let's provide built-in support via a macro!
    orElseTry attemptBuiltInTypesSupport(ctx)
    orElseTry attemptCaseClassSupport(ctx)
    orElseTry attemptEnumSupport(ctx)
    orFail DerivationError.TypeNotSupported(..., ctx.path)

All of that within a single macro expansion, creating a single type class instance, avoiding expensive implicit searches where possible. And since it’s all inside a single macro you can microoptimize/improve UX to, e.g.:

  • resolve implicits/givens only once
  • avoid unnecessary allocations
  • store each Expr[A] => Expr[String] (or whatever your computation is) as a local/private def
  • handle recursive data structures within a macro - by passing around symbols to the def you started implementing
  • provide complete paths to the nested fields/types when an error is happening and the derivation is not possible
  • show users the derivation logic/time it took to expand a macro/the final result

“But if you make this derivation method an implicit/given, it will always resolve to calling itself when looking for implicits. So automatic derivation is not possible this way!” - you might say. And everyone would agree.

Sanely-automatic derivation

Well. There is a trick:

// TypeClass has a parent, with the same API!
trait TypeClass[A] extends TypeClass.AutoDerived[A]
object TypeClass {
  
  // Semi-automatic derivation - and instances hand-written
  // by users - would use TypeClass[A] type.
  inline derived[A]: TypeClass[A] = ${ typeClassMacro[A] }
  
  sealed trait AutoDerived[A] {
    // define the same methods as TypeClass needs
  }
  object AutoDerived extends TypeClassAutoDerivedLowPriorityGiven
  
  import scala.quoted.*
  def typeClassMacro[A: Type](using Quotes): Expr[TypeClass[A]] = {
    // Inside a macro we would use
    //
    //   Expr.summon[TypeClass[...]]
    //
    // !
    // It would NOT summon instances of TypeClass.AutoDerived[...]
    // ignoring given TypeClass.AutoDerived.autoDerived[A]!
    ...
  }
}

trait TypeClassAutoDerivedLowPriorityGiven { TypeClass.AutoDerived.type =>
  
  // Automatic derivation would upcast to that parent type!
  inline given autoDerived[A]: AutoDerived[A] = ${ typeClassMacro[A] }
}

extension [A](value: A)
  // Then extension methods would use that super type.
  def useTypeClass(using tc: TypeClass.AutoDerived[A]) = ...

With this trick:

  • instances provided by users will have higher priority, than the one created by a macro - so you can override the behavior
  • macro will ignore itself when it will start - because it only looks for a TypeClass and it’s given is TypeClass.AutoDerived - so it will be recursive
  • you can still do semi-automatic derivation if you want to
  • errors will show even when something went wrong in a nested structure

So, all the advantages of semi-automatic and automatic and more! It also works on Scala 2! Are there any drawbacks? Well, there are, the biggest one is that… Scala 3.7 broke this pattern with its change to given resolution. But, it gave us something else to compensate for it, Expr.summonIgnoring:

trait TypeClass[A] {
  // all the type class methods
}
object TypeClass {
    
  import scala.quoted.*
  def typeClassMacro[A: Type](using Quotes): Expr[TypeClass[A]] = {
    // Inside a macro we would use:
    //
    //   val derivedSymbol = Symbol
    //     .classSymbol("com.example.TypeClass")
    //     .companionModule
    //     .methodMember("derived")
    //     .head
    //   Expr.summonIgnoring[TypeClass[...]](derivedSymbol)
    //
    // !
    ...
  }
}

trait TypeClassLowPriorityGiven { TypeClass.AutoDerived.type =>
  
  // But automatic derivation would upcast to that parent type!
  inline given derived[A]: TypeClass[A] = ${ typeClassMacro[A] }
}

extension [A](value: A)
  // Then extension methods would use that super type.
  def useTypeClass(using tc: TypeClass[A]) = ...

With that, we can simplify to have:

  • 1 interface again!
  • only 1 method for built-in types, automatic and semi-automatic derivation! (It enormously helps with keeping things backward compatible!)
  • that handles recursion internally!

allowing for more optimal code, faster compilation and better error messages!

That’s what I decided to call sanely-automatic derivation. Summarizing, it’s a derivation that:

  • supports both semi-automatic and automatic derivation
  • where you’re picking semi-automatic only as a means to preserve correctness (“a single source of truth”), rather than micro-optimisation mechanism
  • because it compiles fast enough even as automatic derivation
  • it also supports reporting errors for nested structures
  • it does not require wrapper types for intermediate data structures - the recursion is handled internally in a macro for both automatic and semi-automatic derivation
  • generating more optimal code

In other words, the “automated” part is sane, that’s why it’s “sanely-automatic”.

(And if you want an automatic derivation to require an explicit import, to e.g. avoid accidental derivation, you still can define this given inline def outside the companion - just reuse the logic for the inline given derived, with symbol of that imported inline given excluded in its Expr.summonIgnoring).

(And this pattern is implementable in Scala 2.13 as well, since I backported Expr.summonIgnoring to Scala 2.13.17 as c.inferImplicitValueIgnoring).

The elephant in the room

If it’s so awesome, why no one else is using it?

Good question, I ask it myself constantly.

One of the reasons, is that Shapeless (and then Mirrors) were good enough to deliver something. If it compiles, it works, if it doesn’t you have a bug. Complaining about the UX is for sissies, true men know how to binary search for the implicit resolution failure root cause. It’s a skill issue, nothing wrong with the approach itself!

Another reason is that macros are hard: there was no manual, it was low-level, macros API has next to no documentation, you learn by printing trees for something that works, and trying to reproduce it. Nobody taught them. Nobody shared any findings. There were no best practices established. Everyone treated them (and wrote them) as the worst, untyped JavaScripts: finish it ASAP so that you could move on to greener pastures.

Then there is an issue of debugging: macros are a black box, and you have no idea what it does (as opposed to gazillion implicits, pulled from god knows where).

Nowadays we also have a bonus: Scala 2 and Scala 3 have separate macro systems. Write once, then… write it again.

Still. Shapeless/Mirrors are not without issues. Why nobody tried to make macros saner to use? It’s still a code! You can refactor it, write reusable utilities, treat it as any other business code which should be guided by types, use self-explaining names, put things into a library if you see the same problem again and again…

I cannot understand why everyone missed this obvious opportunity for doing something so impactful: writing a library that would lower the bar for writing a good macro. With some good practices. And better UX for the users than what Shapeless/Mirrors could ever offer.

So, I decided to do it myself. Which is a PITA, but I still believe it’s necessary.

Hearth

This is a golden opportunity to write a good pitch for my “standard library for macros”, and turn this whole article into a crypto advertisement.

But I am already tired of writing this many words, so you are probably also tired of reading them.

And the point I wanted to make is not “you should download Hearth”, but: better UX is possible, macros are a way of doing it, and if you think they are unusable then it’s a solved problem - go nagging your favourite library’s maintainers to step up their game, or the next generation of competitors will leave them in the dust!

More info?

If you want to learn what made me think that Hearth should become a thing, and that sanely-auto is the way, here are some presentations I did:

I believe these are some good source of knowledge about macros in general. Maybe I really should shill them more?

Summary

In this post we learned why yours truly considers Circe and all libraries based on its derivation mechanics a dead end, and all macro-free metaprogramming as fun toys, that should not be used anywhere near the production.

Hopefully, we also learned why he is not delirious: there already is a library that implements all the improvements described here (its name is Chimney, it precedes Hearth, and it implements sanely-automatic derivation since 0.8.0, using summonIgnoring in the 2.0.0 line), and people are rather happy with how it works.

Bad derivation UX/DX is a real problem. Slow compilation times means slow feedback loop, which means lost focus, which means we feel/are unproductive. Poor runtime performance might become long, annoying debugging session in raw bytecode, because you would not see the slowness in the code you wrote yourself. Crappy error messages slow down experienced developers, and scare juniors. We all deserve better and it can be fixed.

I hope that even if you don’t like the solution I proposed today, you acknowledge that it’s at least a proof by construction, that we can do better. And then maybe some more people start challenging the status quo.