Implicits, type classes, and extension methods, part 2: implicit derivation

In the previous post, we learned a bit about type classes, as they were the major reason for introducing the mechanism. We could see that it is a great way of implementing the open-closed principle in a functional way: while the existing implementations are unchanged we can extend the behavior for new types. We haven’t seen though how to address one issue with them: how to provide behavior for potentially thousands of cases?

Type class derivation

Type classes are surely useful, but they wouldn’t as nearly useful as they are if not for the ability to create a new type class basing on existing instances and rules of how to compose them together automatically.

For instance:

trait Show[T] {
  
  def show(value: T): String
}

object Show {
  
  def apply[T: Show]: Show[T] = implicitly[Show[T]]
  def show[T: Show](value: T): String = apply[T].show(value)
  
  implicit val showString: Show[String] = s => s
  implicit val showInt: Show[Int] = _.toString
  
  implicit def showTuple[T: Show, U: Show]: Show[(T, U)] =
    tuple => s"(${Show.show[T](tuple._1)}, ${Show.show[U](tuple._2)})"
}
Show.show(12 -> "34") // "(12, 34)"

Deriving type classes like that would be quite limiting, while is where Shapeless comes in. It allows us to represent many classes (surely all algebraic data types that Scala supports: products and coproducts) in a generic form, which we can inductively iterate and build the final type class step-by-step. Usually, the process works like this:

  • find a generic representation for a type (heterogeneous list for products, coproduct for sum types) (1),
  • for generic representation try to find an implicit for the head, and implicit for the tail (2),
  • support recursion stop condition (empty HList/Coproduct) (3),
  • use a Generic to adjust code working on generic representation for your specific representation.
import shapeless._

object Show {
  
  ... // instances for the "primitives" we would build upon
  
  // (1)
  implicit def showProduct[S, T <: HList](implicit gen: Generic.Aux[S, T],
                                          show: Show[T]): Show[S] =
  s => {
    // (4)
    val hlist: T = gen.to(s) // our concrete S to hlist T translation!
    val result = show.show(hlist)
    // in some cases here we also get result basing
    // on generic representation and we have to convert it back.
    // it is not the case here, but might be in genreal
    result.substring(0, result.length - 2) // drop ", "
  }
  
  // (2)
  implicit def showCons[H: Show, T <: HList: Show]: Show[H :: T] =
    cons => Show[H].show(cons.head) + ", " + Show[T].show(cons.tail)
  
  // (3)
  implicit val showHNil: Show[HNil] = _ => ""
}
case class MyNewType(str: String, int: Int)

Show.show[MyNewType](NuNewType("test", 10)) // "test, 10"

Why do we need shapeless?

Let’s be clear: I don’t intend to provide a full lecture on shapeless here. There is already an excellent (and free) book about it named The type astronaut’s guide to shapeless. I only want to describe a general idea to encourage you to read it.

If we try to describe data mathematically we might find out that we will have to deal with one of 3 cases:

  • our data type is a primitive of a sort. We cannot (or don’t want to) split it into smaller parts,
  • it is a container of a sort. It might be a container of some number of elements of the same type (like lists or sets). It might be a container, where elements might have different type and we tell them apart from their position (tuples) or a name/label associated with them (a record/class attribute/field). We can describe them all as different variants of a Cartesian product. When it comes to types we call them product types,
  • our type is an alternative of 2 or more other types. Though each of them could exist in separation, we also want to be able to tell: T is one of X, Y, Z. Mathematically, the set of such type’s values is a sum of sets of values of each of the component types. Because of that, they are called sum types, or (as a complement to product types) coproducts. When we use this concept in the context of a category theory and FP, we would put an additional add constraint: sets of values of each component type must be disjoint so that you could tell unambiguously to which subset your value belongs. Because of that we also call them disjoint unions.

Products and coproducts together form algebraic data types (ADTs).

If we wanted to use such knowledge to get a type class, the first case is easy. How, about other two?

case class A(s: String, i: Int)
case class B(x: String, y: Int)
type C = (String, Int)

We can see here, that all of the types are in a way one and the same: various implementations of the same Cartesian product String×IntString \times Int. As such, we could conclude, that whatever mechanisms we have for them could be shared (that is, as long as labels don’t matter).

A way of handling it is by providing some common representation. If possible, it should be one that could be inductively built. As a matter of the fact, there is such generic representation for a product type called heterogeneous list (HList).

sealed trait HList

// empty HList
sealed trait HNil extends HList {
  
  def ::[H](head: H): H :: HNil = ::(head, HNil)
}
case object HNil

// prepend element
final case class ::[H, T <: HList](head: H, tail: T) extends HList
"test" :: 10 :: HNil // String :: Int :: HNil

If we obtain such representation it is easy, to build up a type class for a whole generic representation one implicit at a time.

object Show {
  
  ...
  
  implicit def showCons[H: Show, T <: HList: Show]: Show[H :: T] =
    cons => Show[H].show(cons.head) + ", " + Show[T].show(cons.tail)
  
  implicit val showHNil: Show[HNil] = _ => ""
}

The most problematic thing would be: how to translate to generic and back? This is exactly the reason why shapeless was made: to provide such standard implementation as well as a mechanism to handle the translation. This mechanism is Generics.

object Show {
  
  ..
  
  implicit def showProduct[S, T <: HList](implicit gen: Generic.Aux[S, T],
                                          show: Show[T]): Show[S] =
  s => {
    val hlist: T = gen.to(s)
    val result = show.show(hlist)
    result.substring(0, result.length - 2) // drop ", "
  }
}

An instance of a Generic will be generated automatically by shapeless (which underneath uses macros to create them) for cases where it is possible (simplifying: as long as you use sealed traits for sum types and case classes for product types, you are good to go).

Each Generic has 2 methods: to taking the specific representation and translating it to generic representation and from which translates to specific representation from generic one.

There is also a twin generic representations for coproducts, also handled by generics.

sealed trait Coproduct

sealed trait :+:[+H, +T <: Coproduct]

// value is the "left" element of the sum (head)
case class Inl[+H, +T <: Coproduct](head : H) extends :+:[H, T]

// value is one of types in the tail
case class Inr[+H, +T <: Coproduct](tail : T) extends :+:[H, T]

// exists only for the sake of pattern matching and implementation simplicity
sealed trait CNil extends Coproduct
object Show {
  
  ...
  
  implicit def showCons[H: Show, T <: Coproduct: Show]: Show[H :+: T] = {
    case Inl(head) => Show[H].show(head)
    case Inr(tail) => Show[T].show(tail)
    case CNil      => ""
  }
  
  implicit def showCoproduct[S, T <: Coproduct](implicit gen: Generic.Aux[S, T],
                                                show: Show[T]): Show[S] =
  s => {
    val coproduct: T = gen.to(s)
    show.show(coproduct)
  }
}

Notice, that there are no separate Generic types for HLists and Coproducts. What you’ll get depends only on type bounds you place on an implicit.

Implicit evidence and aux pattern

You might have noticed something interesting about the generic representation in the example above.

Generic.Aux[T, U]

The definition of a Generic in shapeless looks like this:

trait Generic[T] extends Serializable {
  /** The generic representation type for {T},
      which will be composed of {Coproduct} and {HList} types  */
  type Repr

  /** Convert an instance of the concrete type to the generic value
      representation */
  def to(t : T) : Repr

  /** Convert an instance of the generic representation to an instance of
      the concrete type */
  def from(r : Repr) : T
}

So, each Generic contains a type Repr with information what is the generic representation for a specific representation T. If we used path-dependent types we would refer to it as:

Generic[T]#Repr // or
gen.Repr

However, this would make the Repr virtually useless to us due to nature of path-dependent types. We could try to access it like:

Generic[T]#Repr =:= U

and then try to access the U. In such way, we would attempt to prove, that there is U type which is identical to Generic[T]#Repr. Such proof is named implicit evidence and if we don’t use it for anything else we usually name the variable ev.

import shapeless.=:!=

def changeType[A, B](list: List[A])(f: A => B)(implicit ev: A =:!= B): List[B] =
    list.map(f)

changeType(List(1,2,3,4))(_.toString)  // List("1", "2", "3", "4")

changeType(List(1,2,3,4))(_ * 2) // compilation fails

Scala provides us with 2 generalized type constraint and shapeless adds another one:

  • =:= proves that 2 types are equal,
  • <:< proves that one is strictly a supertype of another but not equal,
  • <%< used to prove that one must be viewable as the other - it allows both upcasting as well as implicit conversion, which is a reason why it was deprecated and removed (we expand on this concept later on),
  • =:!= proves that the types are not equal. This one is provided by shapeless.

Other useful proof is e.g. ClassTag. It proves that type has a Class[_] instance and provides it. It is not a general rule, but most of the proofs’ instances mentioned here are generated by the compiler when we ask for them.

In the context of Generic it means that what we try to achieve is to have 2 types, S and T, and we want to let the compiler figure out one of them and prove that they could be transformed to and from via a Generic instance.

Sadly, due to path dependent types and limitations of the Scala compiler, this would not work: the compiler will simply give up and won’t return any implicit proof.

However, we might trick it by representing the types in a way, where types don’t depend on each other:

object Generic {
  
  type Aux[T, Repr0] = Generic[T] { type Repr = Repr0 }
}

This trick which lets the compiler figure out the missing type and create a Generic representation for us is known as aux pattern. Aux means helper and indeed it is a type declaration used solely as a utility to help the compiler figure out the types.

Other shapeless utilities

You probably noticed that Generic doesn’t give you any information about labels each field had originally, nor what was the original name of a type in a coproduct.

This is why next to generic there are also LabelledGenerics which encode the original label using Witnesses (a workaround for pre-2.13 Scala not having literal types).

sealed trait A
case class A1(int: Int) extends A
case class A2(str: String) extends A

def labelledH[S, T <: HList](
  get: S)(implicit gen: LabelledGeneric.Aux[S, T]) = gen.to(get)
def labelledC[S, T <: Coproduct](
  get: S)(implicit gen: LabelledGeneric.Aux[S, T]) = gen.to(get)

labelledH(A1(10))
// Int with labelled.KeyTag[Symbol with tag.Tagged[int], Int] :: HNil = 10 :: HNil

labelledC(A1(10): A)
// A1 with labelled.KeyTag[Symbol with tag.Tagged[A1], A1] :+:
//   A2 with labelled.KeyTag[Symbol with tag.Tagged[A2], A2] :+:
//   CNil = Inl(A1(10))

As a matter of the fact, LabelledGenerics are a mechanism which is used in libraries like Circe to automatically derive codecs for ADTs.

Summary

In this post, we touched the idea of a type-class-derivation. While this may not be enough for you to cover all use cases, it should be enough to let you understand where derivation is used and whether you could use it for handling your use case. If so I again recommend you to take a look at The type astronaut’s guide to shapeless.

In a next post we’ll talk a bit about implicit conversions - the implicits use case that is the most controversial. We also introduce few patterns, that would be impossible without them.