Kinds of types in Scala, part 2: take type, return type or type parameters

In the previous post, we laid the foundation for understanding the type system in Scala. But concrete types only would be too little to make language truly expressive. So, now we’ll try to parametrize it.

Type constructors and parametric types

We can think of type as a set. But we can also think of function as a set, a set of parameters-returned value pairs. This lets us also thinks of a function signature as a type: a set of all possible sets of such pairs that arguments belong to one type and returned value to the other (where first value from a pair in unique within set):

type T - a set of values
t: T - value belonging to set T ( $t \in T$ )
type S => T - a set of a subset of $S \times T$ such that each value is a function $S \rightarrow T$ (which means that first element of a pair must be unique)
$\{ f: (f: S \rightarrow T) \} \iff \{ f: f \subset S \times T \land \forall_{(s_1, t_1), (s_2, t_2) \in f} s_1 \neq s_2 \}$
f: S => T: a subset of pairs, where first element of a pair is unique
$f: S \rightarrow T \iff (f \subset S \times T \land \forall_{(s_1, t_1), (s_2, t_2) \in f} s_1 \neq s_2)$

A little bit mind-numbing, but not as much as the next discovery: we could pass a set as an argument of a function and receive another set as a returned value.

Lets build one such function:

at first, let’s build a set of all possible types:
$T' = \{ T: isType(T) \}$
we can define a list of a type T as a function from an index (a natural number) to a value $t: T$ (which is not necessarily total function - it has no defined values after its last index)
$isList_T(l) = \forall_{(i, t) \in l}( i \in \mathbb{N} \land t: T \land \forall_{(i_2, t_2)} (i = i_2 \implies t = t_2 ))$
or shorter
$isList_T(l) = l: \mathbb{N} \rightarrow T$
now, we can build a set of all possible lists of a type $T$ :
$L_T = \{ l: isList_T(l) \}$
next step would be to build a set of all possible lists of a type $T$ for each possible type $T$ :
$L = \{ L_T: isType(T) \}$
finally, we can define a function which takes one $T \in T'$ and returns another set $L_T \in L$

$List: T' \rightarrow L$ $List(T) = \{ l: isList_T(l) \}$

What we built here is a function that takes a type $T$ and returns… a type $List_T$ or List[T] using Scala’s notation.

Such interesting function which takes a type and builds another type is called a type constructor. In Scala, we could denote $List$ type constructor as List[_].

Type constructors are interesting to us because they are a mathematical way of defining *parametric types, **known as *class templates by C++ programmers and generics by Java programmers (though, to be honest, from a mathematicians point of view they are cripples implementation of parametric types). (You will also see this term frequently when you start reading the definitions of type classes on Haskell wiki).

Where parametric types differ from what you might know from C++ or Java is the consequence of their functional definition. You can have a higher order function: a function which takes and/or returns another function. Same with type constructors: you can have a type constructor, which takes and/or returns another type constructor. Many type class definitions take a type constructor as a parameter:

trait Functor[F[_]] { // <- F is a type constructor
  
  def map[A, B](fa: F[A])(f: A => B): F[B]
}

Kinds and higher-kinded-types

These types of types (a type, a type constructor, etc) are known as kinds. Scala let us investigate them in REPL using :kind:

scala> :kind String
String's kind is A

scala> :kind List
List's kind is F[+A]

scala> :kind Either
Either's kind is F[+A1,+A2]

scala> import scala.language.higherKinds
scala> trait NeedTC[F[_]]
scala> :kind NeedTC
NeedTC's kind is X[F[A]]

You can add a -v flag to get a detailed description with a type-theory notation ( $*$ read as type):

scala> :kind -v String
String's kind is A
*
This is a proper type.

scala> :kind -v List
List's kind is F[+A]
* -(+)-> *
This is a type constructor: a 1st-order-kinded type.

scala> :kind -v Either
Either's kind is F[+A1,+A2]
* -(+)-> * -(+)-> *
This is a type constructor: a 1st-order-kinded type.

scala> import scala.language.higherKinds
scala> trait NeedTC[F[_]]
scala> :kind -v NeedTC
NeedTC's kind is X[F[A]]
(* -> *) -> *
This is a type constructor that takes type constructor(s): a higher-kinded type.

But functions allows us to do a partial application. Can we do it for types in Scala? Well, we can, but in a messy way.

scala> :kind ({ type T[A] = Map[String, A] })#T
scala.collection.immutable.Map[String,?]'s kind is F[+A]

scala> :kind -v ({ type T[A] = Map[String, A] })#T
scala.collection.immutable.Map[String,?]'s kind is F[+A]
* -(+)-> *
This is a type constructor: a 1st-order-kinded type.

(Such thing is called a type lambda, but well talk more about it when we learn more about structural and path-dependent types. Let’s skip detailed explanations for now).

Among the detailed descriptions, there are terms like 1st-order-kinds and higher-kinded types. Indeed, the same way function taking/returning functions are called higher-ordered functions, type constructors taking/returning type constructors are called higher-kinded types.

Let’s take a look at how values and functions correspond to types, and how it is reflected in types and type constructors correspondence to kinds:

Value	Type	Notes
`"test"`	`String`	a plain value
`(i: Int) => i*2`	`Int => Int`	a 1st-order function
`(a: Int) => (b: Int) => a+b`	`Int => Int => Int`	a higher-ordered function
`((i: Int) => i*2)(2)`	`Int`	a plain value
`((a: Int) => (b: Int) => a+b)(2)`	`Int => Int`	a 1st-order function

and on a type level:

Type	Kind	Notes
`String`	`*`	a proper type
`Int => Int`	`*`	a proper type a 1st-order function
`Int => Int => Int`	`*`	a proper type a higher-ordered function
`List`	`* -> *`	a type constructor, 1st-order-kinded type
`Either`	`* -> * -> *`	a type constructor, 1st-order-kinded type
`List[A]`	`*`
`({ type T[A] = Either[String, A] })#T`	`* -> *`	a type constructor, 1st-order-kinded type

This table show us how elegantly values corresponds to types, functions to type constructors and function application to passing type to a type constructor. Values can be think of as nullary (0-argument) functions, while types can be thought of as nullary type constructors, and it still works perfectly! They are exactly the same thing, just one level of abstraction higher.

As far as I know, there are even higher levels in values-types-kinds hierarchy, such as sorts, but hardly ever there would be some practical applications for them.

Type constraints

We already talked about type bounds, and how we might impose restrictions on inferred type just giving it values it has to infer against. With parametric types, we might be able to do it more directly.

Let’s say we have an ADT like this:

sealed trait User { val id: Int }
case class Member(id: Int, name: String) extends User
case class Admin(id: Int, accss: Set[String]) extends User

We would like to have a function that takes the list of users, and return a map from id to user… except we would like that map to adjust: if we pass a set of Users it should return a Map[Int, User], if we pass a set of Members it should return a Map[Int, Member].

Our first attempt would probably look like this:

def byId(users: Set[User]): Map[Int, User] =
  users.map { u => u.id -> u }.toMap

It works, but it is not generic. We wanted it to be generic.

def byId[U](users: Set[U])(getId: U => Int): Map[Int, U] =
  users.map { u => getId(u) -> u }.toMap

It also works, but it forces us to pass an extractor every time - not a bad idea if it did something different each time, but it only need to get the same property all our cases share via the User class. If only there was a way to suggest the type system, that U needs to extends User (or has User as a upper bound), and then make use of its properties…

As a matter of the fact, there is.

def byId[U <: User](users: Set[U]): Map[Int, U] =
  users.map { u => u.id -> u }.toMap

byId(users: Set[Member]) // Map[Int, Member]

<: denotes an upper bound in type parameters. It look like this, so that parser would not confuse it with <, but its meaning is similar - a type on the left is smaller (lies lower in hierarchy) than a type on the right. There is also a notation for lower bound, >::

def recover[E, A, B >: A](
    either: Either[E, A])(f: E => B): Either[E, B] =
  either match {
    case Left(e)  => Right(f(e))
    case Right(a) => Right(a)
  }

recover[String, Admin, User](err: Either[String, Admin]) {
    _ =>
  fallback: Member
}
// Either[String, User]

These are the type constraints provided by the language. Besides them there are also generalized type constraints, which exist as type classes provided by compiler (if compiler cannot create them, it means that a types fail constraints):

<:< - let us require that one type is a subclass the other.

def upcast[A, B](set: Set[A])(
  implicit ev: A <:< B
): Set[B] = set.map(ev(_))
  
upcast[Member, User](Set(m: Member)) // Set[User]

It is defined inside a scala.Predef.

=:= - let us require that one is equal to the other.

def update[A, B](set: Set[A])(f: A => B)(
  implicit ev: A =:= B
): Set[B] = set.map(f)
  
val members: Set[Member]
  
update[Member, Member](members)(identity) // ok
update[Member, User](members) { member =>
  member: User
} // compilation error!

It is defined inside a scala.Predef as well.

<%< - already removed from Scala, but it used to denote that type A is either a subtype of B or could be implicitly converted to it.
=:!= - provided by shapeless. Proves, that types are different.

If you are requiring, that there should be a type class in scope for your type parameter:

def doSomething[F[_]: Monad]: F[String]

I met people, who would also name implicit syntax to type classes a type constraint (I require of a type T to be a monad).

Variance

Type constraints are useful when we want to describe our expectations about type parameter and/or returned type. But there are use cases where they are not enough.

Let’s take a look at Option. The first implementation would be something like:

sealed trait Option[T] {}
case class Some[T](value: T) extends Option[T]
case class None[T]() extends Option[T]

It is not something we got used to. We don’t want to create a new None for each type. Actually, we want just one case object, which we could use as a valid Option[T] everywhere. However, if we tried to:

sealed trait Option[T] {}
case class Some[T](value: T) extends Option[T]
case object None extends Option[Nothing]

then, when we would try to use None as Option[String] or Option[Int] we would get a type error.

def f(stringOpt: Option[String]): Unit
f(None) // compilation error!

The reason for that is that we defined the Option has an invariant parametric type. Invariance means, that even if B <: A then Option[B] still cannot be substituted for Option[A]. It is a default and a good one!

Long time ago Java let us do such things with arrays. In Java, you can create Array[Member] and pass it as a valid parameter for a function expecting Array[User]. That function assumes, that the array can contain anything as long as it’s User, so it can insert Admin to the array. Then you end up with a java.lang.ArrayStoreException. Suddenly you have a reference to array of Members that contains Admin - JVM recognizes such state as invalid. To avoid repetition of this problem, when Java introduced generics, it made them invariant without an option to change the behavior.

But the issue only appears if you share a mutable state. If you have immutable object, you could be able to e.g. pass List[Member] as List[User], Option[B] as Option[A] and we would never end up with an invalid behavior (if we modified something, we would create a modified copy, so original would still be fulfilling its requirements).

A situation when we explicitly state that if B :> A then F[B] :> F[A] is called covariance. To mark a type parameter as covariant we use +:

sealed trait Option[+T] {}
case class Some[+T](value: T) extends Option[T]
case object None extends Option[Nothing]

Covariance propagates down, so if we made T in Option covariant, we will receive complaints if we don’t do the same for T in Some. Usually, when we mark the parameter as covariant and we extends the type as object e.g. an empty container (None, Nil, EmptySet), we use Nothing as a hardcoded type - as a bottom type it subtypes all types, so such object can be upcasted always. Additionally, it is a nice explanation, that this container is empty as it is literally a container of nothing.

(As we can guess, most immutable collection would be covariant, while most, if not all, mutable collections is invariant).

It is quite common to have situation when we want to pass a value of a specific parametric type and generalize it. What about the opposite situation, when we have a producer of a parametric type values and we want to specify it:

trait Consumer[T] {
  def consume(value: T): Unit
}

val members: List[Member]
def consumer: Consumer[User]
members.foreach(consumer.consume) // error!

Case where we want to say that A <: B implies F[B] >: F[A] is called contravariance as it is an opposition of covariance. We denote it with - sign:

trait Consumer[-T] {
  def consume(value: T): Unit
}

val members: List[Member]
def consumer: Consumer[User]
members.foreach(consumer.consume) // works

By analogy to covariance, Nothing, and bottom type, here Any (top type) would be a type parameter of a consumer that consumes any value.

Covariance and contravariance are very important to the functional programming in Scala. They allow for intuitive behavior for the central element of function-as-a-first-class-citizen approach, that it function itself:

trait Function[-A, +B] {
  def apply(a: A): B
}

It is obvious to us that function, that takes User should also take Member, and function returning Member could be used as a substitute for a function taking User. This also shows as that invariance, covariance, and contravariance - known together as variance - describe a type parameter and not the whole type. Here A is contravariant type parameter of Function while B is covariant type parameter.

(Actually, Scala’s function definition looks a little bit different - and depends on its arity - but the principle holds. It also holds for a type derived from a function, a PartialFunction).

Type theory allows also bivariance - a case when the type is covariant and contravariant at the same type - but Scala doesn’t allow such situation.

Existential types

We talk a bit about type parameters, how we can put constraints on them and how they translate to whole type being a super/subtype. But what should be do when we don’t care?

def count[T](seqs: Seq[T]*): Int = seqs.map(_.size).sum

count(Seq(1,2,3), Seq("test")) // 4

Here, we don’t actually use the type T at all. It should work for all Ts we pass (if we put there some type constraints, then for all T that matches them). So, we can describe this function with an universal quantifier:

\forall_T count: Seq_{Seq_T} \rightarrow int

Because we can describe it with universal quantifier we could call type Seq[T] (and Seq[Seq[T]]) an universal type.

But what if our function count was a part of another function or a class, it was the only place when we needed T and we would have to pass it down anyway? Or what we wanted to store Sequences in a map, but we didn’t care about the type, because each of them would be of some other type? (ClassTag[T] -> Set[T]).

If it was Java, we could use ? to indicate, that we don’t care about the type:

int count(java.util.List<?>... seqs) {
   return Arrays.stream(seqs).mapToInt(seq -> seq.size()).sum();
}

This ? tells us something like we know, that there is some concrete T we could put here, but we don’t care about it at the moment. We don’t know any details about this T and, to be honest, we don’t care. All we need is assumption that there is some. In math we could express that with an existential qualifier:

seq: Seq_? \iff \exists_{T} seq: Seq_T

count: Seq_{Seq_?} \rightarrow int

By analogy we could name such type an existential type. Scala let us express it in two ways:

def count(seqs: (Seq[T] forSome {type T})*): Int = 
  seqs.map(_.size).sum

forSome is a syntax existing solely for existential types, I believe. The other, shorter version is:

def count(seqs: Seq[_]*): Int = 
  seqs.map(_.size).sum

which looks dangerously similar to type constructor syntax. As a matter of the fact, we can distinguish these 2 different usages merely by the context in which they appear: if F[_] appears in definition of a type parameter it is a type constructor, if it is used as a type of a variable/parameter it is an existential type.

There are few things that would be difficult to implement without existential types (fixed-point types in recursion schemes, I guess?), but in our everyday life, we don’t use them that often. During his work on Dotty, Odersky found out that some of their applications are among the things that were making the language unsound, which is why forSome notation is removed from Dotty.

When mathematician talks that some theory is sound, when all formulas in the theory could be logically derived from the axioms and rules of the theory.

In case of a type system, it means that if our type system decides that some expression has a type T, then if you run it, the returned value will also be of type T.

It might be surprising for some, but (besides null, which breaks most type systems that allows it and everybody is aware of that) there is a lot of cases when language can be broken and forced to return value of an invalid type.

F-bound types

The last type-parameter-related types I want to talk about are F-bound types. The name originates from F-type system, which was the first one that allowed their existence. But what they are and what problem they solve? Let’s look at an example:

trait Number {
  
  def +(number: Number): Number
  def *(number: Number): Number
}

class Int extends Number {
  
  def +(number: Number): Number
  def *(number: Number): Number
}

class Double extends Number {
  
  def +(number: Number): Number
  def *(number: Number): Number
}

val a: Int
val b: Int
val c = a + b // Number!

In our example, if we defined our interface without any type parameters, we would end up with situation where any number on a specific Number would lose information about a specific type. Also, without any notice it would allow additions of unrelated implementations:

val a: Char // Char extends Number
val b: Double
val c = a + b // Number, but what Number exactly?

We might make our type Number parametric so that we could control the type of parameters and returned values:

trait Number[N] {
  
  def +(number: N): N
  def *(number: N): N
}

class Int extends Number[Int] {
  
  def +(number: Int): Int
  def *(number: Int): Int
}

class Double extends Number[Double] {
  
  def +(number: Double): Double
  def *(number: Double): Double
}

val a: Int
val b: Int
val c = a + b // Int!

Issue with operations on unrelated implementations also disappears:

val a: Char
val b: Double
val c = a + b // error!

However, now we have a new issue:

class User extends Number[Int] // why is that a thing?

What we need here is some sort of restriction, where we require, that a type parameter must be the type that extends the parameterized type. In Scala, we can do it in 2 ways

using self-types:
```
trait Number[N] { self: N =>
  // ...
}
```
(we’ll talk more about self-types later on),
using type constraints:
```
trait Number[N <: Number[_]] {
  // ...
}
```
(Here Number[_] is an existential type, not a type constructor).

This definition relies on a certain limitation of Scala (and Java) - we cannot implement the same parametrized type with different parameters in one class. Therefore:
```
class Double extends Number[Double]
class X extends Number[Double] // fails - different type params in Numbers
class Y extends Number[String] // fails - string is not a Number
```

Both ways help us ensure, that the implementation is type-recursive.

This is quite a popular way of solving the initial problem in languages that don’t support (or encourage) type classes. In Java, it is a built-in method of implementing Enums (enum X becomes abstract class X extends Enum[X]). C++ knows F-bound types under the name Curiously Recurring Template Pattern (CRTF).

This pattern is certainly less encouraged in languages that DO support type classes (and do not support classes and inheritance, which means half the FP languages).

Summary

In this post, we talked about type constructors, parameters and constraints. As a matter of the fact, at this point we have enough knowledge to model any program we like in a nice way.

However, there are some other concepts related to types in Scala that are worth mentioning. They aren’t improving concepts which you might know from other mainstream languages, but add a way of encoding some more information about the data in type.

In next post, we will talk about them, to fill the gaps and make sure type system, won’t have much to hide from us!

kubuszok.com