Kinds of types in Scala, part 3: embedding some more info in a type
In the previous post, we understood how parametric types work, which lets us cover most of the cases we’ll have in our everyday coding. However, there are some interesting concepts, which, while not so heavily used, can come in handy at times.
Structural and refined types
We know that in math a class is a collection of objects that fulfill some conditions, and in programming we borrowed the idea. But what if we wanted to specify these conditions without giving them a name? In JavaScript, you can define a type like:
type User = { name: string, surname: string }
By most OOP languages’ standards, it is not a class. But in JavaScript and Go (and in mathematics), this is an allowed method to create a collection of values by a predicate (has name property of type string and surname property of type string) - in other words a type.
Scala also allows us to define a type this way:
type User = {
val name: String
val surname: String
}
We don’t have to extend User to make a value conform to the type - it is enough to fulfill the requirements:
case class Somebody(name: String, surname: String)
def needUser(user: User): Unit
needUser(Somebody("test", "test")) // works!
We can call User a structural type. It is an interesting way of implementing a type, one where compiler checks if object has all the right methods and fields instead of scanning the class hierarchy. A problem with such an approach is that at runtime Scala has to use runtime reflection to access these fields, so it comes with a performance penalty.
Structural types are not restricted to definitions using only vals and defs. We can demand that it extend some class as well:
trait X { val x: String }
type Y = { val y: Int }
val z: X with Y = new X { val x = "test"; val y = 0 }
So a compound type containing a structural type is also a structural type.
The example above shows us another interesting thing. If we asked the REPL about the type of:
new X { val x = "test"; val y = 0 }
the answer would be:
AnyRef with X{val y: Int}
Scala automatically adds the information about new fields and methods to the type! Actually, any time we add something extra to the interface, if we don’t upcast the returned value, Scala will reflect these extra properties in its type. We call such types the refined types, but they are just a special case of structural types.
It appears refined types are not unique to Scala. When local type inference arrived in Java, it also started to support them, though in a half-baked way. You cannot declare a structural type, so such types can only be used via vars, which hold information about refinement. This incomplete implementation can lead some Java programmers to think that such a feature is dangerous.
Path-dependent types
Let’s say we want to model a card game:
case class Card(color: String, value: String)
case class Player(name: String)
class Game {
def getPlayers: Set[Player] = ???
def getCards: Set[Card] = ???
def playerPlayCard(player: Player, card: Card): Unit = ???
}
However, in our application there could be several games running at once, so we are at risk of messing up objects from different games:
val game1: Game
val game2: Game
game1.playerPlayCard(game2.getPlayers.head,
game2.getCards.head)
Could we make sure that we can only use objects from the right Game and do it at compile time (assertions throwing at runtime might be a little too late for our needs)?
Path-dependent types are a way of stating that the type of one object depends on another object. It is a limited version of the more generic idea of dependent types. Actually, Dotty derives its name from Dependent Object Types calculus that Odersky designed to make future versions of Scala sound.
So, how could we define such a type?
class Game {
case class Card(color: String, value: String)
case class Player(name: String)
def getPlayers: Set[this.Player] = ???
def getCards: Set[this.Card] = ???
def playerPlayCard(player: this.Player, card: this.Card): Unit = ???
}
Let’s check what types will have values returned from game1 and game2:
val game1 = new Game
val game2 = new Game
game1.getPlayers // Set[game1.Player]
game2.getPlayers // Set[game2.Player]
We get different types, so we can expect that we cannot substitute one for the other:
game1.playerPlayCard(game2.getPlayers.head,
game2.getCards.head) // fails!
To create a path-dependent type, all we need to do is declare a type inside another type! It doesn’t have to be a class:
class X {
type Y = String
val y: Y = "y"
}
val x1 = new X
val x2 = new X
However, if it’s just a type alias, Scala can prove that two types are the same, and so it will not help us much:
def y(x: X)(y: x.Y): Unit = ()
y(x1)(x2.y) // no complaints: x1.Y = String = x2.Y
That is, unless we make it a little less obvious that the types are equal:
trait X {
type Y = String
val y: Y = "y"
}
class X2 extends X
val x1 = new X2
val x2 = new X2
y(x1)(x2.y) // fails!
It is not obvious anymore because we could e.g. make the type Y abstract:
trait X {
type Y
val y: Y
}
val x1 = new X {
type Y = String
val y: Y = "y"
}
val x2 = new X {
type Y = Int
val y: Y = 1
}
OK, but what if we wanted to relax our requirements at some point?
def takeAnyPlayer(p: ???): Unit
We need a way of passing a path-dependent type without its context if needed. This is possible via #:
def takeAnyPlayer(p: Game#Player): Unit
At this point there is very little we know for certain about our p. Scala most likely won’t be able to tell what the exact type is even if it would be obvious to you from the code. If there were some type constraints about the type, that would be guaranteed, you can rely on it. But anything that comes with the specification of path-dependent type is lost:
trait X {
type Y <: AnyVal { def toLong: Long }
val y: Y
}
val x1 = new X {
override type Y = Int
override val y: Y = 1
}
val x2 = new X {
override type Y = Long
override val y: Y = 1L
}
Hmm, here our type Y is defined directly inside a X instance, so Scala should be able to guess its exact type. We can confirm that indeed it is:
x1.y * x1.y // 1: Int
x2.y * x2.y // 1: Long
Similarly for #:
scala> trait X { type Y }
scala> :kind (X { type Y = Int })#Y
// Int's kind is A
We can use path-dependent types as a method return type (path-dependent methods). However, we are not allowed to create a function that uses path-dependent types. It will be introduced in Dotty under the name dependent function types.
Kind projector/type lambda
At this point we have enough information to understand what was happening inside some weird construction we used during partial application in a type constructor with multiple type parameters:
scala> :kind ({ type T[A] = Either[String, A] })#T
scala.util.Either[String,?]'s kind is F[+A]
We start with Either[_, _] and we want to partially apply String as the first type parameter. In order to do so, we:
- create a type alias
T[A] = Either[String, A], which creates a parametric typeT[A]with one type parameter, - we put it inside a structural type to create an opportunity for creating a path-dependent type,
- finally we extract
Tas a path-dependent type, such that Scala can tell its specific type exactly, - since we didn’t apply type parameters, we end up with a single parameter type constructor,
- we achieved partial-application of a type parameters to a type constructor, aka type lambda or kind projection.
Quite often kind projectors use λ to denote the output type. Because the syntax is quite messy, you might prefer to use a dedicated compiler plugin that would generate it for you. With kind-projector compiler plugin you would need to write just:
Either[String, ?]
In Dotty, type lambdas are supported natively with the syntax:
[T] => Either[String, T]
Kind projectors become more useful the more you dive into hardcore functional programming as promoted by Cats and Scalaz.
Self-types
Mixins. Sometimes we want to add some functionality to an existing class and traits are a great way of achieving that:
class Producer {
def produce: Int = 1
}
trait DoubleProducer extends Producer {
override def produce: Int = super.produce * 2
}
trait IncreaseProducer extends Producer {
override def produce: Int = super.produce + 1
}
class MyProducer extends Producer with DoubleProducer with IncreaseProducer
(new MyProducer).produce // 3
Thing is, if we do it that way, we cannot make Producer a trait or abstract class. Perhaps we would like to use it only as an interface, and then decorate the implementation (one of many).
In Scala, sometimes we need to refer to the current instance. Usually, we might use this, but if we nest type definitions, this will only point to the one we are currently inside. It makes it hard to access the outer object. We can give a name to this to make it less ambiguous:
trait Outer { self =>
trait Inner {
self // Outer
this // self.Inner
}
}
However, we might use type ascription to put a constraint on this self value (or whatever we decide to name it):
trait Producer {
def produce: Int
}
trait DoubleProducer { self: Producer =>
override def produce: Int = super.produce * 2
}
As we can see, we no longer need to extend decorated type - type ascription already allows us to access the decorated type’s members. Additionally, this self-type acts as a type constraint on any type, that would like to mix-in the trait.
class OneProducer extends Producer {
def produce: Int = 0
}
class MyProducer extends OneProducer with DoubleProducer // OK
class Invalid extends Producer with DoubleProducer // error!
Self-types might be used to enforce ordering of low-priority implicits:
object MyType extends MyTypeImplicits
trait MyTypeImplicits extends MyTypeLowPriorityImplicits {
// implicits
}
trait MyTypeLowPriorityImplicits { self: MyTypeImplicits =>
// low-priority implicits
}
That is not the only use case though. Some people use self-types as a way of type-safe dependency injection:
trait FirstServiceComponent {
trait FirstService
val firstService: FirstService
}
trait SecondServiceComponent {
trait SecondService
val secondService: SecondService
}
trait FirstServiceComponentImpl extends FirstServiceComponent {
self: SecondServiceComponent =>
val firstService = new FirstService {
// use secondService
}
}
trait SecondServiceComponentImpl extends SecondServiceComponent {
self: FirstServiceComponent =>
val secondService = new SecondService {
// use firstService
}
}
object Application
extends FirstServiceComponentImpl
with SecondServiceComponentImpl
Each component of such a construct is called a layer. When we compose layers together, we are baking a cake. This metaphor is the origin of the name of this pattern - a cake pattern. Many programmers and I discovered that such a bastardized mixin causes more troubles than it solves. Problems with extensions, maintenance and initialization order I already described in another article.
Creating new types out of existing: tagged and value types
Sometimes we don’t want to create a completely new product with 2 or more fields, nor a sum type. We also don’t want to use type alias, which doesn’t create a distinct type. We might want something like Surname or Name which is like String, but with no risk of accidentally passing values in the wrong way (surname as name, name as surname).
One way of handling these issues was designed by Miles Sabin for shapeless; the other one was provided by the language authors.
The solution proposed by Miles Sabin works by tricking the compiler into believing that our value is of a type it is not:
object tag {
def apply[U] = new Tagger[U]
trait Tagged[U]
type @@[+T, U] = T with Tagged[U]
class Tagger[U] {
def apply[T](t : T) : T @@ U = t.asInstanceOf[T @@ U]
}
}
To understand this, let us see an example:
sealed trait Name
sealed trait Surname
val name = tag[Name]("John") // String @@ Name
val surname = tag[Surname]("Smith") // String @@ Surname
This @@ construct takes two types: our initial type (for which we provide a value) and another type called a tag, which will serve as a way of making a newly created type unique. We can always demand a type String with Tagged[Name] (or String @@ Name if you prefer) - the fact that String is final is checked only during class declaration and the mismatch between String with Tagged[Name] could only be found at runtime.
Meanwhile, the only methods we will have declared will be methods from String - which our value surely is. JVM will never have an opportunity to complain, so as long as compiler is OK about it, it will work. It appears, that .asInstanceOf[String @@ Name] is enough to silence it.
Tagged types require that we define a trait/class that will serve as a tag. They can quite often be sealed traits that you won’t use anywhere else. As long as you have the tags, you can easily lift any type to a tagged type. In runtime, they will be untagged though, which might be advantage or disadvantage depending on situation (runtime reflection!). In compile time they will create different types, so you have to take that into consideration if you are using any type-level programming - type classes provided for primitives won’t work with tagged types (there is no one-standard for tagging) and you will have to handle it yourself.
To provide a standard solution to the mentioned problem, Scala came up with value types. A value type is basically a class extending AnyVal with one val member.
// for convenience people prefer to use case classes for it
case class Name(value: String) extends AnyVal
// though it is not required
class Surname(val value: String) extends AnyVal
object Surname {
def apply(value: String): Surname = new Surname(value)
}
Since there are few restrictions imposed on this class, the compiler has a lot of opportunities to optimize the wrapper away. Since it is a standard solution, many libraries handle it out of the box. The disadvantage is that the optimization doesn’t work reliably, so e.g. mocking is almost completely broken (because sometimes the value is boxed and sometimes wrapper is optimized away).
As some sort of middle ground, a newtype library was designed.
@newtype case class Name(value: String)
It lets you declare your new type as a case class (which helps with IDE support), while a macro annotation rewrites it into:
type Name = Name.Type
object Name {
type Repr = String
type Base = Any { type Name$newtype }
trait Tag extends Any
type Type <: Base with Tag
def apply(x: String): Name = x.asInstanceOf[Name]
// here AnyVal is used as extension methods optimization
implicit final class Ops$newtype(val $this$: Type) extends AnyVal {
def value: String = $this$.asInstanceOf[String]
}
}
Phantom and literal types
sealed traits for each tag would pollute the JVM with unused interfaces. Could that be avoided? Well, it is planned.
There is a pattern for implementing something that is basically a type-level state machine.
sealed trait Switch
sealed trait On extends Switch
sealed trait Off extends Switch
class Bulb[S <: Switch] {
def turnOn(implicit ev: S =:!= On): Bulb[On] =
new Bulb[On]
def turnOff(implicit ev: S =:!= Off): Bulb[Off] =
new Bulb[Off]
}
val bulb = new Bulb[Off]
bulb.turnOff // error!
bulb.turnOn // Bulb[On]
bulb.turnOn.turnOn // error!
bulb.turnOn.turnOff // Bulb[Off]
Because Switch, On and Off are not used for anything other than indicating the state with a type, they are called phantom types. Currently, it is just a pattern, but Dotty will let us mark a type as an erased term, which makes it exist only in compile time.
Another type-level-specific thing that will arrive — this time with Scala 2.13 — is a literal type. With libraries like shapeless you can express e.g. case class as a HList. Thing is, it only stores information about types and their positions. There is no information about properties names, so generating things like JSON codecs is impossible. Or would be if not for LabelledGenerics, which return HList of tuples of Witness-types. Wait, what is a Witness? Let’s try to figure out from an example:
case class Test(a: String, b: Int)
def getGeneric[T, U](
t: T)(implicit gen: Generic.Aux[T, U]): U = gen.to(t)
def getLabelledGeneric[T, U](
t: T)(implicit gen: LabelledGeneric.Aux[T, U]): U = gen.to(t)
getGeneric(test) // String :: Int :: HNil = "xyz" :: 1 :: HNil
getLabelledGeneric(test)
// labelled.FieldType[Symbol @@ a, String] ::
// labelled.FieldType[Symbol @@ b, Int] ::
// ops.hlist.ZipWithKeys.hnilZipWithKeys.Out =
// "xyz" :: 1 :: HNil
Interestingly, the LabelledGeneric created an instance, which stores the name of the field it originated from! (FieldType works the same way as tagged types - no extra information in runtime, but extra data in compile time).
It appears that the compiler can create a singleton type of just one value. Since it is defined by its literal it is also known as a literal type. But even if compiler knows such types, it doesn’t let us create one… without some hacking.
Witness is a helper trait which implementation is provided by a shapeless macro:
trait Witness extends Serializable {
type T
val value: T {}
}
object Witness extends Dynamic {
type Aux[T0] = Witness { type T = T0 }
type Lt[Lub] = Witness { type T <: Lub }
implicit def apply[T]: Witness.Aux[T] =
macro SingletonTypeMacros.materializeImpl[T]
implicit def apply[T](t: T): Witness.Lt[T] =
macro SingletonTypeMacros.convertImpl
// ...
}
These macros let us currently access the literal types, which is so far accessible only via compiler internals. But with Scala 2.13 they will become available without costly hacks in the language.
Summary
In this article, I intended to talk a bit about types in Scala from the perspective of type and set theories, but also explain certain idiosyncrasies of Scala when it comes to implementing them.
We can see that Scala has quite a powerful type system which is only going to get better. Even on its own, it is enough to give Scala a competitive edge over many other languages. It has allowed the development of many features which would be otherwise hard to implement, fragile or impossible.
If someone is looking for a compiled version of posts 1–3, it can be found here.