IO monad: which, why and how

27 May 2019• series: Mathematics and Computer Science • tags: scala io monad cats scalaz zio effect eff free freer ttfi

In many discussions about FP - especially about purely functional programming - there are talks about IO monad, IO type and so on. Its users’ argument on how it is better than your standard imperative approach, how it helps reason, encapsulate side-effects and design more elegant programs. On the other hand, its opponents argue, that programs are side-effectful by nature, so you would end up with something that needs IO for like 80% of the codebase while learning curve for newcomers would go up, so what the benefit? Let’s try to figure out.

Imperative programming in Haskell

Haskell is a purely functional language. This means that you cannot just run StdIn.readLine and get a String from the user. Instead, the language will force you to use only pure values and function, so that no side effects happen. But we want side effects (sometimes), how can we query a database, read from or write to stream, fetch from online service, etc?

The answer functional programmers arrived at is: represent all side effects as data, sort of command object having all information necessary to perform side effect… which is not performed, at least not by you. It is done by the compiler when it runs the main function, so your code contains only pure values and functions working on pure values (which are also pure values). We can imagine it in Scala more or less like this:

sealed trait SideEffect[Result]
object SideEffect {
  final case class WriteLine(line: String)
      extends SideEffect[Unit]
  case object ReadLine
      extends SideEffect[String]
  final case class Curl(url: String,
                        method: String,
                        headers: List[String])
      extends SideEffect[String]
}

// inaccessible to us in any way
def run[A]: SideEffect[A] => A = {
  case WriteLine(line)            => ...
  case ReadLine                   => ...
  case Curl(url, method, headers) => ...
}

// taken by compiler to use `run` on it
// to perform side effects
def main(args: Array[String]) = ...

But how would the compiler figure out how it would work? Well, originally Haskell used List[Request] => List[Response] format. Request was basically ADT representing all possible side effects, and Response was ADT for results. And main was the only place which could know how to perform List[Request] => List[Response]. For a lot of reasons, this is quite bad (for instance, nothing stops you from skipping some computation, because these are just lists, which might break your code). So, at some point Imperative functional programming paper was released by Simon Payton Jones and Philip Wadler, which proposed a different solution.

Let’s say we have this SideEffect[A] type. We know, that compiler can turn it into actual side effect by some run function, that we cannot use directly ourselves. It compiler would run it directly on main’s result, then we would be limited to main in form of:

def main(args: Array[String]): SideEffect[A]

or rather

def main(args: Array[String]): SideEffect[Unit]

This is not composable at all, so we need another type which would allow us to compose things. Since it’s doing IO let’s call it IO[A]:

def main(args: Array[String]): IO[Unit]

Now, we need to figure out how this IO[A] would look like and how it could be interpreted using run.

sealed trait IO[A]
object IO {
  // ???
}

// inaccessible to us
def runIO[A]: IO[A] => A

// done by compiler
runIO(main(args))

The simplest case for a side effect is just using one SideEffect[A]. This side effect would be suspended by IO until it is interpreted by the compiler, so let’s call it just that:

sealed trait IO[A]
object IO {
  final case class Suspend[A](
    sideEffect: SideEffect[A]
  ) extends IO[A]
  
  def writeLine(line: String): IO[Unit] =
    Suspend(SideEffect.WriteLine(line))
  val readLine: IO[String] =
    Suspend(SideEffect.ReadLine)
  def curl(url: String,
           method: String,
           headers: List[String]): IO[Unit] =
    Suspend(SideEffect.Curl(line))
}

// not exposed to us
def runIO[A]: IO[A] => A = {
  case Suspend(sideEffect) => run(sideEffect)
}

// at this point SideEffect type don't have to
// be exposed as well - we only need to see IO!

def main(args: Array[List]): IO[Unit] =
  IO.writeLine(args.toString)

But we wanted IO to be composable. What’s the simplest thing, we can do with a value A in some F[_]? map it! So, let’s add the mapping ability to our structure. We are operating on values and we cannot just run the computation, so we will just store the previous IO[A] and f: A => B and deal with the actual mapping in runIO:

sealed trait IO[A] {
  def map[B](f: A => B): IO[B] =
    Mapped(this, f)
}
object IO {
  final case class Suspend[A](
    sideEffect: SideEffect[A]
  ) extends IO[A]
  final case class Mapped[A, B](
    ioa: IO[A],
    f: A => B
  ) extends IO[B]
  
  // ...
}

// still beyound mortal's realm
def runIO[A]: IO[A] => A = {
  case Suspend(sideEffect) => run(sideEffect)
  case Mapped(ioa, f)      => f(runIO(ioa))
}

def main(args: Array[String]): IO[Unit] =
  IO.readLine
    .map { input => s"I just read: $input" }
    .map { input =>
    // ignoring the input because we need Unit
    ()
  }

So, we can map, but if we wanted to use 2 side effects at once (e.g. readLine then writeLine), map is problematic.

IO.readLine
  .map(IO.writeLine): IO[IO[Unit]]

We don’t want to end up with IO[IO[Unit]] because what we would do with that? map using IO[Unit] => Unit by ourselves? Since we cannot use runIO the only way to conform to that, would be by ignoring the IO[Unit]. Which means it would not be interpreted (only the outer one is!). So we need a way of flattening these 2 IOs into 1.

IO.readLine
  .map(IO.writeLine)
  .flatten: IO[Unit]

Even better, we could just do map and flatten in one step - flatMap. Since it sounds more powerful than map, let’s replace map with flatMap for now.

sealed trait IO[A] {
  def flatMap[B](f: A => IO[B]): IO[B] =
    FlatMapped(this, f)
}
object IO {
  final case class Suspend[A](
    sideEffect: SideEffect[A]
  ) extends IO[A]
  final case class FlatMapped[A, B](
    ioa: IO[A],
    f: A => IO[B]
  ) extends IO[B]
  
  // ...
}

// still beyond mortal's realm
def runIO[A]: IO[A] => A = {
  case Suspend(sideEffect) => run(sideEffect)
  case FlatMapped(ioa, f)  => runIO(f(runIO(ioa)))
}

def main(args: Array[String]): IO[Unit] =
  IO.readLine
    .flatMap { input =>
      IO.writeLine(s"I just read: $input")
    }

We can combine 2 side effects. But we are missing the last piece. If we wanted to use some pure value. Maybe we would want to start from it and then flatMap. Or maybe we would like to flatMap and then use pure value inside (so, basically recover map functionality). Anyway, we need to be able to lift pure value to IO. So, let’s add that final puzzle:

sealed trait IO[A] {
  def flatMap[B](f: A => IO[B]): IO[B] =
    FlatMapped(this, f)
  def map[B](f: A => B): IO[B] =
    flatMap(f andThen IO.pure)
}
object IO {
  final case class Suspend[A](
    sideEffect: SideEffect[A]
  ) extends IO[A]
  final case class FlatMapped[A, B](
    ioa: IO[A],
    f: A => IO[B]
  ) extends IO[B]
  final case class Pure[A](
    value: A
  ) extends IO[A]
  
  def pure[A]: IO[A] = Pure[A]
  
  // ...
}

// in gods' realm
def runIO[A]: IO[A] => A = {
  case Suspend(sideEffect) => run(sideEffect)
  case FlatMapped(ioa, f)  => runIO(f(runIO(ioa)))
  case Pure(a)             => a
}

def main(args: Array[String]): IO[Unit] =
  IO.readLine
    .map(input => s"I just read: $input")
    .flatMap(IO.writeLine)

What we did was basically recreation of IO monad. More specifically, a free monad with an F algebra hardcoded to (invisible to the user) SideEffect type, which’s interpreter into Id[A] is the compiler’s secret.

As you can see by simply chaining functions using map and flatMap you can basically recreate the imperative style of programming, where you write a sequence of side-effecting instructions, even though you are virtually working with pure values only (thus the name of the paper).

IO monad(s) in Scala(?)

But that’s how you would do things in a purely functional language, where every function returns a value and referential transparency is a given. When Scala was designed these were non-goals. (Well, considering that Scala is alive and continuously improved, more like wasn’t a goal so far). Usability on JVM was/is a goal. More expressive (and safe) type system than Java’s was/is a goal. Support for functional programming but not opinionated - you can use pure FP, you can use mutability, you can avoid side-effects, you can use them - might also be considered a part of the design. So, an IO monad embedding all possible side effects was never absolutely needed.

def pointlessComputation: Int = {
  val array = new Array[Int](10)
  populate(array)  // mutable
  quicksort(array) // mutable
  array(0)
}

That is why for synchronous code nothing of sort was introduced into the core language. (Well, if what you want is catching/handling errors then perhaps Try or Either could be considered IO monads for synchronous code).

`Future`

On the other hand, there are also asynchronous computations. If you want to trigger some expensive computation, you probably don’t want to wait for it - your current thread can still have some things to do. However, you might want to do something once the first computation ended. And this is where Future comes into play:

val userF: Future[User] = fetchUserFromDB()
userF.map { user =>
  // do something with user once available
}

Since by expensive computation on web servers’ backend (which is the usual usage of Scala) we usually mean database calls and querying other services, for quite a lot of the community Future became synonymous with IO operations, and therefore side effects (which is obviously not true as e.g. reading initial configs and doing operations on mutable data almost never happens inside a Future).

However, because almost everything on backend is about a database or querying other services, the whole codebases build on top of Future were built. While (almost) nobody considered IO monad needed, the Scala started to run on top of de-facto IO monad in the form of Future.

Shortcomings of `Future`

Since all Scala tutorials discuss how to use Future, we will skip that part and go straight into the showing issues with Future.

Let’s compare this 2 examples:

// common part
def test(): Future[Int] = Future {
  println("running")
  1
}

// example A
val f = test()

for {
  a <- f
  b <- f
} yield a + b

// example B
for {
  a <- test()
  b <- test()
} yield a + b

Referential transparency - the value returned by the function can always be replaced by the function call and vice versa. Here, we see that test is not referentially transparent - the former code will print running once, while the later will print it twice. This is quite important during refactoring because you cannot brainlessly turn uglier code to a prettier once because you can accidentally break it.

// example C
val fa = test()
val f2 = test()

for {
  a <- fa 
  b <- fb
} yield a + b

This code has yet another behavior, though it is less visible. It will be if we will modify test to be:

def test(): Future[Int] = Future {
  sleep(1000)
  println("running")
  1
}

We will see, that the example A and example C take 1 second to print both messages, while example B needs 2 - it is because:

example A calculates result once, then remembers it, and both usages inside a for make use of that computed value,
example B starts calculating the second step once the first step is finished (sequentially)
and example C, while it also calculates the result 2, does it in parallel, so one doesn’t wait for the other.

In other words, Future is eagerly calculated (might start as soon as you created it - the future begins now) and memoizes (remembers/caches) results, so all side effects are run once.

While it isn’t a show stopper, it makes your life a bit harder - you have to remember, that where you created the Future affects how it is run, and if you calculate the results one time or more. If you wanted to treat computation as a value and be able to reason and refactor your code with that assumption, Future becomes an annoying exception to the rule.

Actually, Future is not even a lawful monad as you can break monad laws with this:

val f: Int => Future[String] = _ => throw new Exception

// for Future unit=successful

Future.successful(1) flatMap f // Failed Future
// =/=
(f flatMap Future.successful)(1) // thrown Exception

While people with the imperative background may (initially) have no issues with this, once you start thinking about how to make your code easier to reason about, this becomes a pain.

`Task`s and `IO`

As with many other inventions in Scala, the first proposal was implemented in Scalaz. scalaz.concurrent.Future was reimagination of scala.concurrent.Future where errors are not handled and computation is lazy, to make things more lawful. It also doesn’t require ExecutionContext on each map and flatMap. It has a few other properties, but what we are interested in this moment are 2 things:

it is easier to reason about than standard library’s Future,
it doesn’t handle errors.

So, if we added some error handling we would have something more useful than Future without extra issues. This error handling was provided by scalaz.concurrent.Task. Task is kind of a wrapper/builder for Scalaz’s Future, which used Throwable \/ A (read: as Throwable or A) as internal result type.

// Scalaz 7.1.1

def test(): Task[Int] = Task {
  sleep(1000)
  println("test")
  1
}

// example A

val f = test()

(for {
  a <- f
  b <- f
} yield a + b).run

// example B

(for {
  a <- test()
  b <- test()
} yield a + b).run

// example C

(for {
  a <- f
  b <- f2
} yield a + b).run

This code in all 3 examples (A, B and C) behaves exactly the same. And if test might fail:

def test(): Task[Int] = Task {
  throw new Exception
}

test().attemptRun(println)

we will be able to get the result with attemptRun: Throwable \/ A (synchronously) or runAsync(f: Throwable \/ A => Unit): Unit, or recover (.handle, .handleWith).

While I would like to point to some documentation, Scalaz notoriously considered Haskell resources, category theory books and white papers, as all documentation you need.

Nowadays, you can use Sam Halliday’s Functional Programming for Mortals as the only de facto Scalaz documentation. I wouldn’t expect it to have any other documentation in predictable future.

This design inspired Cats version: Monix’s Task. Just like Scalaz original, it is lazy, doesn’t memoize (by default, you can memoize if you need to). However, it doesn’t provide support for synchronous execution, and integrates better with the existing Scala ecosystem - it returns Scala’s Future on .runAsync and use ExecutionContext on steroids (Scheduler) to run them. If you have a lazily evaluated (by-name param) Future you could lift it to Monix’s Task as well. And if you want to memoize some result (that is, use the result of the same computation twice calculating it once), you can do it explicitly with .memoize.

Since, for various reasons, Cats ecosystem got a faster and bigger adoption (at least in media), soon Monix’s Task became a de facto IO monad for a majority of FP heavier codebases.

BTW. At this point I haven’t mention something that is IO monad for synchronous computations - Coeval. Just like Task(s) it is lazy. And just like Task is uses trampoline underneath to avoid stack overflows and handles Throwables. So we have:

eager lazy

sync A / Id[A] Coeval[A]

async Future[A] Task[A]

(You can also find it in Monix documentation).

	eager	lazy
sync	`A` / `Id[A]`	`Coeval[A]`
async	`Future[A]`	`Task[A]`

While Monix has a lot of functionality (read: tons of utilities and optimizations) some people wanted Cats to have something simpler, more primitive and wanted it to be a Cats library - Cats Effects. Of course, a lot of concepts could be simply moved from Monix to Cats Effect - and as far as I can tell that’s what happened. Cats Effect introduced some new type classes dedicated for handling concurrency better, provided Cats IO as the default implementation and Monix Task became another implementation. (You can read about the differences by the maintainer of both here).

Some of the type classes are:

type class	description	provides e.g.
`Bracket[F, E]`	`Monad[F]` which allows `E` as recoverable error (`MonadError[F, E]`) and allows you to acquire and release resources safely (think: as in loan pattern, RAII, try-with-resources)	`def bracket[A, B](acquire: F[A])(use: A => F[B])`
`Sync[F[_]]`	`Bracket[F, Throwable]` which adds the ability to make some `F[A]` or `A` calculation lazy (suspended) - if you think about it, this condition kind of forbids `Future` to be `Sync`	`suspend[F, A](thunk: => F[A]): F[A]` `delay[F, A](thunk: => A): F[A]`
`Async[F[_]]`	`Sync[F]` which can support both synchronous and asynchronous computation - among many operations you’ll have `memoize` for memoization or `shift` for changing the `ExecutionContext` on which the final computation will be run	`def memoize[F[_], A](f: F[A])(implicit F: Async[F]): F[F[A]]` `def shift[F[_]](ec: ExecutionContext)(implicit F: Async[F]): F[Unit]`
`Concurrent[F]`	`Async[F]` which provides some operations targeting concurrency specifically: make computation `cancellable`, run two computations and cancel the one that is slower (`race`, `racePair`), `timeout`	I’ll skip repeating description here

You might think that Concurrent is the type class which will allow you to e.g. run 3 computations in parallel and combine the result into a tuple. Not really, and I agree that it is misleading. Applicative functor already gives you ability to combine things using .map2 (for 2 things) and/or .mapN (for any number of things):
import cats.implicits._
import monix.eval._
import monix.execution._
import Scheduler.Implicits.global

val task: Task[Int] = Task { scala.util.Random.nextInt }
task.map2(task) { (a, b) => println(s"$a,$b") }.runAsync
// 945262287,-47359325
(task, task).mapN { (a, b) => println(s"$a,$b") }.runAsync
// 1171714378,-1109819564
Thing is, if your applicative functor is also a monad, then its method’s will be sequential as apparently for majority of people, that would be more intuitive. To make things truly parallel, you can find Parallel type class which will let you use .parMapN - this operation will work just like .mapN, except the computations of all values is guaranteed to run in parallel:
(task, task).parMapN { (a, b) => println(s"$a,$b") }.runAsync

With Cats IO you can use Haskell approach of defining your whole program as IO computation which would be interpreted in main. It provides IOApp trait just for this purpose:

import cats.effect._
import cats.implicits._

object MyApp extends IOApp {
  // run is abstract method
  // that we need to implement
  // example borrowed from the documentation
  def run(args: List[String]): IO[ExitCode] =
     args.headOption match {
       case Some(name) =>
         IO(println(s"Hello, \${name}.")).as(ExitCode.Success)
       case None =>
         IO(System.err.println("Usage: MyApp name")).as(ExitCode(2))
    }
}

But you don’t have to do it if you don’t feel comfortable with the concept yet - you can just use Task and interpret it into Future and later on migrate to this more functional approach as you will understand it better and as you will notice more and more issues with using less principled approaches.

Monad transformers

What we have so far - IO monad (well, several IO monads, pick your favorite!) - already allows is to build a solid functional program. If you just recently adventure with FP and just want to write a goddamn program with no nasty surprises then you already have the tools to cover a majority of your cases. (And digging into documentation and Gitter should help you figure out the rest).

But there are some issues, with current IO monads. And I am talking design, not the implementation. First example: your own error algebra:

final case class User(id: java.util.UUID)

sealed trait UserError
object UserError {
  final case class UserExist(user: User)
      extends UserError
  final case class UserNotFound(id: UUID)
      extends UserError
}

You would like to be able to use UserError to handle errors of User domain. Maybe you would like to use some MonadError[F, UserError]. However, Task and IO (and even Future) have error algebra fixed to Throwable. Error algebra is not a type parameter you can pick. So, if you wanted to use something like Either[E, A] inside your F, you would have to do it manually (doable only if you there aren’t many F[Either[E, A]] operations to combine) or monad transformers.

import cats.implicits._
import cats.data.EitherT
import monix.eval._, monix.execution._
import Scheduler.Implicits.global

def service: Task[Either[UserError, Int]] =
  Task.delay { 1.right[UserError] }

// manually
service
    .map { _.flatMap { i =>
      (i*2).right[UserError]
    } }

// with monad transformer just for composition
(for {
  a <- EitherT(service)
} yield a*2).value

// with monad transformers used as the result type
def service2: EitherT[Task, UserError, Int] =
  EitherT(Task.delay { 1.right[UserError] })

for {
  a <- service2
} yield a * 2

Another example - we want to inject something into the service, some sort of context. A config file, a started transaction, an opened connection, etc. Basically, you need a function A => F[B], which you could map, flatMap or combine some other way. In other words, ReaderT monad or Kleisli.

import cats.implicits._
import cats.data.EitherT
import monix.eval._, monix.execution._
import Scheduler.Implicits.global

final case class Config(version: String)

def service = Kleisli[Task, Config, Unit] { config =>
  Task { println(s"Current version is ${config.version}") }
}

for {
  _ <- service
} yield "done"
// Kleisli[Task, Config, String]

Things become even more fun when you need both things at once - inject something and have your own error algebra:

// from this moment if you test things in Ammonite
// you should import the kind projector

// import $plugin.$ivy.`org.typelevel::kind-projector:0.10.0`

type Result[C, E, A] = Kleisli[EitherT[Task, E, ?], C, A]
def service: Result[Config, UserError, String] =
  Kleisli { config =>
    EitherT(Task { config.verion.asRight[UserError] })
  }

While things are manageable (especially, if you throw at things a lot of extension methods wrapping and unwrapping things), such codebase might feel bolerplate-y and exhausting to work within a long run. It is still much better than codebase which doesn’t use any rules and principles to tame side-effects and concurrency, but it is exhausting nonetheless.

I feel that I have to stress this - plain IO monads from the previous section already have you covered if you simply want to control your side effects in a reasonable way! Issues described in this (and following) sections appear only after you get deeper into FP and feel the urge to compose more functionalities together - side-effect control (IO), injecting some context/dependencies (Kleisli/Reader), having custom error handling (EitherT), etc. It is hardly relevant if you are starting, but much more important if you want to understand where trends and recommendations for bigger code bases come from.

Typed Tagless Final Interpreter

Monad transformer’s boilerplate is one of the reasons people were eager to try out anything, that could make things easier - get rid of several layers of wrapping to create value and replace it with just a single call lifting your computation right into target F. Typed Tagless Final Interpreter seemed like a good candidate. (The other candidate was a free monad, but we’ll get to that later).

The way TTFI is used basically relies on defining type classes as algebras of allowed operations (no surprise), implementing them for your IO types, providing extension methods taking generic F, so that they could be used for any implementation (still just following a type class definition), and instead of hardcoding F to cats.effect.IO, monix.Task, scalaz.Task or scalaz.zio using F everywhere in your code.

def service1: Task[Int] = Task {
  println("hello")
  1
}
// hardcoded IO

def service2: Task[Int] = Task {
  println("hello!")
  10
}.flatTap { i =>
  Task { println(i) }
}
// while it used flatTap, which is an extension method
// relying on type class, it is still hardcoded

def service3[F[_]: Sync]: F[Int] = Sync[F].delay {
  println("hello!")
  10
}.flatTap { i =>
  Sync[F].delay { println(i) }
}

service3[Coeval]: Coeval[Int]
service3[Task]: Task[Int]

This approach has several advantages:

you can treat a type class as a type constraint - [F[_]: Sync] can be read as I want F which allows synchronous computations conforming to referential transparency rules,
you can apply the principle of least power - since it forces you to explicitly define your requirements regarding F you can rethink them and limit usage of certain features. E.g. (using Cats and Cats Effect nomenclature) Monad only would indicate, that you only want to compose existing computations, while Sync almost always indicates that you need it to perform side-effects yourself. Concurrent or Effect shows, that in this particular piece of code you need the tooling for optimizing concurrent operations, which sounds dangerous,
you can swap implementation any time - TTFI pretty much requires you to convert your whole codebase to it. Once you do it, the choice of underlying implementation is equal to passing implicit at the beginning of a program, which means you could even make it configurable if you wanted (e.g. to measure and compare performances of different implementations),
you can use one implementation for production and one for testing - since all implementations maintained by library authors are required to be lawful (fulfill contracts), if your code e.g. used at most Sync you could use Task for computing things asynchronously with scalability in mind, while testing using Coeval (because they only require .value to get the value instead of Awaiting Future).

You can meet two different conventions when it comes to using TTFI. The first one relies on using type class notation and declaring type classes as type constraints. Then if you need to get the instance of a type class, you either use implicitly directly or through a summoning method:

trait Default[F[_]] { def get[A]: F[A] }

object Default {
  // summoning method
  def apply[F[_]: Default]: Default[F] =
    implicitly[Default[F]]
  
  implicit val listDefault: Default[List] =
    new Default[List] { def get[A]: List[A] = Nil }
  implicit val optionDefault: Default[Option] =
    new Default[Option] { def get[A]: Option[A] = None }
}

def getDefaults[F[_]: Default, G[_]: Default]
    : (F[String], G[String]) = (
  // summoning Default[List] via summoning method
  Default[F].get[String],
  // summoning via implicitly
  implicitlt[Default[G]].get[String]
)

getDefaults[List, Option] // (Nil, None)

(summoning is just a fancy word for getting that implicit instance, and summoning method usually is an apply method in TC companion object used to summon TC instances like in the example).

The other way (more popular in cases you would need just one type class) is by adding implicit parameter list yourself, where type class would be named F (or whatever you named your type parameter):

trait Default[F[_]] { def get[A]: F[A] }

object Default {
  
  implicit val listDefault: Default[List] =
    new Default[List] { def get[A]: List[A] = Nil }
  implicit val optionDefault: Default[Option] =
    new Default[Option] { def get[A]: Option[A] = None }
}

def getDefaults[F[_], G[_]](
  implicit F: Default[F],
  G: Default[G]
): (F[String], G[String]) = (
  F.get[String],
  G.get[String]
)

getDefaults[List, Option] // (Nil, None)

As you can see, one spares time writing the function’s/method’s declaration and one save time writing the function’s/method’s body. It makes no difference in using extension methods. As far as I can tell, the difference is purely in style, so you can pick up whichever you prefer as long as you use one style consistently.

Since tagless final doesn’t rely on some particular implementation of IO monad which you actually use, it is style highly promoted in Cats ecosystem and its libraries: a library is not forcing you to use a particular implementation, meaning you can always use it without any changes to your codebase.

That said, TTFI has some disadvantages:

codebase pollution - if you wanted to rewrite your whole library to tagless final, theoretically, you could start with functions that do not depend on other functions returning hardcoded IO, convert them to TTFI, then continue to converting unblocked functions until the whole codebase is converted (which in such approach could be done incrementally), hardly ever you can see it in the wild. Usually, you either start with TTFI in a greenfield project or have some poor soul convert the whole codebase at once. Or just give up. The bigger the existing codebase, the less likely you are to do it,
people hardly ever introduce their own specialized type classes for particular side effects like e.g. logging, querying a database or fetching data, so usage of Sync can only tell you side effects are performed. What side effects? You need to look inside a function,
in Scala type classes are implemented using traits, inheritance, and implicits. But implicits resolves only if there is no ambiguity. Let’s say you wanted to use Sync but you found Throwable to be atrocious (Sync[F] implements MonadError[F, Throwable]) and you prefer you own Error algebra. So you create a lawful MonadError[F, Error]. And then this happens:
```
def operation[F: Sync: MonadError[?, Error]]: F[Int] = {
  Sync[F].delay {
    // count sth with side effect
    10
  }.flatMap { count: Int =>
    if (count < 0)
      MonadError[F, Error].raiseError(Error("failed"))
    else
      Sync[F].pure(count)
  }
}
```
Sync[F] works correctly, is unambiguous. Same with MonadError[F, Error]. But .flatMap extension method is broken - both implicits inherit Monad[F] and the compiler doesn’t know which to choose. You can mitigate this particular error e.g. using cats.mtl.FunctorRaise, which doesn’t inherit anything, or focus Throwable into Error using classy optics, but there is no generic solution to such cases. You have to use type classes in a way that don’t cause the collision/ambiguity: use one type class at once or use existing/create your own wrappers together with extension methods.

If you think about it, the ability to swap the IO implementation to something better with just one implicit is tempting, but hardly ever you would actually use it. The fact that IO implementations like monix.Task or cats.effect.IO force you to use asynchronous code isn’t that much of an issue. Actually, the purely functional programmers which used TTFI in production code, that I talked to, didn’t care about these either. All of them stressed, that the reason they stuck to TTFI was the principle of the least power and ability to consciously limit the complexity of their code. Whether or that is a good enough reason for converting production code, is up to you. But it is the best way of implementing a library, that should be able to work with different implementations of certain concept (including, but not limited to IO monad).

MTL

If you decide to go the TTFI way, you can quickly find out that you don’t always need just IO effects. If in your no-TTFI code you would combine e.g. Kleisli with IO or Either with State, you would have to go with monad transformers. And monad transformers, while clumsy, would allow you to access each of the layers and use its functionalities. How about TTFI?

final case class Config(name: String, version: String)

// monad transformer way
type F[X] = Kleisli[Task, Config, X]
val hello: F[A] =
  Kleisli[Task, Config, String] { c =>
    Task.pure(s"I am ${c.name}, ver.${c.version}")
  }

def hello[F[_]: Applicative]: F[String] = {
  // how to access config here?
  val result: String = ???
  result.pure[F]
}

Kleisli[T, A, B] is also known as ReaderT[T, A, B] monad. (Which, when T is set to Id, becomes Reader[A, B] monad or an abstraction over A => B function). If we wanted to invent a type class which would yield us A, it could look like this:

trait ApplicativeReader[F[_], A] { def read: F[A] }

object ApplicativeReader {
  
  def apply[F[_], A](
    implicit F: MonadReader[F, A]
  ): MonadReader[F, A] = F
  
  implicit def kleisli[T[_]: Applicative, A]
      : ApplicativeReader[Kleisli[T, A, ?], A] =
    new ApplicativeReader[Kleisli[T, A, ?], A] {
      def read: Kleisli[T, A, A] = Kleisli { a =>
        a.pure[T]
      }
    }
}

allowing you to do something like:

def hello[F[_]: Applicative
              : ApplicativeReader[?, Config]]: F[String] =
  ApplicativeReader[F, Config].read.map { config =>
    s"I am ${c.name}, ver.${c.version}"
  }

What we invented here, is a special type class which would allow us to access another layer’s functionality (here: Kleisli). This approach was invented in Haskell’s Monad Transformer Library, which started out as package providing monad transformers (thus the name), but later on, introduced the ability to access different layers functionalities using different type classes. This is why it’s called MTL-style.

Cats MTL provides some type classes, which would give us the functionalities to implement both hello example and Either + State case. For hello we could use ApplicativeAsk (which is similar to our ApplicativeReader):

def hello[F[_]: Applicative
              : ApplicativeAsk[?, Config]]: F[String] =
  ApplicativeAsk[F, Config].ask.map { config =>
    s"I am ${c.name}, ver.${c.version}"
  }
// or
def hello[F[_]: ApplicativeAsk[?, Config]]: F[String] =
  ApplicativeAsk[F, Config].reader { config =>
    s"I am ${c.name}, ver.${c.version}"
  }

As for Either + State we could rewrite this monad transformer:

type F[A] = StateT[EitherT[String, ?], List[Int], A]

// each parsed Int is added to List
// on first failure we end up with error message
def parseInt(str: String): F[Int] =
  StateT.liftF[Either[String, ?], List[Int], Int](
    Try(str.toInt).toEither.left.map(_.getMessage)
  ).flatMap { int =>
    StateT.modifyF[Either[String, ?], List[Int]] {
      ints => ints ++ int
    }
  }

into:

def parseInt[
  F[_]: MonadError[?, String]
      : MonadState[?, List[Int]]
](str: String): F[Int] =
  Try(str.toInt).toEither.leftMap(_.getMessage) match {
    case Right(i) =>
      MonadState[F, List[Int]].modify(_ ++ i)
    case Left(e) =>
      e.raiseError[F, Int]
  }

This example highlights the consequences of using TTFI and MTL-style in particular: while code becomes less coupled to a particular implementation and you no longer see the hierarchy of types, you are ending up with more byzantine type classes that you have to understand in order to be able to code quickly. This might be a great tool for more experienced teams, but a huge problem for newcomers and people not so familiar with type class hierarchy. While monad transformers are also troublesome, at least they allow you to explore possibilities through IDE’s intellisense. Here, you have to know beforehand which type class is needed for which extension methods.

Other attempts

While plain IO monads got a quite good adoption, monads transformers and TTFI a bit worse and MTL-style much worse, there were some proposals which were promising to solve a lot of issues with other approaches, but so far failed to gain any significance. It could be useful to take a look at them and understand why.

Free, Freer, Eff

` cats.effect.IO, monix.Task, scalaz.concurrent.Task, ... they all can be considered free monads with hardcoded side-effect algebra. Before tagless final got so much attention, free monads over your own algebras (so, just normal free monads...) interpreted into side effects was a recommended way of designing your application. Decoupling definitions and compositions of your operations from their executions assured referential transparency. The fact, that you defined your own algebras (and interpreters), meant that Free` could be basically used as a way of composing API/domain services calls. If you came from OOP background you could draw some parallel with how you implement your business logic using interfaces and then use DI to inject actual implementation - except here you could do it without threading things through tens of constructors nor using mocks for testing.

Since some concepts described here might get difficult and you most likely won’t be using them on production, treat it as an optional exercise and just jump right to their summary (Free vs Tagless Final/Why it failed?) if you felt like quitting.

Free

Let us remind how such Free looked like:

sealed abstract class Free[S[_], A] {
  
  def flatMap[B](f: A => Free[S, B]): Free[S, B] =
    FlatMapped(this, f)
  
  def map[B](f: A => B): Free[S, B] =
    flatMap(a => Free.unit(f(a)))
}

object Free {
  
  def unit[S[_], A](a: A): Free[S, A] = Pure(a)
  def lift[S[_], A](sa: S[A]): Free[S, A] = Suspend(sa)
  
  case class Pure[S[_], A](a: A) extends Free[S, A]
  case class Suspend[S[_], A](sa: S[A]) extends Free[S, A]
  case class FlatMapped[S[_], A, B](
      fsa: Free[S, A],
      f: A => Free[S, B]
  ) extends Free[S, B]
}

We could interpret it into G[_] using:

def foldMap[F[_], G[_]: Monad, A](
  free: Free[F, A]
)(
  fK: FunctionK[F, G]
): G[A] = free match {
  case Pure(a) =>
    a.pure[G]
  case Suspend(sa) =>
    fK(sa)
  case FlatMapped(fsa, f) =>
    foldMap(fsa)(fK).flatMap(a => foldMap(f(a))(fK))
}

// for semigroupal syntax, Option/String monoids, State monad...
import cats.implicits._

// notice how values stores in algebras becomes
// functions arguments and parameter type
// function's return type

sealed trait MsgQueue[T]
object MsgQueue {
  case class Enqueue(msg: String) extends MsgQueue[Unit]
  case class Dequeue() extends MsgQueue[Option[String]]
}

def enqueue(msg: String): Free[MsgQueue, Unit] =
  Free.lift(MsgQueue.Enqueue(msg): MsgQueue[Unit])
def dequeue: Free[MsgQueue, Option[String]] =
  Free.lift(MsgQueue.Dequeue(): MsgQueue[Option[String]])

val program: Free[MsgQueue, String] = for {
  _ <- enqueue("test 1")
  _ <- enqueue("test 2")
  a <- dequeue
  b <- dequeue
  c <- dequeue
} yield (a |+| b |+| c).getOrElse("")

// interprets into State[Queue[String], A]
// which would require us to pass initial Queue state
val interpreter: FunctionK[MsgQueue, State[Queue[String], ?]] =
  new FunctionK[MsgQueue, State[Queue[String], ?]] {
    def apply[A](fa: MsgQueue[A]): State[Queue[String], A] = {
      case Enqueue(msg) =>
        State.modify[Queue[String]](_.enqueue(msg))
      case Dequeue() =>
        State[Queue[String], Option[String]] { queue =>
          queue.dequeOption match {
            case Some(msg, newQueue) =>
              newQueue -> Some(msg)
            case None =>
              queue -> None
          }
        }
    }
  }

foldMap[MsgQueue, State[Queue[String], ?], String](
  program
)(
  interpreter
).run(Queue.empty[String])

We did a convenient implementation of foldMap, one which used Monad[G] to translate F[_] into G[_]. But in reality, in order to turn any F[_] into a monad and calculate the result, the only requirement is that F is a functor (has Functor[F]):

// alternative implementation of free

sealed abstract class Free[S[_], A]

object Free {
  
  def unit[S[_], A](a: A): Free[S, A] = Pure(a)
  def lift[S[_]: Functor, A](sa: S[A]): Free[S, A] =
    Impure(sa.map(unit))

  case class Pure[S[_], A](a: A) extends Free[S, A]
  case class Impure[S[_], A](sa: S[Free[S, A]]) extends Free[S, A]

  implicit def freeMonad[S[_]: Functor]: Monad[Free[S, ?]] =
    new Monad[Free[S, ?]] {
      def pure[A](a: A): Free[S, A] = Pure(a)

      def flatMap[A, B](fa: Free[S, A])
                       (f: A => Free[S, B]): Free[S, B] =
        fa match {
          case Pure(a)    => f(a)
          case Impure(sa) => Impure(sa.map(flatMap(_)(f))
        }

      // not important to understand this example
      def tailRecM[A, B](init: A)
                        (fn: A => Free[S, Either[A, B]]): Free[S, B] = ???
    }
}

def foldMap[F[_], G[_]: Monad, A](
  free: Free[F, A]
)(
  fK: FunctionK[F, G]
): G[A] =
  free match {
    case Pure(a)       => a.pure[G]
    case Impure(sa, f) => sa.flatMap(foldMap(f(_))(fK))
  }

If we are interested in the later, we might find out that the requirement of implementing Functor[S] is kind of limiting - we cannot simply create a generalized algebraic data type that would represent our domains services. (In case you wonder, previous implementation would also need Functor[S] for certain functionalities - consult Cats or Scalaz implementation). This GADT would also have something, that would handle any possible type because we have no guarantee what will we map into.

Freer

This limitation can be overcome using a simple trick when it comes to representing our computations:

sealed abstract class FFree[S[_], A]

object FFree {
  
  def unit[S[_], A](a: A): FFree[S, A] = Pure(a)
  // Functor[S] not needed here as well
  def lift[S[_], A](sa: S[A]): FFree[S, A] =
    Impure(sa, Pure(_))

  case class Pure[S[_], A](a: A) extends FFree[S, A]
  case class Impure[S[_], A, B](
    sa: S[A],
    f: A => FFree[S, B]
  ) extends FFree[S, B]

  // no longer needs Functor[S]!
  implicit def ffreeMonad[S[_]]: Monad[FFree[S, ?]] =
    new Monad[FFree[S, ?]] {
      def pure[A](a: A): FFree[S, A] = Pure(a)

      def flatMap[A, B](fa: FFree[S, A])
                       (f: A => FFree[S, B]): FFree[S, B] =
        fa match {
          case Pure(a)        => f(a)
          case Impure(sa, f0) => Impure(sa, a => flatMap(f0(a))(f) )
        }

      // not important to understand this example
      def tailRecM[A, B](init: A)
                        (fn: A => FFree[S, Either[A, B]]): FFree[S, B] =
        ???
    }
}

Because we are rewriting Impure and immediately compose functions used for flatMap, we no longer need any Functor[S] to transform intermediate results. (I used this implementation because it is more obvious that in the Pure-Suspend-Flatmapped version). This removes one constraint on S[_] which is why we fall it freer (free-er) monad. Name FFree comes from paper, that used this concept Freer Monads, More Extensible Effects.

The removal of a limitation on S[_] is not so useful, if you simply want to lift some S to FFree, compose it monadically, and foldMap into IO. However, it was an intermediate result required for something else: Free-like monad with an ability to compose several different S[_] together within one computation. Think Free[S1 | S2 | S3 | ..., A], where each SN[_] is a type constructor and | is a sum type but usable on type constructors.

Eff

In order to make such extensible freer (FEFree) possible we need to implement open union. I’d like to say we could use it out-of-the-box but sum types are coming with Dotty and in Scala 2 you need to implement them yourself using something like Either or shapeless.Coproduct.

// kind of EitherK but nested
sealed trait Union[A]
case class UNil[A]() extends Union[A]
sealed trait UCons[L[_], R[_], A] extends Union[A]
case class ULeft[L[_], R[_], A](l: L[A]) extends UCons[L, R, A]
case class URight[L[_], R[_], A](r: R[A]) extends UCons[L, R, A]

Since we are implementing it manually, we also need some way of injecting F[_] into Union[_] and checking is F[_] is a member of it:

// this is optics of sort
trait Member[F[_], U[_]] {
  def get[A](u: U[A]): Option[F[A]]
  def inj[A](f: F[A]): U[A]
}

object Member {
  def apply[F[_], U[_]](
    implicit member: Member[F, U]
  ): Member[F, U] = member
  
  // treat implementation below as pseudocode!!!
  
  implicit def unil[F[_]]: Member[F, UNil] =
    new Member[F, UNil] {
      def get[A](u: UNil[A]) = None
      def inj[A](f: F[A]) = ???
    }
  
  import shapeless._
  implicit def uright[F[_], U[X] <: Union[X], G[_]](
    implicit ev: F[Nothing] =:!= G[Nothing],
    mg: Member[F, U]
  ): Member[F, UCons[G, U, ?]] =
    new Member[F, UCons[G, U, ?]] {
      def get[A](u: UCons[G, U, A]) = u match {
        case ULeft(_)  => None
        case URight(r) => mg.get(r)
      }
      def inj[A](f: F[A]): UCons[G, U, A] =
        URight(mg.inj(f))
    }
  
  implicit def uleft[F[_], U[X] <: Union[X]]
     : Member[F, UCons[F, U, ?]] =
    new Member[F, UCons[F, U, ?]] {
      def get[A](u: UCons[F, U, A]) = u match {
        case ULeft(l)  => Some(l)
        case URight(r) => None
      }
      def inj[A](f: F[A]): UCons[F, U, A] = ULeft(f)
    }
  
  // I'll explain this below
  trait Ctx[F[_]] {
    
    implicit val unil: Member[F, UNil] =
      Member.unil[F]
    implicit def uright[U[X] <: Union[X], G[_]](
      implicit ev: F[Nothing] =:!= G[Nothing],
      mg: Member[F, U]
    ): Member[F, UCons[G, U, ?]] =
      Member.uright[F, U, G]
    implicit def uleft[F[_], U[X] <: Union[X]]
       : Member[F, UCons[F, U, ?]] =
      Member.uleft[F, U]
    
    def in[U[X] <: Union[X]](implicit member: Member[F, U]) = member
  }
}

Due to compilers limitations(?) if we try to use derivation we’ll get implicit not found errors if we’ll try to run this code:
Member[Option, UCons[List, UCons[Option, UNil, ?], ?]]
However, it runs pretty well in Scastie. I cannot explain this. Ctx helps us pin F[_], so that compiler would have a bit easier task.
object test {
  val option = new Member.Ctx[Option] {
    val m = in[UCons[List, UCons[Option, UNil, ?], ?]]
  }.m
  val list   = new Member.Ctx[List] {
    val m = in[UCons[List, UCons[Option, UNil, ?], ?]]
  }.m
  
  val injected = option.inj(Option(2))
  println(injected)
  val extracted = list.get(injected)
  println(extracted)
}
This worked for me in sbt build and in Scastie, but failed to compile in Ammonite and scalac REPL. It is just an example, so it is not a big issue, but for more mature project probably something like macro derivation would make sense.

We’ll also need a way of decomposing Union:

def decompose[F[_], U[X] <: Union[X], A]:
  UCons[F, U, A] => Either[F[A], U[A]] = {
    case ULeft(f)  => Left(f)
    case URight(u) => Right(u)
  }

Actually, the paper Freer Monads, More Extensible Effects uses less constrained version of decompose - one which takes some, U[_] union, its F[_], U0[_] union which is U with F label removed and interpret U[A] into Either[F[A], U0[A]]. However, our example is already complicated, so let’s just make a mental note to ourselves and implement a simpler version.

Now, when we have a way of lifting and extracting algebra from the open union we can implement extensible freer:

sealed trait FEFree[U[X] <: Union[X], A]

object FEFree {
  
  def unit[U[X] <: Union[X], A](a: A): FEFree[U, A] =
    Pure(a)
  def lift[U[X] <: Union[X], A](ua: U[A]): FEFree[U, A] =
    Impure(ua, Pure(_))

  case class Pure[U[X] <: Union[X], A](a: A) extends FEFree[U, A]
  case class Impure[U[X] <: Union[X], A, B](
    ua: U[A],
    f: A => FEFree[U, B]
  ) extends FEFree[U, B]

  implicit def fefreeMonad[U[X] <: Union[X]]: Monad[FEFree[U, ?]] =
    new Monad[FEFree[U, ?]] {
      def pure[A](a: A): FEFree[U, A] = Pure(a)

      def flatMap[A, B](fa: FEFree[U, A])
                       (f: A => FEFree[U, B]): FEFree[U, B] =
        fa match {
          case Pure(a)        => f(a)
          case Impure(sa, f0) => Impure(sa, a => flatMap(f0(a))(f) )
        }

      // not important to understand this example
      def tailRecM[A, B](init: A)
                        (fn: A => FEFree[S, Either[A, B]]): FEFree[S, B] =
        ???
    }
}

Notice, how much easier your life is thanks to not needing to U to be functor! If we still had our previous representation, we would have to take Functor[S] for each S that is a part of U and combine them into a bigger Functor[U]. The way we have things now, however, we only need to combine interpreters for each label within U (which we would still have to do if we used a normal Free representation).

Technically speaking, we could drop U[X] <: Union[X] requirement as long as we required Member instance to exists on interpretation. This way we would be able to provide our own implementation of U which could e.g. avoid destructuring the whole Union to get one value. However, it is easier to show how things work this way.

If all we wanted was the ability to combine different effects that would be enough. As authors of the Freer Monads, More Extensible Effects paper show, we already can combine e.g. Reader and Writer effect:

// think composition of I => A functions
sealed trait Reader[I, A]
// where we only need to know how to access I
// because pure values, flatMap, etc is handled by FEFree
object Reader {
  case class Get[I]() extends Reader[I, I]  
}

// think functional logger where you append
// to some I context (e.g. List)
sealed trait Writer[O, A]
// pure, flatMap, etc provided here be FEFree as well
object Writer {
  case class Put[O](o: O) extends Writer[O, Unit]
}
  
def ask[U[X] <: Union[X], I](
  implicit member: Member[Reader[I, ?], U]
): FEFree[U, I] =
  FEFree.lift(member.inj(Reader.Get()))

def log[U[X] <: Union[X], I](i: I)(
  implicit member: Member[Writer[I, ?], U]
): FEFree[U, I] =
  FEFree.lift(member.inj(Writer.Put(i)))

def getAndLogCtx[
  U[X] <: Union[X]
        : Member[Reader[String, ?], ?]
        : Member[Writer[String, ?], ?]
]: FEFree[U, String] = for {
  str <- ask[U, String]
  _   <- log[U, String](str)
} yield str

As we see types get a little bit wild, so we see why this can be a show-stopper when it comes to adoption. It could be easier if we hardcoded U to some globally available type, but then we would have less flexibility when it comes to adding and removing effects, especially if we shared some code across modules where different effects would be needed. And using FEFree the way it is intended to be used, raises the bar for people who would be entering the project.

Actually, there is another issue - performance. Free in general has an overhead for performance because you have to create an instance for your algebra, you need to wrap it into Free instance, then when you are interpreting you are repackaging objects before creating the final F[A]… FFree (freer) and FEFree (extensible freer) don’t solve that issue. If you think about the interpretation of things inside Union things are even worse! You have to pack and unpack a bunch of objects each time you try to interpret things, and then you also have to traverse to get to the right interpreter.

This is the reason FEFree was modified once again in the paper describing it, in order to make it faster. Eff monad is merely FEFree rewritten to do the same things but faster.

The optimization is done like this:

implement minimal version of fast type aligned constant-time queue (FTCQueue) described in Reflection without Remorse** paper:

// represents A => F[B], exactly like Kleisli
// but with slightly different composition
sealed trait FTCQueue[F[_], A, B] {
  def :+[C](f: A => F[C]): FTCQueue[F, A, C] =
    Node(this, FTCQueue(f))
  def ++[C](fbc: FTCQueue[F, B, C]): FTCQueue[F, A, C] =
    Node(this, fbc)
}
object FTCQueue {
    
  def apply[F[_], A, B](f: A => F[B]): FTCQueue[F, A, B] =
    Leaf(f)
    
  case class Leaf[F[_], A, B](f: A => F[B])
    extends FTCQueue[F, A, B]
  case class Node[F[_], A, B, C](
    fab: FTCQueue[F, A, B],
    fbc: FTCQueue[F, B, C]
  ) extends FTCQueue[F, A, C]
}

It is slightly different than Kleisli/ReaderT because would have to have flatMap[B, C](fa: FTCQueue[F, A, B])(f: B => FTCQueue[F, A, C]): FTCQueue[F, A, C], so that it would be a Monad[FTCQueue[F, A, ?]] instance. But we don’t want to do it! We want FTCQueue to be used internally to implement other structure’s Monadic functionality,

this queue can append or merge operations in $O(1)$ time. But we also need to be able to calculate the results (with no extra overhead on application):

// rewrite (view) of a FTCQueue which allows you to access
// first function in a queue in O(1) time (basically List
// with aligned types of output-input of functions inside)
sealed trait ViewL[F[_], A, B]
object ViewL {
  case class One[F[_], A, B](f: A => F[B])
    extends ViewL[F, A, B]
  case class Cons[F[_], A, B](
    f:   A => F[B],
    ftc: FTCQueue[F, B, C]
  ) extends ViewL[F, A, C]
}
  
// rewrites FTCQueue into ViewL in avg. O(1)
// (just the leftmost edge)
def tviewl[F[_], A, B]: FTCQueue[F, A, B] => ViewL[F, A, B] = {
  case FTCQueue.Leaf(f) =>
    ViewL.One(f)
  case FTCQueue.Node(fab, fbc) =>
    def go[X, Y, Z]: FTCQueue[F, X, Y] => FTCQueue[F, Y, Z] => {
      case FTCQueue.Leaf(f)      => n  => ViewL.Cons(f, n)
      case FTCQueue.Node(n1, n2) => n3 => go(n1)(Node(n2, n3))
    }
    go(fab)(fbc)
}

This view will let us pop one function from the queue with $O(1)$ on average.

having defined queue and a view, we can present our final form of extensible freer monad: the Eff monad:

sealed trait Eff[U[_], A]
object Eff {
    
  // function passed into Eff's flatMap
  type Arr[U[_], A, B] = A => Eff[U, B]
  // queue of functions to Eff's flatMap
  type Arrs[U[_], A, B] = FTCQueue[Eff[U, ?], A, B]
    
  case class Pure[U[_], A](a: A)
    extends Eff[U, A]
  case class Impure[U[_], A](
    ua:   U[A],
    arrs: Arrs[U, A, B]
  ) extends Eff[U, A]
    
  implicit def effMonad[U[_]]: Monad[Eff[U, ?]] =
    new Monad[Eff[U[_], ?]] {
      def pure[A](a: A): Eff[U, A] = Pure(a)
  
      def flatMap[A, B](fa: Eff[U, A])
                       (f: A => Eff[U, B]): Eff[U, B] =
        fa match {
          case Pure(a)         => f(a)
          case Impure(ua, ftc) => Impure(ua, ftc :+ f)
        }
  
      // not important to understand this example
      def tailRecM[A, B](init: A)
                        (fn: A => Eff[S, Either[A, B]]): Eff[S, B] =
        ???
    }
}

With such definition of Eff we should be able to achieve the same goals as with FEFree but with slightly less overhead on rebuilding the internal structure of Eff on each resume.

We could modify our FEFree program to this:

// only change is FEFree -> Eff

def ask[U[X] <: Union[X], I](
  implicit member: Member[Reader[I, ?], U]
): Eff[U, I] =
  Eff.lift(member.inj(Reader.Get()))

def log[U[X] <: Union[X], I](i: I)(
  implicit member: Member[Writer[I, ?], U]
): Eff[U, I] =
  Eff.lift(member.inj(Writer.Put(i)))

def getAndLogCtx[
  U[X] <: Union[X]
        : Member[Reader[String, ?], ?]
        : Member[Writer[String, ?], ?]
]: Eff[U, String] = for {
  str <- ask[U, String]
  _   <- log[U, String](str)
} yield str

We will need some utilities to create interpreter easily:

// we need forall B: (F[B], G[B]) => A
trait Function2K[F[_], G[_], A]  {
  def apply[B](fb: F[B], gb: G[B]): A
}

object Effs {
  import Eff._
  
  // runs leftmost flatMap into Eff[U, B]
  def runArrs[U[_], A, B](ftc: Arrs[U, A, B])
                         (a: A): Eff[U, B] = tviewl(ftc) match {
    case ViewL.One(f) =>
      f(a)
    case ViewL.Cons(f, ftc) =>
      f(a) match {
        case Pure(x) =>
          runArrs(ftc)(x)
        case Impure(ux, ftc2) =>
          Impure(ux, ftc ++ ftc2)
      }
  }
  
  // run leftmost flatMap eliminating F from UCons[F, U, ?]
  // by interpreting it and modifies the U's value type
  def interpretLeftmostF[F[_], U[_], A, B, C](
    arrs: Arrs[UCons[F, U, ?], A, B],
    f: Eff[UCons[F, U, ?], B] => Eff[U, C]
  ): Arr[U, A, C] =
    runArrs(arrs) andThen f
  
  // rewrites whole Eff to eliminate F from UCons[F, U, ?]
  // modifying U's value type
  def interpretF[F[_], U[_], A, B](
    arr: Arr[U, A, B],
    f:   Function2K[F, Arr[U, ?, B], Eff[U, B]]
  )(
    eff: Eff[UCons[F, U, ?], A]
  ): Eff[U, A] = eff match {
    case Pure(x)  => arr(x)
    case Impure(fux, ftc) =>
      val fInterpreted = interpretLeftmostF(ftc,
                                            interpretF(arr, f))
      // we get Left(F) or Right(U)
      decompose(fux) match {
        // F[X] case - interpret current F
        case Left(fx)  => f(fx, fInterpreted)
        // U[X] case - no F here, simply continue rewrite
        case Right(ux) => Impure(ux, FTCQueue(fInterpeted))
      }
  }
}

What we arrived at it Effs.interpretF (the original paper call it handle_relay). With that function we can rewrite Eff[UCons[F, U, ?], A1] into Eff[U, A2]. So not only we are removing F we might also change the value of result type! This is useful in case we e.g. wanted to interpret into a function (State monad) or append functionally to some log (Writer). For our getAndLogCtx program we need interpreters for Reader[String, ?] and Writer[String, ?]:

def runReader[U[_], I, A](i: I):
    Eff[UCons[Reader[I, ?], U, ?], A] => Eff[U, A] =
  EffInterp.interpretF(
    x: A => Eff.pure(x),
    new Function2K[Reader[I, ?],
                   Arr[U, ?, A],
                   Eff[U, ]] {
      def apply[X](reader: Reader[I, X],
                   arr: Arr[U, X, A]): Eff[U, A] =
        reader match {
          case Reader.Get() => Eff.pure(i)
        }
    }
  )

def runWriter[U[_], O, A]:
    Eff[UCons[Writer[O, ?], U, ?], A]) => Eff[U, (A, List[O])] = 
  EffInterp.interpretF(
    x: A => Eff.pure(x -> List.empty[O]),
    new Function2K[Writer[I, ?],
                   Arr[U, ?, (A, List[O])],
                   Eff[U, (A, List[O])]] {
      def apply[X](writer: Writer[O, X],
                   arr: Arr[U, X, (A, List[O])]): Eff[U, (A, List[O])] =
        writer match {
          case Writer.Put(o) => arr(()).flatMap {
            case (x, os) => Eff.pure(x, os :+ o)
          }
        }
    }
  )

Reader interpreter will put pure I value for each Get() occurrence. Notice, that we needed Function2K to provide interpretation to for all possible As… but because Reader is GADT we are actually providing implementation only for one case (Get[I](): Reader[I, I], where I is fixed), because no other can happen!

Similarly with Writer which can only be Put(i: I): Writer[I, Unit] (with fixed I). Writer interpreter, however, uses the possibility to change return value type from A to (A, List[I]) - this way after interpretation we will have our functional log returned to us.

We could combine interpreters and run them with something like this:

type Program[A] = UCons[Reader[String, ?],
                        UCons[Writer[String, ?],
                              UNil,
                              ?],
                        ?]
def runProgram[A](
  i: String
): Eff[Program, A] => Eff[UNil, (A, List[String])] =
  runReader[Program, String, A](i) andThen runWriter

runProgram("test")(getAndLogCtx[Program])

Do you remember, that our decompose handled only the special case of removing leftmost F? Only here it would make a difference: we would be able to use it in interpretF to not rely on order so that we could runWriter andThen runReader and achieve the same result.

Actually, we could also implement resume to interpret that final Eff[UNil, (A, List[String])] into (A, List[String]), but we’ll leave that as an exercise for the reader.

People, who actually would like to use Eff on production, could use existing eff implementation. Its authors put some effort to make it more approachable to Scala programmers: you have some build in utils for composing algebras together:

// from docs

import cats._, data._
import org.atnos.eff._

type ReaderString[A] = Reader[String, A]
type WriterString[A] = Writer[String, A]

// this combines algebras
type Stack = Fx.fx2[WriterString, ReaderString]

You don’t have to figure out how to implement Member or similar to inject/extract/remove F from the union:

import org.atnos.eff.all._
import org.atnos.eff.syntax.all._

type _readerString[R] = ReaderString |= R
type _writerString[R] = WriterString |= R

def getAndLogCtx[R :_readerString
                   :_writerString]: Eff[R, String] = for {
  // Reader is built-in Cats and eff cooperates with it
  // what is why this syntax is already provided
  str <- ask[R, String]

  // same with Writer
  _ <- tell(str)
} yield str

// run the action with all the interpreters
// each interpreter running one effect
program[Stack].runReader("test").runWriter.run

Our implementation quite closely followed its Haskell original from the paper. You could see it by how fast types got overcomplicated. Things that could be handled by simple declarations and type inference in Haskell here required more verbosity. I think Function2K was the best example - the original required ∀ v. t v → Arr r v w → Eff r w. But even after addressing that issue by aligning our approach with how Scala works there are still issues.

If you wonder if Dotty would do things better - it would improve things greatly as you can check in Sanshiro Yoshida’s presentation (you should be familiar with most concepts there from this post and Coyoneda can be found here), but it would not solve all the problems. It is because

this idea of free - functor + union + optimization is not simple. This code is not a small shift from the previous approach e.g. using plain Task and adding one more thing (Kleisli) on top of it. You cannot arrive here one step at a time. We had to put together a lot of powerful machinery in order to obtain the power of expression hardly anyone asked for, and which is extremely difficult to explain to a newcomer. This issue was described by Eric Torreborre in his article Freer doesn’t come for free - even though he liked the idea and promoted it internally at Zalando and externally, the people struggled (and still struggle) with the idea. If you want to be able to quickly train new people and show them that FP makes things simpler Eff might not help you.

Free vs Tagless Final

Let’s put Freer and Eff aside and get back to plain old Free. One we would run through foldMap instead of rewriting open union. If we compare it with tagless final, we can see that there is a lot of boilerplate involved:

sealed trait MsgQueue[T]
object MsgQueue {
  case class Enqueue(msg: String) extends MsgQueue[Unit]
  case class Dequeue() extends MsgQueue[Option[String]]
  
  def enqueue(msg: String): Free[MsgQueue, Unit] =
    Free.lift(Enqueue(msg): MsgQueue[Unit])
  def dequeue: Free[MsgQueue, Option[String]] =
    Free.lift(Dequeue(): MsgQueue[Option[String]])
  
  type Interpreter[F[_]] = FunctionK[MsgQueue, F]
}

Wait a minute, why do we even need to know the internals of interpreter type? We could hide them! Perhaps it would help us get rid of some of that boilerplate?

sealed trait MsgQueue[T]
object MsgQueue {
  private case class Enqueue(msg: String) extends MsgQueue[Unit]
  private case class Dequeue() extends MsgQueue[Option[String]]
  
  def enqueue(msg: String): Free[MsgQueue, Unit] =
    Free.lift(Enqueue(msg): MsgQueue[Unit])
  def dequeue: Free[MsgQueue, Option[String]] =
    Free.lift(Dequeue(): MsgQueue[Option[String]])
  
  type Interpreter[F[_]] extends FunctionK[MsgQueue, F] {
    def enqueue(msg: String): F[Unit]
    def dequeue(): F[Option[String]]
    
    def apply[A](fa: MsgQueue[A]): F[A] = fa match {
      case Enqueue(msg) => enqueue(msg)
      case Dequeue()    => dequeue()
    }
  }
}

Nice! Nobody outside needs to know how our algebra is defined internally. We only need to call MsgQueue.enqueue or MsgQueue.dequeue to get Free and interpret MsgQueue.Interpreter to be able to turn it into F[A]. Since knowledge about internals is irrelevant to us, we could pretty much generate it, e.g. using macro annotations. There are at least 3 libraries doing exactly that: Freestyle (its own free implementation), Freasy Monad (generates Cats/Scalaz implementation) and Freast (Freasy ported to Scalameta, Cats/Scalaz implementation). The later 2 you could use to describe MsgQueue as:

import cats._
import cats.free._
import freasymonad.cats.free // or freasymonad.scalaz.free

// MsgQueue is now the name of a type class of sort
@free trait MsgQueue {
  sealed trait GrammarADT[A] // our previous MsgQueue[A]
  type MsgQueueF[A] = Free[GrammarADT, A]
  
  // operations
  def enqueue(msg: String): MsgQueueF[Unit]
  def dequeue(): MsgQueueF[Option[String]]
}

// syntax generated by macro annotation
import MsgQueue.ops._

val program = for {
  _ <- enqueue("test 1")
  _ <- enqueue("test 2")
  a <- dequeue
  b <- dequeue
  c <- dequeue
} yield (a |+| b |+| c).getOrElse("")

// interpreter also generated by macro annotation
val interpreter = new MsgQueue.Interp[State[Queue[String], ?]] {
  def enqueue(msg: String): State[Queue[String], Unit] =
    State.modify[Queue[String]](_.enqueue(msg))
  def dequeue(msg: String): State[Queue[String], Option[T]]] =
    State[Queue[String], Option[String]] { queue =>
      queue.dequeOption match {
        case Some(msg, newQueue) =>
        newQueue -> Some(msg)
        case None =>
        queue -> None
      }
    }
}

interpreter.run(program).run(Queue.empty[String])

Doesn’t that remind us something? Yes, this type class with extension methods looks exactly like a tagless final! Which, by the way, also has a macro annotation library for generating boilerplate - simulacrum:

import simulacrum._

@typeclass trait MsgQueue[F[_]] {
  
  def enqueue(msg: String): F[Unit]
  def dequeue(): F[Option[String]]
}

def program[F[_]: Monad: MsgQueue] = for {
  _ <- MsgQueue[F].enqueue("test 1")
  _ <- MsgQueue[F].enqueue("test 2")
  a <- MsgQueue[F].dequeue
  b <- MsgQueue[F].dequeue
  c <- MsgQueue[F].dequeue
} yield (a |+| b |+| c).getOrElse("")

implicit val interpreter = new MsgQueue[State[Queue[String], ?]] {
  def enqueue(msg: String): State[Queue[String], Unit] =
    State.modify[Queue[String]](_.enqueue(msg))
  def dequeue(msg: String): State[Queue[String], Option[T]]] =
    State[Queue[String], Option[String]] { queue =>
      queue.dequeOption match {
        case Some(msg, newQueue) =>
        newQueue -> Some(msg)
        case None =>
        queue -> None
      }
    }
}

program[State[Queue[String], ?]].run(Queue.empty[String])

The differences in syntax are practically negligible! As a matter of the fact tagless final and free monad are perfectly interchangeable - you can always implement Free[S, ?] as your F[_] type in tagless final, and you can always derive Free interpreter from TypeClass[F]. So are there any practical differences?

Well, there are. With tagless final you are calling the interpreter the moment lift/flatMap is applied and tagless final doesn’t introduce intermediate layer, which would cause overhead. So, if we want to run computations as soon as you stop composing them, tagless final will almost always be better.

However, tagless final requires you to pass around this interpreter as an implicit instance. This is not always desirable, especially if your interpreter should be some internal secret, or could create a leaky abstraction. If you wanted to combine some computations in one place and pass them to the other place without worrying whether some interpreter exists, that you would have to pass around everywhere, then laziness of free will come handy.

You can see that perfectly in libraries like Doobie. Doobie uses free (hiding it behind aliases like ConnectionIO) to internally represent database operations that it can handle. But it doesn’t want you to have to worry about them, nor expose them by requiring you to pass some DBRunner[F] around. Only once you finish composing your SQLs, you use Transactor[F] to translate that free into your F. This way the library keeps API simple (no extra parameters everywhere) without forcing you to use some specific IO implementation. A similar approach is desirable for all kind of libraries, that you would not want to bound to a particular IO monad. If you share some common library, shared across several services, hardcoding you IO type might block you from a gradual migration to a better alternative when it becomes available.

Algebraic effects

What are the algebraic effects? So far we only talked about side effects - the by-product of our calculations which changes some state of the world. But there can also be indented effects: calculating the value of an expression, applying an argument to a function, returning value from a function, combining functions together… All these little things that are possible are called effects.

And their behavior has to be defined somewhere. If we want to add two As somewhere must live a definition of how they are added, and that addition, in general, is a thing. Maybe it is a part of the language and we can use them always for floats, doubles and integers, maybe we have to define it ourselves. If we wanted to treat everything as algebras, then we would make an addition a monoid for instance. Another algebra would be responsible for e.g. operating on strings, one more for logging, etc. Thing is - similarly to tagless final - we would split the definition of algebra from its execution. What algebraic effects propose is that:

you define some effect algebra $E$ , with some operations,
these operations can be normal functions $A \rightarrow B$ , but we add an extra information, that they use $E$ effect: $A \rightarrow E\ B$ ,
when you compose functions, their effects compose as well:
$f: A \rightarrow E_1\ B$ $g: B \rightarrow E_2\ C$ $(x \rightarrow g(f(x))) = g \circ f : (A \rightarrow E_1\ E_2\ C)$
The order of effect doesn’t matter, only their presence,
you have a $handler$ function, it will take the take each operation $op(A): E \ B$ defined for your effect $E$ and execute it into some $B'$ (because return type might change, it should also provide translation for pure values), and return a function $E\ E' B \rightarrow E'\ B'$ which will remove label $E$ from the list of effects and adjust the output,
of course function might not use any effect at all, then its effect list is empty.

It might be me, but this looks exactly like Eff monad we just described. Additionally, as we discussed in Free vs Tagless Final section, these representations can be translated into one another. So why mention this concept? Well, technically speaking algebraic effects is a wider concept which can just be implemented using Effs or TTFI. But it doesn’t have to. Some languages like Eff or Koka have these concepts build in. Also, there was an experimental library which tried yet another attempt to implement algebraic effects in Scala.

Effekt tries to implement algebraic effects using idioms from the papers discussing them, so it uses nomenclature like Handler instead of an interpreter, Result models internal trampoline and Control is IO/Eff. However, contrary to Eff it uses tagless-final-like approach to inject effects:

import effekt._

trait MsgQueue {
  
  def enqueue(msg: String): Control[Unit]
  def dequeue(): Control[Option[String]]
}

def program(msgQueue: MsgQueue): Control[String] = for {
  _ <- msgQueue.enqueue("test 1")
  _ <- msgQueue.enqueue("test 2")
  a <- msgQueue.dequeue
  b <- msgQueue.dequeue
  c <- msgQueue.dequeue
} yield (a |+| b |+| c).getOrElse("")

which are handled using continuation passing style directly (just as in papers that introduced algebraic effects):

def msgQueueHandler[R] =
  new Handler[Queue[String] => Control[R]] with MsgQueue {
    // use[A] takes (A => Control[Res]) => Control[Res]
    // and returns Control[A]
    
    // pure[A] takes A and returns Control[A]
    
    // we can use them to thread State with CPS
    
    def enqueue(msg: String): Control[Unit] =
      use { resume: =>
        // resume: A => Control[Res]
        // where
        //   A = Unit (enqueue is Control[Unit], so it has to be)
        //   Res = Queue[String] => Control[R] (handler signatue)
        //
        // qs: Queue[String]
        // qs2c: Res - function returning Control[R]
        //             so we only need to .apply(qs2)
        pure { qs =>
          val qs2 = qs.enqueue(msg)
          resume(()).flatMap(_(qs2))
        }
      }
    def dequeue(): Control[Option[String]] =
      use { resume =>
        // similar to previous one but:
        //   A = Option[String]
        pure { qs =>
          val (qs2, msgOpt) = queue.dequeOption match {
            case Some(msg, newQueue) =>
              newQueue -> Some(msg)
            case None =>
              queue -> None
          }
          resume(msgOpt).flatMap(_(qs2))
        }
      }
  }

def handleMsgQueue[R](
  init: Queue[String]
)(
  prog: MsgQueue => Control[R],
): Control[R] =
  msgQueueHandler[R].handle { msgQueue =>
    // program: MsgQueue => Control[R]
    // has to be adjusted to
    // MsgQueue => Control[Queue[String] => Control[R]]
    // to match handler
    program(msgQueue).map(r => qs => pure(r))
  }.flatMap(_(init))

handleMsgQueue(Queue.empty)(program)

Why it failed?

Direct control over continuations is a very powerful concept. Paper Algebraic Effects for Functional Programming shows, that virtually all effects that are interesting could be implemented this way, and library’s authors show in Effekt: Extensible Algebraic Effects in Scala how to achieve that with their library. There are examples where you could use handlers to e.g. change the order of some operations or introduce non-determinism.

Thing is, there are ways of achieving similar goals with much better language support and less mental burden. Compare our queue example for tagless final, free and effekt - while none of them is completely trivial, some are them puts more burden on the reader’s mind.

You can see why this style (direct implementation) didn’t catch on: it is simply a way to hard for your average programmer to become efficient in a reasonable time. Additionally certain handlers don’t commute, so by reversing the order of handlers you change the result (which breaks the promise of the carefree composition of effects). This is also true about Freer/Eff.

However, while Eff gained some adoption (not huge but still) in e.g. Zalando, Effekt is virtually dead.

Current trends - ZIO

One of the reasons Monix caught on was that people have issues with some of the design decisions with scalaz.concurrent. At some point even Scalaz community had an issue with it, so sooner or later someone would introduce a proposition of a replacement.

New Scalaz IO library - ZIO was introduced. Its milestones were used as a marketing opportunity by its author, so after each new improvement you could see a presentation about how awesome ZIO is. As far as I remember, initially it had a design inspired by Monix Task, then there was some marketing campaign when it caught up to/surpassed Monix with performance, one when ZIO became bifunctor (ZIO[E, A]), then when it became… trifunctor(?) (ZIO[R, E, A]) and recently because ZIO introduced software transactional memory (which triggered the same effort in Cats Effect ecosystem, e.g. Cats STM). I admit I haven’t been following actively its development as I find intrusive marketing repulsive, so I warmed up to this library only after I saw some actual code.

So, what is ZIO (currently, and limiting ourselves only the IO monad part)? Actually, the ZIO[R, E, A] is a way of composing functions R => Either[E, A] , where R is some context (runtime), E is your error type and A is your correct result type. Just like other IO monads it evaluates lazily (you have to provide R after all) and let you run code asynchronously. So what is the difference between this and Kleisli[EitherT[Task, E, ?], R, A]?

Well, surely the type signature is simpler. And you have only one abstraction, so lifting values and functions into ZIO is easier. And - and this makes a huge difference - ZIO is contravariant at R, which is not true about current Kleisli implementation in Cats (though this could get fixed soon). This lets us compose requirements with a cake pattern of a sort:

trait WithDatabase { val database: Database }
trait WithEventBus { val eventbus: EventBus }

val createUser: ZIO[WithDatabase, AppError, User]
def userCreated(user: User): ZIO[WithEventBus, AppError, Unit]

val program: ZIO[WithDatabase with WithEventBus, AppError, User] =
  for {
    user <- createUser
    _    <- userCreated(user)
  } yield user

val runtime = new WithDatabase with WithEventBus {
  val database: Database = ...
  val eventbus: EventBus = ...
}

// type IO[E, A] = ZIO[Any, E, A]
program.provide(runtime): IO[AppError, User]

Since ZIO has the ambition of becoming the ultimate IO monad for Scala it has a lot of utilities (queues, STM, support for asynchronous and synchronous execution), provides integration with other ecosystems (e.g. type classes for Cats and Cats Effect), while on its own doesn’t depend on other libraries. Additionally, since it flattened Task, EitherT and Kleisli there are less memory (re)allocations, so it is also faster than its manually composed alternative.

Actually, there are only 2 reasons I am hesitating to jump the hype train:

if you already wrote your codebase using Kleisli[EitherT[Task, E, ?], R< A], worked around its quirks and don’t need the features not yet implemented in Cats Effect and Monix, then probably you might not find it worth it to rewrite your whole codebase (arguments about performance are less convincing to me as the majority from the slowness in my code comes from querying outside world, and not overhead of running these queries and composing them),
the other reason is that so far every few months I heard that ZIO API changed, so I intend to wait a bit more to make sure it becomes stable.

If you have TTFI in your whole codebase you can easily swap the implementation to check if performance improved and swap back if things get worse, otherwise, I would wait a bit just to be sure, and let the early adopters test the library. However, once it became mature it might be one of the best choices possible.

So, which IO approach is the best?

Now, that we know what are our current options, we might actually analyze what is the best of us. I am sure, that the answer is obvious to every veteran Scala programmer, just as I am sure that the obvious answer is different for each programmer you ask. So, let’s take a look at the pros and cons:

Approach	Pros	Cons
`scala.concurrent.Future`	build into the standard library, primitive for async and IO for e.g. Akka ecosystem, can be used to communicate with other IO monads	not lawful, not referentially transparent, misses all the tools and utilities depending on RT, no support for other errors than `Throwable`
`cats.effect.IO`	lawful, referentially transparent, primitive for IO for Cats ecosystem	adds dependencies (Cats), no support for other errors than `Throwable`
`monix.eval.Task`	lawful, referentially transparent, supports Cats Effect type classes, tons of utilities absent in `cats.effect.IO`	adds dependencies (Cats and Cats Effect), no support for other errors than `Throwable`
`scalaz.zio.ZIO`	lawful, referentially transparent, supports Cats Effect type classes (only if you want because) doesn’t introduce dependencies on its own, tons of utilities absent in `cats.effect.IO`, supports your own error algebras and runtime	project is not yet providing guarantees about API stability
Monad Transformers (which contains one of IO monads from above)	enriches other IO monads with extra abilities (like custom error algebra or passing an argument)	increases complexity, increases the number of allocations, types get complex pretty fast mapping over just one layer is troublesome
Typed Tagless Final Interpreter (which passes one of IO monads from above)	lets you limit the operations to some subset using a type class, lets you defer the decision which IO monad you want to use	requires the user to understand how implicits and type classes work, requires user to know in which type class is required functionality
`Free`/Freer monad (interpreted to IO)	let you model your domain with algebras (though it is very similar to TTFI in the end), let you defer the interpretation into your target IO monad	you have to allocate the intermediate representation (`Free`) which always affects performance
`Eff` monad (interpreted to IO)	all advantages of Free, allows you storing and interpreting multiple algebras, potentially fewer allocations than normal Free	all disadvantages of Free, much harder to explain how it actually works, code might get complex at a time even if you understand what is going on
algebraic effects like `Effekt`	one of the most powerful abstractions, let you handle several different effects using the same principles	the amount of complexity overshadows virtually any benefits

However, if we intend to pick up the IO that would:

increase our confidence in code,
avoid introducing concepts that no average programmer can handle,
have reasonable performance

then this table can only rule out Future (unlawful), Eff and Effekt (too complex). For further analysis we need some benchmarks and use cases.

Performance

If we want to talk about performance, one thing is clear: monad transformers, TTFI and free monads at some point are interpreted to IO, so they have their performance characteristics + some overhead. So, it doesn’t make sense to put all of them in one bag. Instead, we can compare different plain IO monads together and then things interpreted to IO monads separately. Because:

benchmarking is hard,
I am lazy,
someone else already measured all these things

I will just present the result summary mentioning with a link to a presentation that described them.

Cats IO vs Monix Task vs ZIO

For cats.effect.IO vs monix.eval.Task vs scalaz.zio.ZIO it is easy to have relatively fresh benchmarks - ZIO maintains the benchmarks comparing all of these. Well, as long as we don’t mind that the complete benchmarks will run for several hours. (In my case more than 11). My full benchmark log is available to download, but here we will only present some part of if (mode: throughput, units ops/s)

Benchmark	(depth)	(size)	Cnt	Score		Error
`ArrayFillBenchmarks` `.catsArrayFill`	N/A	10000	25	5249.632	±	23.900
`ArrayFillBenchmarks` `.monixArrayFill`	N/A	10000	25	5290.395	±	33.737
`ArrayFillBenchmarks` `.scalazArrayFill`	N/A	10000	25	16040.763	±	251.910
`BubbleSortBenchmarks` `.catsBubbleSort`	N/A	1000	25	34.227	±	0.668
`BubbleSortBenchmarks` `.monixBubbleSort`	N/A	1000	25	34.875	±	0.207
`BubbleSortBenchmarks` `.scalazBubbleSort`	N/A	1000	25	41.44	±	5.153
`IODeepAttemptBenchmark` `.catsDeepAttempt`	1000	N/A	25	15151.476	±	70.303
`IODeepAttemptBenchmark` `.futureDeepAttempt`	1000	N/A	25	1810.702	±	37.795
`IODeepAttemptBenchmark` `.monixDeepAttempt`	1000	N/A	25	14857.142	±	62.516 ops/s
`IODeepAttemptBenchmark` `.scalazDeepAttempt`	1000	N/A	25	40451.95	±	236.906
`IODeepAttemptBenchmark` `.scalazDeepAttemptBaseline`	1000	N/A	25	13261.961	±	128.037
`IODeepFlatMapBenchmark` `.catsDeepFlatMap`	20	N/A	25	1628.582	±	30.889
`IODeepFlatMapBenchmark` `.futureDeepFlatMap`	20	N/A	25	9.761	±	0.248
`IODeepFlatMapBenchmark` `.monixDeepFlatMap`	20	N/A	25	1595.518	±	7.650
`IODeepFlatMapBenchmark` `.scalazDeepFlatMap`	20	N/A	25	1088.752	±	15.603
`IOLeftBindBenchmark` `.catsLeftBindBenchmark`	100	10000	25	3868.711	±	55.949
`IOLeftBindBenchmark` `.futureLeftBindBenchmark`	100	10000	25	21.344	±	0.275
`IOLeftBindBenchmark` `.monixLeftBindBenchmark`	100	10000	25	3784.674	±	38.673
`IOLeftBindBenchmark` `.scalazLeftBindBenchmark`	100	10000	25	4070.373	±	47.212
`IOMapBenchmark` `.catsMap`	500	N/A	25	57411.508	±	263.709
`IOMapBenchmark` `.futureMap`	500	N/A	25	2536.527	±	132.078
`IOMapBenchmark` `.monixMap`	500	N/A	25	56944.521	±	261.434
`IOMapBenchmark` `.scalazMap`	500	N/A	25	53415.624	±	470.859
`IONarrowFlatMapBenchmark` `.catsNarrowFlatMap`	N/A	10000	25	5074.909	±	32.706
`IONarrowFlatMapBenchmark` `.futureNarrowFlatMap`	N/A	10000	25	21.664	±	0.525
`IONarrowFlatMapBenchmark` `.monixNarrowFlatMap`	N/A	10000	25	5598.018	±	19.457
`IONarrowFlatMapBenchmark` `.scalazNarrowFlatMap`	N/A	10000	25	6852.629	±	22.147
`IOShallowAttemptBenchmark` `.catsShallowAttempt`	1000	N/A	25	1331.421	±	8.123
`IOShallowAttemptBenchmark` `.futureShallowAttempt`	1000	N/A	25	90.544	±	2.458
`IOShallowAttemptBenchmark` `.monixShallowAttempt`	1000	N/A	25	1354.343	±	6.144
`IOShallowAttemptBenchmark` `.scalazShallowAttempt`	1000	N/A	25	21818.063	±	1010.642
`IOShallowAttemptBenchmark` `.scalazShallowAttemptBaseline`	1000	N/A	25	1145.185	±	8.844
`QueueBackPressureBenchmark` `.fs2Queue`	N/A	N/A	30	607.866	±	14.685
`QueueBackPressureBenchmark` `.monixQueue`	N/A	N/A	30	3570.126	±	137.386
`QueueBackPressureBenchmark` `.zioQueue`	N/A	N/A	30	409.227	±	5.196
`QueueBackPressureBenchmark` `.zioTQueue`	N/A	N/A	30	210.597	±	5.453
`QueueParallelBenchmark` `.fs2Queue`	N/A	N/A	30	277.313	±	3.100
`QueueParallelBenchmark` `.monixQueue`	N/A	N/A	30	6538.669	±	217.168
`QueueParallelBenchmark` `.zioQueue`	N/A	N/A	30	2775.591	±	231.426
`QueueParallelBenchmark` `.zioTQueue`	N/A	N/A	30	123.046	±	6.266
`QueueSequentialBenchmark` `.fs2Queue`	N/A	N/A	30	407.886	±	2.673
`QueueSequentialBenchmark` `.monixQueue`	N/A	N/A	30	14882.24	±	79.581
`QueueSequentialBenchmark` `.zioQueue`	N/A	N/A	30	1491.382	±	11.233
`QueueSequentialBenchmark` `.zioTQueue`	N/A	N/A	30	341.394	±	1.136

We can see that Cats IO and Monix in many tests have comparable performance, though Monix is usually a bit better. In some cases the difference is huge - those are often the cases when Monix is better than ZIO, though quite a lot ZIO is better than Monix. In some tests, we see how plain old Future behaves and it virtually always slower than the lawful IO monads (because their laziness let them manage the thread pool better e.g. by posting many maps/``flatMap`s as one job, which avoids rescheduling).

We should remember, that this benchmark is just what I got on my laptop and it was written by ZIO authors to make sure they are faster than Monix (a testament of a friendly rivalry between the authors of both projects).

Personally, I don’t find the differences to be enough to prefer some particular implementation.

TTFI vs Free

The comparison of TTFI vs Free (vs Eff which we/I don’t care about; no monad transformers) was made by Cameron Joannidis in his presentation The Cost of Abstraction (benchmarks starts around 19:30).

As shown TTFI has the best performance - an order of magnitude better than either of Free. This can be attributed to the fact that type class usage can be often inlined by hot JVM as if you called the IO methods with no indirection. Meanwhile Free has to allocate itself and then needs to be interpreted which allocates again. Scalaz Free is about twice as fast as Cats (except for Scalaz 7 IO, where the difference is smaller, but it seems to be slower in general).

So in a long-running code, TTFI seems to not inflict that much of a performance penalty, which could be used as a counterargument for using it. Modeling domain using Free, however, might be a no-go zone for people looking for performance (but then they might avoid the whole FP in general).

But we should also notice that usage of IO monad made a much bigger difference than whether you use Free or TTFI. In this benchmark Cats IO and ZIO are (again) within the same order of magnitude when it comes to operations. Meanwhile, Scalaz 7 IO (the old one) and Future are 1-2 orders of magnitude slower! Again, I interpret it that either of these is fast enough, so you should make your choice between Cats IO/Monix/ZIO depending on your use cases.

TTFI vs Monad Transformer

We are missing one more benchmark - what is the cost of using monad transformers? This was measured by Jamie Pullar and presented in Cats MTL in action (starting at slide 32).

Here you can see, that usage of Id monad is 7 times slower than writing code using values directly in block (baseline), Either is slightly slower (8 times baseline) and 4 nested monads (ReaderT[StateT[WriterT[Either...]...]...]) is 85 times slower than baseline. As if each new layer made things about 1.5-2 times slower. We might expect, that if we use EitherT and Kleisli/ReaderT we are making our monad about ~3 times slower than simple Task…

Still, if we don’t stack together too many effects we should still have an acceptable performance (unless you work at HFT, but then why are you reading this blog?), better than Free.

Use cases

There are basically 3 use cases to consider. (At least when it comes to picking which IO monad you prefer because, in general, you have a lot more options):

a library code, which you would publish to Maven, Sonatype or other artifactory, no matter whether it is public or internal for your company,
a domain description,
glue code which combines domain services into the application logic.

In general, we want to stick to the principle of the least power. We also want to limit the assumption and requirements as much as possible. This is the most important in the library code - if you make a decision in a library, you are forcing it on all its users. Did you pick Monix? All your users have to use it. You prefer ZIO? Now your users have to prefer it. If your library is available to a wider audience it is a huge disadvantage, forgivable only if your goals would not be achievable otherwise. If the library is internal, it is a bit better, though you are still forcing every service using it to pick the same stack. Which can be a huge pain if you decide to move away from your current library and sometimes you have to.

That is why I see no reason why library which has to return some IO result should not use either Free or TTFI. Whichever you pick should be based on whether it would be more convenient to interpret immediately as you combine different parts (TTFI), which would require a user to make required implicits available, or if some performance penalty is allowed letting user compose operations first, and then interpret them somewhere else later. As said in Free vs Tagless Final Doobie is a good example of usage of both in one library.

On the other hand, in the application logic (glue code), where we would like to compose domain services, we should have full control over what is executed and when and in which order. Since a relatively little number of programmers know effect type classes well enough (and not all extra features are expressed as type classes) this is easier to achieve using your IO monad directly. But this decision is rather late, so you should be able to change it easier.

In my humble opinion, the most troublesome is the middle layer, where you define your services. On one hand, you want to avoid committing to some solution regarding e.g. how do you handle concurrency, events, etc. On the other hand, domain services should provide some guarantees and sometimes it might require usage of things like e.g. MVars or STM which are provided by the IO libraries, but which are too specific to become a part of some type class (unless you want to write it yourself, but then you can end up tightly coupled to the implementation anyway). I think the choice should depend on your knowledge about type classes and your domain, and if you don’t feel well versed enough to achieve your goals using TTFI you can use an IO monad directly. I guess this is not a decision you can correctly make without getting some experience with various approaches and knowing the abilities of your team.

The last question is: which IO to pick? Cats Effect, Monix or ZIO? I would say: as long as you don’t need things like STM or MVars or another library-specific feature, (which means that your choice is no longer only about IO monad), then… it doesn’t matter as much. And when you do need some of these features they kind of force the choice on you.

Summary

If you want to embrace FP and make your code more reliable just start by not using Future. For starters any IO monad (Cats/Monix/ZIO) should do. Then get more familiar with Tagless Final and Free to not bind (some parts of) your core code to a particular implementation. (Though if you feel the need to use your own error algebra and ReaderT, ZIO will get ahead in UX).

When it comes to the rest of the code and TTFI, then it mostly depends on your and your team experience. There are also solutions which give you more expressive power and let you combine effects freely - like Eff or Effekt - but they are too complex to consider them as reasonable solution for teams not made entirely of ~~helpless functional romantics~~ experts.

kubuszok.com

IO monad: which, why and how

Imperative programming in Haskell

IO monad(s) in Scala(?)

`Future`

Shortcomings of `Future`

`Task`s and `IO`

Monad transformers

Typed Tagless Final Interpreter

MTL

Other attempts

Free, Freer, Eff

Free

Freer

Eff

Free vs Tagless Final

Algebraic effects

Why it failed?

Current trends - ZIO

So, which IO approach is the best?

Performance

Cats IO vs Monix Task vs ZIO

TTFI vs Free

TTFI vs Monad Transformer

Use cases

Summary

Kinds of types in Scala, part 1: types, what are they?

Implicits, type classes, and extension methods, part 1: with type classes in mind

Different ways to understand a monad

IO monad: which, why and how

Imperative programming in Haskell

IO monad(s) in Scala(?)

Future

Shortcomings of Future

Tasks and IO

Monad transformers

Typed Tagless Final Interpreter

MTL

Other attempts

Free, Freer, Eff

Free

Freer

Eff

Free vs Tagless Final

Algebraic effects

Why it failed?

Current trends - ZIO

So, which IO approach is the best?

Performance

Cats IO vs Monix Task vs ZIO

TTFI vs Free

TTFI vs Monad Transformer

Use cases

Summary

Kinds of types in Scala, part 1: types, what are they?

Implicits, type classes, and extension methods, part 1: with type classes in mind

Different ways to understand a monad

`Future`

Shortcomings of `Future`

`Task`s and `IO`