Adventures with custom Predef

I first heard about custom Predef from Paweł Szulc. I don’t remember exact circumstances, but I think it was soon after he started working at Slam Data on Quasar. Apparently, in all of their projects, they decided to use their own Predef instead of Scala’s built-in. But what does that mean? Why would one consider it, and what would the consequences be?

scala.Predef

Let’s start with explaining what a Predef is. When you open up an editor and start typing code, you will find that some things are already available and some need to be imported. You can use String, Int, and other primitives. You can initialize Map or Set and you don’t have to import scala.collection.immutable.{ Map, Set }. It also contains utilities that allow syntax a -> b instead of (a, b), let you println without importing scala.Console.println and tons of other stuff.

All of that you can find under scala.Predef. As one can see, it consists of object Predef extends LowPriorityImplicits with DeprecatedPredef. LowPriorityImplicits are mostly implicit conversions, e.g., wrappers for String or Arrays allowing them to be treated like a Scala Seq. DeprecatedPredef contains utilities that were added to Predef, but at some point, they were recognized as code smells and discouraged. (I encourage you to look around that file with your favorite IDE).

So, what’s the issue?

In the beginning, one can think that there are no issues. It takes time to get burnt by them.

One of the biggest fuckups is any2stringadd, which will turn a sequence like x + "test" into string concatenation. It might look innocent, but on several occasions some of my colleagues wanted to append something to a collection, got the types wrong, and ended up with something that compiles. But it compiles into something completely different than they expected.

Next, scala.Predef is missing Seq. Still, you are able to access Seq. That is because the compiler by default also includes scala._, and scala.Seq is an alias for scala.collection.Seq. What’s the issue? scala.collection.Seq is not guaranteed to be immutable. scala.collection.immutable.Seq is. Map, Set, and List are also taken from scala.collection.immutable, so the design is inconsistent, and one might accidentally use mutable Seq one day and be convinced that they used an immutable data structure (in theory, each time one wants to use a mutable data structure one must import scala.collection.mutable).

Additionally, some people are not happy that their namespace is polluted with tons of implicit conversions they don’t need.

On the other hand, programmers might use some features literally everywhere, and they have to add tons of copy-pasted imports. Having them in one place would simplify things a lot.

Custom Predef — how to?

First of all, we don’t want to have 2 Predefs at once. scalac has a special flag that prevents importing scala.Predef._. That flag is -Yno-predef.

For the brave and the bold, there is also -Yno-imports. This prevents the import of scala._ (which contains, e.g., primitives, Unit, and Seq) and java.lang._ (importing everything that is available in Java without explicit imports).

Once we have reduced our scope to the tabula rasa, we can populate it with what we want. We will start by creating a Predef object (obviously), where we will put everything we are sure should be visible globally:

package my.domain

object Predef {
  // here we'll put definitions
}

A good starting point to study is Slam Data’s Predef definition — we might start with that, and then add or remove items as we find suitable.

But to be specific — what should we have there? From the original imports we might borrow some items:

  • if we have used -Yno-imports, we have removed all built-in types from the scope. So we should reimport them, e.g.
    type String = scala.String
    
  • we surely will not have any immutable collections defined, so we need to add them back, e.g.
    // for type alias
    type Set[U] = scala.collection.immutable.Set[T]
    // for companion object
    val Set = scala.collection.immutable.Set
    
  • the $conforms implicit, as without it some operations break,

  • omnipresent types like Nothing, Product, Serializable, RuntimeException and Throwable are handy to have,

  • similarly tailrec, deprecated and SuppressWarnings annotations,

  • StringContext type must be known if we are using any String interpolation,

  • one should also consider whether they can live without some utilities always present without importing from the original Predef, such as:

    • alternative syntax for tuple creation: a -> b,
    • array wrappers: Array(1, 2, 3).filter(predicate),
    • rich operations.

We might also put a few things there that could make our lives easier:

  • make Seq immutable by using scala.collection.immutable.Seq instead of scala.collection.Seq!,
  • if one is using wartremover, then custom Predef is the right place to put === and =/= to get rid of the type-unsafe == and !=.

To make it all accessible, all we need to do is:

import my.domain.Predef._

Could I not import it all over again?

Well, I have my own Predef imported in nearly all my files, so I know the annoyance and I wish I could just tell the compiler that it should just use this import as the new default.

As a matter of fact, other people share the same feeling. For a year, there has been an ongoing pull request that would add new flags -Ysysdef and -Ypredef, which do exactly what we need. Meanwhile, we can use Typelevel Scala, as it has apparently already merged this PR and released a version with this feature.

It sounds too good, where are the catches?

The first catch is that you have to import your Predef everywhere.

The second catch is that tools like IntelliJ are rather unaware that some insane maniac could remove Predef’s content and won’t tell you if you forget about it. You will learn from compiler error.

Additionally, some code relies on some imports (e.g. scala.Predef.$conforms), so if you forget about them you might be surprised when some snippet won’t compile even though in, e.g., Ammonite it will work perfectly.

The last issue I have is not directly related to predefs, but some poor decisions when it comes to standard collections and their consequences. Namely: almost no library remembers about scala.collection.immutable.Seq. I might be insane, it might have been one of my mistakes, it might have been old versions of libraries, but Circe and Slick do not support Seq out of the box. I was able to serialize a normal Seq, but the immutable one fails to encode, unless I provide an Encoder myself. With Slick I had to map all results with sequence.to[Seq] to achieve my goal. In theory, the immutable Seq inherits from the general one, but apparently it is enough to make them different beasts and break code in many places.

So, is it worth it?

For a short-term project? Probably not. You won’t even notice any pain points.

But for a bigger project, that would take years, rely on tons of decisions taken with consideration for this exact domain, one place where all tools would already be present and discouraged tools would be absent sounds like a good idea. If you consider something like a common package, that all the other components would reuse, custom Predef could be a central figure that could help organize and shape practices used in your project.