Adventures with custom Predef

I first heard about custom Predef from Paweł Szulc. I don’t remember exact circumstances, but I think it was soon after he started working at Slam Data on Quasar. Apparently, in all of their projects, they decided to use own Predef instead of Scala’s build in. But what does that mean? Why one would consider it, and what would be the consequences?

`scala.Predef`

Let’s start with explaining what a Predef is. When you open up an editor and start typing code, you will find, that some things are already available and some need to be imported. You can use String, Int, and other primitives. You can initialize Map or Set and you don’t have to import scala.collection.immutable.{ Map, Set }. It also contains utilities that allow syntax a -> b instead of (a, b), let you println without importing scala.Console.println and tons of other stuff.

All of that you can find under scala.Predef. As one can see it consists of object Predef extends LowPriorityImplicits with DeprecatedPredef . LowPriorityImplicits are mostly implicit conversions: e.g. wrappers on String or Arrays allowing them to be treated like a Scala’s Seq. DeprecatedPredef contains utilities, that were added to Predef, but at some point, they were recognized as a code smells and discouraged. (I encourage you to look around that file with your favorite IDE).

So, what’s the issue?

In the beginning, one can think, that there are no issues. It takes time to get burnt by them.

One of biggest fuckups is any2stringadd, that will turn sequence like x + "test" into string concatenation. It might look innocent, but on several occasions some of my colleagues wanted to append something to a collection, got types wrong and ended up with something, that compiles. But compiles into something completely different than they expected.

Next, scala.Predef is missing Seq. Still, you are able to access Seq. That is because compiler by default also includes scala._ and scala.Seq is an alias for scala.collection.Seq. What’s the issue? scala.collection.Seq is not guaranteed to be immutable. scala.collection.immutable.Seq is. Map and Set and List are also taken from scala.collection.immutable, so the design is inconsistent and one might accidentally use mutable Seq one day and be convinced, that he used an immutable data structure (in theory each time one want’s to use mutable data structure one must import scala.colllection.mutable).

Additionally, some people are not happy, that their namespace is polluted with tons of implicit conversions they don’t need.

On the other hand, programmers might use some features literally everywhere and they have to add tons of copy-pasted imports everywhere. Having them in one place would simplify things a lot.

Custom Predef - how to?

First of all, we don’t want to have 2 Predefs at once. scalac has a special flag, that prevents importing scala.Predef._. That flas is -Yno-predef.

For brave and bold, there is also -Yno-import. This prevents the import of scala._ (which contains e.g. primitives, Unit and Seq) and java.lang._ (importing everything that is available in Java without explicit imports).

Once we reduced our scope to the tabula rasa, we can populate it with what we want. We will start by creating a Predef object (obviously), where we will put everything we are sure should be visible globally:

package my.domain

object Predef {
  // here we'll put definitions
}

A good starting point to study is Slam Data’s Predef definition - we might start with that, and then add or remove stuff as we’ll find suitable.

But to be specific - what we should have there? From original imports we might borrow so stuff:

if we used -Yno-imports we removed all built-in types from the scope. So we should reimport them, e.g.
```
type String = scala.String
```

we surely will not have any immutable collections defined, so we need to add them back, e.g.

// for type alias
type Set[U] = scala.collection.immutable.Set[T]
// for compation object
val Set = scala.collection.immutable.Set 

$conforms implicit, as without it some operations break,
omnipresent types like Nothing, Product, Serializable, RuntimeException and Throwable are handy to have,
similarly tailrec, deprecated and SuppressWarnings annotations,
StringContext type must be known if we are using any String interpolation,
one should also consider if he can live without some utils always present without importing from original predef, like:
- alternative syntax for tuple creation: a -> b,
- array wrappers: Array(1, 2, 3).filter(predicate),
- rich operations.

We might also put few things there that could make our life easier:

tailrec and deprecated annotations,
make Seq immutable by using scala.collection.immutable.Seq instead of scala.collection.Seq!,
if one is using wartremover, then custom Predef is the right place to put === and =/= to get rid type-unsafe == and !=.

To make it all accessible all we need to do is

import my.domain.Predef._

Could I not import it all over again?

Well, I have my own Predef imported in nearly all my files, so I know the annoyance and I wish I could just tell the compiler, that it should just use this import as the new default.

As a matter of the fact, other people share the same feeling. For a year there is ongoing pull request that would add new flags -Ysysdef and -Ypredef which do exactly what we need. Meanwhile, we can use Typelevel Scala as apparently it already merged this PR and released version with this feature.

It sounds too good, where are the catches?

The first catch is that you have to import your predef everywhere.

The second catch is that tools like IntelliJ are rather unaware that some insane maniac could remove Predef’s content and won’t tell you if you forget about it. You will learn from compiler error.

Additionally some code rely on some imports (e.g. scala.Predef.$conforms), so if you forget about them you might be surprised when some snippet won’t compile even though in e.g. Ammonite it will work perfectly.

The last issue I have is not directly related to predefs, but some poor decisions when it comes to standard collections and their consequences. Namely: almost no library remembers about scala.collection.immutable.Seq. I might be insane, it might have been one of my mistakes, it might have been old versions of libraries, but Circe and Slick do not support Seq out of the box. I was able to serialize normal Seq, but immutable one fails to Encode, unless I provide Encoder myself. With Slick I had to map all result with sequence.to[Seq] to achieve my goal. In theory immutable Seq inherits from general one, but apparently it is enough to make them different beasts and break code in many places.

So, is it worth it?

For a short time project? Probably, not really. You won’t even notice any pain points.

But for a bigger project, that would take years, rely on tons on decisions taken with consideration for this exact domain, one place, where all tools would be already present and discouraged tools would be missing, sounds like a good idea. If you consider something like a common package, that all the other components would reuse, custom Predef could be a central figure, that could help organize and shape practices used in your project.

kubuszok.com