Skip to content
This repository has been archived by the owner on Sep 1, 2020. It is now read-only.

Typelevel Scala Compatibility Guide

non edited this page Sep 12, 2014 · 7 revisions

Typelevel Scala Compatibility Guide

Introduction

Since Miles Sabin's announcement of the Typelevel Scala fork there has been a flurry of activity in suggesting changes, opening issues, and submitting patches. This is exciting, and an indication of the promise that this fork has.

Many of the proposed changes may create compatibility challenges. The introductory blog post calls this a "conservative fork" (intended to be reminiscent of a conservative extension to a logical theory). The implication is that programs compiled by scalac must still work (with the same meaning) in tlc, but that new programs may be possible.

This post tries to lay out what that means in concrete terms. It is indebted to Daniel Spiewak's initial proposal for compatibility rules.

Versions

For now, tlc is expected to have versions corresponding to every scalac release. So when we say tlc must be compatible with scalac, we mean that tlc-2.x.y must be compatible with scalac-2.x.y.

Binary compatibility

The expectation is that tlc and scalac will be binary compatible. In other words, classfiles created by scalac should be usable with tlc, and classfiles created by tlc should be usable with scalac. For now this is a hard requirement: -Z flags will not be allowed to supersede this requirements.

If tlc defines new types, annotations, or any other objects which must be available at runtime, they will need to be provided some kind of TLC-specific JAR which could be used with tlc or scalac. Thus, code using these features would maintain binary compatibility with scalac.

The only possible exceptions to this binary compatibility story might involve modularization efforts, where packages could be moved from a core jar into an external compatibility jar. This should only be undertaken with caution and careful planning.

Source compatibility

In the standard configuration tlc should be able to compile any source code which scalac supports. The reverse is not true -- tlc is permitted to add support for syntax that is not available in scalac, as long as the resulting binaries are compatible. An early example of this is the new syntax for Byte and Short literals (#12).

However, in some cases developers may wish to make tlc more restrictive by rejecting some programs which scalac would accept. This is allowed as long as the restriction is placed behind a -Z flag. We can broadly categorize these types of changes based on their expected impact and complexity.

Impact

  • Very low impact: most authors and projects should be unaffected.
  • Low impact: some projects affected, but few modifications per-project expected.
  • Moderate impact: most projects may need modification; some may need significant changes.
  • Severe impact: almost all projects are expected to need major modifications.

Complexity

  • Low complexity: a mechanical (or nearly mechanical) rewrite is possible.
  • Moderate complexity: simple local code review should be able to correct all issues.
  • Severe complexity: major code restructuring may be required to work around changes.

These categories are not set in stone, but are provided to illustrate how -Z options which break source compatibility should be understood. The lower the impact and complexity of a change, the easier it is to justify its inclusion in tlc (and also, the easier it is to include it in -Zomnibus, a collection of flags expected to be useful to most projects and authors).

Note that all of these categories assume that code which is successfully compiled by scalac and tlc will have the same semantics and performance characteristics. The next section addresses cases where that may not be true.

Semantic Compatibility

Code compiled by scalac and tlc are expected to produce the "same" program. In the absence of a precise definition, here is an evocative one:

For any program, semantic compatibility means one of:

  1. scalac and tlc should produce identical bytecode.
  2. As #1 but tlc may leave out instructions which have no effect.
  3. tlc may produce different bytecode which has the same effects and results for all inputs.
  4. As #3 but tlc may omit allocations which are not observable.

This definition tries to capture what one might mean by "semantic compatibility" while still being open to differences in bytecode.

Semantic incompatibilities are more serious than source incompatibility, since a change from scalac to tlc might compile with no warnings, yet produce different behavior. However, some semantic changes are benign and useful.

Here are some broad categories of possible semantic incompatibilities:

  • Bug fixes: fix behavior widely-acknowledged as erroneous (deadlock, crash, etc.)
  • Efficiency improvements: improve performance, but may reorder (or skip) side-effects.
  • Type changes: might change the result type/value of particular expressions.
  • Major changes: some combination of the previous three.

It's hard to characterize these more precisely, since there are potentially a very large number of possible changes. Examples of specific semantic incompatibilities include:

  • disabling some numeric coercions
  • changes to how partial functions are implemented/evaluated
  • changes to implementations of equals, hashCode, toString, etc.
  • using a different RNG in scala.util.Random
  • some potential inlining changes
  • changing compilation based on purity annotations/analysis
  • changing how implicit resolution works

Like changes which break source compatibility, these changes must exist behind a -Z flag. Also, it's important to consider how obvious the implications of these changes will be to potential users.

Conclusion

It is important to the project that tlc be a drop-in replacement for scalac. It is easy to relax compatibility requirements but very difficult or impossible to tighten them.

We do have the option to use -Z flags to support interesting-but-incompatible changes. However, these flags have the effect of parameterizing the language (for N flags, there are potentially 2^N possible outcomes to consider). Thus, we'd like to limit the flags to those whose implications and interactions are obvious and limited in scope. Of course there will be exceptions (and users are free to create their own forks with more radical changes) but for now these are the guiding principles.