Skip to content
Pieter edited this page Oct 6, 2022 · 6 revisions

This page describes the structure of the AST ("abstract syntax tree") of COL ("common object language").

Definition

Let's start by looking at the anatomy of a node:

final case class FloorDiv[G](left: Expr[G], right: Expr[G])(val blame: Blame[DivByZero])(implicit val o: Origin) extends Expr[G] with FloorDivImpl[G]
  • final: It's not encouraged to further specialize the behaviour of a node, so we mark concrete nodes final.
  • case class: We maintain the invariant that equal nodes are semantically equivalent. For most nodes, this means that they should define equality structurally, which is implicit for case classes.
  • FloorDiv: Node names should be short but descriptive. Too long: FlooringDivision, too short: FDiv, okay: FlooringDiv.
  • [G]: A type parameter without concrete meaning: it should be propagated into all subnodes.
  • (left: Expr[G], right: Expr[G]): the properties and subnodes of the node. For case classes it is not necessary to specify val in the first parameter list.
  • (val blame: Blame[DivByZero]): nodes that are responsible for an error can optionally have an extra parameter list with one blame parameter.
  • (implicit val o: Origin): origins loosely describe the reason a node is in the tree. One simple reason is that it existed in the supplied files.
  • extends Expr[G]: denotes the node family the node belongs to, in this case the expressions.
  • with FloorDivImpl[G]: to keep the file shorter, the implementation of nodes is moved into a trait in a different file.

Hierarchy

All nodes are defined in one file, Node.scala. This is because we want Node to be a sealed type: this means the compiler knows the type is closed, and match expressions can be checked for completeness.

Node is at the top of the node hierarchy. Its direct descendants are:

  • NodeFamily: A marker trait. Its immediate descendants constitute the "node families".
  • Declaration: The declarations are all nodes that can be referred to. Its immediate descendants constitute the "declaration kinds".

Node families and declaration kinds act very similarly in rewriters. Rewriting a node family always requires that you return a node from the same node family. For example, an Expr node is always rewritten to an Expr node, a Statement node always to a Statement node, and so on. Similarly, declarations are always collected at the granularity of a declaration kind. A program collects GlobalDeclarations (but not ClassDeclarations), a class collects ClassDeclarations (but not Variables), etc.

Adding a node

TL;DR: Copy a similar node in Node.scala, edit CoercingRewriter, implement CoercionUtils.getCoercion and Types.leastCommonSuperType for types, edit ColDefs for new declaration kinds, edit Printer.

All nodes have to have their definition in Node.scala to inherit Node. Since helpers are generated based on the structure of this file, you have to take into account:

  • Only classes and case classes are considered concrete nodes, other declarations (traits, abstract classes) are only considered for the inheritance hierarchy.
  • Nodes must have one unbounded type parameter named G, and either two or three parameter lists:
    • The type parameter must be named G, must have no type bounds, and must be exactly propagated into all node types.
    • The first parameter list defines the syntax tree. For non-case classes, these parameters must be marked with val. The allowed types are:
      • One of the declaration kinds (see ColDefs.DECLARATION_KINDS)
      • One of the node families (direct descendants of NodeFamily in Node.scala). No other (category of) node is allowed.
      • A Ref to a declaration. Contrary to other positions, Ref's may be to a more narrow category of declaration than a kind.
      • Primitive types (Int, String, Boolean, BigInt, BigDecimal)
      • The special type ExpectedError
      • Seq[_], Set[_], Option[_], (_, _) and (_, _, _) of supported types.
    • The optional second parameter list is val blame: Blame[_], where _ is a descendant of VerificationFailure.
    • The last parameter list is the origin, and must be exactly (implicit val o: Origin). Types have the default value DiagnosticOrigin for types.
  • Nodes must be a member of exactly one NodeFamily, or one kind of declaration. Nodes may be a NodeFamily or declaration kind all by themselves, see e.g. ApplicableContract, Variable.

Furthermore, consider:

  • Nodes are not implemented in Node.scala, other that mutable fields for resolution. Instead:
    • For a node T, make a trait TImpl in an appropriate package.
    • The trait must read trait TImpl[G] { this: T[G] => /* body */ }
  • Make sure the node is typeset in Printer
  • Define the typing rule for the node in CoercingRewriter
  • Exprs must define a method def t: Type[G] in their implementation trait
  • Types must consider both Types.leastCommonSupertype and CoercionUtils.getCoercion
  • Coercions must have a default implementation in CoercingRewriter.applyCoercion
Clone this wiki locally