Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anything new? #7

Open
ice1000 opened this issue Oct 28, 2019 · 45 comments
Open

Anything new? #7

ice1000 opened this issue Oct 28, 2019 · 45 comments

Comments

@ice1000
Copy link

ice1000 commented Oct 28, 2019

What's the plan after poly instantiation? I wish there can be:

  • Instance resolution (typeclasses with named instances)
  • Optimized reduction on naturals/booleans/whatever that can be built-in into a language
    • I suppose Agda has a fastReduce that throws the term onto an abstract machine to do the computation, that sounds cool
  • Reflection? Tactics?
@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Oct 28, 2019

One important TODO which I considered would be more pedagogical materials and more docs for numbered packages. I'm a bit lazy for this unfortunately, and I'm more motivated to figure out new stuff.

My general idea about packages is that they should be the simplest possible implementations of features, which are still sensible in the sense that you could scale them up to production strength without conceptual change.

Anyway, new packages that I considered:

  • Switching to de Bruijn indices and levels. This is pretty important in the long run, as we move to more complicated features it becomes harder to correctly handle shadowing names.
  • Indexed inductive types + general recursion + dependent pattern matching. This is a pretty big update, but I have pretty good idea about how to implement it. It would be similar to mini-tt but not entirely the same. It also requires top-level scope, with metas scoped on the top-level as well. In principle one could have generic codes for inductive types, but IMO for performance and ergonomics reasons it's better to have primitive inductive definitions in distinguished scopes.
  • Advanced unification: pruning, postponing, unification with sigma types. I have some ideas (partially stemming from the lessons of poly-instantiation) about fast and elegant ways of eliminating sigma types from equations and meta types, they are not yet fully worked out though.
  • Instance resolution. Unfortunately here I have no clear picture on what is the best design.
  • Modules. Similarly as with instance resolution.

I shall note that I believe the glued evaluation from smalltt to be critically important in production-strength implementation. I would like to add glued elaboration here as a package as well. But I also want to implement at some point the "fully scoped" version of smalltt, with arbitrary local metas, because it seems to be conceptually the most elegant and also the most efficient elaborator, which would also play along well with instance constraints, postponed constraints and let-generalization.

Agda has a fastReduce

The evaluator which I have in all packages is already on par with Agda fastReduce. It can be made a bit faster by switching to de Bruijn indices, and less space-leaky by having environment trimming, but that's it. More improvement would only come from switching to bytecode interpretation, possibly with custom runtime system (as in Lean 4 or Coq VM), or switching to machine code.

Optimized reduction on naturals/booleans/whatever

This is very easy with hardwired types, like in Agda. Would be harder to do in an extensible/generic way.

Reflection? Tactics?

I'm interested in this as well but have no strong opinions yet, and haven't deeply researched the existing approaches.

@atennapel
Copy link

A de Bruijn version of Holes would be awesome :)

@molikto
Copy link

molikto commented Nov 21, 2019

@AndrasKovacs if you are interested. I just finished HOAS to JVM bytecode https://github.com/molikto/mlang/blob/master/src-main/src/main/scala/mlang/compiler/PlatformEvaluator.scala#L903 it doesn't "full reduce" though, as in cubicaltt there is some compliciations.

@molikto
Copy link

molikto commented Nov 21, 2019

also may I ask what exactly is "glued evaluator", my understanding is when solving meta, whnf is needed, a glued evaluator will be able to trace back to a much simpler non-reduced term, so solved meta is more readable.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Nov 21, 2019

HOAS to JVM bytecode

Nice. Do you have benchmark data? Could you perhaps benchmark this as well? I've been interested in HOAS JIT for a while, haven't gotten around to testing it.

glued evaluator

See smalltt and the slides linked from there. The point is that whenever we attempt to solve a meta with a value, we want to

  • occurs/scope check efficiently
  • quote the solution candidate value to a small term, where definitions and metas are not necessarily unfolded

The two things are at odds with each other, unfortunately; the first requires to fully unfold (with call-by-need strategy preferably), the second requires to avoid unfolding as much as possible. A glued evaluator computes two (or more) differently represented values from the same syntactic term. I used it in smaltt, to always have the two specialized values available when solving a meta. It also avoids work duplication compared to when we have two different evaluators instead of just one glued evaluator.

@molikto
Copy link

molikto commented Nov 21, 2019

I just tried it:

https://github.com/molikto/mlang/blob/master/library/01_foundations/00_builtin.poor

forcing 10M seems not terminating in reasonable time

forcing1M:
// by value (branch full-reduction): polymorphic version: 5s, non-polymorphic version uses 79ms
// by need (branch master): 17s, 5s

conversion 1M polymorphic in both branch: ~5s
conversion 10M polymorphic in both branch: stackoverflow after ~5s

conversion checking non-polymorhic version 10M seems fast in both branch. (why?)


I don't know if my implementation is efficient though... let me debug and see what's happening...

I wonder why Coq native_compute failed to compile....


tried on JVM

JVM (implemented in Scala): 10M in 8ms


fixed a bug in full-reduction branch, now forcing 10M takes 2 ms (even faster than Scala?!!!)

but conversion checking seems NOT affected... still 5s and overflow.

I guess full-reduction is really good for computing normal form for closed terms.

@molikto
Copy link

molikto commented Nov 21, 2019

also this is what I do for reify https://github.com/molikto/mlang/blob/master/src-main/src/main/scala/mlang/compiler/semantic/10_value.scala#L122

basically it tries to see if this value is a whnf computed from other term, and the other term is used as meta solution, as it should be simpler.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Nov 21, 2019

2 ms (even faster than Scala?!!!)

A bit suspicious to me, maybe you are not doing the same amount of work. For forcing, you could also use a toInt function which turns a Church number into a native Int, then it's clearer if reduction isn't correct.

@molikto
Copy link

molikto commented Nov 22, 2019

2 ms (even faster than Scala?!!!)

A bit suspicious to me, maybe you are not doing the same amount of work. For forcing, you could also use a toInt function which turns a Church number into a native Int, then it's clearer if reduction isn't correct.

You are right!

in the full-evaluation mode, I measured the wrong thing!

so trivial forcing used time: 106ms, using count: 1716ms

This doesn't include the time to generate the JVM bytecode and load it to JVM, I only measured the time to evaluate the JVM expression.

[info] count simple construct 10000000

for conversion checking, I think it is mostly because the recursive function will stackoverflow on such a deep structure, it will not be longger than 5s if we change it to a loop (if possible?)


Sorry for the wrong results given. so the correct result is:

trivial forcing 10M:
master: OOM
full-reduction: 106ms

conversion 1M:
master: I got various results depending on if I run from SBT(13973ms) or IntelliJ IDEA (1125ms) I think I have different memory setttings
full-reduction: the same.

conversion 10M: stackoverflow on both


I wonder why my whnf implementation is so slow here. maybe HOAS doesn't help with evaluation speed, only when you do cbv with HOAS (full reduction).

I wonder what about a full-reduction cubical evaluator, currently:

computing "problem" https://github.com/mortberg/cubicaltt/blob/pi4s3_nobug/examples/problem.ctt#L519 in cubicaltt uses 40ms, and in mlang it is 200ms.

A full-reduction cubical evaluator will be computing open terms a lot of time, because it is just how cubical theories works, and you need "renormalization" which might be a waste of time (because later you need to do it again after a "swap"). So I am not sure if full-reduction will help a cubical evaluator... but I guess I am not doing it now, it is a lot to change to make it work full-reduction generally. What I have on full-reduction branch is just a hack to make the church numbers works.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Nov 25, 2019

@molikto my expectation is that JIT HOAS should be always faster, for cubical as well. Also, I did some Haskell benchmarking which showed that the difference between HOAS closed evaluation and HOAS open term normalization is not big, maybe 10-20%. I think that the difference should not be dramatic in other systems as well.

@molikto
Copy link

molikto commented Nov 25, 2019

@AndrasKovacs it seems cubicaltt use cbv and renormalize after context restriction (although it is in haskell, a lazy langauge, don't know if this makes any difference?)

mlang instead always works with whnf.

in cbv you need to renormalize, in cbn you don't, so this complicates things. but I don't actually know if this matters that much. maybe I should try convert mlang to cbv... (but then you seems still need to whnf in typechecking? because you need a good small term for meta solution?).

@AndrasKovacs
Copy link
Owner

@molikto What do you mean by "renormalize"? Can you give a link to a line in cubicaltt?

Cubicaltt uses call-by-need AFAIK.

@molikto
Copy link

molikto commented Nov 25, 2019

https://github.com/mortberg/cubicaltt/blob/master/Eval.hs#L106 here, after a nominal operation (act, swap, etc.), neutral values needs to be eliminated again (compare the VPair and Fst case), because context restriction can makes neutral value evaluate more

App r s -> app (eval rho r) (eval rho s), the argument is evaluated, but because lazyness of haskell, it is call by need? if same code in a strict language, it will be cbv?

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Nov 25, 2019

OK, I see what you mean by renormalization. I think it is just inefficient design in cubicaltt. They seem to use nominal sets for interval variables but normal NbE-style vars and evaluation for "normal" variables. IMO, nominal sets should be just removed entirely, and instead one can handle interval vars similarly to normal vars. That way, this instance of the "renormalization" issue disappears as well.

App r s -> app (eval rho r) (eval rho s), the argument is evaluated, but because lazyness of haskell, it is call by need? if same code in a strict language, it will be cbv?

Yes.

@jozefg
Copy link

jozefg commented Nov 25, 2019

I would think that this is actually necessary in cubical type theories because of the odd behavior of dimension variables. The problem is that if you perform a renaming which sends previously distinct dimension variables to the same thing, you can eliminate a (now trivial) coercion or composition. In particular, coe^{r -> r}_A(M) = M, but if r /= r' then this can be neutral if A is neutral. Similarly, if you have a composition with a face r = r, then you can reduce to that face.

The upshot of all of this is all the current implementations of cubical type theories I'm aware of have to be very careful to do seemingly redundant work.

@molikto
Copy link

molikto commented Nov 25, 2019

@jozefg hello, are you implementing XTT? :-)

@jozefg
Copy link

jozefg commented Nov 25, 2019

@molikto Sadly I think none of us are currently working on implementing XTT. In my case, time constraints and some other projects I'm more interested in at the moment, along with the fact that I'm unsure of how to extend the theory of XTT with some additional features I would like.

@AndrasKovacs
Copy link
Owner

@jozefg I'm aware of such reductions in CTTs. My observation is specifically about evaluation in cubicaltt, which seems to have rather major efficiency issues. I haven't looked at cubicaltt source closely before, I've noticed this just now.

On this line, we see that path application uses the @@ operator, which in turn uses single substitution in the beta-redex case. This is already bad news. However, because everything is an NbE-style Val, single substitution needs to act on Val-s. On this line we see single substitution acting on closure environments, which is really bad. During operation of usual NbE, there is a huge amount of implicit sharing in closures. This can be observed by trying to pretty print closures and noting that their size blows up quickly. Now, single substitution on closures, just like printing, kills all of that sharing.

Unsurprisingly, the solution is to get rid of naive substitution. If we want to keep interval variables and expressions separate from the non-interval ones, we can just have dual or explicitly interleaved value environments, and handle interval substitutions the same way as non-interval ones. This setup is also compatible with the cubical reduction rules.

@jonsterling
Copy link

@AndrasKovacs I am interested in what you are saying, but I don't see how it is possible to treat interval substitutions "the same" as we treat other substitutions. What I mean is the following:

In the Coquand-style nbe algorithms, as you know, we never do substitutions; instead, we have closures, and all the "substitutions" that need to happen correspond exactly to instantiation of closures.

It would be nice if cubical type theory's interval variables worked this way! Unfortunately, there are plenty of times where you apparently need to do a "substitution" which is non-local, in the sense that you do not have your hands on some closure that you can just instantiate.

Have you thought about these algorithms and whether there is a way around this problem? I have been wanting to think about it again. Unless we can fix this, I think cubical type theory may be kind of hopeless as far as efficient implementation is concerned.

@jonsterling
Copy link

jonsterling commented Jan 5, 2020

(Of course, I am aware that cubicaltt and redtt suffer from some very naive mistakes in implementation: but I am talking about the apparently more difficult problems which can't be chalked up to hurried implementation.)

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 6, 2020

Unfortunately, there are plenty of times where you apparently need to do a "substitution" which is non-local

I guess you mean the cases when we have to pattern match under abstraction, e.g. in composition computation rules? I agree that these are not so straightforward. But a) there should be at least several possible implementations which are not as nearly bad as recursively de-sharing closures b) even if we handle the tricky cases naively, the simple cases, such as path lambdas and applications, can still benefit greatly from closure-based implementation.

I note that the issue of things not lining up neatly as closure instantiation is not unique to cubical TT. In fact, it pops us when elaborating plain old type theory as well. A classic example is inferring a type for an annotated lambda λ (x : A). t. This yields a type Γ, x : A ⊢ B, which is given as some Val Coquand-style; but we want to return a Pi with a closure for codomain. So we need to turn a value in Γ, x : A to a closure.

This is similar to a (simplified) coercion computation case, where we have coe : Path A B → A → B, and the rule

coe (λ i. A → B) f ≡ λ a. coe (λ i. A) (f (coe (λ i. B[i ↦ 1-i]) a))

Here, the computation rule peeks under the closure. The obvious way to match on A -> B would be to evaluate the closure body with i instantiated to some neutral variable, but then we again need to turn A and B values in Γ, i to closures, in order to build the right-hand-side value (ignoring for the moment the additional 1-i complication).

I list some solutions for doing the value-closure conversion.

  1. Quoting to nf, then building the closure. This is super simple and I blithely use it whenever I don't care about performance. The downsides are obvious.
  2. If we already use "glued" evaluation, we can take the "local" part of the glued value, quote to local nf, then build the closure. I used this solution in smalltt. This is much more efficient than quoting to full nf, and is very simple to implement if we have glued evaluation. The downside is that we have to implement glued evaluation.
  3. This one is really a closure-closure conversion, and thus only works for the coe case: use typecasing projections to get A and B types, in a somewhat similar way to the XTT paper. Here, we first evaluate the closure body to check that it's actually A -> B. Then, we use the typecase projections on the unevaluated closure body, to get terms for A and B. So, the projections are purely implementation details, and we need not expose them in the surface language. I feel that this could be quite efficient.
  4. Do what cubicaltt does, and use an eager substitution operation on values. I don't know for sure, but I think that this is roughly as bad as option 1. In particular, it forces every call-by-need thunk and removes all sharing in values and closures, and repeated substitution has to copy everything all over. Option 1 also forces every thunk and value, but it doesn't necessarily force closures. The difference can be significant when closures are not trimmed/strengthened, as in cubicaltt, because in that case many closure entries would be otherwise unused, and forcing a closure forces a whole local scope. One might object that cubicaltt performs value substitution very lazily, so all of this forcing may not happen, but lazy substitution isn't good either; it leaks more space, it results in a thunk-fest, and when we actually want to do conversion checking the accumulated thunks cause an inefficient ping-pong control flow with lots of indirect jumps.
  5. Use smarter delayed substitution on values, with more general explicit substitutions. The issue here is that there are many choices with regards to pushing explicit substitutions around and composing them, and I don't know much about how it should be done. I'm confident something way better than Option 4 is possible. I've been meaning to try my hand at this and do some benchmarking. Another possible issue is that it's easy to prevent usage of native GHC call-by-need evaluation when we use general explicit substitutions (and IMO the call-by-need support is a key feature that makes GHC the top choice for fast type checking, unless someone wants to roll their own runtime system in C/Rust).

So there are definitely many things to try. I feel now a bit motivated to explore this and attempt some high-performance cubical TT implementation. There's a chance I'll start doing this a couple of weeks from now when the current paper deadline season is over.

@jonsterling
Copy link

I would be very interested in continuing to discuss this, and perhaps collaborating; our research group is beginning to resume thinking about how to properly evaluate cubical type theory, so it would seem to be good timing.

I agree that in the case of implementing coe, some kind of closure/value munging can be done, but the problem is that there really do appear to be cases that do not fall under that category. These are deep in the algorithm of composition in glue (or in V + formal composite types, if you follow the AFH style); I tried a number of "smart" things of the style you are advocating, and it was unfortunately late in the process that I found they didn't scale up to implementing the harder parts of the evaluation semantics of cubical type theory.

Perhaps a careful analysis of composition in glue / V / fcom can provide some insight, but at this point I am quite cautious about any approach which doesn't allow me to just substitute a dimension variable non-locally (even though I am aware that is very inefficient).

In redtt, we had some delayed substitutions going on, but these were incorrectly implemented and very hard to debug.

@jonsterling
Copy link

Regarding calling quote from eval, I have not considered doing this because I do not know how I would prove such an algorithm correct in principle (this approach was considered back in the original OTT paper, but later abandoned when they found an alternative method which was more conventional).

Of course, I believe that one must interleave quotation and evaluation in order to implement features like regularity; it is perhaps a blessing that we don't have regularity in most versions of cubical type theory ;-)

@molikto
Copy link

molikto commented Jan 7, 2020

@AndrasKovacs Mortberg suggested me to checkout pi4s3_nobug branch in cubicaltt when I am working on mlang. This is the branch with most updated computation rules I think.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 7, 2020

@jonsterling Thanks for the comments. I don't yet have a good grasp on gluing composition; I will be hopefully more qualified to comment after I've delved deeper into this.

I do not know how I would prove such an algorithm correct in principle

NbE with universes has to include quote in the model together with the correctness of everything, and I expect that this wouldn't change when moving to NbE for cubical TT. So it doesn't seem far-fetched to me.

@jonsterling
Copy link

@AndrasKovacs I am not sure what you mean about nbe with universes; it is not the case that eval and quote are mutually recursive in nbe with universes.

@jonsterling
Copy link

BTW, I have recently been studying your "glued evaluators" and it is really fascinating; I think I understand in principle what you are doing, but at the technical level I got a little confused... My problem was that I was trying to look at the code and find the moment at which you did something different on the local vs the global side (where the "rubber meets the road"); could you point me to some decisive line of code which would give me the insight?

Once I understand it a bit better, I'd like to explore the possibility of adopting this style in my future implementations of type theory.

@ollef
Copy link

ollef commented Jan 7, 2020

Inspired by your idea of glued evaluation, I've done the following in Sixty. I've found it to be quite easy to implement and it seems to allow many of the same optimisations. The gist of it is to add an extra domain constructor, which tracks a head and spine that isn't unfolded, as well as a lazy unfolded value.

Quotation uses the head and spine to get small terms. Unification first tries unifying the head and spine before diving into the unfolded value. Scope checking does something similar.

I'd be interested to hear your thoughts on this @AndrasKovacs. I'm pretty sure it isn't quite as efficient as your version, but I've found it quite pleasant to work with.

@jonsterling
Copy link

@ollef Looks pretty cool! I just copied your idea and implemented it in my "cool-tt" prototype.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 8, 2020

it is not the case that eval and quote are mutually recursive in nbe with universes.

To my knowledge it is the case. Both here and here, quoting is included in the semantics, i.e. a type is interpreted as a dependent preasheaf together with quote and unquote maps.

Now, if we're looking at the above NbE implementations as they are, quoting is not given mutually with the evaluation morphism into the model; in a hypothetical pattern-matching definition in Agda, quote wouldn't call eval. But when we are implementing stuff OCaml/Haskell, we often use explicit first-order closures instead of implicit ones living in the metalanguage, and we do explicit eval in closure instantiation, so there quote and eval are really mutually recursive.

Without universes, e.g. as in here, quote can be given by recursion on types post-hoc, and there even an explicit closure-based implementation isn't mutually recursive.

I'm being a bit hypocritical here, because in my TT implementations I never actually (operationally) use NbE given by modified glued models in presheaves over renamings. Instead I use purely syntax-directed NbE with Abel-style "liftable terms", which I understand way less on a formal level than normalization-by-gluing.

Anyway, I'm saying that if NbE extends from MLTT with universes to cubical TT and the basics don't change too much, then I'm always able to quote terms to nfs when giving the model, and with Russell-universes as in Thierry's version, I can also quote types to normal types. So implementing a regularity rule seems feasible, assuming that it's not blocked by other issues.

@jonsterling
Copy link

jonsterling commented Jan 8, 2020

@AndrasKovacs I think there may be some misunderstanding about how quotation works in your summary of the literature. In thierry's paper that you link to, quotation is indeed bundled into the computability data of each type, but it is not "mutually recursive" with evaluation. In fact, evaluation isn't even defined until after the entire model is constructed; evaluation is obtained from the initiality of syntax.

With all this said, there is quite a bit of asymmetry between the role of evaluation in operational accounts of nbe (e.g. Abel-style nbe, which we implement) and semantic accounts of nbe (Altenkirch-Kaposi 2016[?], Coquand 2019, etc.).

It is quite easy to see however that the algorithm for evaluation never calls quote or readback, if you "extract" it from the semantic constructions (e.g. those of Thierry's paper). That's all I meant when I said that there is no mutual recursion.

Anyway, I'm open to using a mutual recursion if it is totally necessary, but I see it as a "code smell", a real indication that something about the setup isn't yet understood properly.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 8, 2020

evaluation is obtained from the initiality of syntax.

I was referring to this point when talking about using closures vs. meta-level functions for implementation.

It is quite easy to see however that the algorithm for evaluation never calls quote or readback, if you "extract" it from the semantic constructions (e.g. those of Thierry's paper)

Thierry's paper clearly calls quote in the interpretation of brec (Bool induction) on page 7; any extracted evaluation function for terms would need to call quote, as far as I see.

Perhaps it's better to avoid using the term "mutual recursion", which a bit fuzzy, and just talk about (dependent) models.

To phrase my point differently: in Thierry's model, if I wanted to, I could quote things to nf and decide conversion/strengthening. E.g. I could add a definitional equality to the syntax which says that the Bool -> Bool -> Bool equality testing function shortcuts to true whenever the arguments are definitionally equal, and Thierry's model would still work.

@jonsterling
Copy link

I will have to take a look at Thierry's paper again when I have more time, but let me just say that it is totally unnecessary to call quote in eval of bool induction. I have implemented that dozens of times.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 8, 2020

@jonsterling It's necessary if we want nfs to be eta-long for function types and negative types in general. In that case there is no inclusion of arbitrary neutral terms into nfs, only those which have neutral types or positive types. I agree that this is something that we totally ignore in practical implementation, because we use a separate conversion checking algorithm that can handle etas. But if the goal is to get decidable conversion with function eta out of normalization-by-gluing, then this is essential.

@jonsterling
Copy link

@AndrasKovacs I do not believe that is true. It is very easy to achieve eta long normal forms for negative types without calling quote from eval; or if I am missing something, can you please supply a counterexample?

@jonsterling
Copy link

@AndrasKovacs I spent my walk to the office thinking about your comments -- I still haven't had time to check Thierry's paper again, but now I understand the point about bool-ind in semantic NBE. The computability data for the motive must of course bundle a family of quotation functions (really, we should call these "reification" functions, not quotation! Quotation is a totally different thing).

However, as you know, when you defunctionalize the naive semantic algorithm, this ceases to be necessary. But based on this discussion, I think I see that these semantic versions of nbe may actually provide a path to justifying these very non-traditional algorithms which call quote from eval.

Regarding the eta-long negative types, I am still puzzled by the comment. This is always achieved by implementing quote in a type-directed manner, and eta-long-ness is not affected by anything you do in eval.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 8, 2020

Regarding the eta-long negative types, I am still puzzled by the comment.

I can attempt an explanation why (eta-long negative types)+(positive types) requires calling quote from eval. Let's work with STLC with functions and Bool, and we have the recursor

brec : Tm Γ Bool → Tm Γ A → Tm Γ A → Tm Γ A

We have normals and neturals as:

data Nf where
  lam   : Nf (Γ, A) B → Nf Γ (A ⇒ B)
  ne    : Nf Γ Bool → Nf Γ Bool
  true  : Nf Γ Bool
  false : Nf Γ Bool

data Ne where
  ...
  brec : Ne Γ Bool → Nf Γ A → Nf Γ A → Ne Γ A

Let's do a simple presheaf model over renamings; every type is a presheaf. We interpret Bool as follows:

⟦Bool⟧  = Nf _ Bool

So ⟦Bool⟧ Γ is either true or false or neutral. It's a bit special that we can simply take Nf here, for coproducts we would have

⟦A + B⟧ = ⟦A⟧ + ⟦B⟧ + Ne _ (A + B)   

Now for brec

⟦brec b t f⟧ : ⟦Γ⟧ ⇒ ⟦A⟧
⟦brec b t f⟧ γ = case ⟦b⟧ γ of
    true  -> ⟦t⟧ γ
    false -> ⟦f⟧ γ
    ne b  -> ?

We need to somehow turn a stuck brec into a semantic value in ⟦A⟧, but we can't really. What works is to include in the model reify : ⟦A⟧ ⇒ Nf _ A and reflect : Ne _ A ⇒ ⟦A⟧ for each A type. If we do this, we have a displayed model instead of a plain one. Now we can do

...
ne b -> reflect (brec b (reify (⟦t⟧ γ)) (reify (⟦f⟧ γ)))

Which is what Thierry did in the note. I don't have a super pithy or intuitive take on this issue unfortunately.

There is more discussion in Abel & Sattler. They have an alternative, more sophisticated implementation for STLC which avoids quoting from eval, and supports some eta laws for coproducts as well. I don't know if the same would work in dependent TT.

What if we don't want eta-long functions, but want positive types? Then any neutral term is normal, and we can define a plain presheaf model, where neutral terms can be included in any semantic type, so that every A is interpreted as (Ne _ A) + ..., and everything works. I believe this is close to what we do in our practical implementations.

I also remember that having a universe + eta-long negative types also requires quoting from eval or some other trickery. Unfortunately, I don't recall details at all; maybe there's something with the interpretation of type codes.

@jonsterling
Copy link

jonsterling commented Jan 8, 2020

@AndrasKovacs I see what you mean about needing to call reify and reflect, but please note that in Abel's defunctionalized nbe, these are just generators of the semantic domain, so there is no actual recursive call happening.

This is why, as I have been trying to convey for a while, there is no call to quote or even any eta expansion at all happening during evaluation, and it is extremely misleading to think of things that way. And yes, in Abel's algorithm, you do get eta long normal forms. The way it works is that quote does the heavy lifting: when it hits a "reify" or "reflect" node in the semantic values, it does the right thing.

I agree with your general point about semantic nbe, but the difference here is very important: calling an actual quotation function during evaluation is kind of nutty because it results in walking the entire term, and it is a fact for many years that we can totally avoid such calls in the implementation of standard MLTT (yes, with eta-long normal forms at negative type, and weak positive types); this is possible through Abel's defunctionalization of nbe.

On the other hand, it is not clear that more ambitious use of actual quotation during evaluation can be accounted for in the same sensible way. Let me list below some applications of quote-during-eval which do not appear to fit in Abel's defunctionalized paradigm, and currently seem to require an actual recursive call in the algorithm:

  1. non-linear pattern matching (as I think you mentioned at one point)
  2. regularity in cubical type theory
  3. boundary separation in XTT
  4. the puzzling extent operation of Cavallo and Harper's parametric type theory

This is why I am making a distinction between genuine recursive calls, and those which may be easily defunctionalized. I think I agree with you now that there may not be a problem mathematically justifying a mutual recursion, but it remains that this is a kind of bizarre and expensive thing to do, indicating that we do not yet fully understand the operational aspects of the type theories which seem to require (1-4) above.

@AndrasKovacs
Copy link
Owner

Can you give a reference for "Abel's defunctionalized nbe"? It's not clear to me what that means.

@jonsterling
Copy link

I mean the style of nbe where you use closures instead of "higher order", described in his habilitation thesis.

@jonsterling
Copy link

jonsterling commented Jan 8, 2020

The main contribution was to kill the 'higher order' aspect of nbe algorithms by adding inert constructors to the semantic domain for reification and reflection, which are then interpreted appropriately by the quote operation, sometimes called readback.

Old-school algorithms for nbe on an untyped domain required a semantic domain which is not a set; domain theory was used in the 90s to solve the recursive domain equation. In 2020, the situation is more clear:

  1. If you want an untyped semantic domain (which you usually have in an implementation), then you may avoid domain theoretic justifications by defunctionalizing a la Abel. Then, you just have a totally syntactic / inductive representation of the semantic domain, using closures and inert reify/reflect nodes. Here, the objects are easily shown to exist in ordinary set theory, and likewise, the correctness of the algorithm is shown (laboriously) in ordinary set theory as well.

  2. On the semantic side, one may simply abandon the untyped domain and use a typed domain, recovering the higher-order aspect of 90s-style nbe. This typed approach is represented by the following authors: Altenkirch+Hofmann+Streicher 1998(?); Fiore 2002; Altenkirch + Kaposi 2016 (?); Coquand 2019.

I like the typed approach best, but I don't yet have a programming language which can represent it ;-) I need QIITs.

@AndrasKovacs
Copy link
Owner

AndrasKovacs commented Jan 8, 2020

Is the "reifly/reflect inert node" essentially the same as this?

I haven't made the effort to look into algorithms which quote to eta-long forms efficiently, because I find them practically inferior to implementing conversion checking directly in the semantic domain. These algorithms are between between the Most Practical and the Formally Nicest; they are inefficient and still unwieldy formally.

In my mind the Practical and Formal approaches are, perhaps unfortunately, rather separate. Hence, when justifying non-linear patterns or regularity, I'm approaching this purely from semantic nbe standpoint, and I'm content to have any inefficient solution.

I admit this is not a good situation, but I have no idea how to make formally nice nbe fast, or fast nbe nicely formal.

@jonsterling
Copy link

Yes, that's the idea; Danny, Lars and I also spent some time trying to explain Abel's work in our recent paper: https://dl.acm.org/doi/10.1145/3341711

BTW, I strongly agree with you that in practice there is no need to quote to eta-long normal forms because you should definitely implement conversion directly in the semantic domain.

@molikto
Copy link

molikto commented Jan 10, 2020

@jonsterling @AndrasKovacs will solving the non-local substitution makes cubical type theory efficient enough to compute Brunerie's number though? I am not sure... I think the computation rules for transpGlue produce really big intermediate terms when computing something like Brunerie's' number

@jonsterling
Copy link

@jonsterling @AndrasKovacs will solving the non-local substitution makes cubical type theory efficient enough to compute Brunerie's number though? I am not sure... I think the computation rules for transpGlue produce really big intermediate terms when computing something like Brunerie's' number

I don’t know for sure, but I think it is a prerequisite for making more progress on the problem :) I have some new ideas this week on avoiding naive dimension substitutions which I will try to experiment on soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants