Embedding by Normalisation

My understanding of "object language" itself is merely (?) visual, so in both shallow as well as deep embedding the object language, as the thing that visually appears as words in user programs is defined by an "interface formed by a set of functions (or values) in the host". So whether that is the shallow

eps :: [Char]
chr :: Char -> [Char]
con :: [Char] -> [Char] -> [Char]

or the deep

eps :: CharsD
chr :: CharD -> CharsD
con :: CharsD -> CharsD -> CharsD

the EDSL programs (e.g. (con eps (con chr 'a' eps))) will look almost identical in both cases, just that the latter requires an an additional evaluation step to reach a value of [Char].

So saying in 3.3.3 that

syntactic domain == object language (by the correspondence) == datatype

was a little counter-intuitive to me.

In Gibbon's 2014 paper we can sort of see this set of language words given as function names and the act of "encoding" refers to what these functions are mapped to:

As a deeply embedded DSL, the two operations lit and add are encoded directly as constructors of an algebraic datatype:

[...]
lit n = Lit n
add x y = Add x y

I understand that the EBN correspondence in you paper does not use the term "encoding" in the same way, which may confuse readers.

What do you think of stating the correspondence like this instead:

             NBE <-> EBN

Syntactic Domain <-> Encoding of Object Language
Semantic Domain  <-> Host Language (a subset of)
Evaluation       <-> Evaluation
Reification      <-> Code Extraction

Evaluation in NBE simply corresponds with Evaluation in EBN. I believe the rest of Section 3.3 may remain unchanged in that regard:

  • In shallow embedding evaluation is "the overall evaluation of the interface ..." (as 3.3.1 says)
  • In final tagless embedding, evaluation corresponds with the instantiation of a type-class for a term of the object language (as 3.3.2 essentially says) and potential postprocessing
  • In deep embedding evaluation is the evaluation/processing of the constructed data type (as 3.3.3 says)
  • Analogue for quoted

Keep in mind that in final tagless embedding, unlike shallow embedding or deep embedding, the evaluation is not prescribed by the object-language encoding (syntactic domain). So I am hesitant with its description as "a specific form of shallow embedding". It is too flexible for that. A type-class instance may directly evaluate an EDSL program to a [Char] value, or it may yield a CharsD value, which in turn would need further evaluation... so in essence tagless final is (or rather does not have to be) neither shallow nor deep. It depends on the concrete interpretation (type-class-instance) of a given term.

So for the (changed) Syntactic Domain correspondence we have:

  • In shallow embedding the encoding of the object language is a (fixed) interface that maps to the Semantic Domain
  • In final tagless embedding the encoding of the object language is a flexible interface formed by a type class in Haskell (or signature in ML, trait in Scala, etc.) whose mapping depends on concrete instances; performing a direct evaluation as in shallow embedding is one option of many
  • In deep embedding the encoding of the object language is a (fixed) interface that maps to a datatype (representing the syntax)
  • Analogue for quoted

P.S. My own work on EDSLs is much less formally motivated or of (theoretical) interest. Thus, I have mostly relied on a more fuzzy, less formal categorization of approaches (to my dissatisfaction). I am very happy that someone (you) is trying to tackle this formally.

/r/haskell Thread Parent Link - arxiv.org