The Synthesizer Generator:
Specifying an Editing Environment


Overview

The Synthesizer Generator creates editing environments based on specifications written in the Synthesizer Specification Language (SSL).

SSL is illustrated by a running example in which a simple desk calculator is specified. Picture, if you will, a full-screen desk calculator that allows creation and modification of an integer expression, during which time the expression's current value is computed and displayed.

    Desk Calculator 1
    The expression is editable on the first line; its value is shown on the second line.

    Desk Calculator 2
    After editing ``10'' into ``20'', the value is immediately updated.

    Desk Calculator 3
    After editing ``5'' into ``((2*3)-6'', an error is immediately detected.

Additional functionality of the desk calculator will be explained later.

An editing environment specification consists of the following:

With the exception of the Scheme scripts, editor specifications are declarative.

Abstract Syntax

The core of the specification is the definition of the language's abstract syntax, given as a set of grammar rules:


        exp: NullExp() | Sum(exp exp) | Diff(exp exp) | Prod(exp exp) | Quot(exp exp) | Const(INT) ;

The example declares six alternatives for exp. Each alternative is identified by an operator, e.g., NullExp. As you will see shortly, operators play a key role in the SSL notation.

Operators have 0 or more arguments listed (without commas) within parentheses. The arguments of an operator indicate substructure, e.g., each Sum has two exp constituents. Operator Const has one constituent, an INT, which is an integer constant. The first declared operator of a nonterminal (e.g., NullExp) is termed the completing operator, and is used by default to represent unexpanded occurrences of that nonterminal.

Each buffer in a running editor contains an abstract syntax tree with respect to the specified abstract syntax.

Attribute Declarations

Abstract syntax trees in editor buffers are decorated with attributes, i.e., values that describe the given tree. Each nonterminal is associated with 0 or more attributes. Each attribute has a declared type and name:

        exp { synthesized INT v; };
The sample declaration states that each instance of exp in an attributed abstract syntax tree is to be decorated with an INT attribute named v. The notation synthesized is explained below.

It is our intention that the value of attribute v be the integer value of the subexpression it decorates. Every instance of the same nonterminal has the same attributes, albeit possibly with different values.

Attribute Equations

Attributes are defined equationally in terms of other attributes and constants. This is much like a spreadsheet, in which (some) cells are defined equationally in terms of other cells and constants.

Here are the equations associated with the fragment of abstract syntax given above:

        exp: NullExp{ exp.v = 0; } | Sum{ exp$1.v = exp$2.v + exp$3.v; } | Diff{ exp$1.v = exp$2.v - exp$3.v; } | Prod{ exp$1.v = exp$2.v * exp$3.v; } | Quot{ exp$1.v = (exp$3.v==0) ? exp$2.v : exp$2.v / exp$3.v; local STR error; error = (exp$3.v==0) ? "<-division by 0->" : ""; } | Const{ exp.v = INT; } ;

The equation associated with the operator NullExp defines the value of an unexpanded expression to be 0, an arbitrary choice.

The equations associated with operators Sum, Diff, and Prod define the attribute v much as you would expect. Because a given nonterminal may occur more than once in an alternaitve, it is sometimes necessary to qualify which occurrence of the nonterminal is intended. The equation for Sum may be paraphrased as ``the value of an exp is computed by adding the values of its two arguments.'' Note that this has nothing to do with the name Sum, and everything to do with the semantics of the builtin arithmetic operation +.

The definition of attribute v for operator Quot illustrates SSL's use of C-style conditional expressions. We define the quotient to be 0 when the divisor is 0, an arbitrary choice. The expression on the right side of each attribute equation must be defined in every possible situation. If we had not used a conditional epxression, the editor would crash when given a 0 divisor.

The equations for attribute v, in effect, pass information up the abstract syntax tree. Such attributes are said to be synthesized attributes. Every synthesized attribute of a nonterminal on the left side of an abstract syntax rule must have an equation in every alternative.

The equations for Quot illustrates the use of local attributes. Local attributes allow one to define a computation in one operator without imposing the requirement that definitions for that attribute be provided in every production, as would be the case if the error attribute were a synthesized. Local attribute error is either the null string or an appropriate error message.

Prettyprinting Rules

Each editor buffer contains an attributed abstract syntax tree; each window contains the textual presentation of a buffer.

The textual presentation of an abstract syntax tree is defined by prettyprinting rules, called unparsing rules -- one or more schemes for each production. Each unparsing scheme consists of strings, names of attributes, and names of right-hand side nonterminals. (For pedagogical purposes, we are temporarily making a slight technical simplification below.)

        exp: NullExp[ "<exp>" ] | Sum[ "(" exp$2 " + " exp$3 ")" ] | Diff[ "(" exp$2 " - " exp$3 ")" ] | Prod[ "(" exp$2 " * " exp$3 ")" ] | Quot[ "(" exp$2 " / " error exp$3 ")" ] | Const[ INT ] ;

The display is generated by a left-to-right traversal of the abstract syntax tree that interprets these unparsing schemes. For example, the rule for operator Quot can be paraphrased as:

    A Quot is presented as a left parenthesis,
    followed by the display of its left operand,
    followed by a slash (surrounded by blanks),
    followed by the value of attribute error,
    followed by the display of its right operand,
    followed by a right parenthesis.

In the example, we wish to display the value of the entire expression on the second line. To do so, we extend the abstract syntax with a root nonterminal:

        root calc; calc: Top( exp );
and give an unparsing scheme for Top that incorporates the expression's value in the presentation:
        calc: Top[ exp "%nVALUE = " exp.v ];
The ability to incorporate attribute values in the presentation (or just use them just to control the presentation) lets editor designers easily implement powerful context-sensitive display effects.

For completeness, the unparsing rules (simplified above for pedagogical purposes) are given here exactly as they would appear in SSL. However, we omit any further explanation of the notation.

        calc: Top[ @ "%nVALUE = " exp.v ] ; exp: NullExp[ "<exp>" ] | Sum[ @ ::= "(" @ " + " @ ")" ] | Diff[ @ ::= "(" @ " - " @ ")" ] | Prod[ @ ::= "(" @ " * " @ ")" ] | Quot[ @ ::= "(" @ " / " error @ ")" ] | Const[ @ ::= ^ ] ;

Styles

Regions of the presentation can appear in different styles, i.e., different fonts, font sizes, slants, weight, colors, etc. In editor specifications, styles are just an uninterpreted names. Styles are late bound to different style properties, permitting local customization by end users.

The prettyprinting rules given above do not specify styles. The replacement prettyprinting specification below uses two styles, Placeholder and Error:

        style Placeholder, Error; exp: NullExp[ @ ::= "&lt;%S(Placeholder:exp%S)&gt;&quot; ]<br> | Sum[ @ ::= &quot;(&quot; @ &quot; + &quot; @ &quot;)&quot; ]<br> | Diff[ @ ::= &quot;(&quot; @ &quot; - &quot; @ &quot;)&quot; ]<br> | Prod[ @ ::= &quot;(&quot; @ &quot; * &quot; @ &quot;)&quot; ]<br> | Quot [ @ ::= &quot;(&quot;@&quot;%S(Error:&quot;error&quot;%S)&quot;@&quot;)&quot;]<br> | Const[ @ ::= ^ ];
The unparsing string for NullExp means:
    Display a left angle bracket,
    then switch to the Placeholder style,
    then display the text exp,
    then revert to the previous style,
    then display a right angle bracket.

The following fragment from an X Resource Database specification binds the styles to presentation properties:

      Placeholder: +italic, &lt;green, *, *, green&gt;<br> Error: +bold, &lt;red, *, *, red&gt;

The effect is illustrated in the following screen shot:

    Desk Calculator 4
    An example illustrating styles.

Sparse Views

An editor buffer can be displayed in multiple windows. Different windows may display the same buffer in different views. A view is just a different collection of prettyprinting rules for the same abstract syntax.

Sparse views display only a projection of the editor buffer, where membership of a node in the projection is determined by evaluation of a Boolean expression. The buffer still only has one structural selection, so it is possible to navigate in a sparse view and have the full view scroll in sync.

The following specification declares sparse view MESSAGES and defines the view to have one line per ``division by 0'' message:

        sparse view MESSAGES; exp: Quot[MESSAGES error "%n"] { in MESSAGES on error != ""; };

For the desk calculator, this is clearly overkill. However, when a substantial file appears in one window, it is very nice to have just the error messages in another window.

Subtree Replacement

Each editor buffer contains an attributed abstract syntax tree whose textual presentation is displayed in a window. Both structural and textual editing operations are possible, and will be illustrated shortly. In either case, however, every editing operation results in the replacement of some (possibly interior) subtree U of the object with some other subtree U'.

As a result of the replacement, attributes local to the replacement point(s) in the tree are made inconsistent.

    Tree 1

Optimal Updating

The job of the Synthesizer Generator's incremental attribute evaluator is to reestablish correct attribute values throughout the object, and to do so efficiently. It does this by a change-propagation process: starting at the replacement point(s) and following dependences, attributes are reevaluated until no further changes occur. The Synthesizer Generator's optimal incremental attribute evaluator does work proportional to AFFECTED, the set of attributes that actually change value.

    Tree 1

Transformation Rules

Two sorts of structural edit operations are supported: cut-and-paste and transforms.

Structural cut-and-paste between the selection (some subtree of an object) and a cut buffer are built in and have a clear interpretation as subtree replacement. Cut-and-paste operations are type checked upon invocation to prevent the introduction of syntactic errors.

A transform at a given nonterminal (e.g., exp) consists of a label (e.g., "+"), a pattern (e.g., NullExp()), and a replacement expression (e.g., Sum(NullExp(), NullExp())). For example:

        transform exp on "+" NullExp(): Sum(NullExp(), NullExp()), on "-" NullExp(): Diff(NullExp(), NullExp()) ;

The transform labeled "+" can be paraphrased as: replace a placeholder for exp by a template for a Sum expression containing placeholders for the two operands. such a transofrm is termed a template.

Transforms with matching patterns are listed in the Transform menu, and in the context pane of each window (if present). For example:

    Desk Calculator 5
    Template insertion: before.

Because transformations are type checked at editor construction time, they cannot introduce syntactic errors.

    Desk Calculator 6
    Template insertion: after.

The next example illustrates more general enabling conditions for transforms:

        transform exp on "factor-left" Sum(Prod(a,b), Prod(a,c)): Prod(a,Sum(b,c)), on "absorb" Prod(a,b) when (a.v==1): b ;

Transform factor-left is enabled whenever the selection is a sum of products, and the left operands of the two products are equal.

As mentioned before, the Synthesizer Generator guarantees that transformations preserve syntactic correctness (e.g., factor-left maps exp's to exp's). However, the Synthesizer Generator cannot guarantee that a transform preserves semantic correctness (e.g., factor-left is justified by the Distributive Law of arithmetic, which the Synthesizer Generator knows nothing about).

    Desk Calculator 7
    Source-to-source transformation: before.

    Desk Calculator 8
    Source-to-source transformation: after.

An arbitrary Boolean expression can also be used to test whether a transformation is to be enabled; such Boolean expressions have access to both the syntax (the subtree) and the semantics (the attributes) of the selection. Transform absorb can be paraphrased as: a product whose left operand has value 1 can be replaced by its right operand.

Concrete Parsing Syntax

The Synthesizer Generator provides a very general mechanism for defining textual user interfaces. Using this mechanism, one can define command-driven and parser-driven interfaces in whatever balance is desired.

Lexical definitions (a la lex or flex) and parsing rules (a la yacc or bison) are defined in a slightly sugared form that matches the other declarations of SSL:

        INTEGER: < [0-9]+ > ; Calc::= ( Exp ) ; Exp::= ( Exp '+' Exp ) | ( Exp '-' Exp ) | ( Exp '*' Exp ) | ( Exp '/' Exp ) | ( INTEGER ) ;
Although the upper case and lower case nonterminals (e.g., exp and Exp) bear an obvious relation to one another, as far as SSL is concerned, they are unrelated.

The translation from concrete to abstract syntax is performed by attribution. For each nonterminal in the concrete syntax, we declare an attribute (say abs) to represent the corresponding abstract syntax tree:

        Calc { synthesized calc abs; }; Exp { synthesized exp abs; };

Note that the nonterminals of the abstract syntax are serving as the types of the attributes.

A correspondence is established between the abstract and concrete nonterminals:

        calc ~ Calc.abs; exp ~ Exp.abs;
The tilde-rule specifies the correspondence between abstract and concrete nonterminals. For example, the first rule shown says that when the cursor is positioned at nonterminal calc in the abstract syntax tree, textual input is to be parsed as a Calc, and attribute abs from the root of that tree is to replace the current selection.

The full concrete syntax with attribution rules (replacing the concrete syntax given above) is:

        Calc::= ( Exp ) { Calc.abs = Top(Exp.abs); } ; Exp::= ( Exp '+' Exp ) { Exp$1.abs = Sum(Exp$2.abs, Exp$3.abs) ; } | ( Exp '-' Exp ) { Exp$1.abs = Diff((Exp$2.abs, Exp$3.abs); } | ( Exp '*' Exp ) { Exp$1.abs = Prod((Exp$2.abs, Exp$3.abs); } | ( Exp '/' Exp ) { Exp$1.abs = Quot((Exp$2.abs, Exp$3.abs); } | ( INTEGER ) { Exp$1.abs = Const(STRtoINT(INTEGER)); } ;

Note that the operators (e.g. Sum and Diff) are used as constructors to build values.

Although not shown here, such translations can be defined that are sensitive to the context of the editing cursor in that we allow the translation equations to refer to the attributes of the node at which the cursor is positioned.

Modularity in Specifications.

Modularity of editor specifications is supported by notation that permits the specification of separate aspects of an editor to be placed in separate portions of a specification. Such modularity enhances the comprehensibility for attribute-grammar specifications, of course, but at the same time , it facilitates the generation of a collection of related editors that offer different degrees of static-semantic analysis for the same language.

Three concepts permit specifications to be factored in this way:

  • adding attributes to an existing nonterminal,
  • adding equations to an existing production,
  • adding productions and their semantics to an existing nonterminal.

Example: Let-clauses

For example, suppose we wanted to extend the desk calculator with let-clauses so you could bind sub-calculations to names.

The appropriate abstract-syntax definitions and evaluations equations may be factored into a module separate from the rules for the basic 4 functions.

An upwardly compatible editor could then be generated by using the new module in conjunction with the original one.

The slide shows:

  • adding a new inherited attribute env for nonterminal exp,
  • adding the new syntax for let-clauses Let and use of names in expressions Use.
  • adding attribute definitions for Let and Use in such a way that the scoping of names is "block-structured".

The slide does not show:

  • new syntax for names,
  • definition of a new data type ENV with functions Insert and Lockup,
  • equations for attribute env in other operators.
        let <name> = <exp> in <exp> exp { inherited ENV env; }; exp: Let (name exp exp ) { exp$2.env = exp$1.env; exp$3.env = Insert(name, exp$2.v, exp$1.env); exp$1.v = exp$3.v; } | Use (name) { exp.v = Lookup(name, exp.env); } ;

User-defined Data Types

We can't hope to have built in all types that may be useful in attribute calculations, so we allow one to define new data types out of primitive ones.

To define new datatypes we use the same grammatical mechanism already illustrated for defining abstract syntax--a grammar is just a type definition mechanism.

The slide shows the correspondence between the terminology of types and grammars.

    Type Definitions Grammars
    type nonterminal
    primitive types atoms
    record definitions operators
    variant records alternatives

We saw in the let-clause example the need for user-defined data types(e.g. Env).

This slide gives the definition of ENV.

Paraphraphased: An Env is a list of 0 or more BINDING's, where a BINDING is an ID/INT pair, where an ID is string that begins with a letter and is followed by 0 or more letters or digits.

Note that the rules used to define ENV are exactly the same sort of grammar rules used to define the abstract syntax of exp's.

        ENV: NullEnv () | EnvConcat ( BINDING ENV ) ; BINDING: Binding( ID INT) ; ID: < [a-zA-Z]{a-zA-Z0-9]* > ;

This slide illustrates use of the user-defined type ENV.

Operator names are used as constructors and as selectors in pattern matching case expressions, called with-expressions.

Recursive functions from n-tuples of terms to terms are supported.

Recursive functions can be cached (i.e. memoized).

    • Constructors, e.g. EnvConcat(Binding("a",3),NullEnv())
    • Destructors, e.g. with ( env )( NullEnv(): Binding("",0), EnvConcat(b,e): b)
    • Recursive Functions, e.g. ENV Insert(ID name, INT v, ENV env) { <some env-valued expression>};

Summary

  • Editor features (abstract syntax, context-sensitive computations, prettyprinting rules, concrete parsing syntax, transformations) are expressed in a high level language.
  • SSL's two concepts (term and attributes) have a wide range of application.
  • Static inferences are specified declaratively.
  • Type checking and other consistency checks eliminate errors during editing.
  • SSL's modularity permits
    • components to be used in many editors
    • generating families of related editors
    • factoring specifications for comprehensibility
  • A common user interface is shared by all generated editors.

Free Trial | Products | Customers | Support | News | Jobs | About Us         © 2007-2008, GrammaTech, Inc. All rights reserved.