EBNF Lab: a web-based editor for EBNF grammars

As part of my TA work for the "Introduction to Programming" course at ETH Zurich, I created EBNF Lab, a tool for checking whether words conform to a set of EBNF grammar rules and generating valid words based on that grammar.

Posted on Thu Sep 07 2023
7 min read

This semester, I'll be teaching the course "Einführung in die Programmierung" (Introduction to Programming) at ETH Zürich. One idea that has been on my mind for quite some time is to create an EBNF verifier, along with an editor that provides basic EBNF language support. Given that I've already been working with it a few years ago, I decided to use Monaco editor, Microsoft's web-based editor, which also powers Visual Studio Code.

You may find the editor at gassmann.dev/ebnf.

Note: Much of the grammar simplification and the CYK algorithm have been adapted from github.com/JM4ier/parsley.

Syntax

The EBNF parser mostly follows the EBNF syntax as discussed in the "Einführung in die Programmierung" course taught at ETH Zürich. Some things to be aware of include:

  • The input is evaluated against the last rule
  • Each EBNF rule occupies exactly one line
  • Comments can be created using #
  • Empty lines are skipped
  • Sequences bind stronger than selections, i.e. <r> <= [+|-] 0|2|4|6|8 is interpreted as <r> <= ([+|-] 0)|2|4|6|8 and not as <r> <= [+|-] (0|2|4|6|8). This also coincides with the rules of the lecture, see slide 10 of this document. Always set parentheses to avoid confusion.
  • You won't be able to use ε to match an empty sequence, use "" instead. The following rule matches the empty string:
    <empty> <= ""
    

The tool consists of an EBNF verifier and an EBNF producer. The verifier checks if a given input conforms to the defined EBNF rules, while the producer generates words based on the rules, incrementally increasing their length.

Verify

The verifier provides a simple editor where you can input text to validate against the EBNF rules. Each line in the lower half is evaluated against the EBNF ruleset separately. If a line matches the EBNF ruleset, the line is highlighted green, if a line does not match the EBNF ruleset, the line is highlighted red.

You can try the EBNF verifier down below with the EBNF rules from the HS22 exam:

EBNF Lab

Loading...
Loading...

Please make sure you have some valid rules first.

Please make sure you have some valid rules first.

TODO: You should be able to enter a word and see a tree detailing why this word is matched here. I did not have time to implement this yet :(

Note that this EBNF verifier is whitespace-sensitive. For instance, the EBNF ruleset above matches expressions like *(a + b) but not *(a+b). To faciliate things, you might consider adding a rule akin to <whitespace> <= {" "}.

With an additional whitespace rule as suggested above, we'd have a ruleset similar to the one below (which now also matches *(a+b)):

EBNF Lab

Loading...
Loading...

Please make sure you have some valid rules first.

Please make sure you have some valid rules first.

TODO: You should be able to enter a word and see a tree detailing why this word is matched here. I did not have time to implement this yet :(

Produce

The EBNF producer generates words that follow the defined rules. Words are sorted by length and then lexicographically. Words are produced on the UI thread, so performance may vary with complex rules. In the future, I may move this to web workers for better performance.

You can experiment with the producer here:

EBNF Lab

Loading...
Loading...

Please make sure you have some valid rules first.

Please make sure you have some valid rules first.

TODO: You should be able to enter a word and see a tree detailing why this word is matched here. I did not have time to implement this yet :(

Keep in mind that depending on your EBNF ruleset, the number of possible words may grow exponentially as a function of word length.

Compare

In the compare tab, you can compare two EBNF grammars for equality. After having two valid EBNF grammars on both the left and the right side, you can check for counterexamples with increasing word length.

EBNF Lab

Loading...
Loading...
Loading...

Please make sure you have some valid rules first.

Please make sure you have some valid rules first.

TODO: You should be able to enter a word and see a tree detailing why this word is matched here. I did not have time to implement this yet :(

Again, the number of produced words can grow exponentially.

EBNF language support

Language support extends beyond syntax errors and also encompasses some semantic errors, such as referencing a non-existing rule, as illustrated in the example below:

EBNF Lab

Loading...
Loading...

Please make sure you have some valid rules first.

Please make sure you have some valid rules first.

TODO: You should be able to enter a word and see a tree detailing why this word is matched here. I did not have time to implement this yet :(

All errors have squiggly lines with a description upon hovering over the offending token.

The EBNF verifier/producer view below the rule editor automatically update once the ruleset changes, i.e. the verifier reevaluates whether the input matches the EBNF rule and the producer resets itself.

The producer also detects whether a given ruleset can only produce the empty word. There is also a check in place to detect whether no more words can be produced from the given EBNF ruleset.

EBNF Lab

Loading...
Loading...

Please make sure you have some valid rules first.

Please make sure you have some valid rules first.

TODO: You should be able to enter a word and see a tree detailing why this word is matched here. I did not have time to implement this yet :(

Next steps

You can also find the EBNF parser here outside the blog-post-context (or here for fullpage). There are still some things the editor could be extended with. This includes:

  • Debounce input for rules, especially for larger rules
  • Don't load monaco from an external CDN and use the webpack plugin
  • Produce words in a web worker instead of doing it on the UI thread
    • Maybe don't produce words incrementally by word length but rather by some other useful incremental unit?
    • Also show produced words with virtual scroll
  • Autocomplete using monaco's auto-complete API
  • Add an escape character to e.g. allow matching quotation marks (")
  • serve as a PWA

I do not guarantee the correctness of any results. In case, you spot any errors, feel free to send me an email.

Updates