hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

9.4K
active users

Does anyone have a favorite strategy for mismatched bracket error recovery in recursive descent parsers?

@zwarich My favorite strategy is technically *before* the parser because I find it really hard to do well there:

Detect during lexing, and then post-process the token stream, ideally using indent to guide fixes.

Some of this implemented, but not the indentation bit:
github.com/carbon-language/car

There is a TODO, and we have the indent data, just need someone to write the code to peak at the indent and select good fixes until we run out, and then run the greedy algorithm to fix anything left.

GitHubcarbon-lang/toolchain/lex/lex.cpp at trunk · carbon-language/carbon-langCarbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README) - carbon-language/carbon-lang
zwarich

@chandlerc I came up with something similar, but was curious whether in practice it's really enough to do this without the grammatical context that you would have from a parser. Curious to see how it works out for Carbon in practice.

@zwarich yeah, definitely interesting to see how it pans out.

Generally, my feeling is that recovering balanced delimiters somewhat in isolation is likely to result in unsurprising recovery for users compared to using more contextual cues.

But the big win IMO is the simplicity (and speed) of the parser due to not needing to try to do any of this stuff.

@chandlerc Yeah, I don't really think there's a great way to do this otherwise in a recursive descent parser. All of the other comparable solutions I know of rely on modifying either a shift-reduce parser or a general CF parser.

Another interesting question is whether you disable all other errors within a recovery pair.

@zwarich Yeah, that's a question I really wonder about, but we don't have any real experience playing with options here.

If we start using indentation, I would expect us to be able to make decent guess on "absurd" recoveries and disable errors within that. But definitely an interesting area to explore what actually works best for users...