Hachyderm @hachyderm

Recent searches

Search options

Only available when logged in.

zwarich @zwarich@hachyderm.io

Does anyone have a favorite strategy for mismatched bracket error recovery in recursive descent parsers?

Oct 01, 2024, 03:18 AM··Web

0boosts·1favorite

**Chandler Carruth** @chandlerc · Oct 1, 2024

Oct 1, 2024

Chandler Carruth @chandlerc

@zwarich My favorite strategy is technically *before* the parser because I find it really hard to do well there:

Detect during lexing, and then post-process the token stream, ideally using indent to guide fixes.

Some of this implemented, but not the indentation bit:
https://github.com/carbon-language/carbon-lang/blob/trunk/toolchain/lex/lex.cpp#L1464

There is a TODO, and we have the indent data, just need someone to write the code to peak at the indent and select good fixes until we run out, and then run the greedy algorithm to fix anything left.

GitHubcarbon-lang/toolchain/lex/lex.cpp at trunk · carbon-language/carbon-langCarbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README) - carbon-language/carbon-lang

**Per Vognsen** @pervognsen@mastodon.social · Oct 1, 2024

Oct 1, 2024

Per Vognsen @pervognsen@mastodon.social

@chandlerc @zwarich IMHO for a new language like Carbon, even if you have braces rather than indentation-defined block structure it makes sense to enforce indentation consistency with braces (and other grouping tokens) in the language definition itself rather than leaving it as a degree of freedom in individual lexer/parser implementations.

**Chandler Carruth** @chandlerc · Oct 1, 2024

Oct 1, 2024

Chandler Carruth @chandlerc

@pervognsen @zwarich Mostly? We definitely thought about that but didn't *quite* go that direction.

Specifically, it didn't seem worth *forcing* the indent check, and producing compile error messages when it is wrong. For example, copy/pasting code with too much or little indentation -- it seems more useful for the user to compile than reject that.

We then expect the formatter and linter to enforce consistent indentation. And then design error recovery entirely around consistent indentation.

**zwarich** @zwarich · Oct 1, 2024 *

Oct 1, 2024 *

zwarich @zwarich

@chandlerc I came up with something similar, but was curious whether in practice it's really enough to do this without the grammatical context that you would have from a parser. Curious to see how it works out for Carbon in practice.

**Chandler Carruth** @chandlerc · Oct 1, 2024

Oct 1, 2024

Chandler Carruth @chandlerc

@zwarich yeah, definitely interesting to see how it pans out.

Generally, my feeling is that recovering balanced delimiters somewhat in isolation is likely to result in unsurprising recovery for users compared to using more contextual cues.

But the big win IMO is the simplicity (and speed) of the parser due to not needing to try to do any of this stuff.

**zwarich** @zwarich · Oct 1, 2024

Oct 1, 2024

zwarich @zwarich

@chandlerc Yeah, I don't really think there's a great way to do this otherwise in a recursive descent parser. All of the other comparable solutions I know of rely on modifying either a shift-reduce parser or a general CF parser.

Another interesting question is whether you disable all other errors within a recovery pair.

**Chandler Carruth** @chandlerc · Oct 1, 2024

Oct 1, 2024

Chandler Carruth @chandlerc

@zwarich Yeah, that's a question I really wonder about, but we don't have any real experience playing with options here.

If we start using indentation, I would expect us to be able to make decent guess on "absurd" recoveries and disable errors within that. But definitely an interesting area to explore what actually works best for users...

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back