Skip to content

Add rule parser and tokenizer#2064

Draft
ProfDoof wants to merge 12 commits intotalonhub:mainfrom
ProfDoof:new_help_engine
Draft

Add rule parser and tokenizer#2064
ProfDoof wants to merge 12 commits intotalonhub:mainfrom
ProfDoof:new_help_engine

Conversation

@ProfDoof
Copy link
Copy Markdown

I'm currently working on building out a new help engine that can handle a variety of cases including resolution of captures and lists, and to do so I need to be able to extract out pieces of the rules so I built out a rule tokenizer and parser.

I wanted to go ahead and get this out in draft format so people could discuss whether or not this was something we wanted to include since I already built it out and it only took a few hours.

@nriley
Copy link
Copy Markdown
Collaborator

nriley commented Nov 29, 2025

From the community backlog session — improving help seems like an important goal. Your parser looks relatively lightweight and easy to read. We would want to see some kind of tests in community to make sure that with any changes to TalonScript it does not introduce any regressions.

Thank you for doing this!

@ProfDoof
Copy link
Copy Markdown
Author

So, I started on this functionality a while back and I got stuck so I'm want to share what I'm stuck on as I've been thinking about this for a while and gotten nowhere.

I wanted to work on this feature because the current help functionality is lacking in certain functionalities that I think are very important to overall discoverability in the system. I have been asked to avoid any kind of dependence on the talon internal code. I would like to do that, however that means that I need to basically re-implement all of the logic that talon already implements to read and parse all of the talon files in the talon files folder. This would be an inherently brittle solution as the moment that anything about the format changes, this solution would break. It also wouldn't be able to handle captures which is one of the most interesting potential applications of this search functionality.

For the record, what I want to implement is an improved search functionality which takes and performs a search over an acyclic graph representation of each individual rule. Something along the lines of

rule <capture1> {list1}:
  ...

would be expanded into a graph that looks approximately like

rule -> capture1.rule (capture one rule) -> list1[0]
                                         -> list1[1]
                                         -> ...
                                         -> list1[-1]

Thus allowing us to match against complex rules. From an optimization perspective, we can memoize much of this search (search each list only once, each rule only once, etc.) which should make it fairly accessible for search without an exponential blowup. However, captures are impossible to handle without resolution code provided by Talon (due to captures being defined at runtime and introspection being very brittle in this case), and lists are possible to handle, but as noted prior, very brittle.

With all that being said, I'm not opposed to potentially putting in more work on getting something into existence, but I'm just not sure what is viable here. And I'm also not sure whether this functionality would truly be beneficial as it wouldn't be able to handle semantic similarities while semantic similarities are probably the most valuable comparisons that could be done.

TL;DR. Is this "improved search functionality" worth building and if so what approaches might be viable for completing it while not relying on unstable internal code?

@nriley
Copy link
Copy Markdown
Collaborator

nriley commented Apr 18, 2026

From the community backlog session:

I have been asked to avoid any kind of dependence on the talon internal code.

From the best of our understanding, use of the Talon registry to provide a help system is acceptable (community does this already). Use of it for any other purposes is not. So you may be able to use some Talon internals.

Otherwise we feel that it is very helpful for people who are not comfortable using something like andreas-talon to navigate definitions in an editor or IDE to be able to understand how voice commands can be spoken with lists and captures.

One of the major challenges in visualizing this will be the current imgui, but there is progress work to bind a richer user interface framework.

We also discussed that depending on your specific aim in expanding community's help system, you may not need to completely parse .talon files. For example if you just want to identify lists and captures, you can do so with regular expressions.

I hope I am capturing our discussion reasonably well here, but if you have any questions it may make sense to ask on Slack or consider joining the community backlog session if this works for your time zone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants