Skip to content

Build spy fmt formatter#496

Open
kanin-kearpimy wants to merge 12 commits into
spylang:mainfrom
kanin-kearpimy:457-add-subcommand-spy_fmt
Open

Build spy fmt formatter#496
kanin-kearpimy wants to merge 12 commits into
spylang:mainfrom
kanin-kearpimy:457-add-subcommand-spy_fmt

Conversation

@kanin-kearpimy

@kanin-kearpimy kanin-kearpimy commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Issue: #457

Description

Hello, this PR introduce SPy formatter via spy fmt command. SPy actually extends Python syntax with extra language grammar. Technically, we can use Python formatter. However, we need to handle SPy-specific const and var declaration.

spy format is calling:

  1. preprocess .spy code to `python.
  2. in preprocess, we mark SPy grammar (e.g. var x, const y with MIDDLE DOT like var·x and const·y.
  3. Ruff formatting
  4. reintroduce SPy grammar by converting var·x and const·y to var x and const y

Note

This PR cover only single file formatter. Adding multiple files or directory are not that hard. To concentrate only on SPy formatter, I propose we leave those features for another issue (I'll create it later). Such issue is also good for first time contributor.

DEPRECATED DO NOT READ

spy fmt is calling:

  1. Read .spy file content and tokenize.
  2. Perform scoping tracker SPy-specific declaration const and var << (Please see below for more detail)
  3. Cleaning SPy-specific const and var out.
  4. Parse (3) to Python tokens.
  5. perform ruff formatter.
  6. Reintroduce SPy-specific declaration back.
  7. change file content.

SPy-specific declaration: Scoping Tracker

image

We scope source code by function, class, and some control-flow branch such as if, else, while, for. Tracker push/pop those scope into stack. These scopes are not affected by ruff formatter. So, we can track nth-variable by each scope.

In picture above, we have two scope: module_def-main and module_def-main_if-1. So, dictionary tracker has

  • module_def-main__x-1
  • module_def-main_if-1__y-1

If we found SPy-specific like const, var, we marked them in dictionary. When we perform step (4) to reintroduce SPy-specific, we perform scope tracker against step (1) dictionary.

Please note we also track occurrence of variables and keywords (if, else, while, for) against its scope.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new spy fmt CLI subcommand intended to format .spy sources by preprocessing SPy-specific syntax (var/const) into Python-compatible code, running ruff format, and reinserting SPy-specific tokens afterward.

Changes:

  • Introduces spy fmt command wiring and a new SPyFormatter implementation that shells out to ruff format.
  • Extends magic_py_parse.py with token-tracking helpers to reinsert var/const after formatting.
  • Adds tests for token tracking/reinsertion and a CLI-level formatting test.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
spy/tests/test_fmt.py New tests for SPy-specific token tracking and reinsertion logic.
spy/tests/test_cli.py Adds an integration test for spy fmt formatting a file.
spy/magic_py_parse.py Adds scope/occurrence-based tracking + reinsertion of var/const tokens.
spy/cli/commands/fmt.py New fmt subcommand implementation.
spy/cli/cli.py Registers the new fmt subcommand.
spy/analyze/fmt.py Implements Ruff-based formatting and SPy token reinsertion pipeline.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/cli/commands/format.py
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/analyze/fmt.py Outdated
Comment thread spy/analyze/fmt.py Outdated
Comment thread spy/tests/test_fmt.py Outdated
@kanin-kearpimy kanin-kearpimy marked this pull request as ready for review April 30, 2026 15:54

@JeffersGlass JeffersGlass left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @kanin-kearpimy! This is really cool! A few little comments dropped in.

My biggest though is that I think this feature should be called format, not fmt, and just have fmt be an alias in the CLI. I think that will make it more clear across the code what the intent is. So, fmt.py -> format.py, test_fmt.py -> test_format.py, etc.

Comment thread spy/cli/cli.py Outdated
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/magic_py_parse.py Outdated
Comment thread spy/tool/spyformatter.py Outdated
@kanin-kearpimy kanin-kearpimy force-pushed the 457-add-subcommand-spy_fmt branch from ddf1b3d to 0d5679d Compare May 21, 2026 14:44

@antocuni antocuni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kanin-kearpimy sorry for taking so long to review this, the last month has been crazy.
So, first of all: congratulations for having invented this way to solve the problem. It's very clever and I didn't think about it.

That said, I think it's too much clever and I am a bit worried about the extra complexity added to magic_py_parse.

While reviewing this, I got a different idea which would make spy fmt much easier to implement. I prototyped this idea in #540.

Basically, instead of "stripping" var/const and keeping track of varkind_locs in a separate table, we can replace var x with a single identifier var·x, using the special char MIDDLE DOT.

If we have that, spy fmt would become as easy as:

  1. use magic_py_parse to produce the "pythonized" source
  2. use a regex to replace var·····x with var·x
  3. run ruff fmt on it
  4. use a regex to replace var·x back to var x
    The nice part is that len('var x') == len('var·x'), so ruff fmt should automatically do the right thing w.r.t. e.g. long lines.

What do you think?

@antocuni antocuni mentioned this pull request Jun 1, 2026
@kanin-kearpimy kanin-kearpimy force-pushed the 457-add-subcommand-spy_fmt branch from 8257407 to 464c9d1 Compare June 4, 2026 15:50
@kanin-kearpimy kanin-kearpimy requested a review from antocuni June 4, 2026 16:26
@kanin-kearpimy

Copy link
Copy Markdown
Contributor Author

Hi, I refactored code to use new preprocess.

@antocuni

@JeffersGlass JeffersGlass left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a nit, but I would love to see the filenames and function names all changed from fmt to format. (fmt.py -> format.py, test_fmt.py -> test_format.py, etc.). There's nothing really gained by the abbreviation here and it would be nice to be consistent with the rest of the command filenames and functions.

Other than that, LGTM!

@antocuni antocuni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @kanin-kearpimy .
This is much easier and simpler than the previous approach, good job :).

I left some comments: many are style issues and nitpicks, but two are real:

  1. spy format should operate on filenames, not modnames
  2. it should raise SPyError for messages which are meant to be shown to the user, not RuntimeError.

Comment thread spy/tool/spyformatter.py Outdated
Comment thread spy/tool/spyformatter.py
Comment thread spy/tool/spyformatter.py Outdated
Comment thread spy/tool/spyformatter.py Outdated
check=True,
capture_output=True,
text=True,
cwd=repo_root,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need cwd=repo_root here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for config in pyproject.toml. However, I adjusted it a bit to explicitly adding config file on command.

            cmd = ["ruff", "format", tmp_path]
            if os.path.exists(config_path):
                cmd.extend(["--config", config_path])
            result = subprocess.run(
                cmd,
                check=True,
                capture_output=True,
                text=True,
            )

Comment thread spy/tool/spyformatter.py Outdated

async def fmt(args: Fmt_Args) -> None:
"""Format SPy file"""
# TODO: implement directory formatting.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have support for

  1. directoris, e.g. spy format /tmp/somedir)
  2. multiple files, e.g. spy format /tmp/*.spy)

But I think it's fine to do this in a follow-up PR :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. Let me share new PR with you soon.

Comment thread spy/cli/commands/format.py
Comment thread spy/tests/test_cli.py Outdated
Comment thread spy/tests/test_cli.py
Comment thread spy/magic_py_parse.py Outdated
@kanin-kearpimy

kanin-kearpimy commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author
it seems that this improvement has been lost during the refactoring?
Re-opening it.

What do you mean here sorry? @antocuni

@antocuni

antocuni commented Jun 9, 2026

Copy link
Copy Markdown
Member
it seems that this improvement has been lost during the refactoring?
Re-opening it.

What do you mean here sorry? @antocuni

the original comment asked to return [str, str], it was marked as resolved but it still returns str.

@paugier

paugier commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Side remark: it would be very convenient if spy format could format result of spy redshift. I'm wondering if a trick like var·x could be used for that. Redshift output contains names like "_print::println[str]", which could be turned into equivalent with other special uncommon characters. I will test this idea with a dedicated Python script.

@antocuni

antocuni commented Jun 9, 2026

Copy link
Copy Markdown
Member

Side remark: it would be very convenient if spy format could format result of spy redshift. I'm wondering if a trick like var·x could be used for that. Redshift output contains names like "_print::println[str]", which could be turned into equivalent with other special uncommon characters. I will test this idea with a dedicated Python script.

that's an interesting idea.

Not 100% it would make the code more readable though, because redshifted code tends to have very long identifiers. Maybe we can make it working by using a longer line length than usual?

@kanin-kearpimy kanin-kearpimy force-pushed the 457-add-subcommand-spy_fmt branch from d509e33 to 2c82502 Compare June 11, 2026 13:56
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Documentation Preview
Preview documentation for PR #496.
Last Updated: 2026-06-11 15:01 UTC

@kanin-kearpimy kanin-kearpimy requested a review from antocuni June 11, 2026 15:30
@kanin-kearpimy

kanin-kearpimy commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Hello @antocuni ,

most of comments has been resolved. For multiple files / dictionary, New PR is coming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants