Skip to content

perf: use beam metadata for dependencies#670

Open
doorgan wants to merge 2 commits into
mainfrom
doorgan/perf-deps-beams
Open

perf: use beam metadata for dependencies#670
doorgan wants to merge 2 commits into
mainfrom
doorgan/perf-deps-beams

Conversation

@doorgan
Copy link
Copy Markdown
Contributor

@doorgan doorgan commented May 7, 2026

BEAM metadata provide all the information we need to index definitions, but not references. We only index deps definitions, so it makes sense to index from BEAM files for them.

This reduces a lot of work by skipping most of the work Extractors do for source files, only requiring to scan a few lines of code from the source file to refine the definition range(this is a bit hacky, but it seems to work well for most code, metaprogramming would be where we need to be more careful).

It also adds a new Manifest for incremental indexing. We were previously relying on the Entry.path to get the file path to compare mtime against, but .beam file paths are not stored in the Entry structs(nor should they be) which required me to rethink how we do incremental indexing. I added a Manifest(and a ManifestStore module to handle its lifecycle) which holds a mapping of file -> source files + file stats and all the metadata needed to determine if a file should be reindexed, and which entries are stale and need to be removed.

Expert indexes in ~1.8 seconds for me without this change, and goes down to ~800ms with this change.

Another reason this change is interesting is that if we decide to use compiler tracers we will need a beam extractor(since compiler tracers provide us with the compile code's debug_info), and this sets the foundation for that.

This does not replace indexing of source files for the user project though, for two reasons:

  • I didn't see a significant boost in performance, mostly because we still need to properly parse the original files to index references, that is where the bulk of the work in the index seems to be spent. I think using this for the project beams would be better implemented in a PR that actually makes use of compiler tracers, or that attempts to improve the indexer accuracy on generated code
  • We still require parsing source files for .exs files like configuration and test, those don't produce .beam artifacts and therefore can't be indexed with this approach

@doorgan doorgan force-pushed the doorgan/perf-indexer-paths branch from 1f4f5da to 453246a Compare May 7, 2026 19:42
@doorgan doorgan force-pushed the doorgan/perf-deps-beams branch 3 times, most recently from f25e9af to 7826847 Compare May 8, 2026 19:35
@doorgan doorgan force-pushed the doorgan/perf-indexer-paths branch from 453246a to 14f0eaa Compare May 11, 2026 19:10
@katafrakt katafrakt force-pushed the doorgan/perf-indexer-paths branch from 14f0eaa to 9b2019a Compare May 12, 2026 09:22
Base automatically changed from doorgan/perf-indexer-paths to main May 12, 2026 09:26
@doorgan doorgan force-pushed the doorgan/perf-deps-beams branch 6 times, most recently from a820e27 to 49ae40b Compare May 21, 2026 17:27
@doorgan doorgan marked this pull request as ready for review May 21, 2026 17:36
@doorgan doorgan force-pushed the doorgan/perf-deps-beams branch from 49ae40b to eb1fdcd Compare May 21, 2026 18:50
Copy link
Copy Markdown
Member

@katafrakt katafrakt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly tested it from the usage site. Seems nice for our large codebase. Good one. I left two comments of things I noticed.

Additional worry is that this is quite large change that would make backporting potentailly hard (unless we backport this too?)

defp dependency_app_names(%Project{} = project) do
project
|> mix_dependency_app_names()
|> MapSet.union(configured_dependency_app_names(project))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that effectively eliminate dependencies with app: false? I'm not sure how big of an edge case this is, but feels like a regression from previous approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed

with {:ok, entries} <- Search.Indexer.create_index(project) do
Search.Store.replace(entries)
end
Search.Store.refresh_index(project)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this actually force reindex when nothing is changed in the filesystem?

Copy link
Copy Markdown
Contributor Author

@doorgan doorgan May 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did not, fixed!

@doorgan
Copy link
Copy Markdown
Contributor Author

doorgan commented May 22, 2026

Additional worry is that this is quite large change that would make backporting potentailly hard (unless we backport this too?)

This is true, though at the same time I'm not sure this belongs to 0.1, there's also the SQLite backend PR and upcoming compiler tracer PR that also significantly change the indexer and those don't belong to 0.1 either but to 0.2.
I'm not entirely sure about how we'll want to handle this

@doorgan doorgan force-pushed the doorgan/perf-deps-beams branch from eb1fdcd to 4f0b12a Compare May 24, 2026 22:44
@doorgan doorgan force-pushed the doorgan/perf-deps-beams branch from 4f0b12a to de934ef Compare May 24, 2026 22:44
@katafrakt
Copy link
Copy Markdown
Member

I agree this does not belong in 0.1. One solution would be to try to get through with all the large changes you mentioned (+ perhaps Hex intelligence) and try to cut out 0.2 RC soon. However, I don't know if this would meet the goal of NextLS feature-completeness.

@doorgan
Copy link
Copy Markdown
Contributor Author

doorgan commented May 27, 2026

@katafrakt it would meet the main goals of 0.2, the rest of the milestone are bugfixes, so that strikes me as a good solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants