fix: allow multiple index writers with SQLite#683
Draft
doorgan wants to merge 2 commits into
Draft
Conversation
affec9d to
6dded43
Compare
4f0b12a to
de934ef
Compare
6dded43 to
d667579
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #353
Replaces the indexer's ETS backend with SQLite. The ETS backend did not have good concurrent writes support, and we lost that completely after we moved to EPMDless deployments because project nodes could no longer discover each other to assign a leader writer.
This PR replaces the ETS backend with an SQLite one which solves this issue. It is not a faster backend, the benchmarks I ran(included in this PR) range from 1.5x to 3x slower, however when I try out the features it doesn't seem like that slowdown manifests noticeably during normal use, or even while indexing. YMMV very if you have a large codebase(which I don't have available to test against). This also sits on top of several optimizations I've been working on to be able to afford this switch, so overall this should be faster than v0.1 anyways.
I had tried to implement this several months ago to no avail due to namespacing: we don't have a clean way to namespace NIFs which causes crashes at runtime, and at the time the solution seemed to be to fork Exqlite and have our own CI tooling to release our own namespaced Exqlite, which I ultimately decided against due to the sheer complexity of that approach.
This PR instead makes two changes, based on how NextLS works: instead of having the engine nodes own their indexer storage, we move the storage layer to the manager node(where we can exclude apps from namespacing) and make the indexer in the engine nodes send the data back and forth with the manager to store or retrieve indexer information. Then we can swap the ETS backend to SQLite with relatively minimal effort.
This PR does not delete the ETS backend code even if it becomes unused. I think it's worth keeping around for a little while to be able to easily compare how they perform and to be able to benchmark them side by side. The ETS code can be deleted in a later PR when we're sure we are not going back. I personally dislike deleting "dead code" and having to revert commits to test older approaches during a transition period.