Skip to content

CASSANDRA-21139 - Feature/guardrail for misprepare statements#4596

Open
omniCoder77 wants to merge 17 commits intoapache:trunkfrom
omniCoder77:feature/guardrail-for-misprepare-statements
Open

CASSANDRA-21139 - Feature/guardrail for misprepare statements#4596
omniCoder77 wants to merge 17 commits intoapache:trunkfrom
omniCoder77:feature/guardrail-for-misprepare-statements

Conversation

@omniCoder77
Copy link
Contributor

Validation is posted on Jira Ticket

Feature suggestion:
Guardrail for miss-prepared statements

Description:

We have hundreds of application teams and several dozen of them miss-prepare statements by using literals instead of bind markers.

I.e.,

// wrong 
session.prepare("select * from users where ID = 996");
session.prepare("select * from users where ID = 997");
session.prepare("select * from users where ID = 998");
session.prepare("select * from users where ID = 999");

// correct
session.prepare("select * from users where ID = ?");

The problem causes the prepared statement cache to constantly overflow, and will print a prepared statements discarded WARN message in the Cassandra log. At present, we use a wack-a-mole approach to discuss the problem with each development team individually, and hope they fix it and train the entire team on how to prepare statements correctly.

Also, finding the root cause of the issue today requires having the knowledge and access to look at the system.prepared_statements table.

Guardrails would seem a good approach here, where the guard could WARN or REJECT when a statement was prepared using a WHERE clause and no bind markers.

Note, this should not prevent users from creating prepared statements without a WHERE clause or with one or more literal values so long as there was at least one bind marker. Thus, the following would remain valid:

session.prepare("select * from users");
session.prepare("select * from users where TYPE=5 and ID = ?");

Approach:
Introduced a boolean flag use_misprepare_statements_enabled (which can be configured from cassandra.yaml) whose default value is true (backward compatibility) and added functions to StorageServiceMBean to enable dynamic runtime configuration.
Added test cases to validate changes in parseAndPrepare function.

@smiklosovic smiklosovic self-requested a review February 2, 2026 01:08
Copy link
Contributor

@smiklosovic smiklosovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more comments

@omniCoder77
Copy link
Contributor Author

@smiklosovic
I have addressed above comments and

Added new test cases verifying warning behaviour (warn only when misprepared_statements_enabled is true) and to bypass misprepare guardrail for system keyspaces.

Centralized onMisprepare logic to Guardrails.java

Refined test case to extract repeating statements to a function, increase readability.

@omniCoder77
Copy link
Contributor Author

omniCoder77 commented Feb 4, 2026

In my test cases I have tested

  • System keyspaces, internal calls and super users are exempted from this guardrail check
  • Confirmed literals in the SELECT clause are allowed as long as the WHERE clause is prepared.
  • Verified use is warned if and only if misprepared_statement_use is set to true and statement is misprepared
  • Verified no warning when guardrail is violated, instead GuardrailViolationException
  • Verified blocks for literals in WHERE clauses, IN clauses, LWT IF conditions, and JSON inserts.
  • Confirmed that blocked queries do not enter the statement cache
  • Exempts super user and internal call from this guardrail

@omniCoder77 omniCoder77 force-pushed the feature/guardrail-for-misprepare-statements branch from 311028f to 44bba46 Compare February 4, 2026 11:10
@omniCoder77 omniCoder77 force-pushed the feature/guardrail-for-misprepare-statements branch from 44bba46 to 586af83 Compare February 5, 2026 07:19
restrictions.hasNonPrimaryKeyRestrictions();
if (getBindVariables().isEmpty() && hasRestriction)
{
Guardrails.onMisprepared(state);
Copy link
Contributor

@smiklosovic smiklosovic Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather just copy the content of that mehod here instead of having completely arbitrary method in Guardrails. There is no precedent to that.

You might also move it to superclass of these Statement classes so both can call it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public static void onMisprepared(ClientState state)
    {
        mispreparedStatementsEnabled.ensureEnabled(state);
        mispreparedStatementsEnabled.warn(MISPREPARED_STATEMENT_WARN_MESSAGE);
    }

here warn is a protected function, can't write it inside statement classes or interface, not sure where else to put it.

@smiklosovic
Copy link
Contributor

smiklosovic commented Feb 6, 2026

@omniCoder77 thanks for the perseverance! I am just checking overall code / design etc. Havent deeply checked the actual logic around the very guardrail application.

@omniCoder77 omniCoder77 force-pushed the feature/guardrail-for-misprepare-statements branch from 21adc18 to 6dbbc90 Compare February 6, 2026 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants