Skip to content

Pczajka/critical user journeys#709

Closed
sfc-gh-pczajka wants to merge 3 commits intomainfrom
pczajka/critical-user-journeys
Closed

Pczajka/critical user journeys#709
sfc-gh-pczajka wants to merge 3 commits intomainfrom
pczajka/critical-user-journeys

Conversation

@sfc-gh-pczajka
Copy link
Copy Markdown
Contributor

@sfc-gh-pczajka sfc-gh-pczajka commented Mar 24, 2026

Tests scenarios representing critical user journeys uncovered by existing e2e tests. Scenarios are effect of LLM analysis of 1st party clients: Snowflake CLI, Snowfort, Snowpark, SQLAlchemy and existing driver test cases

The idea is to test scenarios used by Python early adopters in other drivers as well

sfc-gh-pczajka and others added 2 commits March 24, 2026 10:23
Add 12 critical user journey test scenarios covering table lifecycle,
session management, stored procedures, UDFs/UDTFs, cursor reuse,
MERGE INTO, COPY INTO, CREATE VIEW/CTAS, object discovery,
metadata/catalog operations, transaction control, and error handling.

Each journey includes a language-agnostic Gherkin feature definition
under tests/definitions/shared/ and a Python E2E implementation
under python/tests/e2e/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 24, 2026 13:16
@sfc-gh-pczajka sfc-gh-pczajka requested a review from a team as a code owner March 24, 2026 13:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new set of “critical user journey” scenarios (Gherkin) and corresponding Python e2e pytest implementations to cover common Snowflake workflows (DDL/DML lifecycle, object discovery/metadata, session management, COPY/PUT, MERGE, UDF/UDTF, stored procedures).

Changes:

  • Introduces new shared Gherkin feature specs under tests/definitions/shared/critical_user_journeys/.
  • Adds Python e2e test modules under python/tests/e2e/critical_user_journeys/ implementing those scenarios.
  • Establishes a Python package marker for the new test area via __init__.py.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/definitions/shared/critical_user_journeys/udf_udtf_lifecycle.feature Defines UDF/UDTF lifecycle scenarios for Python e2e coverage.
tests/definitions/shared/critical_user_journeys/table_lifecycle.feature Defines table DDL + filtering scenarios.
tests/definitions/shared/critical_user_journeys/stored_procedure_lifecycle.feature Defines stored procedure create/call/drop scenario.
tests/definitions/shared/critical_user_journeys/session_management.feature Defines role switching and restoration scenario.
tests/definitions/shared/critical_user_journeys/object_discovery.feature Defines SHOW/DESCRIBE/DROP discovery scenarios.
tests/definitions/shared/critical_user_journeys/metadata_catalog.feature Defines SHOW TABLES / DESCRIBE TABLE metadata scenarios.
tests/definitions/shared/critical_user_journeys/merge_into.feature Defines MERGE INTO upsert scenario.
tests/definitions/shared/critical_user_journeys/create_view_ctas.feature Defines view creation and CTAS scenarios.
tests/definitions/shared/critical_user_journeys/copy_into.feature Defines PUT + COPY INTO bulk load scenario.
python/tests/e2e/critical_user_journeys/test_udf_udtf_lifecycle.py Implements UDF/UDTF lifecycle e2e tests.
python/tests/e2e/critical_user_journeys/test_table_lifecycle.py Implements table create/describe and WHERE filtering e2e tests.
python/tests/e2e/critical_user_journeys/test_stored_procedure_lifecycle.py Implements stored procedure lifecycle e2e test.
python/tests/e2e/critical_user_journeys/test_session_management.py Implements role switch/restore e2e test.
python/tests/e2e/critical_user_journeys/test_object_discovery.py Implements object discovery and drop verification e2e tests.
python/tests/e2e/critical_user_journeys/test_metadata_catalog.py Implements catalog inspection e2e tests.
python/tests/e2e/critical_user_journeys/test_merge_into.py Implements MERGE INTO upsert e2e test.
python/tests/e2e/critical_user_journeys/test_create_view_ctas.py Implements CREATE VIEW and CTAS e2e tests.
python/tests/e2e/critical_user_journeys/test_copy_into.py Implements PUT + COPY INTO bulk load e2e test.
python/tests/e2e/critical_user_journeys/__init__.py Marks the new critical journey e2e tests package.

Comment on lines +22 to +29
cursor.execute(
f"CREATE OR REPLACE TABLE {table_name} ("
f"id INT NOT NULL, "
f"name VARCHAR(100), "
f"active BOOLEAN, "
f"score FLOAT, "
f"amount NUMBER(10,2))"
)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario/step text calls this a temporary table, but the test creates a non-temporary table. In e2e, tmp_schema is shared across the session and the repo guidance recommends CREATE OR REPLACE TEMPORARY TABLE to avoid name conflicts and cleanup overhead (see python/tests/e2e/conftest.py).

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +66
cursor.execute(
f"CREATE OR REPLACE TABLE {table_name} ("
f"id INT NOT NULL, "
f"name VARCHAR(100), "
f"active BOOLEAN, "
f"score FLOAT, "
f"amount NUMBER(10,2))"
)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: the step says a temporary table exists, but this creates a non-temporary table in the session-scoped tmp_schema. Prefer CREATE OR REPLACE TEMPORARY TABLE here to match the feature and to avoid persistent objects across the e2e session (per python/tests/e2e/conftest.py).

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +25
cursor.execute(f"CREATE STAGE {stage_name}")

# And A temporary table "copy_test" with columns (id INT, name VARCHAR, val FLOAT) exists
cursor.execute(f"CREATE TABLE {table_name} (id INT, name VARCHAR, val FLOAT)")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a non-temporary stage/table with fixed names and without OR REPLACE / IF NOT EXISTS, which can make reruns flaky within the session-scoped tmp_schema. Prefer a temporary stage/table (or unique names) and use CREATE OR REPLACE / IF NOT EXISTS; there are existing helpers in python/tests/e2e/put_get/put_get_helper.py (e.g., create_temporary_stage).

Suggested change
cursor.execute(f"CREATE STAGE {stage_name}")
# And A temporary table "copy_test" with columns (id INT, name VARCHAR, val FLOAT) exists
cursor.execute(f"CREATE TABLE {table_name} (id INT, name VARCHAR, val FLOAT)")
cursor.execute(f"CREATE OR REPLACE TEMPORARY STAGE {stage_name}")
# And A temporary table "copy_test" with columns (id INT, name VARCHAR, val FLOAT) exists
cursor.execute(
f"CREATE OR REPLACE TEMPORARY TABLE {table_name} (id INT, name VARCHAR, val FLOAT)"
)

Copilot uses AI. Check for mistakes.
csv_path.write_text("1,Alice,3.14\n2,Bob,2.72\n3,Charlie,1.41\n")

# When The CSV file is PUT to the stage
cursor.execute(f"PUT 'file://{csv_path}' @{stage_name} AUTO_COMPRESS=FALSE OVERWRITE=TRUE")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building the PUT 'file://...' URI from a raw Path may break on Windows due to backslashes. Use the existing as_file_uri(Path) helper (or csv_path.as_posix()) when formatting the PUT command to keep the test cross-platform.

Suggested change
cursor.execute(f"PUT 'file://{csv_path}' @{stage_name} AUTO_COMPRESS=FALSE OVERWRITE=TRUE")
cursor.execute(
f"PUT 'file://{csv_path.as_posix()}' @{stage_name} AUTO_COMPRESS=FALSE OVERWRITE=TRUE"
)

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +29
cursor.execute(f"CREATE TABLE {target_table} (id INT, name VARCHAR, amount INT)")
cursor.execute(f"INSERT INTO {target_table} VALUES (1, 'original_1', 100), (2, 'original_2', 200)")

# And A source table with rows (2, 'updated_2', 250) and (3, 'new_3', 300) exists
cursor.execute(f"CREATE TABLE {source_table} (id INT, name VARCHAR, amount INT)")
cursor.execute(f"INSERT INTO {source_table} VALUES (2, 'updated_2', 250), (3, 'new_3', 300)")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tables are created without OR REPLACE / IF NOT EXISTS and are non-temporary in the session-scoped tmp_schema, which can cause flakiness on reruns. Prefer CREATE OR REPLACE TEMPORARY TABLE (or at least CREATE OR REPLACE TABLE) for both target/source to match the e2e fixture guidance (python/tests/e2e/conftest.py).

Suggested change
cursor.execute(f"CREATE TABLE {target_table} (id INT, name VARCHAR, amount INT)")
cursor.execute(f"INSERT INTO {target_table} VALUES (1, 'original_1', 100), (2, 'original_2', 200)")
# And A source table with rows (2, 'updated_2', 250) and (3, 'new_3', 300) exists
cursor.execute(f"CREATE TABLE {source_table} (id INT, name VARCHAR, amount INT)")
cursor.execute(f"INSERT INTO {source_table} VALUES (2, 'updated_2', 250), (3, 'new_3', 300)")
cursor.execute(
f"CREATE OR REPLACE TEMPORARY TABLE {target_table} (id INT, name VARCHAR, amount INT)"
)
cursor.execute(
f"INSERT INTO {target_table} VALUES (1, 'original_1', 100), (2, 'original_2', 200)"
)
# And A source table with rows (2, 'updated_2', 250) and (3, 'new_3', 300) exists
cursor.execute(
f"CREATE OR REPLACE TEMPORARY TABLE {source_table} (id INT, name VARCHAR, amount INT)"
)
cursor.execute(
f"INSERT INTO {source_table} VALUES (2, 'updated_2', 250), (3, 'new_3', 300)"
)

Copilot uses AI. Check for mistakes.
cursor.execute(f"INSERT INTO {source_table} VALUES (1, 'first', 1.5), (2, 'second', 2.5), (3, 'third', 3.5)")

# When CREATE TABLE AS SELECT is executed filtering val > 2.0
cursor.execute(f"CREATE TABLE {new_table} AS SELECT * FROM {source_table} WHERE val > 2.0")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CREATE TABLE ... AS SELECT will fail on reruns if the table already exists in the session-scoped tmp_schema. Prefer CREATE OR REPLACE TABLE (or explicitly DROP TABLE IF EXISTS first) to keep the e2e test idempotent.

Suggested change
cursor.execute(f"CREATE TABLE {new_table} AS SELECT * FROM {source_table} WHERE val > 2.0")
cursor.execute(f"CREATE OR REPLACE TABLE {new_table} AS SELECT * FROM {source_table} WHERE val > 2.0")

Copilot uses AI. Check for mistakes.
And A table with columns (id INT NOT NULL, name VARCHAR(100), val NUMBER(10,2)) exists
When SHOW TABLES LIKE 'disc_show_describe' is executed
Then 1 row should be returned
When DESCRIBE TABLE e2e_discovery_test is executed
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario creates a table named disc_show_describe, but the step text says DESCRIBE TABLE e2e_discovery_test which implies a different object. Update the step wording (and matching test comment) to reference the table that is actually created so the feature remains an accurate executable spec.

Suggested change
When DESCRIBE TABLE e2e_discovery_test is executed
When DESCRIBE TABLE disc_show_describe is executed

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +31
Then SHOW TABLES should return 0 rows
And SHOW VIEWS should return 0 rows
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHOW TABLES / SHOW VIEWS returning 0 rows is not deterministic in a shared schema (other tests may create objects). Consider scoping these steps to a specific object name (e.g., SHOW TABLES LIKE ... IN SCHEMA ...) to make the scenario reliably true and aligned with the Python implementation.

Suggested change
Then SHOW TABLES should return 0 rows
And SHOW VIEWS should return 0 rows
Then SHOW TABLES LIKE 'disc_show_describe' should return 0 rows
And SHOW VIEWS LIKE 'disc_view_show' should return 0 rows

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@sfc-gh-turbaszek sfc-gh-turbaszek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo we need to properly define the aim of those tests. Currently majority of them is testing Snowflake behavior not driver. In my opinion we should focus on scenarios where driver does anything with the result. For example the MERGE statements where rowcount / rows affected are crucial for processing. We should look for similar scenarios where numDmlDuplicates is used.

Comment on lines +28 to +29
csv_path = tmp_path / "data.csv"
csv_path.write_text("1,Alice,3.14\n2,Bob,2.72\n3,Charlie,1.41\n")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create test file? I think this is a pattern we use for other tests?

# Then LS @copy_test_stage should show 1 file
cursor.execute(f"LS @{stage_name}")
ls_results = cursor.fetchall()
assert len(ls_results) == 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assert this is the expected file

Comment on lines +49 to +57
assert rows[0][0] == 1
assert rows[0][1] == "Alice"
assert rows[0][2] == pytest.approx(3.14)
assert rows[1][0] == 2
assert rows[1][1] == "Bob"
assert rows[1][2] == pytest.approx(2.72)
assert rows[2][0] == 3
assert rows[2][1] == "Charlie"
assert rows[2][2] == pytest.approx(1.41)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would opt for assert rows == [...] to see what changes in whole output.

def test_should_create_view_and_query_filtered_data(self, execute_query, cursor, tmp_schema):
"""Test CREATE VIEW and query filtered data."""
# Given Snowflake client is logged in
assert_connection_is_open(execute_query)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necccessary? I think @sfc-gh-fpawlowski was getting rid of it

cursor.execute(f"INSERT INTO {source_table} VALUES (1, 'first', 1.0), (2, 'second', 2.0), (3, 'third', 3.0)")

# When A view is created that filters rows where id > 1
cursor.execute(f"CREATE OR REPLACE VIEW {view_name} AS SELECT * FROM {source_table} WHERE id > 1")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is special about view? What CUJ we check here?

)

# When SHOW TABLES LIKE 'meta_show' is executed
cursor.execute(f"SHOW TABLES LIKE 'meta_show' IN SCHEMA {tmp_schema}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert len(results) == 1, f"Expected 1 table, got {len(results)}"
# The table name appears in the result (typically in column index 1)
row = results[0]
assert "META_SHOW" in str(row).upper(), f"Table name not found in result: {row}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we assert proper schema of the result?

class TestObjectDiscovery:
"""Tests for object discovery operations."""

def test_should_discover_table_via_show_and_describe(self, execute_query, cursor, tmp_schema):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how this differs from TestMetadataCatalog?

# Then 1 row should be returned
assert len(show_results) == 1, f"Expected 1 view, got {len(show_results)}"

def test_should_verify_objects_are_gone_after_drop(self, execute_query, cursor, tmp_schema):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing Snowflake not driver.

Comment on lines +28 to +31
# Then SELECT CURRENT_ROLE() should return "PUBLIC"
cursor.execute("SELECT CURRENT_ROLE()")
result = cursor.fetchone()
assert result[0] == "PUBLIC"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know the role has really changed?

@sfc-gh-pczajka
Copy link
Copy Markdown
Contributor Author

Closing as outdated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants