Skip to content
Open

test #21

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion .github/workflows/all_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,19 @@ jobs:
# Scene selection:
# - ci_top_attention_doc_page_build validates doc build through the prebuilt Docker image.
# - ci_top_attention_bin_kvtest keeps the Rust kv_test entry under the testbed scene contract.
# - ci_top_attention_config_* and ci_top_attention_ctrl_c_* reuse the same ops testbed/test-stack CI chain.
suite["scenes"] = {
key: value
for key, value in suite["scenes"].items()
if key in ("ci_top_attention_doc_page_build", "ci_top_attention_bin_kvtest")
if key in (
"ci_top_attention_doc_page_build",
"ci_top_attention_bin_kvtest",
"ci_top_attention_config_kv",
"ci_top_attention_config_fs",
"ci_top_attention_config_mq",
"ci_top_attention_ctrl_c_kv",
"ci_top_attention_ctrl_c_mq",
)
}

# Profile selection:
Expand All @@ -91,6 +100,7 @@ jobs:
# - Keep the original per-scene scales from ci_test_list.yaml.
# - ci_top_attention_doc_page_build stays on n1_kvowner_dram_3gib.
# - ci_top_attention_bin_kvtest stays on n1_kvowner_dram_20gib.
# - Config/Ctrl-C wrappers stay on n1_kvowner_dram_3gib.

out_path.write_text(
yaml.safe_dump(suite, sort_keys=False, allow_unicode=False),
Expand Down
23 changes: 14 additions & 9 deletions fluxon_doc_cn/design/teststack_1_当前架构与CI测试流程.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ suite 中有两大类场景:
- 这是 case 空间的第一层边界。
2. `scene kind`
- 决定当前 scene 走 `CI` 还是 `TEST_STACK` 分支;
- 也决定后续必须存在哪些字段,例如 `scene.ci.runtime_contract` 或 `scene.test_stack.mode`。
- 也决定后续必须存在哪些字段,例如 `scene.ci.requirements` 或 `scene.test_stack.mode`。
3. `scale`
- 提供 topology、targets、owner、benchmark 等“规模与布局”约束;
- 这些约束不会自己决定 runtime 模板,但会限制哪些 runtime 模板能成功 materialize。
Expand Down Expand Up @@ -292,7 +292,7 @@ suite 中有两大类场景:

- `CI`
- scene 选中的 profile 必须有 `runtime.ci`;
- `scene.ci.runtime_contract` 必须能在该 profile 的 `runtime.ci.runtime_contracts` 中找到;
- `scene.ci.requirements` 中声明的每个 requirement,必须都能在该 profile 的 `runtime.ci.requirements` 中找到;
- 选中的 scale 必须至少满足 `CI` 所需的 topology 基础字段。
- `TEST_STACK`
- scene 选中的 profile 必须有 `runtime.test_stack`;
Expand All @@ -308,7 +308,7 @@ suite 中有两大类场景:
- `scene.select.profiles` 引用了不存在的 `profile`。
- `profile.artifact_set` 指向了不存在的 `artifact_set`。
- `CI` scene 选中了没有 `runtime.ci` 的 profile。
- `scene.ci.runtime_contract` 在选中 profile 的 `runtime.ci.runtime_contracts` 中不存在
- `scene.ci.requirements` 中有 requirement 未在选中 profile 的 `runtime.ci.requirements` 中定义
- `TEST_STACK` scene 选中了没有 benchmark block 的 scale。
- `TEST_STACK` role plan 推导出的 target,不在 profile 的 `deploy.target_ip_map` 里。
- `run.selectors.profile_ids` 或 `case_ids` 在编译后的 case 集合里选不中任何对象。
Expand Down Expand Up @@ -348,20 +348,25 @@ deploy.instances 不写死在 suite 中。Runner 会结合 scale、profile 和
`CI` scene 会在通用编译模型上追加这些字段:

- `scene.ci.subject`
- `scene.ci.runtime_contract`
- `scene.ci.requirements`
- `scene.ci.prepare`
- `scene.ci.commands`

其中:

- `prepare` 是 CI case 的前置环境准备;
- `commands` 是 resolved case 里的编译产物字段,不是 suite 输入字段;`test_runner.py` 会按有限 `scene_id` 分支把它生成给 `ci_runner` 顺序执行;
- `runtime_contract` 决定 profile 里选哪套 runtime 模板
- `requirements` 显式声明这个 scene 需要哪些基础服务、case runtime 实例,以及额外 runner 行为

已存在两个 runtime contract
当前 requirement 是有限枚举集合,例如

- `cluster_kv_owner`
- `rust_self_managed`
- `testbed_etcd`
- `testbed_greptime`
- `master`
- `owner_0`
- `ci_runner`
- `owner_shared_bundle`
- `fluxon_kv_readiness_probe`

`_compile_ci_case()` 会根据:

Expand All @@ -374,7 +379,7 @@ deploy.instances 不写死在 suite 中。Runner 会结合 scale、profile 和
`CI` 特化的稳定事实:

- `CI_CASE_RUNTIME_INSTANCE_IDS = ("master", "owner_0", "ci_runner")`
- 最终是否包含这三个实例,取决于 runtime contract 模板里是否声明
- 最终是否包含这些实例,取决于 scene requirements 是否显式声明并且 profile requirement configs 是否提供
- `resolved_case` 会额外固化 `command_id`、`test_id` 等 CI 元数据;
- 生成顺序是稳定的,后续 phase 规划依赖这个顺序。

Expand Down
36 changes: 8 additions & 28 deletions fluxon_test_stack/ci_2_virt_node.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,6 @@
LOCAL_SECONDARY_NODE_SUFFIX = "b"
TEST_STACK_START_TEST_BED_CONFIG_ENV = "FLUXON_TEST_STACK_START_TEST_BED_CONFIG"
PLACEHOLDER_WHEEL_NAME = "fluxon-0.0.0-ci-placeholder-cp38-abi3-manylinux_2_28_x86_64.whl"
SAME_HOST_LOCAL_MULTI_NODE_ETCD_CLIENT_PORT_OFFSET = 100
SAME_HOST_LOCAL_MULTI_NODE_GREPTIME_PORT_OFFSET = 110


def _parse_args() -> argparse.Namespace:
Expand Down Expand Up @@ -404,10 +402,6 @@ def _rewrite_suite_for_local_dual_nodes(
runtime = generated_profile.get("runtime")
if not isinstance(runtime, dict):
raise ValueError("generated public profile runtime must be a mapping")
ci_base_runtime_host_ports = {
"etcd": int(controller_port) + SAME_HOST_LOCAL_MULTI_NODE_ETCD_CLIENT_PORT_OFFSET,
"greptime": int(controller_port) + SAME_HOST_LOCAL_MULTI_NODE_GREPTIME_PORT_OFFSET,
}
ci_runtime = runtime.get("ci")
if not isinstance(ci_runtime, dict):
raise ValueError("generated public profile must define runtime.ci")
Expand Down Expand Up @@ -435,28 +429,14 @@ def _rewrite_suite_for_local_dual_nodes(
secondary_node_name: host_ip,
}
if runtime_key == "ci":
runtime_contracts = runtime_block.get("runtime_contracts")
if not isinstance(runtime_contracts, dict):
raise ValueError("generated public profile runtime.ci.runtime_contracts must be a mapping")
for contract in runtime_contracts.values():
if not isinstance(contract, dict):
continue
base_runtime = contract.get("base_runtime")
if isinstance(base_runtime, dict):
for svc_name in ("etcd", "greptime"):
svc_cfg = base_runtime.get(svc_name)
if isinstance(svc_cfg, dict):
svc_cfg["target"] = primary_node_name
endpoint_cfg = svc_cfg.get("endpoint")
if isinstance(endpoint_cfg, dict):
endpoint_cfg["host_port"] = int(ci_base_runtime_host_ports[svc_name])
case_runtime = contract.get("case_runtime")
if isinstance(case_runtime, dict):
master_cfg = case_runtime.get("master")
if isinstance(master_cfg, dict):
deployer_cfg = master_cfg.get("deployer")
if isinstance(deployer_cfg, dict):
deployer_cfg["target"] = primary_node_name
requirement_configs = runtime_block.get("requirements")
if not isinstance(requirement_configs, dict):
raise ValueError("generated public profile runtime.ci.requirements must be a mapping")
master_cfg = requirement_configs.get("master")
if isinstance(master_cfg, dict):
deployer_cfg = master_cfg.get("deployer")
if isinstance(deployer_cfg, dict):
deployer_cfg["target"] = primary_node_name
if runtime_key == "test_stack":
deploy_templates = runtime_block.get("deploy_templates")
if isinstance(deploy_templates, dict):
Expand Down
166 changes: 95 additions & 71 deletions fluxon_test_stack/ci_test_list.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
schema_version: 9
schema_version: 10

run:
mode: full_once
Expand All @@ -11,8 +11,7 @@ run:
scenes:
ci_top_attention_doc_page_build:
ci:
subject: doc_page
runtime_contract: rust_self_managed
requirements: [ci_runner]
prepare:
- kind: online_docker_image
image_ref: hanbaoaaa/fluxon-doc-site-builder:quartz-v5.0.0-node-v24.16.0
Expand All @@ -23,12 +22,46 @@ scenes:

ci_top_attention_bin_kvtest:
ci:
subject: rust
runtime_contract: rust_self_managed
requirements: [ci_runner, testbed_etcd, testbed_greptime]
select:
scales: [n1_kvowner_dram_20gib]
profiles: [fluxon_tcp]

ci_top_attention_config_kv:
ci:
requirements: [ci_runner]
select:
scales: [n1_kvowner_dram_3gib]
profiles: [fluxon_tcp]

ci_top_attention_config_fs:
ci:
requirements: [ci_runner]
select:
scales: [n1_kvowner_dram_3gib]
profiles: [fluxon_tcp]

ci_top_attention_config_mq:
ci:
requirements: [ci_runner, master, owner_0, testbed_etcd, testbed_greptime]
select:
scales: [n1_kvowner_dram_3gib]
profiles: [fluxon_tcp]

ci_top_attention_ctrl_c_kv:
ci:
requirements: [ci_runner]
select:
scales: [n1_kvowner_dram_3gib]
profiles: [fluxon_tcp]

ci_top_attention_ctrl_c_mq:
ci:
requirements: [ci_runner, testbed_etcd, testbed_greptime]
select:
scales: [n1_kvowner_dram_3gib]
profiles: [fluxon_tcp]

kv_read_heavy_zipf:
test_stack:
mode: KVSTORE
Expand Down Expand Up @@ -315,72 +348,48 @@ profiles:
doc_site_base_url: example.com
ci_top_attention_bin_kvtest:
kv_test_rounds: all
runtime_contracts:
cluster_kv_owner: &cluster_kv_owner_runtime
base_runtime:
etcd:
target: infra44-ThinkStation-PX
endpoint:
scheme: HTTP
host_port: 32579
greptime:
target: infra44-ThinkStation-PX
endpoint:
scheme: HTTP
host_port: 34030
case_runtime:
master:
lifecycle: service
k8s_ref: deployment/master
deployer:
target: infra44-ThinkStation-PX
command: [/bin/bash, -lc]
args:
- |
set -euo pipefail
cd __RUN_DIR__/src
exec __RUN_DIR__/venv/bin/python3 -m fluxon_py.runtime.start_master \
-c __RUN_DIR__/configs/ci_master.yaml \
-w __RUN_DIR__/services/master
owner_0:
lifecycle: service
k8s_ref: deployment/owner_0
deployer:
target: __TARGET__
command: [/bin/bash, -lc]
args:
- |
set -euo pipefail
cd __RUN_DIR__/src
exec __RUN_DIR__/venv/bin/python3 -m fluxon_py.runtime.start_owner_kvclient \
-c __RUN_DIR__/configs/ci_owner_0.yaml \
-w __RUN_DIR__/services/owner_0
ci_runner: &common_ci_runner_instance
lifecycle: job
k8s_ref: deployment/ci_runner
deployer:
target: __TARGET__
command: [/bin/bash, -lc]
args:
- |
set -uo pipefail
exec bash __RUN_DIR__/ci_runner.sh
rust_self_managed:
base_runtime:
etcd:
target: infra44-ThinkStation-PX
endpoint:
scheme: HTTP
host_port: 32579
greptime:
target: infra44-ThinkStation-PX
endpoint:
scheme: HTTP
host_port: 34030
case_runtime:
ci_runner:
<<: *common_ci_runner_instance

ci_top_attention_config_kv: {}
ci_top_attention_config_fs: {}
ci_top_attention_config_mq: {}
ci_top_attention_ctrl_c_kv: {}
ci_top_attention_ctrl_c_mq: {}
requirements:
master:
lifecycle: service
k8s_ref: deployment/master
deployer:
target: infra44-ThinkStation-PX
command: [/bin/bash, -lc]
args:
- |
set -euo pipefail
cd __RUN_DIR__/src
exec __RUN_DIR__/venv/bin/python3 -m fluxon_py.runtime.start_master \
-c __RUN_DIR__/configs/ci_master.yaml \
-w __RUN_DIR__/services/master
owner_0:
lifecycle: service
k8s_ref: deployment/owner_0
deployer:
target: __TARGET__
command: [/bin/bash, -lc]
args:
- |
set -euo pipefail
cd __RUN_DIR__/src
exec __RUN_DIR__/venv/bin/python3 -m fluxon_py.runtime.start_owner_kvclient \
-c __RUN_DIR__/configs/ci_owner_0.yaml \
-w __RUN_DIR__/services/owner_0
ci_runner: &common_ci_runner_instance
lifecycle: job
k8s_ref: deployment/ci_runner
deployer:
target: __TARGET__
command: [/bin/bash, -lc]
args:
- |
set -uo pipefail
exec bash __RUN_DIR__/ci_runner.sh
test_stack: &common_test_stack_runtime
kind: FLUXON
# English note:
Expand Down Expand Up @@ -460,6 +469,11 @@ profiles:
doc_site_base_url: example.com
ci_top_attention_bin_kvtest:
kv_test_rounds: all
ci_top_attention_config_kv: {}
ci_top_attention_config_fs: {}
ci_top_attention_config_mq: {}
ci_top_attention_ctrl_c_kv: {}
ci_top_attention_ctrl_c_mq: {}
test_stack:
<<: *common_test_stack_runtime
fluxon_sockudo_ws:
Expand All @@ -472,6 +486,11 @@ profiles:
doc_site_base_url: example.com
ci_top_attention_bin_kvtest:
kv_test_rounds: all
ci_top_attention_config_kv: {}
ci_top_attention_config_fs: {}
ci_top_attention_config_mq: {}
ci_top_attention_ctrl_c_kv: {}
ci_top_attention_ctrl_c_mq: {}
test_stack:
<<: *common_test_stack_runtime
fluxon_tcp:
Expand All @@ -484,6 +503,11 @@ profiles:
doc_site_base_url: example.com
ci_top_attention_bin_kvtest:
kv_test_rounds: all
ci_top_attention_config_kv: {}
ci_top_attention_config_fs: {}
ci_top_attention_config_mq: {}
ci_top_attention_ctrl_c_kv: {}
ci_top_attention_ctrl_c_mq: {}
test_stack:
<<: *common_test_stack_runtime
redis_sharded:
Expand Down
Loading
Loading