Skip to content

Pipe: Retry history LoadTsFile while waiting for schema#18031

Merged
jt2594838 merged 3 commits into
apache:masterfrom
Caideyipi:fix/pipe-load-tsfile-schema-autocreate
Jun 26, 2026
Merged

Pipe: Retry history LoadTsFile while waiting for schema#18031
jt2594838 merged 3 commits into
apache:masterfrom
Caideyipi:fix/pipe-load-tsfile-schema-autocreate

Conversation

@Caideyipi

@Caideyipi Caideyipi commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Problem

When a full pipe is auto-split, historical data TsFiles can be delivered by the history/data side before the schema metadata side finishes replaying the current schema. If the receiver has enable_auto_create_schema=false, LoadTsFile schema verification may see a missing device or measurement and fail the history TsFile with LOAD_FILE_ERROR.

That failure is too strong for Pipe: in inclusion=all, the missing schema can still arrive from the schema stream. Treating the file as a permanent load failure can drop the history TsFile and produce partial data loss on the receiver.

Behavior after this change

This patch does not bypass the receiver's enable_auto_create_schema=false setting and does not force Pipe-generated LoadTsFile to auto-create database or timeseries schema from the TsFile.

Instead, for Pipe-generated LoadTsFile only, the receiver now classifies the failure as temporary when all of these are true:

  • schema verification is enabled;
  • receiver auto schema creation is disabled;
  • the failure chain indicates a missing device or measurement that cannot be auto-created.

In that case LoadTsFile returns LOAD_TEMPORARY_UNAVAILABLE_EXCEPTION. The Pipe statement status visitor maps it to PIPE_RECEIVER_TEMPORARY_UNAVAILABLE_EXCEPTION, so the sender keeps retrying instead of treating the history TsFile as a permanent failed file.

Other LoadTsFile failures, such as corrupted files, type conflicts, non-Pipe loads, and errors without the missing-schema signal, still follow the existing permanent failure handling.

Why this is better for Pipe

This preserves the semantic contract of inclusion=all: schema should be transferred by the metadata stream, even if it arrives after a historical data TsFile. The data side waits for that schema instead of creating receiver schema from historical TsFile metadata.

It also avoids resurrecting schema that was deleted before pipe creation. Historical TsFiles may still contain old metadata, while the schema snapshot represents the current source schema. Retrying until the schema side catches up is safer than overriding the receiver's auto-create setting and potentially recreating deleted timeseries.

For async active LoadTsFile, this also works with the existing active-load retry policy because ActiveLoadFailedMessageHandler already retries LOAD_TEMPORARY_UNAVAILABLE_EXCEPTION.

Tests

  • Added unit coverage for Pipe LoadTsFile missing-schema temporary classification.
  • Added unit coverage for mapping LOAD_TEMPORARY_UNAVAILABLE_EXCEPTION to Pipe receiver temporary unavailable status.
  • Added an integration test for auto-split full Pipe history TsFile loading with receiver auto schema creation disabled.

Local checks run:

  • mvn -Ddevelocity.off=true -pl iotdb-core/datanode spotless:check
  • mvn spotless:apply -pl integration-test -P with-integration-tests
  • git diff --check

Targeted unit test command was attempted, but this local workspace currently fails during unrelated datanode main compilation before tests run, with missing generated/parser symbols and unrelated Pipe constants.

@Caideyipi Caideyipi force-pushed the fix/pipe-load-tsfile-schema-autocreate branch from 09eff22 to 7a56499 Compare June 25, 2026 09:05
@Caideyipi Caideyipi changed the title Fix pipe history LoadTsFile schema creation Pipe: Retry history LoadTsFile while waiting for schema Jun 25, 2026
@Caideyipi Caideyipi force-pushed the fix/pipe-load-tsfile-schema-autocreate branch from 7a56499 to fdc6b14 Compare June 25, 2026 09:08
@Caideyipi Caideyipi force-pushed the fix/pipe-load-tsfile-schema-autocreate branch from fdc6b14 to dca8085 Compare June 25, 2026 09:20
Comment on lines 77 to 80
|| status.getCode() == TSStatusCode.LOAD_TEMPORARY_UNAVAILABLE_EXCEPTION.getStatusCode()
|| status.getCode() == TSStatusCode.LOAD_FILE_ERROR.getStatusCode()
&& status.getMessage() != null
&& status.getMessage().contains("memory")) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the message one day be involved in i18n?
If so, the contains() call may become dangerous.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 2b62bd9: LoadTsFile schema-wait detection now uses a dedicated LoadAnalyzeMissingSchemaException and LOAD_TEMPORARY_UNAVAILABLE_EXCEPTION, and the pipe status visitor no longer classifies by message text such as contains(memory).

Comment on lines +86 to +89
private static final String MISSING_SCHEMA_MESSAGE =
"does not exist in IoTDB and can not be created";
private static final String AUTO_CREATE_SCHEMA_HINT_MESSAGE =
"Please check weather auto-create-schema is enabled";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i18n

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 2b62bd9: moved the missing device/measurement schema messages and the pipe schema-wait message into DataNodeQueryMessages for en/zh i18n.

@jt2594838 jt2594838 merged commit 1811e13 into apache:master Jun 26, 2026
40 checks passed
@jt2594838 jt2594838 deleted the fix/pipe-load-tsfile-schema-autocreate branch June 26, 2026 06:14
MileaRobertStefan pushed a commit to MileaRobertStefan/iotdb that referenced this pull request Jun 26, 2026
* Fix pipe history LoadTsFile schema retry

* Address load tsfile schema retry review

* Fix load tsfile tests with real temp files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants