Save all warnings encountered in a warnings file by LecrisUT · Pull Request #4642 · teemtee/tmt

LecrisUT · 2026-03-03T17:37:59Z

Added a new RunWarningsHandler that is populated as soon as the run workdir is available. All warnings are then saved under the warnings.yaml file with a format like

- logger: tmt.run
  msg: User is feeling safe.
  trace: 'tmt/base/core.py#3322: go()'

Extended the logger.warning interface to be able to specify a reason and source that would be added to these warnings file.

Pull Request Checklist

Depends-on #4656
Fixes #4618

gemini-code-assist

Code Review

This pull request introduces a feature to save all encountered warnings into a warnings.yaml file. However, it introduces several critical security and stability risks. A critical initialization order flaw in the Run class and a missing import in tmt/log.py will cause the application to crash immediately. Furthermore, an unsafe assert statement in the logging path poses a Denial of Service risk, and the lack of synchronization when writing to the warnings file introduces a race condition that could lead to data loss. The current implementation also suffers from inefficiency due to frequent file I/O.

tmt/base/core.py

tmt/log.py

happz · 2026-03-03T17:58:27Z

My two cents noone asked for: have you considered using a logging handler, similar to LogfileHandler, with custom emit() method? That would be my first choice for this endeavor, because AFAICT, if the idea is to direct warnings into one extra file with special log format, it'd be pretty much the job for logging handler.

The handler could be attached to the run logger (like a logfile is attached to the root logger via add_logfile_handler(), ), and that would be it. The handler would then receive all warnings emitted from loggers below it, as all loggers propagate their messages upward (to get them logged into a logfile). Channeling the records into new file would then happen there rather than in the warning() method which in your draft has two jobs instead of one: not just emitting the record, but also route the record into a logfile, which is exactly what handlers are for.

LecrisUT · 2026-03-04T08:58:49Z

have you considered using a logging handler, similar to LogfileHandler, with custom emit() method?

Ah you mean to reuse that class as the file handler, yeah, sounds reasonable.

Right now I am at the stage to check if the run logger is propagated consistently. Stuff like tmt show I expect to fail, but I want to get a list of all the failure points right now to see what opt-out to use for those.

The handler would then receive all warnings emitted from loggers below it, as all loggers propagate their messages upward (to get them logged into a logfile). Channeling the records into new file would then happen there rather than in the warning() method

This part I need more eyes on the code to figure out what you mean here

happz · 2026-03-04T09:28:29Z

have you considered using a logging handler, similar to LogfileHandler, with custom emit() method?

Ah you mean to reuse that class as the file handler, yeah, sounds reasonable.

Well, not really reuse the class we have, but adding a new one, as the format of the file would be different, plus the filtering is needed. But following the same concept of using a logging handler for this job, yes. The separation of "what to log" and "where it should go". We may decide to sent all warnings and errors to a Sentry instance, for that we would add another handler, with filtering and Sentry API client, and attach it to the root logger, rather than touching error() and warning() implementation.

The handler would then receive all warnings emitted from loggers below it, as all loggers propagate their messages upward (to get them logged into a logfile). Channeling the records into new file would then happen there rather than in the warning() method

This part I need more eyes on the code to figure out what you mean here

Nothing exceptional there, just loggers sending their records upward to their parent. Usually with the logging subsystem, if there are multiple loggers, they tend to have their own handlers, for various reasons. In our case, console handler and logfile handler are attached to the root logger, it's there where records are emitted and filtering happens. This is to make sure all logging lands in the log file, attaching log file to every single logger in the tree would be, hm, impractical, to say at least. Loggers spawned in runtime attach the right verbosity and labels and so on to records, and let them bubble up to the root. Perfectly valid setup per logging docs.

In your scenario, you would introduce new handler, I am just not sure whether it should be attached to the root logger, or to the run-level one. Probably the run, bit may miss some warnings emitted before workdir is created, for example. On the other hand, where would such warning be saved by the handler, when there's no workdir yet to host the warnings.yaml file?

LecrisUT · 2026-03-04T10:57:27Z

Probably the run, bit may miss some warnings emitted before workdir is created, for example.

Is there a reason we cannot move the run workdir creation as early as needed?

On the other hand, where would such warning be saved by the handler, when there's no workdir yet to host the warnings.yaml file?

For the case where we know it should not have a run_workdir, e.g. tmt show then we should skip that. For cases where it has, but it was not created yet, we need to check the context and figure out if we should be deferring them. Ideally we should not have any imo, but we need to look more carefully with more context on those.

Usually with the logging subsystem, if there are multiple loggers, they tend to have their own handlers, for various reasons. In our case, console handler and logfile handler are attached to the root logger, it's there where records are emitted and filtering happens.

Isn't this something we do not have right now? A logger with multiple sub-loggers/drivers? Our descend just creates a new logger, and I don't think that part is intended to be used to have hierarchical connection right?

The idea is sound, but how do we do the separation inside the Logger? My view would be that it should own a list of logging drivers that would specify where it should emit to. Though I would do it in a subsequent PR.

Loggers spawned in runtime attach the right verbosity and labels and so on to records, and let them bubble up to the root.

This part you lost me again because we do not have a design that bubbles up the logging right? Would need a sketch to figure out the refactoring idea of that.

happz · 2026-03-04T11:39:52Z

Probably the run, bit may miss some warnings emitted before workdir is created, for example.

Is there a reason we cannot move the run workdir creation as early as needed?

Just the complexity and technical debt. I have some ideas about how workdirs should be handled in the code, but it's just so painful to start something as it takes ages to get it in, and touching complex topics like this spells a hell of a rebase along the road.

Some commands have no need for it, e.g. tmt * show and tmt * ls, it's tied to tmt run and Run class, but because of history, it's not straightforward.

On the other hand, where would such warning be saved by the handler, when there's no workdir yet to host the warnings.yaml file?

For the case where we know it should not have a run_workdir, e.g. tmt show then we should skip that. For cases where it has, but it was not created yet, we need to check the context and figure out if we should be deferring them. Ideally we should not have any imo, but we need to look more carefully with more context on those.

Ideally once in a run context, we should have a workdir and a place to write warnings into.

Usually with the logging subsystem, if there are multiple loggers, they tend to have their own handlers, for various reasons. In our case, console handler and logfile handler are attached to the root logger, it's there where records are emitted and filtering happens.

Isn't this something we do not have right now? A logger with multiple sub-loggers/drivers? Our descend just creates a new logger, and I don't think that part is intended to be used to have hierarchical connection right?

# in Logger.descend():

actual_logger = self._normalize_logger(self._logger.getChild(logger_name))

Newly created logging.Logger is a child of the logging.Logger wrapped by tmt.log.Logger. So we do have this tree in place.

    @staticmethod
    def _normalize_logger(logger: logging.Logger) -> logging.Logger:
        """
        Reset properties of a given :py:class:`logging.Logger` instance
        """

        logger.propagate = True
        logger.level = logging.DEBUG

        logger.handlers = []

        return logger

Every logger is set to propagate its records to its parent, have the lowest loglevel possible, and have no handlers - the root one gets handlers attached later.

Loggers spawned in runtime attach the right verbosity and labels and so on to records, and let them bubble up to the root.

This part you lost me again because we do not have a design that bubbles up the logging right? Would need a sketch to figure out the refactoring idea of that.

We do :)

LecrisUT · 2026-03-04T14:23:06Z

There is a big complication with the order of how run_workdir is handled and the order of initialization, so even for my debugging I have to go with something more polished and seems like I would go with a deferred logger writing to the warnings file as soon as it is available. Custom handler it is.

    @staticmethod
    def _normalize_logger(logger: logging.Logger) -> logging.Logger:
        """
        Reset properties of a given :py:class:`logging.Logger` instance
        """

        logger.propagate = True
        logger.level = logging.DEBUG

        logger.handlers = []

        return logger

This thing is puzzling me though. If we are nuking the logger.hanlders in the .descend how are those messages appearing? I do not see that add_console_handler is called on anything other then bootstraplogger and the root one :/

happz · 2026-03-04T14:28:13Z

    @staticmethod
    def _normalize_logger(logger: logging.Logger) -> logging.Logger:
        """
        Reset properties of a given :py:class:`logging.Logger` instance
        """

        logger.propagate = True
        logger.level = logging.DEBUG

        logger.handlers = []

        return logger
This thing is puzzling me though. If we are nuking the logger.hanlders in the .descend how are those messages appearing? I do not see that add_console_handler is called on anything other then bootstraplogger and the root one :/

logger.propagate does the trick. By default, no logger in our tree has any handler at all, and propagates its messages to whichever logger is its parent. Thanks to descend(), all records bubble up to the root logger. Handlers are attached to that one. If we attach e.g. a logfile handler to more than one logger, for the same logfile, we would need to start worrying about propagation because suddenly a single record would appear in the file multiple times.

Attaching a warning-file handler to the run-level logger should work like this: all records travel to root logger, they must pass through the run-level logger, which would sent them to its warning-file handler as well as to its parent, to end up in the logfile and on console.

LecrisUT · 2026-03-04T14:45:18Z

logger.propagate does the trick. By default, no logger in our tree has any handler at all, and propagates its messages to whichever logger is its parent. Thanks to descend(), all records bubble up to the root logger. Handlers are attached to that one. If we attach e.g. a logfile handler to more than one logger, for the same logfile, we would need to start worrying about propagation because suddenly a single record would appear in the file multiple times.

Ok, that explains things. There are 2 options here then:

Allow the Run logger to have its own handler and create it as soon as the workdir is available
Create the handler at root level again, but navigate through the loggers until I find the root logger, get the warnings-file handler and inject the actual file path to write to

happz · 2026-03-04T15:00:14Z

logger.propagate does the trick. By default, no logger in our tree has any handler at all, and propagates its messages to whichever logger is its parent. Thanks to descend(), all records bubble up to the root logger. Handlers are attached to that one. If we attach e.g. a logfile handler to more than one logger, for the same logfile, we would need to start worrying about propagation because suddenly a single record would appear in the file multiple times.

Ok, that explains things. There are 2 options here then:

Allow the Run logger to have its own handler and create it as soon as the workdir is available

Create the handler at root level again, but navigate through the loggers until I find the root logger, get the warnings-file handler and inject the actual file path to write to

I believe the first option is the right way to go: once Run establishes its workdir, it should attach the "WarningFile" handler to its own logger, give it the path to the warning file. And that should be all.

LecrisUT · 2026-03-04T15:33:48Z

Ok, that explains things. There are 2 options here then:

Allow the Run logger to have its own handler and create it as soon as the workdir is available

Create the handler at root level again, but navigate through the loggers until I find the root logger, get the warnings-file handler and inject the actual file path to write to

I believe the first option is the right way to go: once Run establishes its workdir, it should attach the "WarningFile" handler to its own logger, give it the path to the warning file. And that should be all.

But that would miss the warnings in between the click code and creation of the run_workdir. With my debugger I was unable to find any of them that would realistically be encountered, e.g. even the schema validation which is done as part of an __init__ is emitted after the workdir was created. The only parts that I could otherwise find is if it is emitted inside Phase.command(), which is not a user warning, but still a relevant warning to be collected for testing-farm itself.

The only other thing is that it breaks the current state of handlers are only at the top-level, and it could open the gate for having some plugin-specific handlers. This would not be a negative IMO, just a new thing to be aware of.

happz · 2026-03-04T15:39:42Z

Ok, that explains things. There are 2 options here then:

Allow the Run logger to have its own handler and create it as soon as the workdir is available

Create the handler at root level again, but navigate through the loggers until I find the root logger, get the warnings-file handler and inject the actual file path to write to

I believe the first option is the right way to go: once Run establishes its workdir, it should attach the "WarningFile" handler to its own logger, give it the path to the warning file. And that should be all.

But that would miss the warnings in between the click code and creation of the run_workdir. With my debugger I was unable to find any of them that would realistically be encountered, e.g. even the schema validation which is done as part of an __init__ is emitted after the workdir was created. The only parts that I could otherwise find is if it is emitted inside Phase.command(), which is not a user warning, but still a relevant warning to be collected for testing-farm itself.

Correct, there would be this gap, a gap before the "run context" begins. For the first version, I for one would find it acceptable. To cover this gap, we could think about commands like tmt * show: if there are warnings happening when they run, should we also save them in some file, if so, which file? That could give us some guidance and idea how to handle pre-run warnings in tmt run.

The only other thing is that it breaks the current state of handlers are only at the top-level, and it could open the gate for having some plugin-specific handlers. This would not be a negative IMO, just a new thing to be aware of.

Yes. So far we had no use for them, the "save warnings that happened during run" use case calls for the change. I think it's the right approach, and if we discover there's a use case for discover/fmf to have a special handler, we would have some experience already.

tmt/log.py

skycastlelily

LGTM in general, you may want to extend the test:)

tmt/log.py

thrix · 2026-03-09T10:38:31Z

@LecrisUT Can you update MR description, even the issue is so brief it is hard to understand the motivation behind this change .

tmt/log.py

thrix

Well-designed implementation. The append-only YAML approach is clever and avoids race conditions. The stacklevel threading through the logging chain is correctly implemented (verified by tracing the math). Previously flagged issues (source/reason swap, is vs ==, workdir init timing) are all resolved.

Two actionable findings:

to_dict() → to_minimal_dict() - avoids noisy source: null / reason: null in every warning entry (verified: to_minimal_dict() omits None values)
Empty warnings file fails schema validation - touch() creates empty file which parses as None, not [], violating the type: array schema (verified experimentally with ruamel.yaml)

tmt/log.py

tmt/base/core.py

Try to provide a more reasonable `stacklevel` to the inner `logging.Logger._log`, skipping any functions that are wrappers around the loggers. The references in the logger's record should point to the most relevant function that requested the log command, or any such indirect commands like the `guest.run`. --- Found that this would be useful in #4642 for getting a more accurate source of the caller and I split it out from there to simplify the review. The stack can be further improved as it goes, as shown in the second commit of this PR, but that can be gradually addressed. --------- Signed-off-by: Cristian Le <git@lecris.dev>

thrix

Follow-up: suggesting delay=True as a practical solution to the empty-file schema discussion between @happz and @LecrisUT.

tmt/base/core.py

Signed-off-by: Cristian Le <git@lecris.dev>

LecrisUT · 2026-03-10T16:28:41Z

I am stumped with the failure in /tests/clean/basic. Is anyone able to replicate it?

Details

:: [ 11:50:23 ] :: [  BEGIN   ] :: Running 'tmt clean -v --last'
guests
workdir root: /var/tmp/tmt
runs
workdir root: /var/tmp/tmt
    Removing workdir '/var/tmp/tmt/clean-005'.
images
    testcloud

[Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/warnings.yaml'

Cannot log error into logfile '/var/tmp/tmt/clean-005/log.txt'.

The exception was caused by 1 earlier exceptions

Cause number 1:

    [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/log.txt'

    The exception was caused by 1 earlier exceptions

    Cause number 1:

        [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/warnings.yaml'

Cannot log error into logfile '/var/tmp/tmt/clean-005/log.txt'.

The exception was caused by 1 earlier exceptions

Cause number 1:

    [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/log.txt'

    The exception was caused by 1 earlier exceptions

    Cause number 1:

        [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/warnings.yaml'

Cannot log error into logfile '/var/tmp/tmt/clean-005/log.txt'.

The exception was caused by 1 earlier exceptions

Cause number 1:

    [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/log.txt'

    The exception was caused by 1 earlier exceptions

    Cause number 1:

        [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/warnings.yaml'

Cannot log error into logfile '/var/tmp/tmt/clean-005/log.txt'.

The exception was caused by 1 earlier exceptions

Cause number 1:

    [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/log.txt'

    The exception was caused by 1 earlier exceptions

    Cause number 1:

        [Errno 2] No such file or directory: '/var/tmp/tmt/clean-005/warnings.yaml'
clean
:: [ 11:50:24 ] :: [   FAIL   ] :: Command 'tmt clean -v --last' (Expected 0, got 2)

happz · 2026-03-10T17:32:22Z

@LecrisUT please, set the "Size" of this PR.

thrix

All review items addressed:

to_minimal_dict() adopted for cleaner YAML output
delay=True adopted to avoid empty-file schema issue (212c9bc)
All previously flagged issues (source/reason swap, is vs ==, workdir init) resolved

Clean implementation with correct stacklevel threading.

Signed-off-by: Cristian Le <git@lecris.dev>

LecrisUT self-assigned this Mar 3, 2026

LecrisUT added this to planning Mar 3, 2026

LecrisUT added code | logging Changes related to debug logging ci | full test Pull request is ready for the full test execution labels Mar 3, 2026

github-project-automation bot moved this to backlog in planning Mar 3, 2026

gemini-code-assist bot reviewed Mar 3, 2026

View reviewed changes

tmt/base/core.py Outdated Show resolved Hide resolved

tmt/log.py Outdated Show resolved Hide resolved

tmt/log.py Outdated Show resolved Hide resolved

LecrisUT moved this from backlog to implement in planning Mar 4, 2026

LecrisUT force-pushed the feat/save-warnings branch 2 times, most recently from 10a4187 to 4c7e678 Compare March 4, 2026 18:11

LecrisUT changed the title ~~[WIP] Save all warnings encountered in a warnings file~~ Save all warnings encountered in a warnings file Mar 5, 2026

LecrisUT marked this pull request as ready for review March 5, 2026 15:26

graphite-app bot reviewed Mar 5, 2026

View reviewed changes

tmt/log.py Outdated Show resolved Hide resolved

This was referenced Mar 6, 2026

Expose dataclasses.asdict #4655

Closed

Try to get a more reasonable stacklevel #4656

Merged

LecrisUT force-pushed the feat/save-warnings branch from 4c7e678 to 0a17aa8 Compare March 6, 2026 15:04

LecrisUT removed their assignment Mar 6, 2026

skycastlelily approved these changes Mar 9, 2026

View reviewed changes

tmt/log.py Outdated Show resolved Hide resolved

thrix self-requested a review March 9, 2026 10:37

LecrisUT moved this from implement to review in planning Mar 9, 2026

graphite-app bot reviewed Mar 9, 2026

View reviewed changes

tmt/log.py Outdated Show resolved Hide resolved

LecrisUT force-pushed the feat/save-warnings branch from 0db1b9a to fa7b74f Compare March 9, 2026 12:46

thrix reviewed Mar 9, 2026

View reviewed changes

tmt/log.py Show resolved Hide resolved

tmt/base/core.py Outdated Show resolved Hide resolved

thrix reviewed Mar 10, 2026

View reviewed changes

tmt/base/core.py Outdated Show resolved Hide resolved

LecrisUT added 13 commits March 10, 2026 12:38

Add RunWarningsHandler

40cc172

Signed-off-by: Cristian Le <git@lecris.dev>

Link the runwarningslogger to the Run

ce3427d

Signed-off-by: Cristian Le <git@lecris.dev>

Move code around

c5e0f13

Signed-off-by: Cristian Le <git@lecris.dev>

Improve the format a bit

9371f46

Signed-off-by: Cristian Le <git@lecris.dev>

Customize the warnings format

d3a3bb0

Signed-off-by: Cristian Le <git@lecris.dev>

Add reason and source for logging warnings

42ce3d3

Signed-off-by: Cristian Le <git@lecris.dev>

Define warnings schema

e27a012

Signed-off-by: Cristian Le <git@lecris.dev>

Use the SpecBasedContainer

39562d7

Signed-off-by: Cristian Le <git@lecris.dev>

Move some comments

3a67a16

Signed-off-by: Cristian Le <git@lecris.dev>

Document the change

de673fa

Signed-off-by: Cristian Le <git@lecris.dev>

Avoid null keys

d758097

Signed-off-by: Cristian Le <git@lecris.dev>

Validate schema with no entries

165e948

Signed-off-by: Cristian Le <git@lecris.dev>

Do not create warnings file if no warnings is emitted

2dca253

Signed-off-by: Cristian Le <git@lecris.dev>

LecrisUT force-pushed the feat/save-warnings branch from 212c9bc to 2dca253 Compare March 10, 2026 11:38

LecrisUT removed the status | blocked The merging of PR is blocked on some other issue label Mar 10, 2026

happz approved these changes Mar 10, 2026

View reviewed changes

thrix approved these changes Mar 10, 2026

View reviewed changes

Do not attach the logfile handler during clean

67a9210

Signed-off-by: Cristian Le <git@lecris.dev>

LecrisUT force-pushed the feat/save-warnings branch from a6ac43f to 67a9210 Compare March 11, 2026 09:01

Missed another case

c7da238

Signed-off-by: Cristian Le <git@lecris.dev>

LecrisUT mentioned this pull request Mar 11, 2026

Delay the creation of the LogfileHandler #4677

Draft

1 task

psss moved this from review to merge in planning Mar 12, 2026

Conversation

LecrisUT commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

happz commented Mar 3, 2026

Uh oh!

LecrisUT commented Mar 4, 2026

Uh oh!

happz commented Mar 4, 2026

Uh oh!

LecrisUT commented Mar 4, 2026

Uh oh!

happz commented Mar 4, 2026

Uh oh!

LecrisUT commented Mar 4, 2026

Uh oh!

happz commented Mar 4, 2026

Uh oh!

LecrisUT commented Mar 4, 2026

Uh oh!

happz commented Mar 4, 2026

Uh oh!

LecrisUT commented Mar 4, 2026

Uh oh!

happz commented Mar 4, 2026

Uh oh!

Uh oh!

skycastlelily left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thrix commented Mar 9, 2026

Uh oh!

Uh oh!

thrix left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

thrix left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LecrisUT commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

happz commented Mar 10, 2026

Uh oh!

thrix left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

LecrisUT commented Mar 3, 2026 •

edited

Loading

LecrisUT commented Mar 10, 2026 •

edited

Loading