[Feature] Strip ANSI codes from run logs and store them as plain text instead of bytes #2876
[Feature] Strip ANSI codes from run logs and store them as plain text instead of bytes #2876peterschmidt85 merged 9 commits intomasterfrom
Conversation
… instead of bytes #2841
| "time" | ||
|
|
||
| "github.com/creack/pty" | ||
| "github.com/dstackai/ansistrip" |
There was a problem hiding this comment.
Any benefit having this in a separate repo vs adding to dstack?
There was a problem hiding this comment.
ansistrip docs should mention what it does with non-ascii, non-utf-8 sequences.
runner/internal/executor/executor.go
Outdated
| select { | ||
| case <-ctx.Done(): | ||
| log.Error(ctx, "Job canceled") | ||
| stripper.Close() |
There was a problem hiding this comment.
Not seeing any benefit calling stripper.Close() from multiple places. Just close it with defer like other cleanups?
runner/internal/executor/executor.go
Outdated
| }() | ||
|
|
||
| logger := io.MultiWriter(runnerLogFile, os.Stdout, ex.runnerLogs) | ||
| stripper := ansistrip.NewWriter(ex.runnerLogs, AnsiStripFlushInterval, AnsiStripMaxDelay) |
There was a problem hiding this comment.
BTW, do we actually need stripper for runner logs?
There was a problem hiding this comment.
If we don't use utf-8, we would have to use base64. That's why it's much easier to reuse the same code for both job and runner logs.
| return { | ||
| "timestamp": runner_log_event.timestamp, | ||
| "message": b64encode_raw_message(runner_log_event.message), | ||
| "message": runner_log_event.message.decode(), |
There was a problem hiding this comment.
.decode() will raise an exception if bytes are invalid utf-8. If it's handled by the runner, please leave a comment since it's very non-obvious when the runner itself returns base64 encoded bytes instead of strings.
| yield base64.b64decode(log.message) | ||
| yield log.message.encode() |
There was a problem hiding this comment.
No need to return bytes anymore – the HTTP API change is backward incompatible anyway since it no longer returns base64-encoded bytes, so we may break the Python API just as well.
There was a problem hiding this comment.
I'd suggest that we change this in a separate PR where I want to refactor the method and have two separate methods for attached and detached logs.
|
@r4victor Just to confirm, this PR breaks backward compatibility. After upgrading, old logs won't be accessible. |
What happens to the old logs exactly? I don't see the code filtering them out, so users will see old logs in base64 in the UI / CLI? |
Fixes #2841
Out of the scope:
--startin UI, etc^ This will be done in a separate PR