storage: new sync option for overlay/vfs#622
storage: new sync option for overlay/vfs#622giuseppe wants to merge 6 commits intocontainers:mainfrom
Conversation
|
@mrunalp PTAL |
storage/pkg/config/config.go
Outdated
| doptions = append(doptions, fmt.Sprintf("%s.use_composefs=%s", driverName, options.Overlay.UseComposefs)) | ||
| } | ||
| if options.Overlay.SyncPolicy != "" { | ||
| doptions = append(doptions, fmt.Sprintf("%s.sync_policy=%s", driverName, options.Overlay.SyncPolicy)) |
There was a problem hiding this comment.
I seriously dislike this kind of turning of structured data back into strings.
I realize that’s not the fault of this PR, but can we stop doing that some time? There will never be a convenient time.
I suppose OptionsConfig, or something, could be stored in StoreOptions, and perhaps added to drivers.Options.
There was a problem hiding this comment.
Also it would be nice to parse the value into an enum already here.
|
👍 to adding this option. |
|
@nalind thoughts? |
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
Luap99
left a comment
There was a problem hiding this comment.
not really fully review just some drive by comments I might be missing the context of the origin of this.
| **sync**="" | ||
| Filesystem synchronization mode for layer creation. (default: "") |
There was a problem hiding this comment.
default: "" does not say what the default actually is, it should say none or filesystem.
I think it default to none? Is that the right default? I mean practically speaking should we not do the safe thing as default. I guess the concern would be performance? But the people needing that could opt. Without sync we need to worry about more store corruption which seems worse for us?
| logrus.Debugf("Error Syncfs(%s) - %v", mountpoint, err) | ||
| } | ||
| unix.Close(fd) | ||
| if err := system.Syncfs(mountpoint); err != nil { |
There was a problem hiding this comment.
is this correct to not use the syncmode setting here and always sync?
There was a problem hiding this comment.
yes, this was already done before, I've only reused the new function now.
If you kill a FUSE file system, all the data in memory will be lost, even if the system didn't crash. This is a fallback path taken when doing it correctly through fusermount failed.
The syncfs here prevents that when fusermount failed (or it is not present), we do not lose data on fuse-overlayfs when we unmount it.
|
Of reference too composefs/composefs-rs#185 (comment) Basically ostree keeps track of the boot id optionally - another option here is to have a periodic "walk layer dirs and fsync()" which is syscall expensive (without io-uring) but avoids flushing unrelated I/O. This of course relates to "containers-storage on top of composefs(rs)" in that this option would today always be on by default... |
747478c to
2bc826e
Compare
|
Can we keep it open for a while until we get any feedback? |
|
we haven't got any feedback from the customer, but they'd like us not to wait for them to get it released. |
|
@containers/storage-maintainers PTAL |
|
@mtrmac @kolyshkin @Luap99 PTAL |
63ba009 to
31330a9
Compare
storage/pkg/config/config_test.go
Outdated
| t.Fatalf("Expected to find overlay sync option in %v", doptions) | ||
| } | ||
|
|
||
| // Make sure no sync option when not configured |
There was a problem hiding this comment.
These tests already start with “empty should mean 0 options”, this is unnecessary. Same with VFS.
storage/layers.go
Outdated
| modifiedLocations |= layer.location | ||
| } | ||
| if err := r.saveLayers(modifiedLocations); err != nil { | ||
| if err := r.saveLayers(modifiedLocations, true); err != nil { |
There was a problem hiding this comment.
This one is after deleting layers and BigData… I am really unsure whether we need to sync.
I suppose any future operation will sync that?!
At least, it is not “metadata only” so we might need a more precise naming of the option/situation.
storage/layers.go
Outdated
|
|
||
| // saveFor saves the contents of the store relevant for modifiedLayer to disk. | ||
| // If metadataOnly is true, only the layer metadata (db) is being changed (e.g. setting a flag, | ||
| // changing names), and the sync of the storage filesystem is skipped. |
There was a problem hiding this comment.
(After the API settles) please document this also in the other save… methods.
storage/layers.go
Outdated
| } | ||
| } | ||
|
|
||
| // Write the tar-split file before UpdateLayerIDMap. |
There was a problem hiding this comment.
Why? This hints at the order being important but it doesn’t say what to pay attention to.
There was a problem hiding this comment.
this is unnecessary now as the sync happens later.
On the other hand, the AtomicWriteFile is not needed here. I think we need to drop it
There was a problem hiding this comment.
I think the general principle has been “writes of our metadata always atomic, writes of layer data are atomic only with the new opt-in”. In that sense, tar-split probably counts as “metadata” — OTOH, if either the layer data or the tar-split is corrupted, the outcome is ~the same, so I don’t feel strongly about this.
There was a problem hiding this comment.
yes, I agree with the "always write medata atomically". I am only proposing tar-split as an exception because it still needs the data to be meaningful
storage/layers.go
Outdated
| } | ||
| // If the underlying storage driver is using sync, make sure we sync everything before | ||
| // saving the layer data, this ensures that all files/directories are properly created and written. | ||
| if r.driver.SyncMode() != drivers.SyncModeNone { |
There was a problem hiding this comment.
ACK. For safety, I think I’d prefer a private enum over just a bool. (And I didn’t now carefully review every call site, assuming that it would be easier with the enum.)
there is no reason to sync the tarsplit data if the rest of the layer was not sync'ed. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
save the layer info only after the tarsplit file and big data was written. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
it is a preparatory commit. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add a new driver-specific `sync` configuration option that controls filesystem synchronization during layer operations. When set to "filesystem", the driver ensures all pending writes are flushed to the file system before marking a layer as complete. This helps prevent data corruption in scenarios where the system crashes or loses power before the filesystem has finished writing layer data. Only the vfs and overlay backends support the new flag. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
I've left a bool but renamed it to |
|
@mtrmac is there anything more left or can we move forward with this PR? |
Add a new driver-specific
syncconfiguration option that controls filesystem synchronization during layer operations. When set to "filesystem", the driver ensures all pending writes are flushed to the file system before marking a layer as complete.This helps prevent data corruption in scenarios where the system crashes or loses power before the filesystem has finished writing layer data.