Skip to content

Add back ability to disable backup of snapshot to secondary#3122

Merged
GabrielBrascher merged 2 commits into
apache:4.11from
myENA:bug/disablesnapshotbackup
Feb 4, 2019
Merged

Add back ability to disable backup of snapshot to secondary#3122
GabrielBrascher merged 2 commits into
apache:4.11from
myENA:bug/disablesnapshotbackup

Conversation

@nathanejohnson
Copy link
Copy Markdown
Member

@nathanejohnson nathanejohnson commented Jan 7, 2019

Description

The snapshot.backup.rightafter configuration variable was removed by:

SHA: 6bb0ca2

This adds it back, though named snapshot.backup.to.secondary now instead.

This global parameter, once set, will allow you to prevent automatic backups of
snapshots to secondary storage, unless they're actually needed.

Fixes #3096

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

How Has This Been Tested?

tested this in our 4.11 lab environment

@nathanejohnson nathanejohnson changed the title The snapshot.backup.rightafter configuration variable was removed by: Add back ability to disable backup of snapshot to secondary Jan 7, 2019
if (backupedSnapshot != null) {
snapshotStrategy.postSnapshotCreation(snapshotOnPrimary);
if(s_logger.isDebugEnabled()) {
s_logger.debug("skipping backup of snapshot to secondary due to configuration");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it clear from this log message which snapshot we are talking about?

s_logger.debug("skipping backup of snapshot to secondary due to configuration");
}
if (!snapshotOnPrimary.markBackedUp()) {
throw new CloudRuntimeException("Can't mark snapshot as backed up!");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same applies here. Maybe add the snapshot name, volume ID, etc?

}

@Override
public boolean markBackedUp() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why returning a boolean here?

I mean, you capture a checked exception here, and then you return false. Then, in the method where you use this markBackedUp, you will check if the return is false, and then you throw a runtime exception. Why not capture the checked exception and throw the runtime here?

Also, what about unit test cases?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly I just couldn't throw a NoTransitionException because it would create a circular dependency. CloudRuntimeException would work though, since as you say that's ultimately what ends up happening in the one place this gets called (for now). I have no problem making this change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 on the @rafaelweingartner idea. Throwing a CloudRuntimeException would be better than the boolean approach.

SnapshotInfo snapshotOnPrimary = snapshotStrategy.takeSnapshot(snapshot);
if (payload.getAsyncBackup()) {
backupSnapshotExecutor.schedule(new BackupSnapshotTask(snapshotOnPrimary, snapshotBackupRetries - 1, snapshotStrategy), 0, TimeUnit.SECONDS);
boolean backupFlag = BackupSnapshotAfterTakingSnapshot.value() == null ||
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this variable to something more descriptive?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

BackupSnapshotAfterTakingSnapshot.value();

if (backupFlag) {
if (payload.getAsyncBackup()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you extract lines 1130-1136 to a method? This will also allow you to document it, and then to unit test it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't my code, but I'm not entirely sure how to even test things in backupSnapshotExecutor. That is inited by the config() method, and that's not called from the existing unit tests. Any thoughts?

@nathanejohnson nathanejohnson force-pushed the bug/disablesnapshotbackup branch from 816a356 to ca27bbc Compare January 16, 2019 19:01
@nathanejohnson
Copy link
Copy Markdown
Member Author

@wido I hope I've addressed your concerns. @rafaelweingartner see above regarding the unit test suggestion

SHA: 6bb0ca2

This adds it back, though named snapshot.backup.to.secondary now instead.

This global parameter, once set, will allow you to prevent automatic backups of
     snapshots to secondary storage, unless they're actually needed.

Fixes apache#3096
@nathanejohnson nathanejohnson force-pushed the bug/disablesnapshotbackup branch from ca27bbc to 4372205 Compare January 16, 2019 19:31
Copy link
Copy Markdown
Contributor

@wido wido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

processEvent(Event.OperationNotPerformed);
} catch (NoTransitionException ex) {
s_logger.error("no transition error: ", ex);
throw new CloudRuntimeException("Error marking snapshot backed up: " +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maintain the history of the stack of the exception that is re-thrown here?
I mean, it is a matter of using throw new CloudRuntimeException(<message>, <exception>)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanejohnson I see this point as critical. Doing as @rafaelweingartner proposed would allow having the full stack on the log, which helps on debugging.
Could you please address this point? Thanks! Overall LGTM.

SnapshotInfo snapshotOnPrimary = snapshotStrategy.takeSnapshot(snapshot);
if (payload.getAsyncBackup()) {
backupSnapshotExecutor.schedule(new BackupSnapshotTask(snapshotOnPrimary, snapshotBackupRetries - 1, snapshotStrategy), 0, TimeUnit.SECONDS);
boolean backupSnapToSecondary = BackupSnapshotAfterTakingSnapshot.value() == null ||

This comment was marked as outdated.

return snapshot;
}

protected void backupSnapshotToSecondary(boolean asyncBackup, SnapshotStrategy snapshotStrategy, SnapshotInfo snapshotOnPrimary) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind documenting this method, and then creating unit test cases for it?

@GabrielBrascher
Copy link
Copy Markdown
Member

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@GabrielBrascher a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2554

@GabrielBrascher
Copy link
Copy Markdown
Member

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@GabrielBrascher a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@GabrielBrascher
Copy link
Copy Markdown
Member

This PR is almost good to go, thanks @nathanejohnson!
I see that we have some minor changes requested by @rafaelweingartner. Can you please address them @nathanejohnson?

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3337)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 19575 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3122-t3337-kvm-centos7.zip
Smoke tests completed. 68 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@GabrielBrascher GabrielBrascher merged commit bf805d1 into apache:4.11 Feb 4, 2019
winterhazel pushed a commit that referenced this pull request Jan 28, 2026
Correção no estado da VM ao criar um _backup_ com o host `Down` ou `Disconnected`

Closes #3122

See merge request scclouds/scclouds!1314
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants