-
Notifications
You must be signed in to change notification settings - Fork 48
recovery from key - store the parties being migrated in the local db for recovery in case of failure #3201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Nicu Reut <[email protected]> [ci] Store parties being migrated during key recovery This ensures the process is resilient to a restart after we unhosted the parties from the old node Signed-off-by: Nicu Reut <[email protected]>
Signed-off-by: Nicu Reut <[email protected]>
Signed-off-by: Nicu Reut <[email protected]>
.../test/scala/org/lfdecentralizedtrust/splice/validator/store/ValidatorInternalStoreTest.scala
Show resolved
Hide resolved
|
@pasindutennage-da also requests a review from you as I touched on some of the code you added, mostly tried to keep it as is just try to integrate it with the validator app |
Signed-off-by: Nicu Reut <[email protected]>
Signed-off-by: Nicu Reut <[email protected]>
...main/scala/org/lfdecentralizedtrust/splice/validator/store/db/DbValidatorInternalStore.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Nicu Reut <[email protected]>
Hmm thinking about options...
I'm on the fence of whether it is worth it to add something like this. I'd definitely feel better to have tested this somehow at least once... So I think I'd also be happy with some form of manual test? (For example on a cluster: deploy, rig things so the validator crashes on the right place after the migration, confirm it resumes fine once we unrig things again?) |
| )(implicit ec: ExecutionContext): OptionT[Future, A] = { | ||
| OptionT(futureUnlessShutdownToFuture(f.value)) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very important to have newlines! 😛
| import org.apache.pekko.stream.scaladsl.{Sink, Source} | ||
| import org.apache.pekko.stream.testkit.StreamSpec | ||
| import org.lfdecentralizedtrust.splice.admin.api.client.commands.HttpCommandException | ||
| import org.lfdecentralizedtrust.splice.util.{SpliceRateLimitMetrics, SpliceRateLimiter} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just going to pretend I understand how this passed static checks previously...
| for { | ||
| _ <- importAcs(partiesToMigrateFinal, getAcsSnapshot) | ||
| _ <- importAcs(partiesToMigrateFinal.parties, getAcsSnapshot) | ||
| _ <- configProvider.clearPartiesToMigrate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, makes sense... so I guess one of my testing ideas doesn't work then...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well we can't test in integration tests because we sys.exit if the app fails to start. I will do a manual test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we sys.exit if the app fails to start
Ugh right, we don't have info to "assert-fails-to-start". Meh.
I will do a manual test
Thanks! The code looks perfect to me but it seems like the right thing to do.
| } | ||
|
|
||
| class ValidatorConfigProvider(config: ValidatorInternalStore) { | ||
| class ValidatorConfigProvider(config: ValidatorInternalStore, val loggerFactory: NamedLoggerFactory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| class ValidatorConfigProvider(config: ValidatorInternalStore, val loggerFactory: NamedLoggerFactory) | |
| class ValidatorDynamicConfigProvider(config: ValidatorInternalStore, val loggerFactory: NamedLoggerFactory) |
Quite off-topic to this PR, more like a general "let's consider refactoring this" comment; to a casual reader it is hard to distinguish otherwise that this is not related to the (static) config from the config files.
(cc @pasindutennage-da )
.../test/scala/org/lfdecentralizedtrust/splice/validator/store/ValidatorInternalStoreTest.scala
Show resolved
Hide resolved
martinflorian-da
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks! I like how ValidatorInternalStore is helping us do smart things.
Signed-off-by: Nicu Reut <[email protected]>
Signed-off-by: Nicu Reut <[email protected]>
Signed-off-by: Nicu Reut <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinflorian-da can you take another look? Added a test for the new functionality. In theory there's a slight chance that it will be flaky but I am confident enough that the startup is not fast enough for it to flake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ufff IDK, it does sound quite a bit shaky flaky to me... Things change, we'll forget what we did here, then we'll reorder app init order or change something else... and then go debug this.
I guess you run start where you do so it sets up the DB for you? Can't we just do that in the test, before running start?
Yes it seems a bit over the top for this particular PR, but the capability of initialing this store with arbitrary data before we start an app sounds like a handy thing to have also for other tests.
I am confident enough
Not going to veto it but if it was my PR I think I'd lean towards not adding that test, even if the alternative was "I only tested it manually once".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you run start where you do so it sets up the DB for you? Can't we just do that in the test, before running start?
We could but it's not trivial
Not going to veto it but if it was my PR I think I'd lean towards not adding that test, even if the alternative was "I only tested it manually once".
honestly SGTM, will just remove the test, I was on the fence about it but then realized that our startup is real slow but then again that can always change and we shouldn't count on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
...it's also a dangerous precedent... Not everyone's gut feeling is as well calibrated as yours, but they might still end up copying this approach...
Signed-off-by: Nicu Reut <[email protected]>
# Conflicts: # apps/validator/src/main/scala/org/lfdecentralizedtrust/splice/validator/ValidatorApp.scala
Signed-off-by: Nicu Reut <[email protected]>
Signed-off-by: Nicu Reut <[email protected]>
This ensures the process is resilient to a restart after we unhosted the parties from the old node
fixes #2694
Tried to add a unit test but mocking the participant admin connection is a full on terror
testing in the integration tests that we fail at the right time it's also a pain
Pull Request Checklist
Cluster Testing
/cluster_teston this PR to request it, and ping someone with access to the DA-internal system to approve it./hdm_teston this PR to request it, and ping someone with access to the DA-internal system to approve it.PR Guidelines
Fixes #n, and mention issues worked on using#nMerge Guidelines