-
Notifications
You must be signed in to change notification settings - Fork 2k
peers: Use peer cache to store information for RPC routing. #26985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Nomad 1.11 introduced a new peer cache to store information about Serf peers in order to perform server version checking. This sub-system provides a good place to store all cached Serf peer information which is also stored in the server object. This change therefore moves the peer and local peer information from the server object into the peer cache backend. It means we have a single place to query this information as well as a single place to reflect Serf membership updates.
tgross
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up that you might want to look at the ENT code base for consumers of these fields, in particular the Autopilot ENT features.
| // If we reached this point then it's a new member, so append it to the | ||
| // exiting array. | ||
| p.peers[parts.Region] = append(existing, parts) | ||
| // Add ot the list if not known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to :)
|
|
||
| for _, m := range event.Members { | ||
| if ok, parts := IsNomadServer(m); ok { | ||
| p.peerDeleteLocked(p.allPeers, parts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to delete from alivePeers or localPeers here as well?
| // PeerSet adds or updates the given parts in the cache. This should be called | ||
| // when a new peer is detected or an existing peer changes is status. | ||
| func (p *PeerCache) PeerSet(parts *Parts) { | ||
| func (p *PeerCache) PeerSet(parts *Parts, localRegion string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this UpdatePeerSet or SetPeers or something like that to make it clear this is a write? When seeing it at the call site (ex. serf.go) it was a little unexpected.
| local1 := len(s1.peersCache.LocalPeers()) | ||
| if n1 != 2 { | ||
| return false, fmt.Errorf("bad: %#v", n1) | ||
| return false, fmt.Errorf("bad1: %#v", n1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Along the lines of your edit here, I don't love these old "bad: %#v" test errors because they don't tell someone who's less familiar with the code why this is bad. They're unfortunately really common in the older parts of the code. Any chance we can improve these messages as long we're touching them?
| Datacenter: v.Datacenter, | ||
| }) | ||
| } | ||
| reply.Servers = n.srv.peersCache.LocalPeersServerInfo() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not having to hold the lock for the rest of this function is a nice improvement 👍
Note: new improvement, do not merge until after 1.11 GA.
Nomad 1.11 introduced a new peer cache to store information about Serf peers in order to perform server version checking. This sub-system provides a good place to store all cached Serf peer information which is also stored in the server object.
This change therefore moves the peer and local peer information from the server object into the peer cache backend. It means we have a single place to query this information as well as a single place to reflect Serf membership updates.
Links
Contributor Checklist
changelog entry using the
make clcommand.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.