core/muxing: Use total number of alive inbound streams for backpressure#2878
core/muxing: Use total number of alive inbound streams for backpressure#2878thomaseizinger wants to merge 21 commits intomasterfrom
Conversation
mxinden
left a comment
There was a problem hiding this comment.
Thanks for writing this POC!
From what I can see here, I am in favor of this proposal.
I would like to only merge here once we have at least one user of this API. I would guess that the best user for this would be libp2p-metrics.
While we could emit a new SwarmEvent on each new stream, I wonder how we could inform libp2p-metrics once a stream closed.
Next to the number of substreams, it would be wonderful to expose the protocol that is being negotiated on the substream as well. That would enable us to expose the following metric:
libp2p_swarm_stream { direction = "inbound", protocol = "/libp2p/kad/1.0.0" } 10
Yes, makes sense. I see the first user in
I did something like this in the past. If we make |
swarm/CHANGELOG.md
Outdated
| - Use the total number of alive inbound streams for back-pressure. This can have a BIG impact on your application | ||
| depending on how it uses `libp2p`. Previously, the limit for inbound streams per connection only applied to the | ||
| _upgrade_ phase, i.e. for the time `InboundUpgrade` was running. Any stream being returned from `InboundUpgrade` and | ||
| given to the `ConnectionHandler` did not count towards that limit, essentially mitigating the back-pressure mechanism. | ||
| With this release, substreams count towards that limit until they are dropped and thus we actually enforce, how many | ||
| inbound streams can be active at one time _per connection_. `libp2p` will not accept any more incoming streams once | ||
| that limit is hit. If you experience stalls or unaccepted streams in your application, consider upping the limit via | ||
| `SwarmBuilder::max_negotiating_inbound_streams`. See [PR 2878]. |
There was a problem hiding this comment.
Not sure if I am giving this too much attention but it feels like a really important change to me.
swarm/CHANGELOG.md
Outdated
| - Use the total number of alive inbound streams for back-pressure. This can have a BIG impact on your application | ||
| depending on how it uses `libp2p`. Previously, the limit for inbound streams per connection only applied to the | ||
| _upgrade_ phase, i.e. for the time `InboundUpgrade` was running. Any stream being returned from `InboundUpgrade` and | ||
| given to the `ConnectionHandler` did not count towards that limit, essentially mitigating the back-pressure mechanism. | ||
| With this release, substreams count towards that limit until they are dropped and thus we actually enforce, how many | ||
| inbound streams can be active at one time _per connection_. `libp2p` will not accept any more incoming streams once | ||
| that limit is hit. If you experience stalls or unaccepted streams in your application, consider upping the limit via | ||
| `SwarmBuilder::max_negotiating_inbound_streams`. See [PR 2878]. |
There was a problem hiding this comment.
I have to give this more thought. Previously I was under the impression that this will be feature for reporting only, but not actually enforcing a limit.
As of today I am undecided whether the number of established inbound streams should have a global connection maximum, or whether it should only have a maximum per ConnectionHandler implementation, potentially coordinated with the NetworkBehaviour and thus a maximum per NetworkBehaviour implementation. I am not sure the global application has enough knowledge to set a suitable limit for all protocols on a single connection, nor do I think a static limit per connection is a good idea in the first place.
Ideally I would like to enforce a limit for connections only. With #2828 a connection limit could e.g. be dynamic based on the current memory utilization. Protocols, e.g. like Kademlia, would enforce a streams-per-connection limit in their ConnectionHandler implementation. That limit would be a value of maximum expected parallelization, e.g. 16 (as in "we don't expect an implementation to be handle more than 16 requests in parallel").
What do you think @thomaseizinger?
There was a problem hiding this comment.
If we want the limit to be dynamic, all we need to do is add a function to ConnectionHandler that allows us to query the max number of allowed streams:
trait ConnectionHandler {
fn max_inbound_streams(&self) -> usize {
128 // This is today's default.
}
}We can then use this limit on every iteration of Connection::poll to check if we should poll for more inbound streams.
If we move forward with #2863 then the notion of upgrades will be gone and this limit does not mean anything anymore. Thus, I am inclined to say we should remove this from the Swarm and Pool altogether and only have the ConnectionHandler decide.
There was a problem hiding this comment.
If we move forward with #2863 then the notion of upgrades will be gone and this limit does not mean anything anymore. Thus, I am inclined to say we should remove this from the
SwarmandPoolaltogether and only have theConnectionHandlerdecide.
That would be my preference.
This effectively deprecates `SwarmBuilder::max_negotiating_inbound_streams`.
This will result in peers never being able to open a substream and we currently depend on this behaviour, despite the upgrade never being successful.
These will eventually go away so we don't bother replacing the links.
|
@mxinden Updated the PR description with an open question. |
|
@mxinden In #2957, a usecase for tracking the number of substreams in metrics came up. I can't think of a way of recording this though if we don't want to take in a dependency on |
Co-authored-by: Ryan Plauche <ryan@littlebearlabs.io>
f7b2221 to
a6bd2e4
Compare
|
@AgeManning I'd be curious what you think of this PR in regards to what it changes in gossipsub ( The gist is that we can now enforce, how many substreams to accept from the remote. Given that GossipSub should only ever have one inbound stream, I set this to one. |
|
Yeah the changes look fine to me. We only have one inbound substream as you've pointed out, so I dont see any issue here |
|
I can't reproduce this error on my machine. Anyone else got any luck? Also, can you have another look at this @mxinden? I implemented a per connection-configurable limit now as requested in #2878 (comment) :) |
mxinden
left a comment
There was a problem hiding this comment.
I think long term this is something we should be doing. I am not yet sure on whether the current implementation allows collaborative usage of these limits across ConnectionHandler implementations.
In my eyes, before we do this, we should tackle #2863 first, thus simplifying the sets (pending-multistream, pending-upgrade, negotiated) of inbound streams.
Another thought would be to redesign max_inbound_streams to be closer to Sink::poll_ready:
fn poll_new_inbound_stream_ready(&self, cx) -> Poll<()> {
A ConnectionHandler could thus signal in a async way whether it is willing to accept new streams.
Also adding this to the agenda for the next rust-libp2p community call.
| /// The maximum number of inbound substreams allowed on the underlying connection. | ||
| /// | ||
| /// Once this limit is hit, we will stop accepting new inbound streams from the remote. | ||
| fn max_inbound_streams(&self) -> usize { | ||
| DEFAULT_MAX_INBOUND_STREAMS | ||
| } |
There was a problem hiding this comment.
In case an implementation of ConnectionHandler does not implement this method, but accepts DEFAULT_MAX_INBOUND_STREAMS streams, it would starve all other implementations of ConnectionHandler from receiving more inbound streams, correct?
There was a problem hiding this comment.
From the connection task PoV, there is only one ConnectionHandler. I think if we get the combinators right, this shouldn't be an issue?
| self.handlers | ||
| .values() | ||
| .map(|h| h.max_inbound_streams()) | ||
| .max() |
There was a problem hiding this comment.
Shouldn't this be the sum? I.e. the sum of all limits equals the total limit.
| .values() | ||
| .map(|h| h.max_inbound_streams()) | ||
| .max() | ||
| .unwrap_or(0) // No handlers? No substreams. |
Do you think the proposed implementation is an improvement? I think it is but I agree with you that it can be even better.
Funny, I see this a step towards that solution. With this PR, a substream counts towards a limit as soon as it exists and not just during the upgrade phase. The next idea (but could be done in parallel) for #2863 is to make less use of upgrades in other protocols but build a
This is an interesting idea! Should a
👍 |
|
Setting to draft until we reach a decision. |
|
Closing because stale. |
Description
A PoC implementation for what is described in #2865.
Links to any relevant issues
ConnectionHandlertrait by removing as many associated types as possible #2863Open tasks
StreamMuxer#2861SubstreamBox::newfrom public API oflibp2p-corelibp2p-coreto 0.37.0libp2p-swarmto 0.40.0Open Questions
Should
const MAX_NUM_INBOUND_SUBSTREAMS: usize = 32;
be refactored?libp2p-kadmake use of the new limit andrust-libp2p/protocols/kad/src/handler.rs
Line 42 in cce296e
Change checklist