fix(plugin-meetings): add subnet details in reachability, getAllClust…#4662
fix(plugin-meetings): add subnet details in reachability, getAllClust…#4662venky-mediboina wants to merge 21 commits intowebex:nextfrom
Conversation
|
This pull request is automatically being deployed by Amplify Hosting (learn more). |
| if (firstReachableResult) { | ||
| return { | ||
| result: 'reachable', | ||
| latencyInMilliseconds: firstReachableResult.latencyInMilliseconds, |
There was a problem hiding this comment.
not sure if this is correct. firstReachableResult is just the first found result in the array, it doesn't necessarily mean that it was actually the one that generated the "reachable" result as first in time, right? so I think this should be set to a minimum value of latencyInMilliseconds from all peer connections, shouldn't it?
There was a problem hiding this comment.
yeah, firstReachableResult was just the first in array order (creation order), not the fastest. Fixed it with renaming to bestResult and now picks the one with the minimum latencyInMilliseconds across all per-URL peer connections.
| ...result, | ||
| } | ||
| ); | ||
| // Note: resultReady event is emitted when ICE gathering completes in emitResultsForAllProtocols() |
There was a problem hiding this comment.
this is a big change - before only "reachable" results were ever emitted, now "unreachable" ones are also emitted. This will require changes in places where this event is handled in reachability/index.ts and in clusterReachability.ts and I don't see them being done in this PR. Can you explain a bit more why do we need this change?
There was a problem hiding this comment.
resultReady still emits both reachable and unreachable (needed for per-URL UDP aggregation in ClusterReachability, but the firstResultAvailable guard in index.ts now filters with && result === 'reachable' so unreachable results never trigger early calls."
…ICE gathering not finished
…t as it was already
|
@claude review |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62fdb70c63
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const allInstancesComplete = | ||
| this.udpResultsReceived === this.reachabilityPeerConnectionsForUdp.length; | ||
|
|
||
| if (!this.udpResultEmitted && (data.result === 'reachable' || allInstancesComplete)) { |
There was a problem hiding this comment.
Wait for all per-URL UDP checks before emitting aggregate
In per-URL mode this emits the cluster-level UDP resultReady as soon as the first URL is reachable, then permanently suppresses later UDP aggregate emissions. Because Reachability.performReachabilityChecks() only persists details when it receives resultReady, any slower UDP URLs that later become reachable will keep stale answeredTx/lostTx values in storage (especially in UDP-only clusters where no later protocol event triggers another write), so backend subnet details can be incorrect.
Useful? React with 👍 / 👎.
| if (isPerUrlMode || this.result.udp.result !== 'reachable') { | ||
| this.saveResult('udp', latencyInMilliseconds, e.candidate.address, serverIp, serverPort); | ||
| } else { | ||
| this.addPublicIp('udp', e.candidate.address); |
There was a problem hiding this comment.
Continue saving srflx results after UDP becomes reachable
This branch stops calling saveResult for additional srflx candidates once UDP is marked reachable (non per-URL mode with multiple UDP URLs), but saveResult is the only place that emits reachedSubnets. As a result, later reachable UDP subnets are dropped from cluster.reachedSubnets, and Reachability.isSubnetReachable() can return false negatives for those subnets even though ICE reached them.
Useful? React with 👍 / 👎.
… saving,complete udp results
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a076d8dda
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Store results with pre-populated details. | ||
| await this.storeResults(results); |
There was a problem hiding this comment.
Persist prepopulated details before writing initial results
results is still only the locally initialized structure at this point, and it is never hydrated from clusterReachability.getResult() before being stored. In paths where no resultReady event is emitted (for example when ReachabilityPeerConnection.start() exits early because RTCPeerConnection could not be created), this write persists protocol status without the prepopulated details, so subnet entries that should be reported as unreachable are silently dropped from storage.
Useful? React with 👍 / 👎.
| urls.forEach((url) => { | ||
| const {host, port} = parseIceServerUrl(url); | ||
| if (host && port) { | ||
| const hostPort = `${host}:${port}`; |
There was a problem hiding this comment.
Keep IPv6 brackets in getClusterUrls host:port output
parseIceServerUrl() strips IPv6 brackets, and this concatenation emits ${host}:${port} directly; for IPv6 servers that yields values like 2402:2500::1:5004, which is not an unambiguous host:port form. Any consumer of getAllClustersInfo() that parses these strings cannot reliably separate address and port, so IPv6 cluster URLs are returned in a malformed format.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ced58b78bd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| urls.forEach((url) => { | ||
| const {host, port, isIp} = parseIceServerUrl(url); | ||
| if (host && port) { |
There was a problem hiding this comment.
Include host-only URLs when prepopulating subnet details
prepopulateSubnetDetails() currently drops any ICE URL that omits an explicit :port because of the if (host && port) guard, so those servers never get an initial unreachable detail entry. This matters when clusters contain host-only STUN URLs (which are still valid and used by the reachability checks), because a failed check for that endpoint will be missing from details entirely and backend reporting loses the expected unreachable subnet record.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
We won't get any port-less domain URLs. The guard if (host && port) will skip port-less URLs, so a server like stun:server.example.com would get no unreachable detail entry. However, Orpheus always returns URLs with explicit ports (e.g., :5004, :443), so this path is never hit in real usage. And since SubnetDetail.port is a required number field, there's no safe value to assign for a missing port anyway. So don't need to change anything.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5a365acb0e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Emit results for all protocols before closing (important for timeout scenarios) | ||
| this.emitResultsForAllProtocols(); | ||
| this.closePeerConnection(); |
There was a problem hiding this comment.
Avoid emitting resultReady during abort cancellation
Calling emitResultsForAllProtocols() inside abort() turns cancellation into normal completion events, but Reachability.performReachabilityChecks() treats every resultReady as a live run update (increments counters, persists results, and can clear timers/resolve). In the fallback path (gatherReachabilityFallback() aborts the previous run before starting a new one), these stale abort-generated events can race with the new run and overwrite storage or prematurely drive reachability:done/timer cleanup for the wrong run, which makes fallback behavior nondeterministic under timeout conditions.
Useful? React with 👍 / 👎.
| result.clientMediaIPs = [publicIp]; | ||
| } | ||
|
|
||
| this.emit( |
There was a problem hiding this comment.
the resultReady event was emitted immediately and now it is only emitted at the end. As a result the "reachability:firstResultAvailable" event will only be emitted at the end and not immediately when the first good result is found - the web app relies on this event being emitted immediately, so we can't have that change of behaviour
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9a5e86b68d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| e.candidate.address, | ||
| e.candidate.port |
There was a problem hiding this comment.
Match relay details against TURN URL port
Do not pass e.candidate.port as the server port for relay candidates here, because that value is the relay allocation port and can differ from the TURN server listening port configured in the cluster URL. When they differ, updateSubnetDetail() cannot match the prepopulated host:port detail, so the original subnet stays marked unreachable and an extra synthetic detail row is added, which corrupts TCP/xTLS subnet reporting sent to backend.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a279061892
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| details: enablePerUdpUrlReachability | ||
| ? prepopulateSubnetDetails(clusterInfo.tcp) |
There was a problem hiding this comment.
Include hostname URLs in TCP/TLS detail prepopulation
When enablePerUdpUrlReachability is enabled, TCP/TLS detail arrays are initialized with prepopulateSubnetDetails() using its default includeDomains=false, so hostname-based URLs (the common stun:...webex.com:port form) are dropped before checks start. If those TCP/TLS checks fail and no relay candidate is gathered, no detail row is ever added, and the stored/backend result omits unreachable TCP/TLS subnets entirely. This makes failure reporting incomplete for domain-based clusters.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
We are not tracking hostname-based URLs for TCP and xTLS. So its fine for now.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4368676f21
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (result.result === 'untested' || this.emittedProtocols.has(protocol)) { | ||
| return; |
There was a problem hiding this comment.
Emit final result for protocols after ICE gathering completes
Because protocols are marked as already emitted (emittedProtocols) after the first reachable candidate, emitResultsForAllProtocols() skips them at ICE completion. In TCP/xTLS flows with multiple relay candidates, saveResult() can append additional subnet details after that first success, but those late updates usually carry publicIp=null and therefore do not trigger clientMediaIpsUpdated, so performReachabilityChecks() may persist only partial details and finish before the full set is written.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The concern is that early emission might cause partial TCP/xTLS details to be stored. But this doesn't happen because: areAllResultsReady requires all clusters/protocols to report before the final store, and the final re-read at that point calls getResult() which picks up any details added after the early emission. So, no fix needed.
|
I've asked AI to check if the existing flows (with enablePerUdpUrlReachability = false in config) are affected by this PR. This is what it came back with:
I think we can ignore points 4, 6 7 and 8, but please check the other points. |
|
Fixed abouve issue: |
| ); | ||
| this.sendMetric(); | ||
| // Guard: abort may have triggered resultReady events that already emitted reachability:done | ||
| if (!this.reachabilityDoneEmitted) { |
There was a problem hiding this comment.
since we have to make sure reachabilityDoneEmitted is set to true whenever we emit that event and we already have 3 places where we do this, it would be good to have a method like emitReachabilityDoneEvent() that would set the flag and emit the event. This avoids any risk of someone adding another place where they emit the event and not set the flag.
Also, shouldn't we be checking that reachabilityDoneEmitted flag is false in all the places where we emit the event? it's only checked here. That check should also just be in the new method emitReachabilityDoneEvent()
There was a problem hiding this comment.
yeah, we can use a common emitReachabilityDoneEvent() fn. thanks for the suggestion, used same.
| results[clusterKey].xtls.details = finalResult.xtls.details; | ||
| } | ||
| }); | ||
| await this.storeResults(results); |
There was a problem hiding this comment.
nitpick: storeResults() is already called in line 1044, the code could be written in a way that we call it only once instead of twice for the areAllResultsReady case
There was a problem hiding this comment.
consolidated to call storeResults() at only one place.
| ? parseIceServerUrl(candidateWithUrl.url) | ||
| : {host: undefined, port: undefined}; | ||
|
|
||
| const isPerUrlMode = this.numUdpUrls === 1; |
There was a problem hiding this comment.
this is not safe, we might have just a single udp url in web app and still not want do have the "separate peer connection per udp url" functionality. Better to use this.enablePerUdpUrlReachability
There was a problem hiding this comment.
Right, used enablePerUdpUrlReachability, removed isPerUrlMode var.
| // Only pass serverPort when enablePerUdpUrlReachability is true — | ||
| // in standard mode there are no subnet details to update. | ||
| const effectiveServerPort = this.enablePerUdpUrlReachability ? serverPort : undefined; | ||
| if (isPerUrlMode || this.result.udp.result !== 'reachable') { |
There was a problem hiding this comment.
I don't understand this if condition, previously we always called saveResult. Can you explain the purpose of this condtion?
There was a problem hiding this comment.
The condition tried to avoid calling saveResult redundantly after UDP was already reachable, but it was unnecessary — saveResult already handles this internally, since its own guard only marks reachable on the first call and routes all subsequent calls to addPublicIp automatically. So the this was removed and saveResult is now called unconditionally.
There was a problem hiding this comment.
issue: this may update the resultsCount for unreachable results, right? and the resultsCount is only supposed to count reachable results
There was a problem hiding this comment.
Earlier thought for each connection i.e even its reachable or not reachable, counting it. Changed with condition.
…,fix detail matching&test case
…na/webex-js-sdk into subnet_details_pr3
| } else if (e.candidate.type === CANDIDATE_TYPES.RELAY) { | ||
| // Use candidate.url (Chrome) for protocol detection; fall back to ephemeral e.candidate.port in Firefox (best-effort). | ||
| const relayCandidateWithUrl = e.candidate as RTCIceCandidate & {url?: string}; | ||
| const configuredPort = relayCandidateWithUrl.url |
There was a problem hiding this comment.
I don't understand what is the benefit of getting the port from relayCandidateWithUrl.url. The existing code that uses e.candidate.port to detect TLS vs TCP has been working fine in all browsers and is simpler. Is there any edge case where network checker needs to use the port from the url because e.candidate.port doesn't work? If not, then I'd rather keep existing code and minimize the changes in this PR.
|
|
||
| // Initialize this.result as saying that nothing is reachable. | ||
| // It will get updated as we go along and successfully gather ICE candidates. | ||
| // Preserve existing details (pre-populated or undefined based on enableSubnetDetails flag). |
There was a problem hiding this comment.
I don't see any "enableSubnetDetails" flag anywhere in the code, is this comment valid?
| return {isIp: false}; | ||
| } | ||
|
|
||
| return port ? {host: ipv6, port, isIp: true} : {host: ipv6, isIp: true}; |
There was a problem hiding this comment.
wouldn't a simpler line like this do the same?
return {host: ipv6, port, isIp: true}
same in line 39
| * @param {string} url - The STUN/TURN URL to parse (e.g., 'stun:server.com:3478', 'stun:[2402:2500::1]:5004') | ||
| * @returns {object} Object containing host, port, and isIp flag | ||
| */ | ||
| export function parseIceServerUrl(url: string): { |
There was a problem hiding this comment.
a new exported util function like this should have its own unit tests
| * @param {boolean} [includeDomains=false] - Whether to include domain names in details | ||
| * @returns {SubnetDetail[]} subnet details marked as unreachable | ||
| */ | ||
| export function prepopulateSubnetDetails(urls: string[], includeDomains = false): SubnetDetail[] { |
There was a problem hiding this comment.
a new exported util function like this should have its own unit tests
| const udpRpc2 = (perUdpClusterReachability as any).reachabilityPeerConnectionsForUdp[1]; | ||
|
|
||
| // Stub getResult so aggregateUdpResults() sees the correct state | ||
| sinon.stub(udpRpc1, 'getResult').returns({ |
There was a problem hiding this comment.
why do we need this? aggregateUdpResults() is not supposed to be called in default mode and these tests use the defaultmode, right?
|
|
||
| assert.equal(udpEvents.length, 1); | ||
| assert.equal(udpEvents[0].latencyInMilliseconds, 50); | ||
| assert.equal(udpEvents[0].protocol, 'udp'); |
There was a problem hiding this comment.
why the existing check has been replaced?
| // check the right event was emitted | ||
| // resultReady is emitted immediately for the first reachable protocol (UDP) | ||
| assert.equal(emittedEvents[Events.resultReady].length, 1); | ||
| assert.deepEqual(emittedEvents[Events.resultReady][0], { |
There was a problem hiding this comment.
why this check has been removed?
| // check events emitted: there should be a resultReady and no clientMediaIpsUpdated | ||
| // resultReady is emitted immediately for the first reachable protocol (UDP) | ||
| assert.equal(emittedEvents[Events.resultReady].length, 1); | ||
| assert.deepEqual(emittedEvents[Events.resultReady][0], { |
| // no new event was emitted | ||
| assert.equal(emittedEvents[Events.resultReady].length, 0); | ||
| // no new event was emitted (duplicate IP) | ||
| assert.equal(emittedEvents[Events.resultReady].length, 1); |
There was a problem hiding this comment.
why this change? we should minimize any changes to existing tests so that we can easily see that changes in this PR are not breaking them
COMPLETES SPARK-610518
Changes flow documented in this wiki page.
1st PR link.
2nd PR link.
This pull request addresses
Old combined PR: #4542. As changes are somewhat big, we split into 3 PRs.
Description for 3PRs:
This is the last and 3rd PR of 3 PRs series to add subnet details which consists of each subnet IP in the details section including whether its reachable or not.
As part of this PR-3, added subnet details to reachability results and a function to return all URLs.
by making the following changes
Added subnet details to reachability results based on this flag and a function to return all URLs.
Change Type
The following scenarios were tested
MANUAL Tests performed with unit test cases
The GAI Coding Policy And Copyright Annotation Best Practices
I certified that
Make sure to have followed the contributing guidelines before submitting.