Skip to content

MasterAgent.findForSubAgent: defaultSubAgent fallback is gated on NoSecurity, breaking v3 default-context routing #46

@ran0121

Description

@ran0121

Summary

MasterAgent.findForSubAgent only falls back to defaultSubAgent when SecurityConfig.NoSecurity is true. As a consequence, SNMPv3 GETs against the default context (ContextName="") never reach the registered SubAgent's MIB unless the operator explicitly inserts "" into that SubAgent's CommunityIDs. The route miss produces a syntactically valid response with PDU error-status = noAccess and every varbind retained as Type=Null — most client SDKs surface only the varbinds, so the failure looks like an empty MIB.

Reproduced against current main (18e6381) and the latest tagged release v0.5.2.

Observed behavior

Wire-level (captured with tcpdump on the agent host):

Field Value
Response PDU type GetResponse
PDU error-status noAccess
Each varbind type Null (the request varbinds, unmodified)
USM auth/priv ✓ both validated, response is encrypted authPriv

Client-side (net-snmp v5.9, the underlying error is visible):

$ snmpget -v3 -l authPriv -u u1 -a SHA-256 -A authpassphrase \
          -x AES -X privpassphrase 127.0.0.1:1161 1.3.6.1.4.1.99.1.1.0
SNMPv2-SMI::enterprises.99.1.1.0 = No more variables left in this MIB View (It is past the end of the MIB tree)

Client-side (gosnmp 1.43, the failure is silent):

result, _ := g.Get([]string{"1.3.6.1.4.1.99.1.1.0"})
result.Variables[0].Type    // gosnmp.Null
result.Variables[0].Value   // nil
// result.Error is set to NoAccess but many consumers do not check it.

Minimal repro

package main

import (
    "log"

    "github.com/gosnmp/gosnmp"
    "github.com/slayercat/GoSNMPServer"
)

func main() {
    master := GoSNMPServer.MasterAgent{
        Logger: GoSNMPServer.NewDefaultLogger(),
        SecurityConfig: GoSNMPServer.SecurityConfig{
            AuthoritativeEngineBoots: 1,
            Users: []gosnmp.UsmSecurityParameters{{
                UserName:                 "u1",
                AuthenticationProtocol:   gosnmp.SHA256,
                AuthenticationPassphrase: "authpassphrase",
                PrivacyProtocol:          gosnmp.AES,
                PrivacyPassphrase:        "privpassphrase",
            }},
        },
        SubAgents: []*GoSNMPServer.SubAgent{{
            CommunityIDs: []string{"public"}, // v2c key only
            OIDs: []*GoSNMPServer.PDUValueControlItem{{
                OID:  "1.3.6.1.4.1.99.1.1.0",
                Type: gosnmp.Integer,
                OnGet: func() (interface{}, error) { return 42, nil },
            }},
        }},
    }
    s := GoSNMPServer.NewSNMPServer(master)
    if err := s.ListenUDP("udp", ":1161"); err != nil {
        log.Fatal(err)
    }
    s.ServeForever()
}
# v2c — works
$ snmpget -v2c -c public 127.0.0.1:1161 1.3.6.1.4.1.99.1.1.0
SNMPv2-SMI::enterprises.99.1.1.0 = INTEGER: 42

# v3 — returns Null instead of 42
$ snmpget -v3 -l authPriv -u u1 -a SHA-256 -A authpassphrase \
          -x AES -X privpassphrase 127.0.0.1:1161 1.3.6.1.4.1.99.1.1.0
SNMPv2-SMI::enterprises.99.1.1.0 = No more variables left in this MIB View

Root cause

agentcontrol.go SyncConfig (around L288–316) accepts two ways for a SubAgent to register as the default:

if len(current.CommunityIDs) == 0 || t.SecurityConfig.NoSecurity {
    if t.priv.defaultSubAgent != nil {
        return errors.Errorf("SyncConfig: Config Error: duplicate default agent")
    }
    t.priv.defaultSubAgent = current
    continue
}
for _, val := range current.CommunityIDs {
    t.priv.communityToSubAgent[val] = current
}

But findForSubAgent (L318–327) only consults defaultSubAgent when NoSecurity is true:

func (t *MasterAgent) findForSubAgent(community string) *SubAgent {
    if val, ok := t.priv.communityToSubAgent[community]; ok {
        return val
    } else {
        if t.SecurityConfig.NoSecurity {
            return t.priv.defaultSubAgent
        }
        return nil
    }
}

A SubAgent registered with CommunityIDs: [] correctly populates defaultSubAgent, but it is unreachable in any deployment using USM v3 (where NoSecurity=false). The registration and dispatch paths are asymmetric.

The routing key for v3 comes from getPktContextOrCommunity, which returns packet.ContextName. RFC 3411 §3.3.1 specifies the SNMPv3 default context is the empty string, so every v3 query against the default context lands with community="".

Proposed fix (PR follow-up)

Drop the NoSecurity guard on the fallback. The function becomes:

func (t *MasterAgent) findForSubAgent(community string) *SubAgent {
    if val, ok := t.priv.communityToSubAgent[community]; ok {
        return val
    }
    return t.priv.defaultSubAgent // nil-safe; previously gated on NoSecurity
}

This change is strictly additive:

  • If no defaultSubAgent was registered → return value is nil (same as before).
  • If NoSecurity=true → behavior unchanged.
  • If NoSecurity=false and defaultSubAgent registered → new: v3 default-context queries dispatch to the registered SubAgent instead of returning nil.
  • v2c path untouched: when community matches an entry in communityToSubAgent, the early-return on line 1 of the function fires before the new line is ever reached.

I'm happy to follow up with a PR adding this change and a test (TestMasterAgent_V3DefaultContextDispatchesToDefaultSubAgent), plus a TestMasterAgent_V2cRoutingUnchanged to cover the regression surface explicitly.

Compatibility / regression risk

I'm aware of the #23 → PR #27 → #28 history — PR #27 was reverted because it broke SNMPv2c (ErrNoPermission). I want to be explicit that this proposal is on a different surface:

  • PR fix: snmp v3 auth #27 added enforcement in SubAgent.Serve / ResponseForBuffer that rejected v2c at the per-request level. The unintended side-effect was v2c queries that previously dispatched OK started returning ErrNoPermission.
  • This proposal does not touch the per-request enforcement path. It only changes findForSubAgent's fallback when the community map already misses. Any v2c request whose community is in communityToSubAgent returns the same SubAgent it did before; any v2c request whose community is NOT in the map returns the same nil it did before (because by definition no defaultSubAgent was registered — registering one is an explicit opt-in by the library user).

In other words: the new line is only reachable in scenarios that previously returned nil + ErrNoSNMPInstance. Any path that previously returned a non-nil SubAgent reaches the early return on line 1 and is bit-for-bit identical.

Workaround currently in use

Production users (including an internal NTCIP 1209 traffic-sensor adapter we ship) add "" to CommunityIDs explicitly:

sub := &GoSNMPServer.SubAgent{
    CommunityIDs: []string{"", "public"}, // "" = v3 default context
    OIDs:         buildMIB(),
}

This works because SyncConfig registers each community ID as a map key, so findForSubAgent("") hits the map without ever entering the NoSecurity fallback path. But it is non-obvious — we lost a multi-day debugging cycle to it before locating the issue in the library source.

Related (separate issue, will file separately)

The dispatch-miss path returns noAccess at PDU level but leaves the request varbinds as Type=Null. RFC 3416 §4.2.1 recommends noSuchObject / noSuchInstance at the varbind level for "this name is not in this view." Setting both signals together would give clients that ignore PDU-level error-status (which many popular SDKs do) a clearer indication.

Environment

  • GoSNMPServer: v0.5.2 and current main (18e6381)
  • gosnmp client: 1.43.0
  • Server: linux/arm64 (NVIDIA Jetson Orin NX, JetPack 6), Go 1.22

Happy to provide a tcpdump capture or sit on a PR review thread.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions