Skip to content

fix: improve parser robustness for real-world tech output#5

Open
lbruton wants to merge 1 commit intomainfrom
feat/trmc-12-parser-robustness
Open

fix: improve parser robustness for real-world tech output#5
lbruton wants to merge 1 commit intomainfrom
feat/trmc-12-parser-robustness

Conversation

@lbruton
Copy link
Copy Markdown
Owner

@lbruton lbruton commented Apr 6, 2026

Summary

  • Abbreviated command supportPROMPT_RE matches sh as well as show; new alias map resolves sh cdp nei, int stat, mac ad, etc.
  • New parsersparseInterfaceStatus() for show interfaces status and parseLldpNeighbors() for show lldp neighbors, both backward-compatible with existing enrichment consumers
  • CDP multi-line fix — handles long device IDs that wrap to the next line
  • Better error messaging — missing MAC table now lists which commands were detected instead of a generic error
  • Noise reduction — skips VLAN "All" CPU entries, smarter fallback avoids 1500+ false errors in multi-command mode

Closes TRMC-12

Test plan

  • Production before file: detects hostname, 17 interfaces (with descriptions + VLANs from show interfaces status), 1 CDP neighbor, 15 VLANs — 0 false MAC errors
  • Production after file: 253 MAC entries, 2 errors (down from 23), CDP + LLDP + interface descriptions + VLANs all parsed
  • Existing sample data (before-ie3000.txt, after-ie3300.txt) parses identically — backward compatible
  • Visual smoke test in browser with production data

🤖 Generated with Claude Code

First production test revealed the parser couldn't handle typical field
tech terminal captures. Three categories of fixes:

1. Abbreviated commands — PROMPT_RE now matches `sh` as well as `show`,
   and a new COMMAND_ALIASES map resolves `sh cdp nei`, `int stat`,
   `mac ad`, etc. to their canonical forms.

2. New parsers — `parseInterfaceStatus()` handles `show interfaces status`
   (different columns than `show interfaces description`), and
   `parseLldpNeighbors()` handles `show lldp neighbors`. Both populate
   existing enrichment fields for backward compatibility.

3. CDP multi-line — `parseCdpNeighbors()` now handles long device IDs
   that wrap to the next line (common in production).

Also: skip VLAN "All" CPU entries, smarter fallback in multi-command mode
(no more 1500+ false errors), and informative missing-MAC-table messages
that list which commands were actually detected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 6, 2026 19:04
@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 57 complexity · 0 duplication

Metric Results
Complexity 57 (≤ 500 complexity)
Duplication 0 (≤ 4 duplication)

View in Codacy

AI Reviewer: run a review on demand. To trigger the first review automatically, go to your organization or repository integration settings. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes. Give us feedback

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the Cisco command parsing logic by introducing support for command abbreviations, 'show interfaces status', and 'show lldp neighbors'. It also improves the CDP neighbor parser to handle multi-line output for long hostnames and adds a detailed error reporting mechanism for missing MAC address-table data. Review feedback highlights an opportunity to prevent silent data loss for unrecognized commands in the splitter and suggests refactoring duplicated regex patterns into module-level constants to improve maintainability.

Comment on lines +236 to +237
const canonical = resolveCommand(showArgs);
if (!canonical) continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The resolveCommand function returns null if the command is not found, but the caller does not handle the case where a command is missing from the recognized list but still present in the input. This could lead to silent data loss for commands that are not in the RECOGNIZED_COMMANDS list but are still valid show commands.

if (!text) return result;

const lines = text.split('\n');
const PORT_PATTERN = /(?:Gig(?:abitEthernet)?|Fas(?:tEthernet)?|Ten(?:GigabitEthernet)?|Eth(?:ernet)?|Twe(?:ntyFiveGigE)?|Po(?:rt-channel)?|Gi|Te|Fa)\s*[\d/]+/i;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The PORT_PATTERN regex is duplicated across multiple functions (parseCdpNeighbors and parseLldpNeighbors). It should be defined as a constant at the module level to improve maintainability and ensure consistency.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Improves the terminal-output parser to better handle real-world Cisco CLI pastes (abbreviated commands, multi-line CDP entries, and new interface/LLDP parsers) while reducing noise and improving UI error messaging.

Changes:

  • Added abbreviated sh ... prompt support and a command-alias resolver for multi-command splitting.
  • Introduced parseInterfaceStatus() and parseLldpNeighbors() and wired them into parseTerminalOutput() with backward-compat fallbacks.
  • Improved MAC parsing noise reduction and UI messaging when MAC-table data is missing.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File Description
samples/production-before.txt Adds a large real-world capture used to validate parsing robustness (but currently contains sensitive data).
samples/production-after.txt Adds a real-world capture focused on MAC/interface/VLAN/CDP/LLDP outputs for parser validation.
js/parser.js Extends command splitting, adds new parsers (interfaces status + LLDP), improves CDP wrap handling, and reduces MAC parsing noise/fallback errors.
js/app.js Improves missing-MAC-table UI errors by listing detected commands/hostname.
Comments suppressed due to low confidence (1)

js/parser.js:1

  • commandsFound is currently derived from the presence of parsed data objects, but some of those objects are now synthesized for backward compatibility (e.g., interfaceDesc derived from interfaces status, cdpNeighbors derived from lldp neighbors). This can cause the UI to report commands as “detected” even when they weren’t present in the paste; track “seen commands” separately from “derived compatibility data” (e.g., sawInterfacesDescription, sawInterfacesStatus, sawCdp, sawLldp) and build commandsFound from the saw-flags only.
// Cisco MAC address table parser

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +425 to +427
enable secret 9 $9$Ul64Po8JxM16t0$w6E3BBfxG60Ju2uyQ7l.FCFUFhOpBuScSJmC8vznkMQ
!
username svc_prime privilege 15 secret 9 $9$YvhRekuD07VDGU$6YyzIyKySN6ltDorL8VbnLQN96Ft7qZStCW5Ug0E2zg
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repository is committing production configuration secrets/credentials (enable secret, user secrets, SNMP community strings, TACACS keys, etc.). Even if some are hashed/obfuscated, they are still sensitive and should not be stored in-source; please remove this sample or heavily sanitize it (redact secrets/communities/keys/usernames, consider replacing hostnames/IPs too) and add guidance for generating sanitized fixtures if needed.

Copilot uses AI. Check for mistakes.
!
snmp-server group DNACGROUPAuthPriv v3 priv read DNAC-ACCESS write DNAC-ACCESS
snmp-server view DNAC-ACCESS iso included
snmp-server community aOR!75Nm7b RO 11
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repository is committing production configuration secrets/credentials (enable secret, user secrets, SNMP community strings, TACACS keys, etc.). Even if some are hashed/obfuscated, they are still sensitive and should not be stored in-source; please remove this sample or heavily sanitize it (redact secrets/communities/keys/usernames, consider replacing hostnames/IPs too) and add guidance for generating sanitized fixtures if needed.

Copilot uses AI. Check for mistakes.
Comment on lines +1107 to +1119
tacacs server tacacs-tulpaipn06
address ipv4 172.25.84.55
key 7 144F3E5D464E2C79092022052A50570137
timeout 5
single-connection
tacacs server tacacs-edcpaipn06
address ipv4 172.30.84.55
key 7 135D3B4441460278062C39033D61431022
timeout 5
single-connection
tacacs server tacacs-drrpaipn05
address ipv4 172.20.204.54
key 7 135D3B4441460278062C39033D61431022
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repository is committing production configuration secrets/credentials (enable secret, user secrets, SNMP community strings, TACACS keys, etc.). Even if some are hashed/obfuscated, they are still sensitive and should not be stored in-source; please remove this sample or heavily sanitize it (redact secrets/communities/keys/usernames, consider replacing hostnames/IPs too) and add guidance for generating sanitized fixtures if needed.

Suggested change
tacacs server tacacs-tulpaipn06
address ipv4 172.25.84.55
key 7 144F3E5D464E2C79092022052A50570137
timeout 5
single-connection
tacacs server tacacs-edcpaipn06
address ipv4 172.30.84.55
key 7 135D3B4441460278062C39033D61431022
timeout 5
single-connection
tacacs server tacacs-drrpaipn05
address ipv4 172.20.204.54
key 7 135D3B4441460278062C39033D61431022
tacacs server tacacs-server-01
address ipv4 192.0.2.10
key 7 REDACTED_SANITIZED_SAMPLE
timeout 5
single-connection
tacacs server tacacs-server-02
address ipv4 192.0.2.11
key 7 REDACTED_SANITIZED_SAMPLE
timeout 5
single-connection
tacacs server tacacs-server-03
address ipv4 192.0.2.12
key 7 REDACTED_SANITIZED_SAMPLE

Copilot uses AI. Check for mistakes.
Comment on lines +512 to +515
// Last token is remote port ID
const remotePort = parts.length > 0 ? parts[parts.length - 1] : '';

result.set(localPort, { deviceId, platform, remotePort });
result.set(localPort, { deviceId, platform: 'LLDP', remotePort });
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseLldpNeighbors() returns remotePort un-normalized, while parseCdpNeighbors() normalizes remote port names via normalizePort(). This breaks the “same shape / interchangeability” guarantee and can lead to mismatches in downstream enrichment/comparison; normalize the LLDP remotePort the same way (while leaving non-interface IDs like port-001 unchanged).

Copilot uses AI. Check for mistakes.
Comment on lines +694 to +707
} else if (cmd.includes('interfaces status')) {
result.interfaceStatus = parseInterfaceStatus(block.text);
// Also populate interfaceDesc from status data for backward compatibility
if (!result.interfaceDesc) {
result.interfaceDesc = result.interfaceStatus.descriptions;
}
} else if (cmd.includes('cdp neighbors')) {
result.cdpNeighbors = parseCdpNeighbors(block.text);
} else if (cmd.includes('lldp neighbors')) {
result.lldpNeighbors = parseLldpNeighbors(block.text);
// Also populate cdpNeighbors from LLDP for backward compatibility
if (!result.cdpNeighbors) {
result.cdpNeighbors = result.lldpNeighbors;
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commandsFound is currently derived from the presence of parsed data objects, but some of those objects are now synthesized for backward compatibility (e.g., interfaceDesc derived from interfaces status, cdpNeighbors derived from lldp neighbors). This can cause the UI to report commands as “detected” even when they weren’t present in the paste; track “seen commands” separately from “derived compatibility data” (e.g., sawInterfacesDescription, sawInterfacesStatus, sawCdp, sawLldp) and build commandsFound from the saw-flags only.

Copilot uses AI. Check for mistakes.
*
* @param {string} text - Raw command output
* @returns {{ descriptions: Map<string, { status: string, protocol: string, description: string }>,
* portVlans: Map<string, number|'trunk'> }}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSDoc return type for portVlans doesn’t match behavior: the implementation can set 'routed', but the doc says number|'trunk'. Update the doc (and any consuming type assumptions) to include 'routed' or adjust the implementation to match the declared return shape.

Suggested change
* portVlans: Map<string, number|'trunk'> }}
* portVlans: Map<string, number|'trunk'|'routed'> }}

Copilot uses AI. Check for mistakes.
Comment on lines +365 to +368
if (vlanField.toLowerCase() === 'trunk') {
portVlans.set(port, 'trunk');
} else if (vlanField.toLowerCase() === 'routed') {
portVlans.set(port, 'routed');
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSDoc return type for portVlans doesn’t match behavior: the implementation can set 'routed', but the doc says number|'trunk'. Update the doc (and any consuming type assumptions) to include 'routed' or adjust the implementation to match the declared return shape.

Copilot uses AI. Check for mistakes.
Comment on lines +162 to +163
* Abbreviation map: maps common abbreviated command forms to their canonical form.
* Checked before RECOGNIZED_COMMANDS matching. Order: most specific first.
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states aliases are “checked before RECOGNIZED_COMMANDS matching,” but resolveCommand() has no fallback to match RECOGNIZED_COMMANDS. Either implement the fallback (so full canonical commands don’t need to be duplicated in COMMAND_ALIASES) or update the comment/structure to reflect that aliases are the sole recognition mechanism.

Suggested change
* Abbreviation map: maps common abbreviated command forms to their canonical form.
* Checked before RECOGNIZED_COMMANDS matching. Order: most specific first.
* Abbreviation map: maps abbreviated and canonical command forms to their canonical form.
* Entries are matched in order, so keep the most specific forms first.
* Canonical commands must also appear here if they should be recognized directly.

Copilot uses AI. Check for mistakes.
Comment on lines +194 to +200
function resolveCommand(showArgs) {
const lower = showArgs.toLowerCase();
for (const [abbrev, canonical] of COMMAND_ALIASES) {
if (lower.startsWith(abbrev)) return canonical;
}
return null;
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states aliases are “checked before RECOGNIZED_COMMANDS matching,” but resolveCommand() has no fallback to match RECOGNIZED_COMMANDS. Either implement the fallback (so full canonical commands don’t need to be duplicated in COMMAND_ALIASES) or update the comment/structure to reflect that aliases are the sole recognition mechanism.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants