fix: improve parser robustness for real-world tech output#5
fix: improve parser robustness for real-world tech output#5
Conversation
First production test revealed the parser couldn't handle typical field tech terminal captures. Three categories of fixes: 1. Abbreviated commands — PROMPT_RE now matches `sh` as well as `show`, and a new COMMAND_ALIASES map resolves `sh cdp nei`, `int stat`, `mac ad`, etc. to their canonical forms. 2. New parsers — `parseInterfaceStatus()` handles `show interfaces status` (different columns than `show interfaces description`), and `parseLldpNeighbors()` handles `show lldp neighbors`. Both populate existing enrichment fields for backward compatibility. 3. CDP multi-line — `parseCdpNeighbors()` now handles long device IDs that wrap to the next line (common in production). Also: skip VLAN "All" CPU entries, smarter fallback in multi-command mode (no more 1500+ false errors), and informative missing-MAC-table messages that list which commands were actually detected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | ✅ 57 (≤ 500 complexity) |
| Duplication | ✅ 0 (≤ 4 duplication) |
AI Reviewer: run a review on demand. To trigger the first review automatically, go to your organization or repository integration settings. AI can make mistakes. Always validate suggestions.
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the Cisco command parsing logic by introducing support for command abbreviations, 'show interfaces status', and 'show lldp neighbors'. It also improves the CDP neighbor parser to handle multi-line output for long hostnames and adds a detailed error reporting mechanism for missing MAC address-table data. Review feedback highlights an opportunity to prevent silent data loss for unrecognized commands in the splitter and suggests refactoring duplicated regex patterns into module-level constants to improve maintainability.
| const canonical = resolveCommand(showArgs); | ||
| if (!canonical) continue; |
There was a problem hiding this comment.
The resolveCommand function returns null if the command is not found, but the caller does not handle the case where a command is missing from the recognized list but still present in the input. This could lead to silent data loss for commands that are not in the RECOGNIZED_COMMANDS list but are still valid show commands.
| if (!text) return result; | ||
|
|
||
| const lines = text.split('\n'); | ||
| const PORT_PATTERN = /(?:Gig(?:abitEthernet)?|Fas(?:tEthernet)?|Ten(?:GigabitEthernet)?|Eth(?:ernet)?|Twe(?:ntyFiveGigE)?|Po(?:rt-channel)?|Gi|Te|Fa)\s*[\d/]+/i; |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Improves the terminal-output parser to better handle real-world Cisco CLI pastes (abbreviated commands, multi-line CDP entries, and new interface/LLDP parsers) while reducing noise and improving UI error messaging.
Changes:
- Added abbreviated
sh ...prompt support and a command-alias resolver for multi-command splitting. - Introduced
parseInterfaceStatus()andparseLldpNeighbors()and wired them intoparseTerminalOutput()with backward-compat fallbacks. - Improved MAC parsing noise reduction and UI messaging when MAC-table data is missing.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| samples/production-before.txt | Adds a large real-world capture used to validate parsing robustness (but currently contains sensitive data). |
| samples/production-after.txt | Adds a real-world capture focused on MAC/interface/VLAN/CDP/LLDP outputs for parser validation. |
| js/parser.js | Extends command splitting, adds new parsers (interfaces status + LLDP), improves CDP wrap handling, and reduces MAC parsing noise/fallback errors. |
| js/app.js | Improves missing-MAC-table UI errors by listing detected commands/hostname. |
Comments suppressed due to low confidence (1)
js/parser.js:1
commandsFoundis currently derived from the presence of parsed data objects, but some of those objects are now synthesized for backward compatibility (e.g.,interfaceDescderived frominterfaces status,cdpNeighborsderived fromlldp neighbors). This can cause the UI to report commands as “detected” even when they weren’t present in the paste; track “seen commands” separately from “derived compatibility data” (e.g.,sawInterfacesDescription,sawInterfacesStatus,sawCdp,sawLldp) and buildcommandsFoundfrom the saw-flags only.
// Cisco MAC address table parser
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| enable secret 9 $9$Ul64Po8JxM16t0$w6E3BBfxG60Ju2uyQ7l.FCFUFhOpBuScSJmC8vznkMQ | ||
| ! | ||
| username svc_prime privilege 15 secret 9 $9$YvhRekuD07VDGU$6YyzIyKySN6ltDorL8VbnLQN96Ft7qZStCW5Ug0E2zg |
There was a problem hiding this comment.
The repository is committing production configuration secrets/credentials (enable secret, user secrets, SNMP community strings, TACACS keys, etc.). Even if some are hashed/obfuscated, they are still sensitive and should not be stored in-source; please remove this sample or heavily sanitize it (redact secrets/communities/keys/usernames, consider replacing hostnames/IPs too) and add guidance for generating sanitized fixtures if needed.
| ! | ||
| snmp-server group DNACGROUPAuthPriv v3 priv read DNAC-ACCESS write DNAC-ACCESS | ||
| snmp-server view DNAC-ACCESS iso included | ||
| snmp-server community aOR!75Nm7b RO 11 |
There was a problem hiding this comment.
The repository is committing production configuration secrets/credentials (enable secret, user secrets, SNMP community strings, TACACS keys, etc.). Even if some are hashed/obfuscated, they are still sensitive and should not be stored in-source; please remove this sample or heavily sanitize it (redact secrets/communities/keys/usernames, consider replacing hostnames/IPs too) and add guidance for generating sanitized fixtures if needed.
| tacacs server tacacs-tulpaipn06 | ||
| address ipv4 172.25.84.55 | ||
| key 7 144F3E5D464E2C79092022052A50570137 | ||
| timeout 5 | ||
| single-connection | ||
| tacacs server tacacs-edcpaipn06 | ||
| address ipv4 172.30.84.55 | ||
| key 7 135D3B4441460278062C39033D61431022 | ||
| timeout 5 | ||
| single-connection | ||
| tacacs server tacacs-drrpaipn05 | ||
| address ipv4 172.20.204.54 | ||
| key 7 135D3B4441460278062C39033D61431022 |
There was a problem hiding this comment.
The repository is committing production configuration secrets/credentials (enable secret, user secrets, SNMP community strings, TACACS keys, etc.). Even if some are hashed/obfuscated, they are still sensitive and should not be stored in-source; please remove this sample or heavily sanitize it (redact secrets/communities/keys/usernames, consider replacing hostnames/IPs too) and add guidance for generating sanitized fixtures if needed.
| tacacs server tacacs-tulpaipn06 | |
| address ipv4 172.25.84.55 | |
| key 7 144F3E5D464E2C79092022052A50570137 | |
| timeout 5 | |
| single-connection | |
| tacacs server tacacs-edcpaipn06 | |
| address ipv4 172.30.84.55 | |
| key 7 135D3B4441460278062C39033D61431022 | |
| timeout 5 | |
| single-connection | |
| tacacs server tacacs-drrpaipn05 | |
| address ipv4 172.20.204.54 | |
| key 7 135D3B4441460278062C39033D61431022 | |
| tacacs server tacacs-server-01 | |
| address ipv4 192.0.2.10 | |
| key 7 REDACTED_SANITIZED_SAMPLE | |
| timeout 5 | |
| single-connection | |
| tacacs server tacacs-server-02 | |
| address ipv4 192.0.2.11 | |
| key 7 REDACTED_SANITIZED_SAMPLE | |
| timeout 5 | |
| single-connection | |
| tacacs server tacacs-server-03 | |
| address ipv4 192.0.2.12 | |
| key 7 REDACTED_SANITIZED_SAMPLE |
| // Last token is remote port ID | ||
| const remotePort = parts.length > 0 ? parts[parts.length - 1] : ''; | ||
|
|
||
| result.set(localPort, { deviceId, platform, remotePort }); | ||
| result.set(localPort, { deviceId, platform: 'LLDP', remotePort }); |
There was a problem hiding this comment.
parseLldpNeighbors() returns remotePort un-normalized, while parseCdpNeighbors() normalizes remote port names via normalizePort(). This breaks the “same shape / interchangeability” guarantee and can lead to mismatches in downstream enrichment/comparison; normalize the LLDP remotePort the same way (while leaving non-interface IDs like port-001 unchanged).
| } else if (cmd.includes('interfaces status')) { | ||
| result.interfaceStatus = parseInterfaceStatus(block.text); | ||
| // Also populate interfaceDesc from status data for backward compatibility | ||
| if (!result.interfaceDesc) { | ||
| result.interfaceDesc = result.interfaceStatus.descriptions; | ||
| } | ||
| } else if (cmd.includes('cdp neighbors')) { | ||
| result.cdpNeighbors = parseCdpNeighbors(block.text); | ||
| } else if (cmd.includes('lldp neighbors')) { | ||
| result.lldpNeighbors = parseLldpNeighbors(block.text); | ||
| // Also populate cdpNeighbors from LLDP for backward compatibility | ||
| if (!result.cdpNeighbors) { | ||
| result.cdpNeighbors = result.lldpNeighbors; | ||
| } |
There was a problem hiding this comment.
commandsFound is currently derived from the presence of parsed data objects, but some of those objects are now synthesized for backward compatibility (e.g., interfaceDesc derived from interfaces status, cdpNeighbors derived from lldp neighbors). This can cause the UI to report commands as “detected” even when they weren’t present in the paste; track “seen commands” separately from “derived compatibility data” (e.g., sawInterfacesDescription, sawInterfacesStatus, sawCdp, sawLldp) and build commandsFound from the saw-flags only.
| * | ||
| * @param {string} text - Raw command output | ||
| * @returns {{ descriptions: Map<string, { status: string, protocol: string, description: string }>, | ||
| * portVlans: Map<string, number|'trunk'> }} |
There was a problem hiding this comment.
The JSDoc return type for portVlans doesn’t match behavior: the implementation can set 'routed', but the doc says number|'trunk'. Update the doc (and any consuming type assumptions) to include 'routed' or adjust the implementation to match the declared return shape.
| * portVlans: Map<string, number|'trunk'> }} | |
| * portVlans: Map<string, number|'trunk'|'routed'> }} |
| if (vlanField.toLowerCase() === 'trunk') { | ||
| portVlans.set(port, 'trunk'); | ||
| } else if (vlanField.toLowerCase() === 'routed') { | ||
| portVlans.set(port, 'routed'); |
There was a problem hiding this comment.
The JSDoc return type for portVlans doesn’t match behavior: the implementation can set 'routed', but the doc says number|'trunk'. Update the doc (and any consuming type assumptions) to include 'routed' or adjust the implementation to match the declared return shape.
| * Abbreviation map: maps common abbreviated command forms to their canonical form. | ||
| * Checked before RECOGNIZED_COMMANDS matching. Order: most specific first. |
There was a problem hiding this comment.
The comment states aliases are “checked before RECOGNIZED_COMMANDS matching,” but resolveCommand() has no fallback to match RECOGNIZED_COMMANDS. Either implement the fallback (so full canonical commands don’t need to be duplicated in COMMAND_ALIASES) or update the comment/structure to reflect that aliases are the sole recognition mechanism.
| * Abbreviation map: maps common abbreviated command forms to their canonical form. | |
| * Checked before RECOGNIZED_COMMANDS matching. Order: most specific first. | |
| * Abbreviation map: maps abbreviated and canonical command forms to their canonical form. | |
| * Entries are matched in order, so keep the most specific forms first. | |
| * Canonical commands must also appear here if they should be recognized directly. |
| function resolveCommand(showArgs) { | ||
| const lower = showArgs.toLowerCase(); | ||
| for (const [abbrev, canonical] of COMMAND_ALIASES) { | ||
| if (lower.startsWith(abbrev)) return canonical; | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
The comment states aliases are “checked before RECOGNIZED_COMMANDS matching,” but resolveCommand() has no fallback to match RECOGNIZED_COMMANDS. Either implement the fallback (so full canonical commands don’t need to be duplicated in COMMAND_ALIASES) or update the comment/structure to reflect that aliases are the sole recognition mechanism.
Summary
PROMPT_REmatchesshas well asshow; new alias map resolvessh cdp nei,int stat,mac ad, etc.parseInterfaceStatus()forshow interfaces statusandparseLldpNeighbors()forshow lldp neighbors, both backward-compatible with existing enrichment consumersCloses TRMC-12
Test plan
show interfaces status), 1 CDP neighbor, 15 VLANs — 0 false MAC errors🤖 Generated with Claude Code