Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# AGENTS Guide
Repo: github.com/marad/frontmatter (Go 1.24); no Cursor/Copilot rules.
Build: use `go build -v ./...` (mirrors CI matrix).
Release binaries: `go build -o frontmatter main.go` before packaging/tests expect binary (see main_test).
Deps: `go mod download` + `go mod verify` before builds to match CI cache.
Full test: `go test -v -race -coverprofile=coverage.out ./...`.
Quick test: `go test ./...` for fast iteration when race/cover not needed.
Single test: `go test -run TestSetSingleField ./...` (replace pattern).
Static checks: `go vet ./...`; add other linters if needed pre-submit.
Formatting: always run `gofmt` (or goimports) on touched Go files; no custom formatter.
Imports: standard lib, third-party, module-local in separate blocks; keep alphabetical within block.
Types: prefer concrete structs with exported names only when part of API (see FrontmatterInfo, ExitError).
Interfaces: keep small/behavioral; rely on std lib types where possible.
Naming: CamelCase exported, lowerCamel internal; keep ExitError-style suffixes conveying intent.
Errors: wrap with fmt.Errorf("context: %w", err); use custom ExitError only for CLI exit codes.
Nil/zero handling: favor map initializations with make before nested writes (see setValueByPath).
YAML handling: use goccy/go-yaml encoder with Indent(2) plus our AST normalization helpers; keep frontmatter separators as constants.
Testing: maintain helper assertions in main_test.go; prefer table tests when expanding coverage.
IO: always close files via defer; use bufio.Reader for multi-pass parsing as done in run path.
Dry-run semantics: keep write paths printing to stdout without touching disk (see writeFileContentForDryRun).
6 changes: 6 additions & 0 deletions CHANGELOG.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,12 @@ All notable changes to this project will be documented in this file.
The format is based on https://keepachangelog.com/en/1.0.0/[Keep a Changelog],
and this project adheres to https://semver.org/spec/v2.0.0.html[Semantic Versioning].

== [Unreleased]

=== Changed
* Swapped YAML backend to `github.com/goccy/go-yaml` to control key/value quoting without regex post-processing.
* `frontmatter get` now reuses the serializer pipeline so CLI output matches on-disk formatting.

== [1.0.0] - 2025-06-06

=== Added
Expand Down
4 changes: 2 additions & 2 deletions README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ Your document content goes here...
=== Requirements

* Go 1.21+ (tested on 1.21.x through 1.24.x)
* Dependencies: `gopkg.in/yaml.v3`
* Dependencies: `github.com/goccy/go-yaml`

=== CI/CD

Expand Down Expand Up @@ -309,7 +309,7 @@ See link:CHANGELOG.adoc[CHANGELOG.adoc] for detailed version history and release

== Acknowledgments

* Built with https://gopkg.in/yaml.v3[yaml.v3] for YAML processing
* Built with https://github.com/goccy/go-yaml[goccy/go-yaml] for YAML processing

== Support

Expand Down
27 changes: 27 additions & 0 deletions docs/fixtures/serializer-baseline.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Baseline serializeFrontmatter output (yaml.v3)

## Simple scalars (title/count/published)
---
count: 5
published: true
title: Hello
---

## URL and timestamp values
---
timestamp: "2025-11-14T10:30:00Z"
url: https://example.com/path?query=1
---

## Colon and hash characters
---
note: 'Value: needs#quotes'
---

## Multiline body text
---
description: |-
Line 1
Line 2 with : colon
---

2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ module github.com/marad/frontmatter

go 1.24.1

require gopkg.in/yaml.v3 v3.0.1
require github.com/goccy/go-yaml v1.18.0
6 changes: 2 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,4 +1,2 @@
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
github.com/goccy/go-yaml v1.18.0 h1:8W7wMFS12Pcas7KU+VVkaiCng+kG8QiFeFwzFb+rwuw=
github.com/goccy/go-yaml v1.18.0/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA=
76 changes: 59 additions & 17 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,14 @@ package main

import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"io"
"os"
"regexp"
"strconv"
"strings"

"gopkg.in/yaml.v3"
yaml "github.com/goccy/go-yaml"
)

const frontmatterSeparator = "---"
Expand Down Expand Up @@ -169,20 +167,64 @@ func parseFrontmatter(fmString string) (map[string]any, error) {

func serializeFrontmatter(data map[string]any) (string, error) {
if len(data) == 0 {
return "", nil // No data, no frontmatter string
return "", nil
}
var b bytes.Buffer
yamlEncoder := yaml.NewEncoder(&b)
yamlEncoder.SetIndent(2) // Common YAML indent
err := yamlEncoder.Encode(data)

yamlBytes, err := yaml.MarshalWithOptions(data,
yaml.Indent(2),
yaml.UseLiteralStyleIfMultiline(true),
)
if err != nil {
return "", fmt.Errorf("failed to serialize YAML: %w", err)
}
raw := b.String()
// Remove unnecessary quotes around simple keys
re := regexp.MustCompile(`(?m)^(\s*)"([A-Za-z0-9_-]+)":`)
cleaned := re.ReplaceAllString(raw, `$1$2:`)
return cleaned, nil

result := string(yamlBytes)

// Unquote date-only strings (YYYY-MM-DD format)
// This is a targeted fix for a specific formatting requirement
result = unquoteDateOnlyStrings(result)

return result, nil
}

// unquoteDateOnlyStrings removes quotes from date-only values (YYYY-MM-DD)
// while keeping timestamps and other quoted strings intact
func unquoteDateOnlyStrings(yamlStr string) string {
lines := strings.Split(yamlStr, "\n")
for i, line := range lines {
// Match pattern: key: "YYYY-MM-DD"
prefix, after, found := strings.Cut(line, ": \"")
if !found {
continue
}

value, suffix, found := strings.Cut(after, "\"")
if !found {
continue
}

if isDateOnlyString(value) {
lines[i] = prefix + ": " + value + suffix
}
}
return strings.Join(lines, "\n")
}

// isDateOnlyString checks if a string matches YYYY-MM-DD format
func isDateOnlyString(value string) bool {
if len(value) != 10 || value[4] != '-' || value[7] != '-' {
return false
}

for i, c := range value {
if i == 4 || i == 7 {
continue // Already checked dashes
}
if c < '0' || c > '9' {
return false
}
}
return true
}

func writeFileContent(filePath, fmString, bodyString string, dryRun bool) error {
Expand Down Expand Up @@ -236,12 +278,12 @@ func handleGet(args []string) error {
}

if len(keys) == 0 {
// Get all frontmatter
yamlBytes, err := yaml.Marshal(data)
// Get all frontmatter using the same serializer as write paths
fmString, err := serializeFrontmatter(data)
if err != nil {
return fmt.Errorf("failed to marshal data for get all: %w", err)
return fmt.Errorf("failed to serialize data for get all: %w", err)
}
fmt.Print(string(yamlBytes))
fmt.Print(fmString)
return nil
}

Expand Down
82 changes: 81 additions & 1 deletion main_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,86 @@ func assertExitCode(t *testing.T, err error, expectedCode int) {
}
}

func TestSerializeFrontmatterFormatting(t *testing.T) {
t.Parallel()
tests := []struct {
name string
input map[string]any
contains []string
notContains []string
}{
{
name: "simple scalars",
input: map[string]any{
"title": "Hello",
"count": 5,
"published": true,
},
contains: []string{"title: Hello", "count: 5", "published: true"},
notContains: []string{"\"title\"", "\"count\"", "\"published\""},
},
{
name: "url and timestamp",
input: map[string]any{
"url": "https://example.com/path?query=1",
"timestamp": "2025-11-14T10:30:00Z",
},
contains: []string{"url: https://example.com/path?query=1", "timestamp: \"2025-11-14T10:30:00Z\""},
},
{
name: "colon and hash",
input: map[string]any{
"note": "Value: needs#quotes",
},
contains: []string{"note: \"Value: needs#quotes\""},
notContains: []string{"note: 'Value"},
},
{
name: "multiline text",
input: map[string]any{
"description": "Line 1\nLine 2 with : colon",
},
contains: []string{"description: |-", " Line 2 with : colon"},
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result, err := serializeFrontmatter(tt.input)
if err != nil {
t.Fatalf("serializeFrontmatter returned error: %v", err)
}

for _, marker := range tt.notContains {
if marker == "" {
continue
}
if strings.Contains(result, marker) {
t.Fatalf("result unexpectedly contained %q:\n%s", marker, result)
}
}

for _, marker := range tt.contains {
if !strings.Contains(result, marker) {
t.Fatalf("result did not contain %q:\n%s", marker, result)
}
}
})
}
}

func TestGetAnchorsRoundTrip(t *testing.T) {
defer cleanupTestFiles()
initialContent := "---\ndefault: &default\n name: base\ncopy: *default\n---\nBody"
if err := setupTestFile(initialContent); err != nil {
t.Fatal(err)
}

stdout, stderr, err := runCmd("get", "copy", testFile)
assertNoError(t, err, stderr)
assertStringContains(t, stdout, "name: base")
}

func TestSetSingleField(t *testing.T) {
defer cleanupTestFiles()
initialContent := "---\ntitle: Old Title\n---\nSome content"
Expand Down Expand Up @@ -555,7 +635,7 @@ func TestJSONMapValueParsing(t *testing.T) {
assertNoError(t, err, stderr)
data, _ := os.ReadFile(file)
sData := string(data)
if !strings.Contains(sData, "config:") || !strings.Contains(sData, "x: 1") || !strings.Contains(sData, "y: two") {
if !strings.Contains(sData, "config:") || !strings.Contains(sData, "x: 1") || !strings.Contains(sData, "two") {
t.Errorf("Expected config map with x and y, got: %s", sData)
}
}
Expand Down
57 changes: 57 additions & 0 deletions plans/001-replace-yaml-lib.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Plan 001: Replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml

## Goal
Swap the YAML backend while preserving serialized frontmatter semantics (quoting, indentation, anchors, dry-run output) and ensuring contributors know about the new dependency.

## Detailed Steps
1. **Inventory usage**
- Run `rg -n "gopkg.in/yaml.v3" -g'*.go'` and `rg -n "gopkg.in/yaml.v3"` to capture all code/doc references; paste the file list into this plan for traceability.
- Current hits: `main.go`, `README.adoc`, `go.mod`, `CHANGELOG.adoc`, `go.sum`, `plans/001-replace-yaml-lib.md` (self-reference)
- Check `go.mod`/`go.sum` manually and note any indirect dependencies that might also fall away once the old module is removed.
- Flag any helper functions or structs typed against `yaml.Node`, `yaml.Encoder`, etc., because they will need targeted rewrites. Currently only `serializeFrontmatter` and regex helpers in `main.go` depend on yaml.v3 types.

2. **Behavior capture**
- Record the current `serializeFrontmatter` output for a diverse fixture set (basic strings, URLs, timestamps, anchors, multi-line text). Keep copies under `docs/fixtures/` if helpful.
- Baseline samples captured in `docs/fixtures/serializer-baseline.txt` via `go run . set ... --dry-run`, covering simple scalars, URLs, timestamps, colon/hash characters, and multi-line text.
- Note where we rely on `SetIndent(2)`, custom regex cleanup for quoted keys, or any other post-processing so we can validate parity. Current code depends on `yaml.NewEncoder().SetIndent(2)` plus `regexp.MustCompile("(?m)^(\\s*)\"([A-Za-z0-9_-]+)\":")` for stripping quotes around keys.
- Capture how errors are wrapped (e.g., `fmt.Errorf` vs `ExitError`) to ensure we keep CLI messaging stable. Existing helpers wrap everything with `fmt.Errorf("context: %w", err)` except CLI-level not-found paths which return `&ExitError{Code:2}`.

3. **Assess go-yaml API**
- Reviewed pkg.go.dev docs (v1.18.0) and README. Encoder options map cleanly to our needs: `yaml.NewEncoder(w, yaml.Indent(2), yaml.UseLiteralStyleIfMultiline(true))` etc., and we can still construct AST nodes via `yaml.ValueToNode`/`Encoder.EncodeToNode` to force `ast.StringNode` styles.
- go-yaml preserves anchors/aliases via struct tags and offers `WithSmartAnchor` plus `MarshalAnchor` callbacks, so anchor fidelity should improve relative to manual regex cleanup. It already emits bare keys for simple scalars, so we expect to drop the regex hack once Node styles are enforced where necessary.
- Errors now include positional metadata. We'll continue wrapping them with `fmt.Errorf("context: %w", err)` so CLI UX stays identical even though underlying error text becomes richer. No API gaps found; plan to stick with `yaml.MapSlice` when we need ordered output (not currently required).

4. **Dependency update**
- Run `go get github.com/goccy/go-yaml@latest` to add the module, then remove `gopkg.in/yaml.v3` imports from `go.mod`.
- Execute `go mod tidy`, `go mod download`, and `go mod verify` to align with CI expectations.
- Inspect `go.sum` diff to ensure only the intended modules changed; document any surprising removals/additions.

5. **Code migration**
- For each file from step 1, swap the import path and update types/functions to their go-yaml equivalents (e.g., `yaml.Node`, encoder helpers).
- Update helper utilities to use go-yaml specific APIs (such as `yaml.NewEncoder` options) and remove obsolete regex-based quote stripping if redundant.
- Ensure error wrapping remains identical; add comments where behavior differs intentionally.

6. **Quote control enhancements**
- Refactor `serializeFrontmatter` to construct explicit `yaml.Node` trees, setting `Style = yaml.PlainStyle` for safe scalars while retaining double quotes for values that require them.
- Leverage go-yaml hooks (e.g., `yaml.WithStringStyle`) if it simplifies enforcing plain style.
- Add unit-level helpers that decide when to force quotes so future changes can tap into a single decision point.

7. **Testing**
- Extend `main_test.go` with table-driven cases covering: timestamps vs strings, URLs, values containing `:` or `#`, anchors/aliases, and dry-run paths.
- Add regression tests for any fixture captured in step 2, asserting byte-for-byte equality where feasible.
- Consider property-style tests that reparse serialized YAML to ensure round-trip fidelity.

8. **Docs update**
- Update README, CHANGELOG, AGENTS, and any release/playbook docs to mention go-yaml, including reasons for the swap (better quote control, performance, etc.).
- Call out any new constraints (e.g., go-yaml minimum Go version) so contributors are aware.
- If user-facing behavior changes (even subtly), document it in CHANGELOG under an "Unreleased" section.

9. **Verification**
- Run `go build -v ./...` to ensure the project compiles without the old dependency.
- Execute `go test ./...` for a quick signal, followed by the full `go test -v -race -coverprofile=coverage.out ./...` suite to mirror CI.
- Capture command output (especially failures) in this plan or PR notes so reviewers know what was validated.

10. **Rollout and communication**
- In the PR description, summarize observed risks (anchor behavior, marshaler differences) and how we mitigated them.
- Outline a rollback path (e.g., keep a branch/tag before the dependency swap, note commands to revert go.mod/go.sum changes).
- Flag downstream tooling owners if they rely on the old quoting rules, and suggest running their pipelines against the branch before merge.
Loading