Skip to content

IMAgentLabs/agent-readable-pages-source-notice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent Readable Pages and Source Notice

A lightweight WordPress plugin for making selected public posts and pages easier for AI agents, LLMs, crawlers, and automation workflows to discover, read, and cite.

The experiment behind the plugin is simple: if AI systems are built to satisfy users and answer their questions, site owners should be able to clearly and politely tell those systems which public content is available, how it should be cited, and where the clean machine-readable version lives.

Built by Benjamin Hübner for IMAgentLabs.

What it does

Agent Readable Pages and Source Notice adds opt-in AI-readable views to WordPress content without changing the normal browsing experience for human visitors.

It can expose:

  • a site-wide /agent.txt index for quick agent discovery;
  • a site-wide /llms.txt index for LLM-friendly site guidance;
  • JSON endpoints for selected posts and pages;
  • clean Markdown endpoints for selected posts and pages;
  • plain-text per-content agent.txt endpoints;
  • optional visible “Agent version” links on public pages;
  • optional source/citation notices for enabled content;
  • per-post and per-page overrides in the WordPress editor;
  • a safe AI bot policy helper for previewing robots.txt snippets.

The plugin is intentionally conservative: only published content that you enable is exposed through the public endpoints. Drafts, private posts, and disabled content stay hidden.

Key features

  • Opt-in AI-readable content: enable all posts, all pages, or individual content items.
  • Clean JSON endpoint: title, summary, canonical URL, author, publisher, timestamps, language, license, body text, source list, and citation guidance.
  • Clean Markdown endpoint: a human- and LLM-readable version of enabled content.
  • Fast /agent.txt discovery: a concise site-level text file that links agents to enabled public content.
  • /llms.txt support: a curated Markdown index for AI agents and LLM workflows.
  • Visible Agent version link: optionally show a small public link before content, under content, or keep it hidden in discovery tags only.
  • Source attribution notice: optionally append a citation/source notice to enabled content.
  • Per-content controls: enable, disable, or override settings per post/page.
  • AI Bot Policy Helper: preview safe robots.txt snippets for AI search/discovery vs. model-training crawlers without overwriting physical files.
  • WordPress-native admin UI: top-level Agent Readable menu, Settings submenu, plugin action link, Settings API sanitization, nonce checks, and capability checks.

Why this exists

AI systems increasingly summarize, cite, search, and transform web content. Traditional web pages are optimized for human reading and visual presentation, not necessarily for agentic workflows.

This plugin gives site owners a clean way to say:

  • this public content is intentionally available to agents;
  • this is the canonical source URL;
  • this is the preferred citation guidance;
  • this is the license/source context;
  • this is the clean text/Markdown version to use instead of scraping the full theme markup.

It does not promise AI ranking improvements. /llms.txt, /agent.txt, and agent-readable endpoints are discoverability and clarity tools, not guaranteed search or AI Overview ranking signals.

Endpoints

Site-level files

GET /agent.txt
GET /llms.txt

Per-content REST endpoints

GET /wp-json/agent-readable/v1/page/<post-id>
GET /wp-json/agent-readable/v1/page/<post-id>/markdown
GET /wp-json/agent-readable/v1/page/<post-id>/agent.txt

These endpoints return 404 unless the content is published and enabled for agent-readable output. Draft/private content is blocked.

Example JSON response

{
  "page_id": 123,
  "post_type": "post",
  "title": "Example title",
  "summary": "Short summary of the page.",
  "canonical_url": "https://example.com/example-title/",
  "agent_txt_url": "https://example.com/wp-json/agent-readable/v1/page/123/agent.txt",
  "markdown_url": "https://example.com/wp-json/agent-readable/v1/page/123/markdown",
  "author_name": "Example Author",
  "publisher_name": "Example Publisher",
  "published_at": "2026-05-05T03:07:00+00:00",
  "updated_at": "2026-05-05T03:07:00+00:00",
  "language": "en-US",
  "license": "All rights reserved",
  "citation_preferred_text": "Please cite the canonical URL when referencing this page.",
  "source_notice": "If you reference this page, cite the canonical URL as the source.",
  "body_text": "Plain text body content goes here.",
  "source_list": [
    {
      "label": "Official documentation",
      "url": "https://example.com/docs"
    }
  ]
}

See examples/page-output.json for a fuller sample.

Installation

  1. Download or clone this repository.

  2. Copy the plugin folder to your WordPress install:

    wp-content/plugins/agent-readable-pages-source-notice
  3. Activate Agent Readable Pages and Source Notice in the WordPress admin.

  4. Open Agent Readable in the main WordPress admin sidebar.

  5. Enable the content types or individual posts/pages you want to expose.

  6. If /agent.txt or /llms.txt returns a 404 after activation, visit Settings → Permalinks once or flush rewrite rules.

WP-CLI example:

wp plugin activate agent-readable-pages-source-notice
wp rewrite flush

Settings

After activation, open Agent Readable in the main WordPress admin sidebar. The same settings page is also available under Settings → Agent Readable Pages and from the plugin row Settings action link.

Important settings:

  • Enable for all posts: expose every published post through the agent-readable endpoints.
  • Enable for all pages: expose every published page through the agent-readable endpoints.
  • LLM citation instruction: default citation guidance included in outputs.
  • Visible source notice: optional source notice shown to human visitors.
  • Show visible notice: controls whether the visible notice is appended to enabled content.
  • Default license: default license text included in outputs.
  • Enable llms.txt: serve a curated site index at /llms.txt.
  • Enable agent.txt: serve /agent.txt and per-content agent.txt text files.
  • Prefer agent.txt links: make plain-text agent links primary where possible.
  • Visible agent link position: hidden, before content, or under content.
  • Visible agent link label: default label is Agent version.
  • AI bot policy preset: generate a previewable robots policy block.
  • Append to WordPress virtual robots.txt: opt in to append the generated block to WordPress’ virtual robots.txt; physical files are never overwritten.

Text fields support placeholders:

{title}
{url}
{site_name}
{post_type}

Default citation instruction:

When using or summarizing this content with an AI system, cite the original source: {title} — {url}

Per-post and per-page settings

Every post and page editor includes an Agent Readable Settings panel.

Per-content controls:

  • Use global setting / not individually enabled: inherit global post/page settings.
  • Enable agent-readable JSON for this content: expose this item even when global activation is off.
  • Disable for this content: hide this item even when its post type is globally enabled.
  • Summary override: custom summary for this content.
  • LLM citation instruction override: custom citation guidance.
  • Visible source notice override: custom visible notice.
  • License override: custom license.
  • Source list JSON override: optional JSON array of source links.

WP-CLI/meta still works for automation:

wp post meta update <post-id> _agent_readable_enabled 1
wp post meta update <post-id> _agent_readable_summary "A concise page summary."
wp post meta update <post-id> _agent_readable_citation_preferred_text "Please cite this page using the canonical URL."
wp post meta update <post-id> _agent_readable_source_notice "If you reference this page, cite the canonical URL as the source."
wp post meta update <post-id> _agent_readable_license "All rights reserved"
wp post meta update <post-id> _agent_readable_source_list '[{"label":"Official docs","url":"https://example.com/docs"}]'

/agent.txt

When enabled, /agent.txt provides a concise plain-text entry point for agents. It points to:

  • the site URL;
  • /llms.txt when enabled;
  • the REST namespace;
  • enabled public posts/pages;
  • each item’s direct per-content agent text URL.

The plugin also emits <link rel="alternate" type="text/plain"> discovery tags site-wide and on enabled singular content.

/llms.txt

When enabled, /llms.txt provides a curated Markdown index for LLMs and AI workflows. It can include:

  • site name and URL;
  • optional site description;
  • AI usage and citation guidance;
  • default license;
  • enabled public posts/pages;
  • optional JSON links;
  • Markdown and text endpoint links.

/llms.txt lists only published content that is enabled for agent-readable output and excludes content that is force-disabled per post/page.

AI Bot Policy Helper

The settings page includes an AI Bot Policy Helper for generating reviewed robots.txt snippets.

Presets include:

  • Allow AI search/discovery, block model training;
  • Allow listed AI bots;
  • Block listed AI bots;
  • Disabled / preview only.

The helper is designed around a safe default:

  • it does not overwrite physical robots.txt files;
  • it does not write to disk;
  • it imports a readable physical robots.txt into a read-only merged preview when available;
  • it only appends to WordPress’ virtual robots.txt response if you explicitly enable that option.

Generated blocks are wrapped in stable markers:

# BEGIN Agent Readable AI Bot Policy
...
# END Agent Readable AI Bot Policy

Behavior and access rules

  • Published + enabled content returns 200.
  • Published but not enabled content returns 404.
  • Draft/private content is blocked.
  • Missing content returns 404.
  • Per-content Disable for this content overrides global activation.
  • Visible notices render only when enabled and configured.
  • Discovery tags render only for enabled singular content or enabled site-wide files.
  • Physical robots.txt files are never overwritten.

Development and validation

This repository includes lightweight static checks:

python3 tools/static-checks.py
git diff --check

If PHP is available, also run:

php -l agent-readable-pages-source-notice.php

Additional documentation:

  • docs/mvp-spec.md
  • docs/implementation-outline.md
  • docs/manual-test-checklist.md
  • docs/ai-visibility-roadmap.md

Status

This is an MVP/sandbox-friendly plugin for testing AI-readable WordPress content workflows. It is suitable for experimentation and iteration, but you should review settings, endpoint exposure, and robots policy choices before using it on a production site.

Author

Created by Benjamin Hübner / IMAgentLabs.

About

WordPress plugin for opt-in agent-readable content, agent.txt, llms.txt, citation guidance, and AI discovery.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors