This document explains how PHPantom resolves PHP symbols — classes, interfaces, traits, enums, and functions — across files and from the PHP standard library.
PHPantom is a language server for PHP projects. It provides completion, go-to-definition, go-to-implementation, find references, hover, signature help, and diagnostics. It works by:
- Parsing PHP files into lightweight
ClassInfo/FunctionInfostructures (not a full AST — just the information needed for IDE features). - Caching parsed results in an in-memory
ast_mapkeyed by file URI. - Building a precomputed symbol map (
symbol_maps) during parsing for O(log n) go-to-definition lookups and call-site detection for signature help. - Resolving symbols on demand through a multi-phase lookup chain.
- Merging inherited members (from parent classes, traits, interfaces, and mixins) at resolution time.
Most features (completion, go-to-definition, hover, signature help, diagnostics) use maps or known file names and never walk directories. Only two features do multi-file scanning:
- Go-to-implementation scans for concrete classes that implement an interface or extend an abstract class (see
find_implementors). Walks PSR-4 source directories only. - Find References scans for all occurrences of a symbol across the project (see
ensure_workspace_indexed). Walks the entire workspace root.
Both features follow the same principle for vendor code: the Composer classmap is the source of truth. Vendor directories are never walked. User PSR-4 roots from composer.json are walked because user files change between dump-autoload runs.
The two walkers differ in scope because GTI only needs class declarations (which live in PSR-4 roots), while Find References needs any usage of a symbol, which could be in a standalone script, config file, or index.php at the project root.
PHPantom is built in layers. Each layer is independently useful and independently testable.
- Layer 0: Stubs. Embedded PHP standard library types. Available immediately, no project context needed.
- Layer 1: Single file. Parse the open file, extract classes/functions/symbols. Completion, hover, and go-to-definition work within the file with no cross-file resolution at all.
- Layer 2: On-demand resolution. When a symbol references a class in another file, resolve it through the classmap or PSR-4 and parse that file. Only the files actually needed are touched.
- Layer 3: Classmap. A name-to-path index covering the whole project. Enables class name completion and O(1) cross-file lookup. Built from Composer's output or self-generated via a fast byte-level scan.
- Layer 4: Full index (opt-in). Background-parse every file in the classmap. Enables workspace symbols, fast find-references, and rich completion item detail.
Each layer builds on the one below it. A bug in classmap generation doesn't break single-file completion. A slow full index doesn't block on-demand resolution. New features can be developed and tested against the lower layers without waiting for a full project scan. This is also why PHPantom starts fast: Layer 0-2 are ready in milliseconds, Layer 3 takes seconds, and Layer 4 (when enabled) fills in over the following minute.
src/
├── lib.rs # Backend struct, state, constructors
├── main.rs # Entry point (stdin/stdout LSP transport)
├── server.rs # LSP protocol handlers (initialize, didOpen, completion, …)
├── types.rs # Data structures (ClassInfo, MethodInfo, SubjectExpr, ResolvedCallableTarget, …)
├── composer.rs # composer.json / PSR-4 autoload parsing
├── stubs.rs # Embedded phpstorm-stubs (build-time generated index)
├── resolution.rs # Multi-phase class/function lookup and name resolution
├── inheritance.rs # Base class inheritance merging (traits, parent chain)
├── symbol_map/
│ ├── mod.rs # Data structures (SymbolSpan, SymbolKind, VarDefSite, CallSite, SymbolMap) and impl
│ ├── docblock.rs # Docblock symbol extraction (type span emission, @template/@method tag scanning, navigability filter)
│ └── extraction.rs # AST walk that builds a SymbolMap (extract_symbol_map and all extract_from_* helpers)
├── virtual_members/
│ ├── mod.rs # VirtualMemberProvider trait, VirtualMembers struct, merge logic
│ ├── laravel.rs # LaravelModelProvider (relationships, scopes, casts, accessors)
│ └── phpdoc.rs # PHPDocProvider (@method, @property, @property-read, @property-write, @mixin)
├── subject_extraction.rs # Shared helpers for extracting subjects before ->, ?->, ::
├── util.rs # Position conversion, class lookup, logging, directory walkers (collect_php_files, collect_php_files_gitignore)
├── parser/
│ ├── mod.rs # Top-level parse entry points (parse_php, parse_functions, …)
│ ├── classes.rs # Class, interface, trait, enum, and anonymous class extraction
│ ├── functions.rs # Standalone function and define() constant extraction
│ ├── use_statements.rs # use statement and namespace extraction
│ └── ast_update.rs # update_ast orchestrator and name resolution helpers
├── docblock/
│ ├── mod.rs # Re-exports from submodules
│ ├── tags.rs # PHPDoc tag extraction (@return, @var, @mixin, @deprecated, …)
│ ├── templates.rs # Template/generics/type-alias tags (@template, @extends, …)
│ ├── virtual_members.rs # Virtual member tags (@property, @method)
│ ├── conditional.rs # PHPStan conditional return type parsing
│ └── types.rs # Type cleaning utilities (clean_type, strip_nullable, …)
├── completion/
│ ├── mod.rs # Submodule declarations + backward-compatible re-exports
│ ├── handler.rs # Top-level completion request orchestration
│ ├── target.rs # Extract what the user is completing (subject + access kind)
│ ├── resolver.rs # Resolve subject → ClassInfo (type resolution engine), shared resolve_callable_target
│ ├── call_resolution.rs # Call expression and callable target resolution
│ ├── builder.rs # Build LSP CompletionItems from resolved ClassInfo
│ ├── array_shape.rs # Array shape key completion and raw variable type resolution
│ ├── named_args.rs # Named argument completion inside function/method call parens
│ ├── use_edit.rs # Use-statement insertion helpers
│ ├── variable/ # Variable resolution
│ │ ├── resolution.rs # Variable type resolution via assignment scanning
│ │ ├── completion.rs # Variable name completions and scope collection
│ │ ├── rhs_resolution.rs # Right-hand-side expression resolution
│ │ ├── class_string_resolution.rs # Class-string variable resolution ($cls = User::class)
│ │ ├── raw_type_inference.rs # Raw type inference (array shapes, array functions, generators)
│ │ ├── foreach_resolution.rs # Foreach value/key and array destructuring type resolution
│ │ └── closure_resolution.rs # Closure and arrow-function parameter resolution
│ ├── types/ # Type resolution
│ │ ├── resolution.rs # Type-hint string → ClassInfo mapping (unions, generics, aliases)
│ │ ├── narrowing.rs # instanceof / assert / custom type guard narrowing
│ │ └── conditional.rs # PHPStan conditional return type resolution at call sites
│ ├── context/ # Context-specific completion
│ │ ├── catch_completion.rs # Smart exception type completion inside catch() clauses
│ │ ├── class_completion.rs # Class name completions (class, interface, trait, enum)
│ │ ├── constant_completion.rs # Global constant name completions
│ │ ├── function_completion.rs # Standalone function name completions
│ │ ├── namespace_completion.rs # Namespace declaration completions
│ │ └── type_hint_completion.rs # Type completion in parameter lists, return types, properties
│ ├── phpdoc/ # PHPDoc completion
│ │ ├── mod.rs # PHPDoc tag completion inside /** … */ blocks
│ │ └── context.rs # PHPDoc context detection and symbol info extraction
│ └── source/ # Source analysis
│ ├── comment_position.rs # Comment and docblock position detection
│ ├── helpers.rs # Source-text scanning helpers (closure/callable return types, new-expression parsing)
│ └── throws_analysis.rs # Shared throw-statement scanning and @throws tag lookup
├── signature_help.rs # Signature help: parameter hints inside function/method call parens
├── hover/
│ └── mod.rs # Hover handler: symbol-map dispatch, type/signature/docblock formatting
├── definition/
│ ├── mod.rs # Submodule declarations
│ ├── resolve.rs # Core go-to-definition: symbol-map dispatch + text-based fallback
│ ├── member.rs # Member-access resolution (->method, ::$prop, ::CONST) with stored offsets
│ ├── variable/
│ │ ├── mod.rs # VarDefSearchResult enum, Backend methods, tests
│ │ ├── var_definition.rs # AST walk finding variable definition sites
│ │ └── type_hint.rs # AST walk extracting type hints at definition sites
│ └── implementation.rs # Go-to-implementation (interface/abstract → concrete classes)
├── references/
│ ├── mod.rs # Find References handler: same-file and cross-file symbol scanning
│ └── tests.rs # Unit tests for find-references
├── highlight/
│ └── mod.rs # Document highlighting: same-file symbol occurrence highlighting
├── diagnostics/
│ ├── mod.rs # Diagnostic collection and publishing (skips vendor files)
│ ├── deprecated.rs # @deprecated usage diagnostics (strikethrough)
│ └── unused_imports.rs # Unused use-statement dimming
build.rs # Parses PhpStormStubsMap.php, generates stub index
stubs/ # Composer vendor dir for jetbrains/phpstorm-stubs
tests/
├── common/mod.rs # Shared test helpers and minimal PHP stubs
├── completion_*.rs # Completion integration tests (by feature area)
├── definition_*.rs # Go-to-definition integration tests
├── hover.rs # Hover integration tests
├── signature_help.rs # Signature help integration tests
├── implementation.rs # Go-to-implementation integration tests
├── document_highlight.rs # Document highlighting integration tests
├── docblock_*.rs # Docblock parsing and type tests
├── parser.rs # PHP parser / AST extraction tests
├── composer.rs # Composer integration tests
└── …
PHPantom uses three crates from the Mago project for PHP parsing and type representation:
| Crate | Purpose |
|---|---|
mago-docblock |
Structured PHPDoc parsing (replaces manual scanning) |
mago-names |
Use-statement and namespace resolution |
mago-type-syntax |
PHPStan/Psalm type expression parsing into PhpType |
PhpType (src/php_type.rs) is an owned enum that wraps the borrowed
mago_type_syntax::ast::Type AST. Every type-carrying field in the
data model (type_hint, return_type, native_type_hint,
native_return_type, asserted_type, template_param_bounds values,
generics type arguments, ResolvedType::type_string) uses PhpType.
The substitution engine in inheritance.rs operates on
HashMap<String, PhpType> so that template parameter replacement
never re-parses type strings.
Several other Mago crates were evaluated and ruled out:
| Crate | Reason |
|---|---|
mago-codex |
Replaces ClassInfo model with one that cannot carry LaravelMetadata. |
mago-semantics |
12K false positives on Laravel; no way to inject our type context. |
mago-linter |
Same problem; Integration::Laravel is surface-level only. |
mago-fingerprint |
Requires mago-names for limited value; signature_eq already works. |
All Mago crates are pinned to the same release. When upgrading, update all Mago crates in a single commit and run the test suite.
The following functions in type_strings.rs operate on raw docblock
text (tokenizing tag descriptions, separating type tokens from
parameter names) rather than on structured type fields, and have no
PhpType equivalent:
split_type_token— extracts a type token from a tag descriptionclean_type— strips?, leading\, trailing punctuationsplit_union_depth0— splits raw union text at depth 0split_generic_args— splits raw generic arguments at depth 0PHPDOC_TYPE_KEYWORDS— completion candidate list for PHPDoc types
The definition subsystem uses a three-tier approach, from fastest to slowest:
During update_ast, every navigable symbol occurrence in a file is recorded as a SymbolSpan in a sorted vec stored in symbol_maps (keyed by file URI). Each span records the byte range and a SymbolKind variant:
SymbolKind |
What it captures |
|---|---|
ClassReference |
Class/interface/trait/enum names in type contexts (new Foo, extends, implements, type hints, catch, docblock types) |
ClassDeclaration |
Class name at its declaration site (cursor already at definition) |
MemberAccess |
->, ?->, :: member names with the subject text and static/method-call flags |
Variable |
$variable tokens (both usage and definition sites) |
FunctionCall |
Standalone function call names |
SelfStaticParent |
self, static, parent keywords in navigable contexts |
ConstantReference |
Constant names (define() name, class constant access, standalone constant reference) |
MemberDeclaration |
Method, property, or constant name at its declaration site (not navigable for go-to-definition, but needed for find-references) |
When a go-to-definition request arrives, resolve_definition converts the cursor position to a byte offset and does a binary search on the symbol map. If a SymbolSpan is found, it dispatches directly to the appropriate resolution path — no text scanning needed. If the offset falls in a gap (whitespace, string interior, comment interior, etc.), the request is instantly rejected.
Docblock type references (@param, @return, @var, @template, @method, etc.) are extracted by a dedicated string scanner during the AST walk, since docblocks are trivia in the mago_syntax AST and produce no expression/statement nodes. These use the same ClassReference kind and resolution path as code type hints.
The symbol map also stores:
- Variable definition sites (
var_defs): records every assignment, parameter, foreach binding, catch variable, static/global declaration, and destructuring site with its byte offset, scope boundary, andeffective_fromoffset (for assignments, this is the end of the statement so the RHS sees the previous definition). Go-to-definition for$varfinds the most recent definition before the cursor within the enclosing scope via binary search. - Scope boundaries (
scopes): function, method, closure, and arrow function body ranges. Used byfind_enclosing_scopeto determine which scope the cursor is in. - Template parameter definitions (
template_defs):@templatetag locations so that template parameter names (e.g.TKey,TModel) that appear in docblock types can be resolved to their declaration site.
MethodInfo, PropertyInfo, ConstantInfo, and FunctionInfo each carry a name_offset: u32 field recording the byte offset of the member's name token in the source file. ClassInfo carries keyword_offset: u32 for the class/interface/trait/enum keyword. These are populated during parsing.
When a cross-file jump lands on a class or member, find_member_position and find_definition_position convert the stored offset directly to an LSP Position via offset_to_position — no line-by-line text scanning needed.
A value of 0 means "not available" (stubs parsed before offsets were stored, synthetic/virtual members). In that case, the system falls back to Tier 3.
The only remaining line-by-line text scanner is in find_member_position (definition/member/file_lookup.rs). When a member has no stored AST byte offset (name_offset is None or 0), the function scans docblock comments for @property / @property-read / @property-write and @method tags that declare the virtual member. This is inherent to virtual members: they exist only as docblock annotations and have no AST node to store an offset for.
All cursor context detection (word extraction, member access detection, type-hint resolution) now relies exclusively on the precomputed SymbolMap and AST walks. The former text-based helpers extract_word_at_position, extract_member_access_context, and resolve_type_hint_at_variable_text have been removed.
Variable go-to-definition ($var → jump to definition) uses two layers:
- Symbol map (
var_defs): the primary path. Finds the most recentVarDefSitebefore the cursor within the enclosing scope. When the cursor is physically on a definition token (parameter, foreach binding, catch variable), it returnsNoneso the caller can fall through to type-hint resolution. - AST-based search (
resolve_variable_definition_ast): parses the file and walks the enclosing scope to find the definition site. Handles destructuring ([$a, $b] = ...,list($a, $b) = ...) and nested scopes correctly. Used as a fallback when the symbol map doesn't have a match. ReturnsNonewhen the AST parse fails.
"Go to Type Definition" (textDocument/typeDefinition) jumps from a symbol to the class declaration of its resolved type, rather than to the definition site. For example, if $user is typed as User, go-to-definition jumps to the $user = ... assignment, while go-to-type-definition jumps to class User { … }.
The handler (definition/type_definition.rs) reuses the same symbol-map lookup as go-to-definition and the same type-resolution pipelines as hover and completion:
-
Symbol map lookup. Convert the cursor position to a byte offset and binary-search the symbol map (same as go-to-definition, including the offset-1 retry for end-of-token cursors).
-
Type resolution by symbol kind:
SymbolKindResolution path Variableresolve_variable_type(type-string path) with fallback toresolve_variable_types(ClassInfo path).$thisresolves to the enclosing class.MemberAccessresolve_target_classesfinds the subject's class, then the method'sreturn_typeor the property'stype_hintis extracted.self/static/$thisin return types are replaced with the owning class name.SelfStaticParentself/staticresolve to the enclosing class;parentresolves to the parent class.ClassReferenceThe type is the class itself. FunctionCallThe function's return_typeis extracted.ClassDeclaration,MemberDeclaration,ConstantReferenceNo type definition target; returns None. -
Type string to class names.
extract_class_names_from_type_stringsplits union types at depth-0|separators, strips?(nullable), generic parameters (<…>), array shapes ({…}), and trailing[]. Scalar types (int,string,array,void,mixed,bool,float,null,false,true,never,callable,iterable,resource,object) are excluded since they have no user-navigable declaration. -
Class name to location. Each surviving class name is resolved via
resolve_class_reference(the same function go-to-definition uses for class references), which tries same-file AST lookup, cross-fileclass_index+ast_map, Composer classmap, PSR-4, and template parameter fallback. Duplicate locations are deduplicated.
For union types, the handler returns multiple locations (one per non-scalar class in the union), which editors display as a peek list.
Signature help shows parameter hints when the cursor is inside the parentheses of a function or method call. Detection uses a two-tier strategy:
During update_ast, every call expression in the file is recorded as a
CallSite in the SymbolMap. Each entry stores:
| Field | Purpose |
|---|---|
args_start |
Byte offset immediately after the opening ( |
args_end |
Byte offset of the closing ) |
call_expression |
The call target in resolve_callable format (see below) |
comma_offsets |
Byte offsets of each top-level comma separator |
Call sites are emitted for all five AST call kinds: Call::Function,
Call::Method, Call::NullSafeMethod, Call::StaticMethod, and
Expression::Instantiation. The call_expression string is built by
expr_to_subject_text, which recursively converts the AST expression into
the text format the resolver expects:
"functionName"for standalone function calls"$subject->method"for instance and null-safe method calls"ClassName::method"for static method calls"new ClassName"for constructor calls
When a signature help request arrives, detect_call_site_from_map converts
the LSP position to a byte offset and searches the call_sites vec in
reverse for the innermost entry whose argument range contains the cursor.
The active parameter index is computed by counting how many precomputed
comma offsets fall before the cursor. This handles nesting, strings
containing commas/parens, and arbitrary chain depth correctly because the
offsets come from the parser's token stream.
When the symbol map has no matching call site (typically because the parser
could not recover the call node from incomplete code, e.g. an unclosed (
while the user is typing), the text-based scanner
detect_call_site_text_fallback fires. It walks backward from the cursor
to find an unmatched (, extracts the call expression with simple
character-level scanning, and counts top-level commas. This path handles
simple calls reliably but cannot resolve property chains, method return
chains, or expressions containing balanced parentheses.
If the text fallback also fails to resolve the callable (e.g. because the
class context is missing from the broken AST), a content-patching strategy
inserts ); at the cursor position and re-parses the file to recover
class context for resolution.
resolve_callable dispatches on the call expression format:
new ClassName— loads the class viaclass_loader, finds__construct. Returns an empty parameter list when no constructor is defined.$subject->method— splits at the last->, resolves the subject viaresolve_target_classes(which handles$this, variables, property chains, call return chains, array access, etc.), then looks up the method on the resolved class.ClassName::method— resolvesself/static/parentkeywords and bare class names viaclass_loader, with a fallback toresolve_target_classesfor class-string variables (e.g.$cls = Pen::class; $cls::make()).functionName— resolves viaresolve_function_name(use map, namespace, stubs). Falls back to first-class callable resolution: if the expression is a$variable, scans backward for$var = target(...)and recursively resolves the underlying target.
When the LSP needs to resolve a class name (e.g. during completion on Iterator:: or when following a type hint), it calls find_or_load_class. This method tries four phases in order, returning as soon as one succeeds:
find_or_load_class("Iterator")
│
├── Phase 1: ast_map scan
│ Searches all already-parsed files by short class name + namespace.
│ This is where cached results from previous phases are found on
│ subsequent lookups — a classmap file parsed in Phase 1.5 or a
│ stub parsed in Phase 3 is cached here and found in Phase 1
│ next time.
│ ↓ miss
│
├── Phase 1.5: Composer classmap (FQN → file path)
│ Direct hash lookup in the classmap parsed from
│ vendor/composer/autoload_classmap.php. More targeted than PSR-4
│ and covers classes that don't follow PSR-4 conventions. When the
│ classmap is complete, all classes (including vendor) are resolved
│ here without further searching.
│ ↓ miss
│
├── Phase 2: PSR-4 resolution
│ Uses PSR-4 mappings from composer.json to locate the file on disk.
│ These mappings only cover user code (vendor PSR-4 is not loaded).
│ Example: "App\Models\User" → workspace/src/Models/User.php
│ Reads, parses, resolves names, caches in ast_map.
│ ↓ miss
│
├── Phase 3: Embedded PHP stubs
│ Looks up the class name in the compiled-in stub index
│ (from phpstorm-stubs). Parses the stub PHP source, caches
│ in ast_map under a phpantom-stub:// URI.
│ ↓ miss
│
└── None
Every phase that successfully parses a file caches the result in ast_map. This means:
- Phase 1.5 (classmap) files are parsed once, then found via Phase 1.
- Phase 2 (PSR-4) files are parsed once, then found via Phase 1.
- Phase 3 (stubs) are parsed once, then found via Phase 1.
- Files opened in the editor are parsed on
didOpen/didChangeand always in Phase 1.
PHPantom bundles the JetBrains phpstorm-stubs directly into the binary. This provides type information for ~1,450 built-in classes/interfaces, ~5,000 built-in functions, and ~2,000 built-in constants without requiring any external files at runtime.
The build.rs script:
- If
stubs/jetbrains/phpstorm-stubs/does not exist, fetches the latest release tarball from the JetBrains/phpstorm-stubs GitHub repository and extracts it. This makescargo installandcargo buildwork without any prior setup. - Reads
stubs/jetbrains/phpstorm-stubs/PhpStormStubsMap.php— a generated index mapping symbol names to file paths. - Emits
stub_map_generated.rscontaining:STUB_FILES: an array ofinclude_str!(...)calls embedding every referenced PHP file (~502 files, ~8.5MB of source).STUB_CLASS_MAP: maps class/interface/trait names → index intoSTUB_FILES.STUB_FUNCTION_MAP: maps function names → index intoSTUB_FILES.STUB_CONSTANT_MAP: maps constant names → index intoSTUB_FILES.
The stubs are cached in the stubs/ directory (gitignored). To update to the latest version, delete stubs/ and rebuild.
At Backend construction, stubs.rs converts the static arrays into HashMaps for O(1) lookup. When find_or_load_class reaches Phase 3:
- Look up the class short name in
stub_index→ get the PHP source string. - Parse it with the same parser used for user code.
- Run name resolution (for parent classes, trait uses, etc.).
- Cache in
ast_mapunderphpantom-stub://ClassName. - Return the
ClassInfo.
Because stubs are cached after first access, repeated lookups (e.g. every enum needing UnitEnum) hit Phase 1 and skip parsing entirely.
Built-in PHP functions (e.g. array_map, date_create, str_contains) are resolved through find_or_load_function, which mirrors the multi-phase pattern used for classes:
find_or_load_function(["str_contains", "App\\str_contains"])
│
├── Phase 1: global_functions (user code + cached stubs)
│ Checks all candidate names against the global_functions map.
│ Keys are always FQNs: "array_map" for global functions,
│ "Illuminate\Support\enum_value" for namespaced ones.
│ No short-name fallback entries are stored.
│ ↓ miss
│
├── Phase 1.5: autoload_function_index (byte-level scan)
│ Checks candidate names against autoload_function_index
│ (function FQN → file path on disk). Populated by the
│ find_symbols byte-level scan for both Composer projects
│ (autoload_files.php) and non-Composer projects (workspace
│ full-scan).
│ When found:
│ 1. Reads the file from disk.
│ 2. Calls update_ast to get full FunctionInfo + ClassInfo.
│ 3. Results are cached in global_functions (so future
│ lookups hit Phase 1).
│ 4. Returns the matching FunctionInfo.
│ ↓ miss
│
├── Phase 1.75: Last-resort autoload file parse
│ The byte-level scanner misses functions inside conditional
│ blocks (e.g. `if (! function_exists(...))` guards). As a
│ safety net, lazily parses each known autoload file path
│ (stored in autoload_file_paths) via update_ast until the
│ function is found. Skips files already in ast_map. Each
│ file is parsed at most once; subsequent lookups hit Phase 1.
│ ↓ miss
│
├── Phase 2: Embedded PHP stubs
│ Looks up each candidate name in stub_function_index.
│ When found:
│ 1. Parses the entire stub file (extracting all FunctionInfo).
│ 2. Caches ALL functions from that file into global_functions
│ under phpantom-stub-fn:// URIs (FQN keys only).
│ 3. Also caches any classes defined in the same stub file into
│ ast_map (so return type references can be resolved).
│ 4. Returns the matching FunctionInfo.
│ ↓ miss
│
└── None
The function_loader closures in both server.rs (completion) and definition/resolve.rs (go-to-definition) build a list of candidate names — the bare name, the use-map resolved name, and the namespace-qualified name — then delegate to find_or_load_function. Because global_functions is keyed by FQN only, bare calls like enum_value() in namespace App succeed via the namespace-qualified candidate App\enum_value. This means built-in function return types are available for:
- Variable type resolution:
$dt = date_create();→$dtisDateTime - Call chain completion:
date_create()->offersformat(),modify(), etc. - Nested call resolution:
simplexml_load_string(...)->xpath(...)works
User-defined functions in global_functions always take precedence over stubs because Phase 1 is checked first — stubs use entry().or_insert() to avoid overwriting existing entries.
build_function_completions handles two modes:
-
use functioncontext (for_use_import = true): The insert text is the FQN (e.g.Illuminate\Support\enum_value) so the resulting statement readsuse function Illuminate\Support\enum_value;. The label is the FQN for namespaced functions and the signature for global ones. -
Inline context (
for_use_import = false): The insert text is a snippet using the short name (e.g.enum_value($0)). For namespaced functions, anadditional_text_editsentry insertsuse function FQN;at the alphabetically correct position, mirroring how class auto-import works. The detail shows the namespace (e.g.function (Illuminate\Support)). Functions in the same namespace as the current file do not receive an auto-import.
Deduplication uses the map key (FQN), so two functions with the same short name in different namespaces both appear as separate completion items.
The stub_constant_index (HashMap<&'static str, &'static str>) is built at construction time from STUB_CONSTANT_MAP, mapping constant names like PHP_EOL, PHP_INT_MAX, SORT_ASC to their stub file source. It is consulted by the constant-value loader (Backend::constant_loader) during variable type resolution: when a variable is assigned from a global constant (e.g. $eol = PHP_EOL), the loader extracts the constant's initializer value from the stub source and the type is inferred from the literal (string, int, float, bool, null, or array). The same lookup chain also serves hover, completion of standalone constant names, and go-to-definition.
phpstorm-stubs annotate functions, methods, and parameters with #[PhpStormStubsElementAvailable] attributes to indicate which PHP versions they apply to. For example, array_map has two variants of its second parameter: array $array (from PHP 8.0) and an untyped $arrays (PHP 5.3 to 7.4). Without filtering, both appear in signatures.
PHPantom detects the target PHP version during initialized:
- Read
config.platform.phpfromcomposer.json(explicit platform override). - Fall back to
require.phpfromcomposer.json(e.g."^8.4"→ 8.4). - Default to PHP 8.5 when neither is available.
The detected version is stored on the Backend as a PhpVersion(major, minor).
When parsing stubs (both class stubs via find_or_load_class and function stubs via find_or_load_function), the PHP version is passed through DocblockCtx.php_version to the extraction functions. Three filtering points apply:
- Function-level:
extract_functions_from_statementschecks the function's#[PhpStormStubsElementAvailable]attribute. If the version range excludes the target, the entire function is skipped. This handles duplicate function definitions (e.g.array_combinehas separate signatures for PHP ≤7.4 and ≥8.0). - Method-level:
extract_class_like_membersapplies the same check to methods. For example,SplFixedArray::__serialize(from PHP 8.2) is excluded when targeting PHP 8.1. - Parameter-level:
extract_parametersfilters individual parameters. For example,array_map's untyped$arraysparameter (PHP 5.3–7.4) is excluded when targeting PHP 8.0+, leaving only the typedarray $array.
The attribute supports named arguments (from: '8.0', to: '7.4') and a positional argument ('8.1' treated as from). Both bounds are inclusive. A missing bound means unbounded in that direction.
User code is never filtered. The php_version field in DocblockCtx is None for files parsed via update_ast, so the filtering logic is a no-op for non-stub code.
If the stubs can't be fetched (e.g. no network access during the build), build.rs generates empty arrays and the build succeeds. The LSP just won't know about built-in PHP symbols.
When building completion items or resolving definitions, PHPantom merges members from the full inheritance chain. Resolution proceeds in two phases:
-
Base resolution (
resolve_class_with_inheritanceininheritance.rs): merges own members, trait members, and parent chain members with generic type substitution. -
Virtual member providers (
resolve_class_fullyinvirtual_members/mod.rs): queries registered providers in priority order and merges their contributions. Virtual members never overwrite real declared members or contributions from higher-priority providers.
All completion and go-to-definition call sites use resolve_class_fully, which is the primary entry point for full resolution.
ClassInfo (own members)
│
├── 1. Merge used traits (via `use TraitName;`)
│ └── Recursively follows trait composition and parent_class chains
│
└── 2. Walk the extends chain (parent_class)
└── For each parent: merge its traits, then its public/protected members
After base resolution, resolve_class_fully applies registered virtual member providers. These are implementations of the VirtualMemberProvider trait (defined in src/virtual_members/mod.rs) that synthesize methods and properties not present in the PHP source code.
Each provider implements two methods:
applies_to(class, class_loader) -> bool: cheap pre-check to skip providers early.provide(class, class_loader) -> VirtualMembers: produce the virtual methods and properties.
Providers receive the base-resolved class (own + traits + parents) but without other providers' contributions. This prevents circular loading.
resolve_class_fully(class)
│
├── 1. resolve_class_with_inheritance(class)
│ └── Returns base-resolved ClassInfo
│
├── 2. For each provider (in priority order):
│ └── if applies_to(class): merge provide(class) into result
│ └── Skips members that already exist (no overwrites)
│
├── 3. Merge members from implemented interfaces
│
└── 4. apply_laravel_patches(class, fqn)
└── Post-resolution fixups for Laravel classes (see below)
Provider priority order (highest first):
- Framework provider (e.g. Laravel): richest type info
- PHPDoc provider:
@method,@property,@property-read,@property-write,@mixin
Two providers are currently registered in default_providers():
LaravelModelProvider(virtual_members/laravel.rs): synthesizes virtual members for classes extendingIlluminate\Database\Eloquent\Model. Produces relationship properties (methods returningHasMany,HasOne,BelongsTo, etc. generate a virtual property typed from the relationship's generic parameters), scope methods (both thescopeActivenaming convention and the#[Scope]attribute from Laravel 11+ are supported; either style becomesactive()as both static and instance), Builder-as-static forwarding (User::where()->get()resolves end-to-end), accessors (legacygetXAttribute()and modernAttributecasts), and cast properties ($castsarray orcasts()method entries are mapped to PHP types likedatetimeto\Carbon\Carbon,booleantobool, custom cast classes to theirget()return type). Highest priority among virtual member providers. Scope methods are also injected ontoBuilder<Model>instances via a post-generic-substitution hook intype_hint_to_classes_depth(see "Scope Methods on Builder Instances" below).PHPDocProvider(virtual_members/phpdoc.rs): parses@method,@property,@property-read,@property-write, and@mixintags from the class-level docblock stored inClassInfo.class_docblock. Explicit@method/@propertytags are not parsed eagerly during AST extraction; instead, the raw docblock string is preserved and parsed lazily whenprovideis called. For@mixintags, the provider loads the referenced classes and merges their public members. Within the provider, explicit tags take precedence over mixin members. Recurses into mixin-of-mixin chains up toMAX_MIXIN_DEPTH.
After virtual members and interface members are merged, resolve_class_fully_inner calls apply_laravel_patches(class, fqn) from virtual_members/laravel/patches.rs. Unlike virtual member providers (which add new members), patches modify existing members' type information to fix framework-specific type inaccuracies that break chain resolution.
The patch system is centralized in a single module with one entry point. Dispatching is based on the fully-qualified class name and trait usage. Current patches:
-
Eloquent\Builder::__call/__callStaticreturn type. Laravel'sBuilder::__call()is declared as returningmixed, but in practice always returns$this(scope dispatch, macro dispatch, Query\Builder forwarding). The patch overrides the return type tostaticso that method chains through unknown calls preserve the Builder type. -
Conditionable::when()/unless()return type. The trait declares@return $this|TWhenReturnTypebut the unresolved method-level template parameterTWhenReturnTypepreventsis_self_like_typefrom recognizing the return as self-referential, breaking method chain resolution on Builder and Collection. The patch replaces the return type with$this. Applied to theConditionabletrait itself, toEloquent\Builder(which uses the trait), and to any class whoseused_traitsincludesConditionable. -
Bare
Builderreturn types on scope methods. Handled separately inscopes.rs(is_bare_builder_type) because it runs at scope-injection time (post-generic-substitution), not duringresolve_class_fully_inner. Documented in the patch module as part of the inventory but not dispatched from it.
- Class own members always win.
- Trait members override inherited members but not own members.
- Parent members fill in anything not already present.
- Virtual members from providers sit below all real declared members (own, trait, parent). Higher-priority providers shadow lower-priority ones. Within the PHPDoc provider,
@method/@propertytags beat@mixinmembers. - Private members are never inherited from parents (but trait private members are copied, matching PHP semantics).
Scope methods (both scopeX convention and #[Scope] attribute) are synthesized on the Model class by LaravelModelProvider, but they also need to be available on Builder<Model> instances (e.g. Brand::where('id', $id)->isActive()). This cannot be handled by a virtual member provider alone because providers run before generic argument substitution, so the provider would not know the concrete model type.
Instead, scope injection happens in three places:
-
Post-generic-substitution hook (
completion/resolver.rs, insidetype_hint_to_classes_depth): afterresolve_class_fully+apply_generic_argsproduces aBuilder<User>class, the resolver detects that the result is an Eloquent Builder with a concrete model generic argument. It callsbuild_scope_methods_for_builder(model_name, class_loader)which loads the model, fully resolves it, extracts scope methods, and returns them as instance methods withstaticin return types substituted to the concrete model name. These are merged onto the Builder's method list, giving$q->active()after$q = User::where(...). -
Scope body Builder enrichment (
completion/variable_resolution.rs,enrich_builder_type_in_scope): inside a scope method body, the$queryparameter is typically typed asBuilderwithout generic arguments. The enrichment function detects when the enclosing method is a scope (either thescopeXnaming convention or the#[Scope]attribute) on a class that extends Eloquent Model, and the parameter type isBuilderwithout generics. It rewrites the type toBuilder<EnclosingModel>, which then flows through the generic-args path and triggers the post-substitution hook above. This makes$query->otherScope()resolve inside scope bodies. -
Go-to-definition fallback (
definition/member.rs,find_scope_on_builder_model): when go-to-definition resolves a member on a Builder class and the normal lookup chain (own members, traits, parents, mixins, builder forwarding) fails, this fallback checks whether the member is a scope method injected from the model. It confirms the resolved candidate (with injected scopes) has the method, extracts the model name from the scope method's return type generic argument, loads the model, and finds the declaration through the model's inheritance chain. ForscopeX-style scopes it looks forscopeXxx; for#[Scope]-attributed methods it falls back to the original method name. This bridges the gap between completion (which works on the merged ClassInfo) and GTD (which traces back to the declaring source file).
The merge_traits_into function also walks the parent_class chain of each trait/interface it loads. This is critical for enums: a backed enum's used_traits contains BackedEnum, and BackedEnum extends UnitEnum. The parent chain walk ensures UnitEnum's members (cases(), $name) are merged alongside BackedEnum's own members (from(), tryFrom(), $value).
PHP enums implicitly implement UnitEnum (for unit enums) or BackedEnum (for backed enums). The parser detects this and adds the appropriate interface to the enum's used_traits:
enum Color { ... } → used_traits: ["\UnitEnum"]
enum Status: int { ... } → used_traits: ["\BackedEnum"]
The leading backslash marks the name as fully-qualified so that namespace resolution doesn't incorrectly prefix it (e.g. an enum in namespace App\Enums won't resolve to App\Enums\UnitEnum).
At resolution time, merge_traits_into loads the UnitEnum or BackedEnum stub from the embedded phpstorm-stubs, and the interface inheritance chain provides all the standard enum methods and properties.
composer.rs parses:
composer.json→autoload.psr-4andautoload-dev.psr-4mappings
PSR-4 mappings come exclusively from the project's own composer.json. Vendor PSR-4 (vendor/composer/autoload_psr4.php) is not loaded. The Composer classmap is the sole source of truth for vendor code.
Design principle: vendor PSR-4 mappings are not loaded separately. The Composer classmap (augmented by the self-scanner, see below) is the sole source of truth for vendor code. User PSR-4 roots are walked by Go-to-implementation (Phase 5) and Find References because user files change between dump-autoload runs.
When Composer's autoload_classmap.php is missing or incomplete, PHPantom builds a classmap itself using a fast byte-level PHP scanner (classmap_scanner.rs). The scanner is a single-pass state machine that extracts namespace\ClassName pairs without a full AST parse, handling comments, strings, heredocs, and property accesses ($node->class). memchr provides SIMD-accelerated keyword pre-screening so files without class-like keywords are rejected immediately. Inside matched files, memchr also accelerates skipping of line comments (jump to newline), block comments (jump to *), single-quoted strings (jump to ' or \), double-quoted strings (jump to " or \), and heredocs/nowdocs (jump to newline) instead of scanning byte-by-byte.
Scanning is parallelised using a two-phase approach: directory walks collect file paths first (single-threaded, via the ignore crate), then files are read and scanned in parallel batches using std::thread::scope. Three helpers cover the scan modes: scan_files_parallel_classes (plain classmap), scan_files_parallel_psr4 (PSR-4 with FQN filtering), and scan_files_parallel_full (classes + functions + constants). The thread count is capped at std::thread::available_parallelism(). Small batches (≤ 4 files) skip threading overhead and run sequentially.
The indexing strategy is configurable via [indexing] strategy in .phpantom.toml:
"composer"(default) — merged classmap + self-scan. Load Composer's classmap (if it exists) as a skip set, then self-scan all PSR-4 and vendor directories for anything the classmap missed. Whatever the classmap already covers is a free performance win; whatever it's missing, we find ourselves. No completeness heuristic needed."self"— always self-scan, ignoring Composer's classmap entirely. Equivalent to the merged approach with an empty skip set."full"— same as"self"for now; reserved for future background indexing."none"— no proactive scanning; uses Composer's classmap if present but never self-scans to fill gaps.
The merged pipeline works in three steps: (1) load autoload_classmap.php into a HashMap<String, PathBuf>, (2) collect the classmap's file paths into a HashSet<PathBuf> skip set, (3) self-scan all PSR-4 and vendor directories, skipping files already in the skip set. The result is a merged index: classmap entries for everything Composer already knew about, plus self-scanned entries for everything it missed. When the classmap is complete (the common case), the self-scanner walks directories but skips every file, finishing almost instantly. When the classmap is empty or absent, it falls back to a full self-scan. When the classmap is partial (e.g. vendor classes only), vendor files are skipped and only user code is scanned. Every state of the classmap helps.
When self-scanning with a composer.json present, the scanner reads autoload.psr-4, autoload-dev.psr-4, autoload.classmap, and autoload-dev.classmap to determine which directories to walk. PSR-4 directories are filtered: only classes whose FQN matches the namespace prefix plus the relative file path are included. Vendor packages are discovered from vendor/composer/installed.json (both Composer 1 and 2 formats); the JSON packages array is borrowed rather than cloned to avoid allocating a copy of the entire vendor manifest. All directory walkers (full-scan, PSR-4 scanner, vendor package scanner, and go-to-implementation file collector) use the ignore crate for gitignore-aware traversal. Hidden directories are skipped automatically, and .gitignore rules are respected at every level. When no composer.json exists at all, the scanner falls back to walking all .php files under the workspace root.
The result is a HashMap<String, PathBuf> in the same format as the existing Backend.classmap. Everything downstream (resolution, diagnostics, go-to-definition) works unchanged.
Redundant I/O elimination: init_single_project parses composer.json once and passes the pre-parsed serde_json::Value to build_self_scan_composer. Previously each function re-read and re-parsed the file independently.
Vendor dir detection: the config.vendor-dir setting is read from composer.json once during initialized (via parse_composer_json, which returns both the PSR-4 mappings and the vendor dir name). The absolute vendor directory path is cached on Backend.vendor_dir_paths and a file:// URI prefix is stored in Backend.vendor_uri_prefixes for fast vendor-file detection at runtime. Both fields are Vecs to support monorepo workspaces with multiple subprojects (see below). For single-project workspaces, each collection has exactly one entry.
vendor/composer/autoload_files.php lists files containing global function definitions and define() constants. During initialized(), these files are scanned with the lightweight find_symbols byte-level pass (not a full AST parse). This populates autoload_function_index (function FQN → file path), autoload_constant_index (constant name → file path), and class_index (class FQN → file URI). Full parsing is deferred to the moment a symbol is first accessed via find_or_load_function (Phase 1.5), resolve_constant_definition (Phase 1.5), or find_or_load_class (through class_index).
The byte-level scanner only discovers top-level declarations. Functions wrapped in if (! function_exists(...)) guards (common in Laravel helpers) are at brace depth > 0 and are missed. As a safety net, all visited autoload file paths are stored in autoload_file_paths. When a function or constant is not found in any index or stubs, find_or_load_function and resolve_constant_definition lazily parse each known autoload file via update_ast as a last resort (Phase 1.75). Each file is parsed at most once.
In projects without composer.json, there is no autoload_files.php to consult. Instead, the full-scan (find_symbols) runs on all workspace files during initialization, populating three indices in a single pass: classmap (classes), autoload_function_index (functions), and autoload_constant_index (constants). When a function or constant is first accessed, find_or_load_function or resolve_constant_definition lazily calls update_ast on the file, caching the result for subsequent lookups.
When the workspace root has no composer.json but contains subdirectories that are independent Composer projects, PHPantom discovers and processes each subproject automatically. This is a best-effort mitigation for workspaces where the editor opens a monorepo root as a single folder. The monorepo path only activates when there is no root composer.json.
Discovery. discover_subproject_roots in composer.rs walks the workspace using the ignore crate (WalkBuilder), respecting .gitignore at every level. Any directory containing a composer.json is recorded as a subproject root. Nested subprojects (a composer.json inside an already-accepted root) are filtered out. Each discovered composer.json is read to extract the config.vendor-dir setting.
Per-subproject processing. For each discovered (subproject_root, vendor_dir), the same Composer pipeline runs: PSR-4 mappings (with base paths resolved to absolute paths so resolve_class_path works regardless of workspace root), classmap parsing, vendor package scanning from installed.json, and autoload file indexing. Results are merged into the shared backend state with first-subproject-wins semantics for duplicate FQNs.
Loose file discovery. After all subprojects are processed, the gitignore-aware full-scan walker (scan_workspace_fallback_full) runs on the workspace root with a skip set containing all subproject root paths. This picks up PHP files outside any subproject tree (e.g. top-level scripts, shared utilities) without double-scanning subproject content.
Vendor tracking. Each subproject's vendor directory is registered in both vendor_uri_prefixes (for URI-level vendor detection in diagnostics, find references, and rename) and vendor_dir_paths (for filesystem-level skip logic in go-to-implementation and workspace indexing).
Trade-offs. Conflicting class versions across subprojects result in first-wins resolution. A single PHP version is used for the entire workspace. Per-subproject .phpantom.toml is not supported. Multi-root LSP workspaces (separate workspaceFolders) are a separate feature.
When resolving a standalone function call (e.g. app(), date_create()), the lookup order is:
- User code (
global_functionsfrom opened/changed files and lazily parsed autoload index hits) - Autoload function index (
autoload_function_index, populated byfind_symbolsfor both Composer and non-Composer projects, triggers lazyupdate_aston first access) - Known autoload files (
autoload_file_paths, last-resort lazy parse for functions missed by the byte-level scanner, e.g. those insidefunction_exists()guards) - Embedded stubs (
stub_function_indexfrom phpstorm-stubs, parsed lazily)
This ensures that user-defined overrides or polyfills always win over stubs.
When the user invokes go-to-implementation on an interface or abstract class, PHPantom scans for concrete classes that implement or extend it. When invoked on a method definition in a concrete class, it performs a reverse jump to the interface or abstract method that declares the prototype. The forward scan runs five phases, each progressively wider:
find_implementors("Cacheable", "App\\Contracts\\Cacheable")
│
├── Phase 1: ast_map (already-parsed classes)
│ Iterates every ClassInfo in every file already in memory.
│ Checks interfaces list and parent_class chain against the target.
│ ↓ continue
│
├── Phase 2: class_index (FQN → URI entries not yet covered)
│ Loads classes via class_loader for entries not seen in Phase 1.
│ ↓ continue
│
├── Phase 3: classmap files (string pre-filter → parse)
│ Collects unique file paths from the Composer classmap.
│ Skips files already in ast_map.
│ Reads each file's raw source and checks contains(target_short).
│ Only matching files are parsed via parse_and_cache_file.
│ Every class in a parsed file is checked (not just the classmap FQN).
│ ↓ continue
│
├── Phase 4: embedded stubs (string pre-filter → lazy parse)
│ Checks each stub's static source string for contains(target_short).
│ Matching stubs are loaded via class_loader (parsed and cached).
│ ↓ continue
│
├── Phase 5: PSR-4 directory walk (user code only)
│ Recursively collects all .php files under every PSR-4 root.
│ Skips files already covered by the classmap (Phase 3) or ast_map.
│ Reads raw source, applies the same string pre-filter.
│ Matching files are parsed via parse_and_cache_file.
│ Discovers classes missing from the classmap.
│ ↓ done
│
└── Vec<ClassInfo> (concrete implementors only)
Phase 5 walks PSR-4 roots from composer.json (autoload and autoload-dev). Since PSR-4 mappings are sourced exclusively from the project's own composer.json (vendor PSR-4 is not loaded), Phase 5 inherently only discovers classes in the user's own source directories (e.g. src/, app/, tests/). Vendor dependencies are fully covered by the classmap (Phase 3).
Phase 5 exists to catch newly-created or not-yet-indexed user classes that are missing from the classmap.
Note: collect_php_files still receives the vendor dir name because a fallback mapping like "" => "." resolves to the workspace root, where the walk must skip the vendor directory (and hidden directories like .git).
Phases 3–5 avoid expensive parsing by first reading the raw file content and checking whether it contains the target class's short name. A file that doesn't mention "Cacheable" anywhere in its source can't possibly implement the Cacheable interface, so it's skipped without parsing. This keeps the scan fast even for large projects with thousands of files.
parse_and_cache_file follows the same pattern as find_or_load_class: it parses the PHP file, resolves parent/interface names via resolve_parent_class_names, and stores the results in ast_map, use_map, and namespace_map. This means files discovered during a go-to-implementation scan are immediately available for subsequent completion, definition, and implementation lookups without re-parsing.
When the cursor is on a method call (e.g. $repo->find()), resolve_member_implementations first resolves the subject to candidate classes. If any candidate is an interface or abstract class, find_implementors is called and each implementor is checked for the specific method. Only classes that directly define (override) the method are returned — inherited-but-not-overridden methods are excluded.
When go-to-implementation is invoked on a MemberDeclaration symbol (a method name at its definition site), resolve_reverse_implementation checks whether the enclosing class implements any interface or extends an abstract class that declares a method with the same name. If found, it returns the location of the prototype method in the interface or abstract class. This covers:
- Direct interfaces (
class Foo implements Bar) - Interfaces inherited from parent classes (
class Foo extends BasewhereBase implements Bar) - Interface-extends chains (
interface A extends B) - Abstract parent methods (
abstract class Base { abstract function foo(); })
For interface and abstract class method declarations, the same handler works in the forward direction: it calls resolve_interface_member_implementations to find all concrete classes that define the method.
All comparisons in class_implements_or_extends and deduplication in find_implementors use fully-qualified names (built from ClassInfo.name + ClassInfo.file_namespace). The short-name fallback is only used when the target has no namespace information, preventing false positives when two interfaces in different namespaces share the same short name (e.g. App\Logger vs Vendor\Logger).
When the user invokes "Find All References", PHPantom scans all user files for occurrences of the symbol. Vendor files are excluded (matching PhpStorm's behaviour).
Before scanning, ensure_workspace_indexed ensures all user files have symbol maps:
- Phase 1: class_index files (user only) — files already known from
update_astcalls. Vendor and stub URIs are skipped. - Phase 2:
.gitignore-aware workspace walk — uses theignorecrate'sWalkBuilderto recursively discover PHP files under the workspace root, respecting.gitignorerules (including nested and global gitignore files). This automatically skips generated/cached directories likestorage/framework/views/(Laravel blade cache),var/cache/(Symfony), andnode_modules/. The vendor directory is always skipped regardless of.gitignorecontent. Hidden directories are skipped by default.
Both phases parse files in parallel using std::thread::scope. The work is split into chunks (one per CPU core) and each thread reads a file from disk and calls update_ast, which acquires write locks briefly to store results while the expensive parsing step runs without any locks held. Batches of 2 or fewer files skip threading overhead.
Parsed files stay cached in ast_map, symbol_maps, use_map, and namespace_map after the scan completes. There is no post-scan eviction; keeping the entries means subsequent operations (a second find-references call, go-to-definition on a cross-file symbol) benefit from the work already done.
The user_file_symbol_maps() helper snapshots all symbol maps whose URI does not fall under the vendor directory or the internal stub scheme. With Arc<SymbolMap>, the snapshot is a vector of cheap reference-count increments rather than deep clones. Four scanners use this snapshot:
find_class_references— matchesClassReferencespans by resolved FQNfind_member_references— matchesMemberAccessandMemberDeclarationspans by member name, filtered by class hierarchyfind_function_references— matchesFunctionCallspans by resolved FQNfind_constant_references— matchesConstantReferencespans by name
Variable references ($this, local variables) are scoped to the enclosing function/class in the current file only, and do not use the cross-file scan.
When the user triggers "Find References" on a method, property, or constant, the handler resolves the target class and builds a hierarchy set containing all related class FQNs. Only references whose subject resolves to a class in this set are included. This prevents $user->save() from matching $order->save() when User and Order are unrelated.
The hierarchy set is built in two passes:
- Ancestors — walk the parent chain, interfaces, traits, and mixins upward from the target class, collecting every FQN encountered.
- Descendants — scan all classes in
ast_mapandclass_indexfor classes that extend, implement, or use anything already in the set. This repeats until no new FQNs are added (transitive closure), bounded byMAX_INHERITANCE_DEPTH.
For each candidate MemberAccess span, the subject text is resolved to class FQNs using a lightweight path:
self/static/$this→ enclosing class FQN viafind_class_at_offsetparent→ enclosing class's parent FQN- Bare class name (static access) → resolved via the file's use-map and namespace
$variable→ full variable type resolution viaresolve_variable_types
When the subject resolves to one or more FQNs, the reference is included only if at least one FQN is in the hierarchy set. When resolution fails entirely (untyped variable, complex expression), the reference is included conservatively to avoid false negatives.
For MemberDeclaration spans, the enclosing class at the span offset is checked against the hierarchy set. Property declaration sites (which use Variable spans for GTD compatibility) are filtered by checking the declaring class's FQN.
When a variable can hold one of several types (from match arms, ternary branches, null-coalescing, or conditional return types), the completion list shows the union of all members across all possible types, not just the intersection of shared members.
This is a deliberate choice that matches PHPStorm and Intelephense behaviour. The rationale:
- The developer may not have isolated branches yet. When working through a match or ternary, the code often starts with a shared variable before the developer splits behaviour per type. Hiding branch-specific members would block progress during that phase.
- Missing completions are worse than extra completions. Restricting to the intersection would hide useful members whenever a variable has more than one possible type.
- Type safety belongs in diagnostics. Calling a method that only exists on one branch is a potential bug, but the right place to flag it is a diagnostic/static-analysis pass, not the completion list.
This is distinct from narrowing via early return, unset(), or
instanceof guards. Those reflect deliberate developer intent to
eliminate types, so the narrowed-out members are correctly hidden. Union
completion is about types the developer has not yet separated.
Members that are only available on a subset of the union already show the
originating class in the detail field (e.g. "Class: AdminUser"), which
gives the developer a visual hint. A future enhancement could sort
intersection members above branch-only members or add an explicit marker.
When the user types a class name outside of a member-access chain (->, ::), the LSP needs to decide which symbols to offer. PHP reuses many of the same keywords in positions that accept very different kinds of symbols. new only makes sense with concrete classes. implements only accepts interfaces. use inside a class body only accepts traits. Offering the wrong kind of symbol is noisy at best and confusing at worst. This section explains how the LSP detects the context, selects which symbols to show, and handles the many edge cases that arise from incomplete information.
detect_class_name_context() in class_completion.rs walks backward from the cursor through whitespace and comma-separated identifier lists to find the keyword that governs the completion position. It recognises twelve contexts:
| Context | Trigger | What is shown |
|---|---|---|
Any |
Bare identifier (e.g. $x = DateT|) |
All class-likes, constants, and functions |
New |
new | |
Concrete (non-abstract) classes only |
ExtendsClass |
class A extends | |
Non-final classes (abstract is OK) |
ExtendsInterface |
interface B extends | |
Interfaces only |
Implements |
class C implements | |
Interfaces only |
TraitUse |
class D { use | |
Traits only |
Instanceof |
$x instanceof | |
Classes, interfaces, enums (not traits) |
TypeHint |
function f(|), ): |, public | |
Classes, interfaces, enums (not traits) |
UseImport |
use | (top level) |
All class-likes + function/const keyword hints |
UseFunction |
use function | |
Functions only |
UseConst |
use const | |
Constants only |
NamespaceDeclaration |
namespace | |
Namespace names only |
TypeHint is not detected by detect_class_name_context (which only handles keyword-triggered contexts). Instead, the type-hint completion handler (complete_type_hint) and the PHPDoc type completion handler (complete_docblock_type_or_variable) pass ClassNameContext::TypeHint directly to build_class_name_completions. This excludes traits from parameter types, return types, property types, and PHPDoc @param/@return/@var type positions. @throws uses a separate Throwable-filtered path instead.
Comma-separated lists are handled by walking past Identifier, sequences so that implements Foo, Bar, \| still resolves to Implements. Multi-line declarations work the same way because the backward walk skips all whitespace including newlines.
TraitUse vs UseImport is distinguished by brace depth: use at brace depth >= 1 is inside a class body (trait use), while brace depth 0 is a top-level import.
try_class_constant_function_completion() in handler.rs is the entry point. It extracts the partial identifier, detects the context, and branches:
-
UseFunctionandUseConstshort-circuit to dedicated builders (build_function_completions,build_constant_completions). These bypass class-name logic entirely. Items from the current file are filtered out (importing a function from the file you are editing is pointless). A semicolon is appended to the insert text. ForUseFunction, namespaced functions use their FQN as insert text so the resulting statement readsuse function Illuminate\Support\enum_value;. -
NamespaceDeclarationshort-circuits tobuild_namespace_completions, which produces namespace-path items from PSR-4 prefixes and known class FQNs. -
UseImportcallsbuild_class_name_completionswith an empty use-map (the file's own use-map contains the half-typeduseline, which the parser interprets as a real import and would appear as a bogus completion item). The results are post-processed: classes defined in the current file are removed, a semicolon is appended, andfunction/constkeyword hints are injected when the partial matches. -
All other class-only contexts (
New,ExtendsClass,ExtendsInterface,Implements,TraitUse,Instanceof) callbuild_class_name_completionsnormally. Constants and functions are suppressed. -
Anycalls all three builders (classes, constants, functions) and merges the results.
build_class_name_completions collects candidates from five sources. Deduplication is by FQN (seen_fqns set). All candidates are ranked by a five-dimension sort_text key:
{match_quality}{source_tier}{affinity}{demote}{gap}_{short_name_lower}
Match quality (highest weight). Compares the typed prefix against the short name: a = exact match, b = starts-with (or empty prefix), c = substring only. An exact match always ranks above every prefix match, regardless of source or affinity.
Source tier. Three tiers based on how the class relates to the current file:
| Tier | Source | Meaning |
|---|---|---|
0 |
Use-imported classes | Developer explicitly imported this name |
1 |
Same-namespace classes (from ast_map) |
PHP auto-resolves without a use statement |
2 |
Everything else (class index, classmap, stubs) | Requires a use import |
Import affinity. A four-digit inverted score derived from the file's namespace declaration and existing use imports. Namespaces that appear frequently in the file's imports score higher, pushing related classes above unrelated ones within the same tier. The score sums occurrence counts across all ancestor prefixes of the candidate's namespace.
Heuristic demotion. 0 = normal, 1 = demoted by naming-convention heuristics (see below). Demotion sits after affinity but before gap, so a demoted exact match still ranks above every non-demoted prefix match (match quality is higher weight), and a non-demoted name always beats a demoted name within the same affinity group regardless of name length.
Name-length gap. A three-digit distance between the short name length and the typed prefix length (short_name.len() - prefix.len()). Smaller gaps sort first. This is the lowest-weight dimension, breaking ties within the same affinity and demotion group. The effect: typing "Pro" ranks Product (gap 4) above ProductFilterTerm (gap 14) when both live in the same namespace. This smooths the visual transition as a prefix match narrows toward an exact match. Without it, there is a visible "snap" when a candidate suddenly jumps to the top upon becoming an exact match.
The result set is capped at 300 items. When truncated, is_incomplete is set to true so the editor re-requests completions as the user types more characters. Items are sorted by sort_text before truncation so higher-priority items survive.
Each context defines which class-like kinds are valid through matches_kind_flags(kind, is_abstract, is_final). For example, TraitUse only matches ClassLikeKind::Trait, and New only matches non-abstract classes. This filter is applied at every source, but the amount of information available varies by source, which creates a layered filtering strategy.
Loaded classes (in ast_map). The full ClassInfo is available, so the filter is exact. An interface will never appear in TraitUse completions if it has been parsed.
Unloaded stubs (in stub_index but not yet parsed into ast_map). The raw PHP source is available. detect_stub_class_kind scans it for the declaration keyword (class, interface, trait, enum) and modifiers (abstract, final, readonly). This is fast (string search, no tree-sitter parse) and gives accurate kind information for filtering.
Unloaded project classes (in class_index or classmap but not parsed). No kind information is available. These are allowed through the filter with benefit of the doubt. A naming-convention heuristic (likely_mismatch) demotes but does not remove suspicious names (e.g. FooInterface is demoted in ExtendsClass context).
The file's use map is the most problematic source because it can contain entries that are not class-likes at all:
- Namespace aliases (
use App\Models as M;putsM -> App\Modelsin the map, butApp\Modelsis a namespace, not a class). - Non-existent imports (the user may have typed
use Vendor\Something;for a class the LSP has never seen). - Half-typed lines (the parser interprets
use cas importing"c").
Three filters run on use-map entries, in order:
-
is_likely_namespace_not_class. Checks all four class sources (ast_map, class_index, classmap, stubs). If the FQN is found in any of them, it is a real class and passes. If not found, the function looks for positive namespace evidence: is the FQN declared as a namespace innamespace_map, or do known classes exist under it as a prefix (e.g.Cassandra\Exception\AlreadyExistsExceptionprovesCassandra\Exceptionis a namespace)? If positive evidence is found, the entry is rejected. If there is no evidence either way, the entry passes (benefit of the doubt). -
matches_context_or_unloaded. If the class is loaded inast_map, applies exact kind filtering. If not loaded but present instub_index, scans the stub source for the declaration keyword. If truly unknown, allows through. This catches stub interfaces appearing inTraitUseand similar mismatches. -
is_known_class_like(narrow contexts only). For contexts that expect a very specific kind (TraitUse,Implements,ExtendsInterface), an additional check rejects use-map entries whose FQN cannot be found in any class source at all. The reasoning: if we are looking specifically for traits and we have never seen this FQN anywhere, it is almost certainly not a trait. This is deliberately not applied to broader contexts likeNeworInstanceof, where an imported-but-not-yet-indexed class should still appear.
The half-typed-line problem is solved at a higher level: the UseImport handler passes an empty use-map to build_class_name_completions, so the parser's misinterpretation never enters the candidate set.
The UseImport context (use | at top level) has several differences from other contexts:
-
FQN insertion. The insert text is always the full qualified name, not a short name. Writing
use User;is invalid PHP even ifUseris in the current namespace. Theis_fqn_prefixflag is forced totrueandeffective_namespaceis set toNoneto prevent namespace-relative simplification. -
No auto-import text edits. In other contexts, selecting a class from a different namespace generates an
additional_text_editsentry that inserts ausestatement. InUseImportcontext this would be circular (you are already writing theusestatement), so no text edit is generated. -
Semicolons. All class items get a
;appended to the insert text so accepting a completion producesuse App\Models\User;. -
Keyword hints. When the partial matches,
functionandconstkeyword items are injected with a trailing space (not a semicolon) so the user can continue typing the function or constant name. -
Current-file exclusion. Classes defined in the cursor file are filtered out. The same applies to
UseFunction(functions from the current file) andUseConst(constants from the current file).
Many stubs are never parsed into ast_map during a session. Rather than parse thousands of stub files eagerly, the LSP scans the raw PHP source on demand. detect_stub_class_kind finds the short name in the source, checks that it is preceded by a declaration keyword (class, interface, trait, enum), and extracts abstract/final modifiers. This gives accurate filtering without the cost of a full parse.
The scan handles PHP 8.2 readonly classes (final readonly class Foo) by stripping readonly before checking for abstract/final. Multi-class stub files (e.g. V8Js which declares both V8Js and V8JsException) are handled by searching for each name independently.
When a class is not loaded and not a stub (typically a classmap or class_index entry whose file has not been opened), the LSP has no kind information. Rather than guess, it uses naming conventions to adjust sort order via the demotion flag in sort_text:
- Names ending in
Interfaceor starting withI[A-Z]are demoted inExtendsClassandNew. - Names starting with
AbstractorBase[A-Z]are demoted inImplements,ExtendsInterface,TraitUse, andNew.
Demotion sets the fourth dimension of the sort key to 1, pushing the item below non-demoted items within the same match quality, tier, and affinity group. The item is never removed, because naming conventions are not reliable enough to exclude candidates entirely. A demoted exact match still ranks above every non-demoted prefix match, since match quality is the primary dimension.
The symbol_maps store is keyed by file URI, matching ast_map. A typical PHP file has 100–400 navigable symbol tokens. Each SymbolSpan is ~50–100 bytes (two u32 fields plus the SymbolKind enum with a String or two), totalling ~20–40 KB per open file. Variable definition sites add ~1–3 KB. For comparison, open_files already stores the full file content (often 50–200 KB per file), so the symbol map is a small fraction of existing memory use.
Files that are not open (vendor files loaded via PSR-4 on demand) do not get a symbol map — they use the stored byte offsets from Tier 2 (which live on ClassInfo / MethodInfo / etc. in ast_map).
PHPantom earns trust through two promises:
- Provide suggestions for everything the developer expects. Completion, hover, go-to-definition, and every other LSP feature should work on every symbol in the project with no dead spots.
- Stay silent on everything the developer knows works. Every diagnostic PHPantom reports must be something that is genuinely wrong, not something that is merely imprecise or unconventional. If the code runs without errors, PHPantom should not complain about it.
Most PHP developers have learned to ignore IDE errors. They have seen tools produce walls of false positives from bugs in the inference engine, mistakes in stubs, or overly strict rules applied to code that works perfectly well in production. The result is that developers treat red squiggles as noise and test in production to see if a problem is real. PHPantom exists to break that cycle. When a diagnostic appears, it must mean something. A single false positive costs more trust than ten missed true positives, because it teaches the developer to stop reading.
The core diagnostic categories assert type coverage: can the LSP resolve every class, member, and function call in the project?
| Diagnostic | What it asserts |
|---|---|
| Syntax errors | The file parses. |
| Unused imports | The use-map is clean. |
| Unknown classes | Every class reference resolves to a real class. |
| Unknown members | Every member access resolves to a declared member. |
| Unknown functions | Every function call resolves to a declared function. |
| Argument count | Call sites match the target's parameter count. |
| Argument type mismatch | Every argument's type is definitely compatible with the parameter's declared type. |
| Implementation errors | Concrete classes satisfy their interface/abstract contracts. |
| Undefined variables | Every variable read has a prior definition in scope. |
| Deprecated usage | The developer is aware of deprecation. |
A project that passes all of these has 100% type coverage from the LSP's perspective: every symbol is resolvable, every completion trigger produces results, and every hover shows a type.
This also serves as a deliberate division of labour with PHPStan. PHPStan only complains about missing types at levels 6, 9, and 10. PHPantom fills those gaps cheaply and immediately so PHPStan can focus on logic errors rather than fighting incomplete type information. PHPantom rejects structural problems fast (unknown class, unknown member, wrong argument count, unimplemented interface method), and PHPStan hunts for semantic bugs with full type context available.
Argument type mismatch diagnostics (type_error.argument) go beyond coverage into type-compatibility territory. Here the risk of false positives is highest, because PHP is a dynamically typed language with pervasive implicit coercion, and real-world codebases are full of patterns that are technically imprecise but functionally correct.
Every type check is classified into one of three buckets:
- Yes (definitely compatible).
Catpassed toAnimalwhereCat extends Animal. No diagnostic. - No (definitely incompatible).
arraypassed toint. Diagnostic reported. - Maybe (might work, might not).
?Carbonpassed toCarbon, barearraypassed toarray<string>,HtmlStringpassed tostring. No diagnostic.
Only "no" produces a diagnostic. The "maybe" bucket is deliberately wide. If there is any reasonable interpretation under which the code could work at runtime, PHPantom stays silent. Examples of "maybe":
- Nullable to non-nullable (
?CarbontoCarbon). The developer may have null-checked before the call. We cannot see the guard clause from the call site alone. - Bare array to typed array (
arraytoarray<string>). A barearrayis untyped. It might contain strings. We cannot prove otherwise. - Stringable objects to string (
HtmlStringtostring). PHP calls__toString()at runtime. We follow PHP's runtime behaviour. - Int to string. PHP coerces integers to strings in many contexts and we cannot know the
strict_typessetting. list<X>andarray<int, X>. Used interchangeably in practice. array<int, X> might have sequential-keys in actuality.- Interface to concrete subtype (
CarbonInterfacetoCarbon). The interface is broader, but in practice the value is almost always the expected concrete type.
This conservatism is the foundation for trust. Once developers learn that PHPantom's red squiggles are never false alarms, they start paying attention to them. At that point, a stricter mode can be offered as an opt-in for teams that want more help. But the default must be: if we are not certain, we stay silent.
The analyze subcommand (phpantom_lsp analyze --project-root <DIR>) runs all native diagnostics across a project in headless batch mode. It reuses the same Backend initialization pipeline as the LSP server, so results match exactly what a user sees in their editor.
The command exists so users can measure and maintain type coverage without opening files one by one. A clean run (exit code 0) means the LSP can resolve every symbol in the project. Any reported diagnostic is a spot where completion, hover, or go-to-definition will have a gap.
The batch pipeline runs in two parallel phases:
-
Parse phase. Read every user file (discovered via PSR-4 source directories) and call
update_astto populatefqn_index,ast_map,symbol_maps,use_map,namespace_map, andclass_index. After this phase, all user classes are infqn_indexand cross-file references resolve via O(1) hash lookup. -
Diagnose phase. Run all seven diagnostic collectors on every file in parallel (
std::thread::scopewith one chunk per CPU core). Because Phase 1 already indexed all user classes, the diagnostic phase never triggers lazyparse_and_cache_filefor other user files, eliminating write-lock contention.
Output uses PHPStan's table format (with progress bar and coloured output) so the results are familiar to teams already using PHPStan. Severity filtering (--severity error|warning|all) and path scoping let users focus on specific areas.
Diagnostics run in three independent background tokio::spawn tasks so they never block completion, hover, or signature help:
- Native diagnostic worker — collects fast (syntax errors, unused imports, deprecated usage) and slow (unknown classes/members/functions, argument count, implementation errors) diagnostics. Debounces at 500 ms.
- PHPStan worker — runs PHPStan in editor mode (
--tmp-file/--instead-of). Debounces at 2 000 ms. - PHPCS worker — runs PHP_CodeSniffer via
phpcs --report=jsonwith stdin piping. Debounces at 2 000 ms.
Each worker is created during initialized via clone_for_diagnostic_worker, which builds a shallow clone of the Backend. All Arc-wrapped fields (maps, caches, the notify/pending slots) are shared by Arc::clone, so every worker sees all mutations the main Backend makes. The PHPStan and PHPCS workers each have their own notify handle, pending-URI slot, and diagnostic cache so they run independently of each other and of the native diagnostic worker.
Non-Arc fields are snapshotted at spawn time: php_version, vendor_uri_prefixes, vendor_dir_paths, and config. These fields are only written during initialized (before the workers are spawned) and never change afterwards. If a future feature adds hot-reloading of .phpantom.toml or runtime PHP version changes, the workers would need to be notified or re-cloned. This invariant ("init-time fields are write-once") should be verified before adding any post-init mutation to these fields.
PHP class names go through resolution at parse time (resolve_parent_class_names):
- Fully-qualified (
\Foo\Bar) → strip leading\, use as-is. - In use map (
Barwithuse Foo\Bar;) → expand toFoo\Bar. - Qualified (
Sub\Barwithuse Foo\Sub;) → expand first segment. - Unqualified, not in use map → prepend current namespace.
- No namespace → keep as-is.
This runs on parent_class, used_traits, and mixins for every ClassInfo extracted from a file.