This document is Linux-first. Windows notes are left in a few places for reference, but the main target is a Linux environment running offline behind a local proxy.
The goal is:
- Serve cached URLs from the local database created by
url-vault. - Record cache misses into a YAML file.
- Copy that YAML file back to a connected environment.
- Re-run
url-vaultthere so the missing URLs are fetched into the same cache layout.
- a cache tree populated by
url-vault mitmproxyormitmdump- the
mitm_local_cache.pyaddon from this repo - a shell environment where clients can be pointed at an HTTP(S) proxy
You can run the addon either:
- from the repository root, so it can import
url_vault - or from any environment where
url-vaultis installed into Python
url-vault stores cache entries under:
<destination_dir>/<scheme>/<host>/<path>
Examples:
https://github.com/folke/lazy.nvim.git-><destination_dir>/https/github.com/folke/lazy.nvim.githttps://github.com/folke/lazy.nvim.git/info/refs?service=git-upload-pack-><destination_dir>/https/github.com/folke/lazy.nvim.git/info/refs/__query__/service%3Dgit-upload-packgit@github.com:folke/lazy.nvim.git-><destination_dir>/ssh/github.com/folke/lazy.nvim.git
For ordinary URLs with a query string, the addon first checks the exact query-derived path and then falls back to the same path without the query component.
python -m pip install mitmproxyOr:
uv tool install mitmproxyRun mitmproxy one time:
mitmdump --listen-host 127.0.0.1 --listen-port 8080This creates CA files under ~/.mitmproxy/.
The file most CLI tools need is:
~/.mitmproxy/mitmproxy-ca-cert.pem
This repo includes mitm_local_cache.py.
It does three things:
- looks up requested URLs in the local cache tree
- serves the file immediately on cache hit
- appends cache misses into a YAML file on cache miss
Current miss-log behavior:
- only
GETandHEADare recorded - duplicate URLs are merged
count,first_seen,last_seen, andlast_methodare tracked
The miss log schema is compatible with request_files in config.yaml.
Repository-root example:
export MIRROR_ROOT="$HOME/repo_mirrors"
export MISS_LOG="$PWD/requests/offline-misses.yaml"
mitmdump \
--mode regular \
--listen-host 127.0.0.1 \
--listen-port 8080 \
-s ./mitm_local_cache.py \
--set cache_root="$MIRROR_ROOT" \
--set miss_log_path="$MISS_LOG" \
--set offline_only=trueInstalled-package example:
export MIRROR_ROOT="$HOME/repo_mirrors"
export MISS_LOG="$HOME/url-vault/requests/offline-misses.yaml"
mitmdump \
--mode regular \
--listen-host 127.0.0.1 \
--listen-port 8080 \
-s /path/to/mitm_local_cache.py \
--set cache_root="$MIRROR_ROOT" \
--set miss_log_path="$MISS_LOG" \
--set offline_only=trueNotes:
offline_only=truemakes cache misses fail fast with404.- If you later want online fallback, set
offline_only=false. - Cache hits return
X-Url-Vault-Cache: hit.
Use both upper and lower case variables for compatibility:
export HTTP_PROXY="http://127.0.0.1:8080"
export HTTPS_PROXY="http://127.0.0.1:8080"
export ALL_PROXY="http://127.0.0.1:8080"
export NO_PROXY="localhost,127.0.0.1,::1"
export http_proxy="$HTTP_PROXY"
export https_proxy="$HTTPS_PROXY"
export all_proxy="$ALL_PROXY"
export no_proxy="$NO_PROXY"This typically covers:
curlwget- Git
- Python
requests - tools layered on top of Git or libcurl
Point common tools at the mitmproxy CA:
export MITM_CA="$HOME/.mitmproxy/mitmproxy-ca-cert.pem"
export SSL_CERT_FILE="$MITM_CA"
export CURL_CA_BUNDLE="$MITM_CA"
export REQUESTS_CA_BUNDLE="$MITM_CA"
export GIT_SSL_CAINFO="$MITM_CA"wget example:
wget --ca-certificate="$MITM_CA" -S -O - https://example.com/file.txtIf the CA file is missing or the client does not trust it, Git, curl, and other HTTPS clients will usually fail with TLS or certificate errors before cache behavior is even relevant.
Git mirrors are stored as bare repositories. That is enough to hold the objects locally, but HTTP clients are still expecting a Git-over-HTTP view of those files.
url-vault already runs:
git --git-dir <mirror> update-server-infoafter each Git sync. That helps dumb-HTTP clients by generating info/refs metadata inside the mirror.
This is still not the same thing as a full smart Git HTTP server. If you later need better compatibility, the next step is probably git-http-backend in front of the same cache tree.
In practice, offline Git cloning works best when clients are forced onto dumb HTTP:
export GIT_SMART_HTTP=0Without that, Git may try smart-HTTP behavior that a static cache tree cannot satisfy reliably.
In the offline Linux environment:
- Run the proxy with
miss_log_pathpointing at a writable YAML file. - Let users run Neovim, Git,
curl,wget, or other tools through the proxy. - When the cache misses, the addon writes or updates entries in
requests/offline-misses.yaml.
Example:
kind: url
requests:
- url: https://github.com/folke/lazy.nvim.git/info/refs?service=git-upload-pack
count: 2
first_seen: 2026-03-15T18:00:00Z
last_seen: 2026-03-15T18:05:00Z
last_method: GETThen:
- Copy that YAML file to the connected environment.
- Keep it listed under
request_filesinconfig.yaml. - Run
url-vault --once. - Copy the refreshed cache tree back to the offline environment.
The request file is deduplicated by URL and keeps hit counts plus timestamps.
Manual miss capture is only part of the loop. Known high-value sets should be prefetched in advance.
This repo already uses that pattern:
prefetch.d/neovim-plugins.yamlkeeps the Neovim plugin prefetch list in the generickind + entriesformatconfig.yamlpoints at that file throughprefetch_files
That lets you warm the cache for likely requests before an offline user asks for them.
With the proxy running and env vars set:
curl -I https://github.com/folke/lazy.nvim.git/info/refswget --ca-certificate="$MITM_CA" -S -O - https://github.com/folke/lazy.nvim.git/info/refsIf the file exists in the cache, the response should come directly from mitmproxy.
If it does not exist, the addon should:
- record the miss in
requests/offline-misses.yaml - return
404whenoffline_only=true
For Git URLs, do not assume that a successful offline clone means zero 404s or zero miss-log entries. Dumb-HTTP Git clients may probe a few optional paths first, including:
objects/info/http-alternatesobjects/info/alternates- individual loose-object paths under
objects/<xx>/...
Those probes can miss even when the mirror is healthy and Git later falls back to packed objects successfully. Treat clone success plus pack-file responses as the stronger signal. Unexpected misses outside those probe patterns are more likely to indicate a real cache gap.
If you ever want to test from Windows, the same proxy variables exist in PowerShell:
$env:HTTP_PROXY = "http://127.0.0.1:8080"
$env:HTTPS_PROXY = "http://127.0.0.1:8080"
$env:ALL_PROXY = "http://127.0.0.1:8080"
$env:NO_PROXY = "localhost,127.0.0.1,::1"
$env:http_proxy = $env:HTTP_PROXY
$env:https_proxy = $env:HTTPS_PROXY
$env:all_proxy = $env:ALL_PROXY
$env:no_proxy = $env:NO_PROXY