Skip to content

feat(cli): add gpu safe address mining#15

Open
mmv08 wants to merge 1 commit into
nlordell:mainfrom
mmv08:gpu
Open

feat(cli): add gpu safe address mining#15
mmv08 wants to merge 1 commit into
nlordell:mainfrom
mmv08:gpu

Conversation

@mmv08
Copy link
Copy Markdown

@mmv08 mmv08 commented May 3, 2026

Add Opt-In GPU Mining for Safe Vanity Addresses

Summary

This branch adds an optional GPU mining mode to deadbeef.

The normal CPU miner is still available and remains the source of truth. When GPU mining is enabled, the GPU quickly scans possible Safe salt nonce values and returns a possible match. The CPU then checks that candidate again before the CLI prints the final address and transaction data.

The GPU work was inspired by create2crunch.

What This Unlocks

The practical win is that longer vanity prefixes become realistic on consumer-grade GPU hardware.

Using the RTX 5090 benchmark in the README, the GPU miner checks about 2.2 billion candidate addresses per second. The same machine has a 12-core CPU. If we take the README's single-thread CPU benchmark and assume perfect scaling across all 12 physical cores, the GPU is still about 107x faster.

At that speed, a 5-byte prefix such as 0x5afe000000, 0xdeadbeef00, or 0x0000000000 takes about 8 minutes on average on the GPU. With the optimistic 12-core CPU estimate above, the same search would take about 15 hours.

A 6-byte prefix such as 0x5afe00000000 is still much harder. On the same RTX 5090 benchmark, it is closer to a day and a half on average. With the same optimistic 12-core CPU estimate, it would be closer to 5 months. So this change does not make every vanity address cheap, but it moves useful 5-byte prefixes into the "run it over lunch" range and makes some longer searches possible without a dedicated mining setup.

Important Safety Note

The GPU kernel was written with the assistance of AI.

The Keccak hashing function inside the GPU shader has not been cryptographically verified. Because of that, we do not treat the GPU result as trusted on its own.

Instead, the CPU implementation is always treated as the known-good implementation. The GPU only proposes a candidate. Before returning a result, the CLI uses the CPU code to recompute the Safe address and confirm that it really matches the requested prefix.

If the GPU ever returns a wrong candidate, the CLI reports an error instead of printing unsafe deployment data.

What Changed

  • Added --gpu to mine Safe vanity addresses with a GPU.
  • Added --list-gpus to show the GPU adapters available through wgpu.
  • Added --gpu-backend so users can choose the backend, including metal, vulkan, dx12, and gl.
  • Added --gpu-adapter so users can select a specific GPU from the adapter list.
  • Added --gpu-batch-size so users can tune how many candidate addresses are checked in one GPU dispatch.
  • Added a hidden --allow-software-gpu flag for tests and diagnostics.
  • Kept CPU mining as the default behaviour.
  • Kept --quiet compatible with GPU mining by suppressing GPU progress output.
  • Documented GPU usage, backend selection, batch-size tuning, and reference benchmark results in the README.

How GPU Mining Works

The CPU still builds the Safe configuration, initialiser, factory address, and proxy init code hash.

For GPU mining, the CPU sends the fixed search inputs to the GPU:

  • the Safe initialiser hash
  • the Safe proxy factory address
  • the Safe proxy init code hash
  • the requested address prefix
  • a random salt nonce prefix
  • a counter range to scan

The GPU checks many counter values in parallel. Each counter becomes the last 8 bytes of the Safe salt nonce. If the shader finds a value that appears to create an address with the requested prefix, it writes that nonce into a small result buffer.

The CPU reads the candidate nonce, recomputes the Safe address using the existing Rust implementation, and accepts the result only if the CPU-computed address matches the requested prefix.

Implementation Details

  • Added a new cli/src/gpu module split into adapter selection, dispatch layout, host/GPU input packing, mining, and tests.
  • Added cli/src/gpu.wgsl, which contains the compute shader used by the GPU miner.
  • Added SearchContext in deadbeef-core so the CLI can share fixed Safe address search inputs with the GPU path without moving full Safe setup logic into the shader.
  • Added Safe::search_context() and Safe::update_salt_nonce() helpers so both CPU and GPU mining paths use the same final Safe transaction construction.
  • Added wgpu, pollster, bytemuck, and rand dependencies to the CLI crate.
  • Added adapter filtering so software-rendered GPU adapters are not used by default.
  • Added a GL backend guard for the current fallback path, where the CLI exits without tearing down the GL miner after a successful run.

Testing and Validation

This branch adds tests for:

  • GPU CLI argument parsing.
  • --list-gpus behaviour without requiring Safe owner or prefix arguments.
  • prefix length validation.
  • GPU dispatch-size rounding.
  • host-side input and result memory layout.
  • shader workgroup size matching the Rust host code.
  • GPU nonce derivation for counter byte order and carry behaviour.
  • GPU matching across every Ethereum address prefix length from 1 to 20 bytes.
  • CPU recomputation of Safe addresses from SearchContext.

The GPU tests are written so they skip when no GPU adapter is available. A separate, ignored benchmark test can be run manually to measure GPU throughput.

@mmv08 mmv08 force-pushed the gpu branch 5 times, most recently from 867e5c5 to db291dd Compare May 8, 2026 19:44
Add an opt-in wgpu compute miner with adapter listing, backend selection, and GPU batch sizing for Metal, Vulkan, and DX12 backends.

Expose Safe search context APIs so CPU and GPU derivation share the same Safe-specific CREATE2 inputs, and keep CPU verification as the source of truth for GPU hits.

Document the host/device buffer layout, nonce scanning strategy, shader helpers, and reference benchmark environment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant