feat: inference optimization — new API, perf improvements, bug fixes#1
Open
feat: inference optimization — new API, perf improvements, bug fixes#1
Conversation
## New megalodon.inference package Public API (`from megalodon.inference import ...`): - `validate_smiles(smiles)` — validates SMILES before featurization; rejects salts, unsupported elements (LoQI vocab: 17 atoms), radicals - `generate_conformers(smiles_list, model, cfg, n_confs, batch_size=48, max_atoms_per_batch)` — batched conformer generation returning a structured `ConformerGenerationResult` with per-SMILES conformer lists, error records, and `.to_sdf()` serialization; default batch_size=48 (sweep-validated optimum on L40S: 8.7 conf/s at 83% SM utilization, 1.2 GB peak) - `ffd_pack_indices / pack_batches` — First-Fit-Decreasing atom-count bin- packing to minimise padding waste on heterogeneous molecule sets - `ConformerGenerationResult / MoleculeProcessingError` — typed result objects ## Performance improvements (src/megalodon/models/ and dynamics/) - Pre-build time tensors before diffusion loop (eliminates per-step alloc) - Register `freqs` as buffer in TimestepEmbedder (CPU→GPU transfer gone) - Cache time embeddings per discrete timestep in MegaFNV3Conf (24/25 MLP calls eliminated per sample) - Precompute attention mask once before diffusion loop (25 recomputations eliminated) - Pre-encode null variable one-hots before sample loop (25× redundant F.one_hot calls eliminated) - Skip softmax for discrete_null pass-through logits in `separate_discrete_variables` - Convert attn_mask to float additive bias (0.0 / -inf) enabling efficient Flash Attention dispatch - Skip ETKDG 3D embedding in app inference path (coordinates are overwritten by diffusion prior anyway) ## Bug fixes - batch_preprocessor argument typo in sample_conformers.py - Duplicate `lin_edge1` layer definition in fn_model.py - Stray `torch.max` expressions with discarded results in fn_model.py - `ModuleDict[key] is None` check (was `.get()` / `not in`, both wrong for nn.ModuleDict with None values) - `DataLoader` import moved from deprecated `torch_geometric.data` to `torch_geometric.loader` - `copy(base_data)` → `base_data.clone()` in featurization (shallow copy shared tensor storage, causing in-place mutation bugs) - Inline `_ATOM_ENCODER` to avoid 8s pytorch-lightning transitive import - `Chem.SetUseLegacyStereoPerception(True)` in package `__init__.py` to match training-time stereo assignment - Restore `--skip_eval` CLI arg as no-op for backward compat - Preserve `_Name` from SDF mol inputs in pickle output IDs - Add warning comment on Z-branch float-mask incompatibility in fn_model.py ## Scripts / tooling - `scripts/sample_conformers.py` refactored to use `generate_conformers()` API - `scripts/benchmark_inference.py` — timing + accuracy benchmark (20 curated drug-like molecules, batch sweep, FFD vs fixed, SMILES round-trip check) - `scripts/sustained_perf_test.py` — large-scale sustained-load test using real ChEMBL3D test-set SMILES with stratified size sampling - `scripts/batch_size_sweep.py` — batch-size sweep with live nvidia-smi GPU SM% / memory-BW% sampling; identifies throughput knee and efficiency optimum
eb50fb0 to
0830a33
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
megalodon.inferencepackage: publicgenerate_conformers()API accepting a list of SMILES and returning a typedConformerGenerationResult; includes SMILES validation, identity-preserving featurization, FFD atom-count bin-packing, and variablen_confsper moleculeModuleDictnull-interpolant check, shallowcopy()→.clone()for PyG Data, deprecated DataLoader import, 8s pytorch-lightning transitive import, backward-compat CLI fixes,_Namepreservation from SDF inputsChanges
src/megalodon/inference/__init__, validation, featurization, generation, result, batching)src/megalodon/models/module.pysrc/megalodon/dynamics/fn_model.pyfreqsbuffer; remove duplicate layer; Z-branch warning commentscripts/sample_conformers.pygenerate_conformers()APIscripts/benchmark_inference.pyBenchmark (NVIDIA L40S, 20 drug-like molecules)
batch_size=4batch_size=16n_confs=5× 20 molsn_confsAccuracy on 100 generated conformers: 100/100 have valid 3D, correct atom count, and SMILES round-trip match.
Test Plan
python scripts/benchmark_inference.py --config ... --ckpt ... --dataset_root ...(full suite)python scripts/sample_conformers.py --input "c1ccccc1" --config ... --ckpt ... --output out.sdf --n_confs 5streamlit run app/app.pyand verify app still functions