Automated discovery and curation of Auxiliary Metabolic Genes (AMGs), Auxiliary Regulatory Genes (AReGs), and Auxiliary Physiology Genes (APGs) encoded by viral genomes
β οΈ This tool is in active development and has not yet been peer-reviewed.
CheckAMG is a pipeline for high-confidence identification and curation of auxiliary genes (AMGs, AReGs, APGs) in viral genomes. It leverages functional annotations, genomic context, and manually curated lists of AVG annotations. Its prediction approach reflects years of community-defined standards for identifying auxiliary genes, validating that they are virus-encoded, and filtering common misannotations.
CheckAMG supports:
- Nucleotide or protein input
- Single-contig viral genomes or vMAGs (multi-contig)
- Running on viral genomes or viromes/metagenomes directly
See pyproject.toml for all dependencies. Major packages:
python >=3.11, <3.13lightgbm>=4.5.0metapyrodigal>=1.4.1polars-u64-idx>=1.30.0pyfastatools==2.5.0pyhmmer==0.11.1snakemake==8.23.2
Step 1: Create a conda environment and install CheckAMG using pip
conda create -n CheckAMG python=3.11 pip
conda activate CheckAMG
pip install checkamgStep 2: Download the databases required by CheckAMG
The current CheckAMG database is v1 (compatible with CheckAMG versions 0.7.0 and higher). It can be downloaded from Zenodo and set up automatically with checkamg download.
About 40 GB of free disk space will be required to download the databases. This can be reduced to about 21 GB after downloading finishes if the human-readable HMM files are removed by providing the --rm-hmm argument.
checkamg download -d /path/to/db/destination --rm-hmm
Example data to test your installation of CheckAMG are provided in the examples/example_data folder of this repository.
checkamg download -d /path/to/db/destination
checkamg annotate \
-d /path/to/db/destination \
-i examples/example_data/single_contig_viruses.fasta \
-I examples/example_data/multi_contig_vMAGs \
-o CheckAMG_example_out
CheckAMG has multiple modules. The main modules that will be used for AVG prediction are annotate, de-novo, and end-to-end. Currently, only the annotate module has been implemented, and the associated download module to download the required databases.
Run checkamg -h for full options and module descriptions:
usage: checkamg [-h] [-v] {download,annotate,de-novo,aggregate,end-to-end} ...
CheckAMG: Automated discovery and curation of Auxiliary Metabolic Genes (AMGs),
Auxiliary Regulatory Genes (AReGs), and Auxiliary Physiology Genes
(APGs) encoded in viral genomes.
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
modules:
{download,annotate,de-novo,aggregate,end-to-end}
download Download the databases required by CheckAMG.
annotate Predict and curate auxiliary genes using functional
annotations and genomic context.
de-novo (Not yet implemented) Predict auxiliary genes with an
annotation-independent method.
aggregate (Not yet implemented) Aggregate results into a final
report.
end-to-end (Not yet implemented) Run annotate, de-novo, and
aggregate in tandem.
The annotate module is for the automated prediction and curation of auxiliary genes in viral genomes based on functional annotations and genomic context.
Basic usage:
checkamg annotate -i <genomes.fna> -d <db_dir> -o <output_dir>
Basic arguments:
-i,--input-contigs: Path to viral genomes/nucleotide sequences in a single FASTA file-I,--input-bins: Path to a folder containing multi-contig vMAGs/bins-p,--input-proteins: Path to amino acid sequences from translated contigs in a single FASTA file-P,--input-bin-proteins: Path to a folder containing amino acid sequences from translated vMAGs/bins-d,--db-dir: Path to the CheckAMG database download withcheckamg download-o,--output: Path to the CheckAMG output folder to be written
Notes:
- At least one of
--input-contigsor--input-bins, or one of--input-proteinsor--input-bin-proteins, must be provided - Both nucleotide and protein input types cannot be mixed
- Providing single-contigs versus bins only affects the labeling and organization of results, and does not affect AVG predictions
- Protein headers must be in prodigal format (e.g.
>Contig1_1 # 144 # 635 # 1or>Contig1_2 # 1535 # 635 # -1)
Full usage:
usage: checkamg annotate [-h] -d DB_DIR -o OUTPUT [-i INPUT_CONTIGS]
[-I INPUT_BINS] [-p INPUT_PROTEINS]
[-P INPUT_BIN_PROTEINS] [--input-type {nucl,prot}]
[-l MIN_LEN] [-f MIN_ORF] [-a MIN_ANNOT]
[-c COV_FRACTION] [-e EVALUE] [-b BIT_SCORE]
[-bf BITSCORE_FRACTION_HEURISTIC] [-w WINDOW_SIZE]
[-v MIN_FLANK_VSCORE] [-vl MIN_WINDOW_AVG_VL_SCORE]
[-ha | --use-hallmark | --no-use-hallmark]
[--filter-ambig-regions | --no-filter-ambig-regions]
[--filter-avg-arrays | --no-filter-avg-arrays]
[--avg-array-limit AVG_ARRAY_LIMIT]
[--filter-presets FILTER_PRESETS] [-kf] [-pq]
[-t THREADS] [-m MEM] [--debug | --no-debug]
Predict and curate auxiliary genes in viral genomes based on functional
annotations and genomic context.
options:
-h, --help show this help message and exit
required arguments:
-d DB_DIR, --db-dir DB_DIR
Path to CheckAMG database files. (default: None)
-o OUTPUT, --output OUTPUT
Output directory for all generated files and folders.
(default: None)
input arguments:
-i INPUT_CONTIGS, --input-contigs INPUT_CONTIGS
Input nucleotide contigs FASTA (.fna/.fasta; gzipped
allowed). (default: None)
-I INPUT_BINS, --input-bins INPUT_BINS
Folder of binned contig FASTAs (e.g. vMAGs with
multiple contigs). Expects one .fna/.fasta (gzipped
allowed) per bin. (default: None)
-p INPUT_PROTEINS, --input-proteins INPUT_PROTEINS
Input amino-acid FASTA from translated contigs
(.faa/.fasta; gzipped allowed). Expected Prodigal
headers: >[CONTIG]_[CDS] # START # END # FRAME # ...
(default: None)
-P INPUT_BIN_PROTEINS, --input-bin-proteins INPUT_BIN_PROTEINS
Folder of amino-acid FASTAs from translated binned
contigs (.faa/.fasta; gzipped allowed). Expects one
file per bin, each containing proteins from multiple
contigs. (default: None)
--input-type {nucl,prot}
Input type: 'nucl' for nucleotide sequences or 'prot'
for translated amino-acid sequences. Providing
proteins instead of nucleotide sequences skips
pyrodigal-gv, and annotations/contextual analyses are
performed using the provided proteins. So ensure all
proteins from contigs/bins are included and that
headers are formatted as expected (see
--input-proteins). (default: nucl)
thresholds and HMMsearch settings:
-l MIN_LEN, --min-len MIN_LEN
Minimum length (bp) of input contigs for them to be
considered for analysis. (default: 5000)
-f MIN_ORF, --min-orf MIN_ORF
Minimum number of ORFs/proteins per contig for it to
be considered for analysis. (default: 4)
-a MIN_ANNOT, --min-annot MIN_ANNOT
Minimum fraction (0.0-1.0) of genes per contig that
must receive an annotation to be considered for
contextual analysis. (default: 0.2)
-c COV_FRACTION, --cov-fraction COV_FRACTION
Minimum covered fraction (0.0-1.0) of HMM profiles
required to report hits. (default: 0.3)
-e EVALUE, --evalue EVALUE
Maximum fallback E-value for HMM hits when database-
provided cutoffs are unavailable. (default: 1e-05)
-b BIT_SCORE, --bitscore BIT_SCORE
Minimum fallback bit score for HMM hits when database-
provided cutoffs are unavailable. (default: 30)
-bf BITSCORE_FRACTION_HEURISTIC, --bitscore-fraction-heuristic
BITSCORE_FRACTION_HEURISTIC
Retain HMM hits scoring at least this fraction
(0.0-1.0) of its database-provided threshold during
heuristic filtering. (default: 0.5)
genomic context settings:
-w WINDOW_SIZE, --window-size WINDOW_SIZE
Window size (bp) for local average VL-score
calculation. (default: 5000)
-v MIN_FLANK_VSCORE, --min-flank-vscore MIN_FLANK_VSCORE
Minimum V-score (0.0-10.0) required in flanking
regions to verify viral origin and reduce host-
contamination artifacts (higher = more viral-like).
(default: 10.0)
-vl MIN_WINDOW_AVG_VL_SCORE, --min-window-avg-vlscore
MIN_WINDOW_AVG_VL_SCORE
Minimum average VL-score within the specified window
size around a gene to be considered a viral region
(higher = more viral-like). (default: 3.0)
-ha, --use-hallmark, --no-use-hallmark
Use viral hallmark genes instead of V-scores when
evaluating flanks. Enable to be extra conservative.
(default: False)
filtering settings:
--filter-ambig-regions, --no-filter-ambig-regions
Exclude predictions that fall outside strict viral
regions (inside ambiguous regions). Strict viral
regions are identified from window-average VL-scores
and then refined using per-gene V-scores (see
--min-window-avg-vlscore and --min-flank-vscore) or
viral hallmark genes if --use-hallmark is enabled
(stricter, lower recall). When enabled, any
prediction not overlapping a strict viral region is
filtered out. Disabled by default because it can be
too strict when annotation rate is low but other
viral origin signals are strong. Enable to be extra
conservative. (default: False)
--filter-avg-arrays, --no-filter-avg-arrays
Exclude AVG predictions that occur in contiguous runs
(arrays), which suggests non-auxiliary function.
(default: True)
--avg-array-limit AVG_ARRAY_LIMIT
If --filter-avg-arrays is enabled, exclude runs of
AVGs of this length or more. (default: 3)
--filter-presets FILTER_PRESETS
Comma-separated preset(s) controlling functional
annotation filtering. Valid presets:
* default (recommended)
* allow_glycosyl (keep glycosyltransferase, glycoside-
hydrolase, and related annotations)
* allow_nucleotide (keep nucleotide metabolism
annotations)
* allow_methyl (keep methylase/methyltransferase
annotations)
* allow_lipid (keep lipopolysaccharide and phospho-
lipid-related annotations)
* no_filter (disable all filtering, not recommended).
Example: --filter-presets allow_glycosyl,allow_
nucleotide. (default: default)
output files:
-kf, --keep-full-hmm-results
Write all HMM search results for every hit in each
database. By default, only the top hit per protein
per database is written to reduce file size. Not
recommended for large inputs unless --save-as-parquet
is used. (default: False)
-pq, --save-to-parquet
Write intermediate and final tables as parquet files
instead of TSV. Tables will be smaller files but not
human readable without external tools. Recommended
for large datasets. (default: False)
resources:
-t THREADS, --threads THREADS
Maximum number of threads allowed. Default is 25% of
available. (default: 64)
-m MEM, --mem MEM Max memory allowed (GB). Default is 80% of available.
(default: 1431)
--debug, --no-debug Enable debug-level logging. (default: False)
Outputs:
The CheckAMG annotate output folder will have the following structure:
CheckAMG_annotate_output
βββ CheckAMG_annotate.log
βββ config_annotate.yaml
βββ results/
β βββ faa_metabolic/
β β βββ AMGs_all.faa
β β βββ AMGs_high_confidence.faa
β β βββ AMGs_low_confidence.faa
β β βββ AMGs_medium_confidence.faa
β βββ faa_physiology/
β β βββ APGs_all.faa
β β βββ APGs_high_confidence.faa
β β βββ APGs_low_confidence.faa
β β βββ APGs_medium_confidence.faa
β βββ faa_regulatory/
β β βββ AReGs_all.faa
β β βββ AReGs_high_confidence.faa
β β βββ AReGs_low_confidence.faa
β β βββ AReGs_medium_confidence.faa
β βββ final_results.tsv
β βββ gene_annotations.tsv
β βββ genes_genomic_context.tsv
β βββ metabolic_genes_curated.tsv
β βββ physiology_genes_curated.tsv
β βββ regulation_genes_curated.tsv
βββ snakemake/
βββ wdir/
CheckAMG_annotate.log: Log file for the CheckAMG annotate runconfig_annotate.yaml: Snakemake pipeline configurationresults/: Main results directoryfaa_metabolic/,faa_physiology/,faa_regulatory/: Predicted AVGs by type and confidencefinal_results.tsv: Summary table of AVG predictions- Note that this table contains information on all genes that made it past the length/CDS filtering steps, including metabolic, physiological, regulatory, and unclassified (not AVG) genes. The "Protein Classification" column can be used to filter by classification.
gene_annotations.tsv: All gene annotationsgenes_genomic_context.tsv: Gene-level genomic context for confidence assignment*_genes_curated.tsv: Curated lists of metabolic, physiological, and regulatory genes after filtering false positives
snakemake/: Snakemake.donefileswdir/: Intermediate files
Examples of these output files are provided in the examples/example_outputs folder of this repository.
Coming soon.
Coming soon.
An AVG is an Auxiliary Viral Gene, a virus-encoded gene that is non-essential for viral replication but augments host metabolism (AMGs), physiology (APGs), or regulation (AReGs). Historically, many auxiliary genes were referred to broadly as AMGs, but recently the term AVG has been adopted to include broader host-modulating functions, not just metabolism (see Martin et al. (2025) Nat Microbiol).
Examples:
- A virus-encoded psbA or soxY would be an AMG because they encode proteins with functions in host photosynthesis and sulfide oxidation
- A virus-encoded VasG type VI secretion system protein or HicA toxin would be an APG because they are involved in host physiology
- A LuxR transcriptional regulator or an AsiA anti-sigma factor protein would be an AReG because they are likely involved in the regulation of host gene expression
Despite the name "CheckAMG", this tool also predicts APGs and AReGs using the same pipeline, differing only by functional annotation criteria.
CheckAMG applies a two-stage filtering process:
- Use a list of curated profile HMMs that represent metabolic, physiological, and regulatory genes to come up with initial AVG candidates
- Use a second list of curated keywords/substrings that will be used to filter unlikely AMGs, APGs, and AReGs
Unclassified genes are those with annotations that don't meet thresholds for confident AVG classification, not necessarily unannotated.
Users can control how CheckAMG applies keyword-based filters using the --filter-presets argument. The currently available options are:
default: Standard annotation filtering behavior (recommended)allow_glycosyl: Disables filtering for glycosyltransferase, glycoside-hydrolase, and related annotationsallow_nucleotide: Disables filtering for nucleotide metabolism annotationsallow_methyl: Disables filtering for methyltransferase and related annotationsallow_lipid: Disables filtering for lipopolysaccharide and phospholipid-related annotationsno_filter: Disables all keyword-based filtering (not recommended)
We generally do not recommend changing --filter-presets from default for most use cases. However, there are scenarios where it may be appropriate to add exceptions to CheckAMG's filtering logic. For example:
- If virus-encoded glycosyltransferases/glycoside-hydrolases, methyltransferases, nucleotide metabolism genes, or lipopolysaccharide/phospholipid metabolism genes are specifically of interest, consider applying the relevant filter presets to include those exceptions
- If you have environment-specific knowledge that makes certain gene functions highly relevant to your study system, you can use the appropriate
--filter-presetsto retain those annotations if they were originally included among the CheckAMG filters- For example, setting
--filter-presets allow_glycosylmay include additional potential AMGs involved in carbohydrate degradation when these functions are likely to be enriched in the environmental context of your viral genomes
- For example, setting
- If you have other evidence to suggest that annotations flagged by certain keywords are more likely involved in auxiliary metabolic, physiological, or regulatory pathways in the host, rather than essential/core viral functions like genome replication, capsid assembly, cell entry, or lysis
Note: If any non-default values for --filter-presets are used, additional manual curation of functional annotations is still necessary to avoid misclassification of a gene as an AMG, APG, or AReG.
TL;DR It reflects the likelihood that a gene is virus-encoded (vs host/MGE)
AVGs often resemble host genes and can result from contamination. CheckAMG uses local genome context to assign high, medium, or low viral origin confidence based on:
- Proximity to virus-like or viral hallmark genes
- Proximity to transposases or other non-viral mobilization genes
- Local viral gene content, determined using V- and VL-scores (Zhou et al., 2025)
A LightGBM model, trained on real and simulated viral/non-viral data, makes these assignments. Confidence levels refer to the viral origin, not the functional annotation.
TL;DR When in doubt, use high, but medium can be included if your input is virus enriched.
The precision and recall of each confidence level for predicting true viral proteins depends on the input dataset. Whether you should use high, medium, and/or low-confidence AVGs will depend on your knowledge of your input data.
- High-confidence
- CheckAMG assigns confidence levels such that high-confidence predictions can be almost always be trusted (false-discovery rate < 0.05 in most cases)
- To maintain the integrity of high-confidence predictions even in cases where viral proteins are relatively rare in the input, high-confidence predictions are conservative
- We recommend using just high-confidence AVGs when viral proteins are relatively rare in the input data (such as mixed-community metagenomes) or when the composition of the input data is unknown
- Medium-confidence
- Using medium-confidence predictions can significantly increase the recovery of truly viral proteins, but they may not always be best to use
- Medium-confidence predictions maintain false-discovery rates < 0.1 in datasets with at least 33% viral proteins, but as input sequences become increasingly non-viral in their protein composition, FDRs begin to surpass 0.1 (see the figure and table, below)
- We recommend using both high- and medium-confidence AVGs if you know that roughly one-third or more of your input sequences are viral, such as outputs from most virus prediction tools or viromes
- Low-confidence
- Low-confidence predictions are not filtered at all, so we only recommend using them when you are certain that all of your input sequences are free of non-viral sequence contamination (complete or high-quality viral genomes), or for testing
Below are preliminary results for benchmarking our viral origin confidence predictions against test datasets with varying sequence composition (% of proteins, see the table below for composition):
| Dataset | % Viral Proteins | % MGE Proteins | % Host Proteins |
|---|---|---|---|
| Near all virus | 90.0% | 4.1% | 5.9% |
| Virus enriched | 72.0% | 12.5% | 15.5% |
| Half viral/host | 50.0% | 4.8% | 45.2% |
| Equal viral/nonviral | 50.0% | 20.4% | 29.6% |
| Training distribution | 40.6% | 13.5% | 45.9% |
| Equal viral/MGE/host | 33.3% | 30.0% | 36.7% |
| Integrated proviruses | 38.3% | 7.7% | 53.9% |
| Host enriched | 14.7% | 12.5% | 72.8% |
| Near all host | 5.0% | 5.0% | 90.0% |
| MGE enriched | 8.1% | 75.0% | 16.9% |
TL;DR Profile HMM searches with adaptive adjustment of database-provided thresholds
If you're curious about the internal mechanics of how CheckAMG annotates proteins for function, this section explains the behavior. These settings are designed to balance sensitivity (not missing true hits) and specificity (excluding weak/ambiguous matches), with additional database-specific optimizations for functional reliability.
-
Homology Searching Method
- CheckAMG uses
pyhmmerfor fast and reproducible HMM searches of user proteins against profile HMMs
- CheckAMG uses
-
Profile HMM Databases
-
CheckAMG relies on the following databases:
- KEGG Orthology (KO) (Kanehisa et al., 2016)
- Functional Ontology Assignments for Metagenomes (FOAM) database (Prestat et al., 2014)
- Pfam-A (Mistry et al., 2021)
- Prokaryotic Virus Remote Homologous Groups database (PHROGs) (Terzian et al., 2021)
- dbCAN CAZyme domain HMM database (Zheng et al., 2023)
- The METABOLIC HMM database (Zhou et al., 2022)
- Curated Annotations for Microbial (Poly)phenol Enzymes and Reactions (CAMPER) (McGivern et al., 2024)
-
These databases can be downloaded and processed using the
checkamg downloadmodule
-
-
E-value Threshold
- An initial, permissive E-value cutoff of
0.01is applied duringhmmsearchto minimize missed hits due to chunking or memory differences when parallelizing, which can affect search reproducibility
- An initial, permissive E-value cutoff of
-
Coverage Filter
- After hits are collected, CheckAMG enforces a minimum HMM alignment coverage filter (default
0.30, configurable via--cov-fraction) - This is applied during downstream hit filtering so that functional inferences are not drawn from tiny partial alignments
- After hits are collected, CheckAMG enforces a minimum HMM alignment coverage filter (default
-
Database-Specific Thresholds
-
CheckAMG applies specialized rules depending on the HMM source:
- Pfam: Applies sequence-level gathering threshold (GA); hits below GA are excluded
- FOAM, KEGG, & CAMPER: Use database-defined bit score thresholds, but apply a relaxed fallback heuristic (see below)
- METABOLIC: Uses GA cutoffs derived from its underlying Pfam/TIGRFAM sources, where available
-
-
Fallback Heuristic (FOAM, KEGG, & CAMPER)
-
KEGG (and consequently, FOAM and CAMPER, since these databases were largely derived from KEGG KOfams) thresholds can sometimes be overly strict, especially for environmental viruses, filtering out hits that are biologically valid
-
To recover these valid hits, CheckAMG applies a relaxed fallback heuristic inspired by the Anvi'o
anvi-run-kegg-kofamsstrategy:-
If a hit falls below the database-provided trusted threshold (e.g., KEGG TC), it is still retained if all three conditions below are met:
- The bit score is at least 50% of the threshold value
- The E-value is below
1e-5 - The coverage of the HMM profile aligned to the sequence hit is at least
0.30
-
These values are configurable by the user using the
--bitscore-fraction-heuristic,--evalue, and--cov-fractionarguments if desired, but we do not recommend changing them
-
-
A similar heuristic improves annotation recovery without compromising too much on precision (Kananen et al., 2025)
-
-
Fallback Filtering for Other Databases
-
If the HMM source doesn't have defined cutoffs, such as dbCAN, PHROGs, and some profiles in the METABOLIC database, CheckAMG enforces:
- A minimum coverage of the HMM profile
0.30to the aligned sequence - A minimum bit score of
30 - A maximum E-value of
1e-5 - These cutoffs are configurable by the user if desired with
--cov-fraction,--bitscore, and--evalue
- A minimum coverage of the HMM profile
-
-
Result Consolidation and Best-Hit Reporting
-
Each input protein is searched against each HMM database (KEGG, FOAM, Pfam, PHROG, dbCAN, METABOLIC, and CAMPER)
-
All hits are filtered using the criteria above (including the minimum coverage filter)
-
Then, CheckAMG reports (1) per-database best hits and (2) a single cross-database top-hit summary:
-
Per database: only the single best hit per protein is retained and reported for that database
- Preference is given to the hit with the lowest E-value
- If E-values are equal, the hit with the higher bit score is selected
-
Across databases: CheckAMG also reports a single best-supported annotation per protein (
top_hit_hmm_id,top_hit_description,top_hit_db) by selecting the database whose retained per-database best hit has the largest bit score among databases with a non-null hit
-
-
Full, unfiltered
hmmsearchoutput can optionally be written for inspection with--keep-full-hmm-results. This output includes all hits per protein per database, including hits that fail the configured thresholds (e.g., bitscore, E-value, and coverage), rather than only the retained best hit. Because these files can be very large, we strongly recommend enabling--save-to-parquetalongside--keep-full-hmm-resultsto reduce disk usage.
-
These defaults provide a balance between accuracy and recall, and are based on benchmarking and community best practices. Users may modify thresholds using the --bitscore, --bitscore-fraction-heuristic, --evalue, and --cov-fraction arguments.
CheckAMG modules are executed as Snakemake pipelines. If a run is interrupted, it can resume from the last complete step as long as intermediate files exist.
CheckAMG is packaged with several curated reference tables under CheckAMG/files/ that define (i) the functional label mappings used for reporting, (ii) the curated AMG/APG/AReG HMMs, and (iii) the AVG filtering tables (including exception categories). These tables are what the pipeline reads and parses when curating annotations.
To make these resources transparent and reproducible, this repository includes a notebook (see make_checkamg_required_tables.ipynb) that was used to build the required tables from upstream sources, including:
hmm_id_to_name.tsv(cross-database HMM id to name/description mapping)FOAM.tsvandvscores.tsvAMGs.tsv,APGs.tsv,AReGs.tsvAMG_filters.tsv,APG_filters.tsv,AReG_filters.tsvviral_hallmark_genes.tsvandmobile_genes.tsv
CheckAMGβs required HMM database (downloaded via checkamg download) is formatted and packaged to ensure consistency and standardization across versions. The notebook used to download, format, and build this database, including documentation of the associated source versions, is available at build_checkamg_db.ipynb.
These notebooks are not required to run CheckAMG, but they are provided so others can inspect, regenerate, and update the curated assets when upstream databases change.
To report bugs or request features, please use the GitHub Issues page.
Coming soon.
Authors:
- James C. Kosmopoulos (kosmopoulos [at] wisc [dot] edu)
- Cody Martin
- Karthik Anantharaman (karthik [at] bact [dot] wisc [dot] edu)
