Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,21 @@ To test the HCASE method, you can run the `embed_uspto.py` script located in the

This script will execute the embedding process and provide the results.

## Interactive Heatmap Plotting and Features

The HCASE repository includes an interactive heatmap visualization tool for exploring the distribution of embedded chemical structures. This tool is implemented in `hcase/plot.py` and provides the following features:

- **Automatic Deployment**: The Dash app for the heatmap visualization is automatically launched when running the USPTO embedding workflow (e.g., `examples/EmbedUSPTO/embed_uspto.py`). No separate launch step is required.
- **Frequency Heatmap**: Visualizes the frequency of structures in each region of the embedded space, using log-scaled bins for better dynamic range.
- **Color Scheme**: The heatmap uses a discrete, inverted color scale—light colors represent low frequency counts, and dark colors represent high frequency counts.
- **Plot Area Border**: The plot area (data region) is outlined with a clear border, making the data region visually distinct. Axis ticks and labels are positioned just outside this border.
- **Interactive Search**: Users can search for specific reference or target structures by SMILES or InChIKey, and matching points are highlighted on the heatmap.
- **Structure Display**: Clicking on a heatmap cell displays the reference structure and a grid of corresponding target structures, with options to highlight substructure overlaps.
- **Frequency Range Slider**: A slider allows users to filter the heatmap by frequency range, using log-scale bins.
- **Legend**: A color legend explains the mapping between frequency bins and colors.

To use the heatmap tool, simply run the USPTO embedding workflow as described above. The Dash app will be deployed automatically, enabling detailed exploration of the chemical embedding space and helping users identify regions of interest, structure clusters, and outliers.

## For Contributors

We welcome contributions to the HCASE method! Here's how you can contribute:
Expand Down
50 changes: 44 additions & 6 deletions docker/Dockerfile.hcase
Original file line number Diff line number Diff line change
@@ -1,9 +1,47 @@
FROM python:3.12-slim-bookworm
# Use a base image with CUDA and Python (adjust CUDA version as needed)
FROM nvidia/cuda:12.2.0-devel-ubuntu22.04

RUN apt-get update && apt-get install -y gcc
# Set environment variables to prevent some interactive prompts
ENV DEBIAN_FRONTEND=noninteractive

COPY ./requirements.txt requirements.txt
RUN pip install -r requirements.txt
# Install dependencies
RUN apt-get update && apt-get install -y \
wget \
git \
curl \
ca-certificates \
build-essential \
python3 \
python3-pip \
python3-venv \
&& rm -rf /var/lib/apt/lists/*

RUN mkdir /app
WORKDIR /app
# Symlink python3 and pip3
RUN ln -s /usr/bin/python3 /usr/bin/python && ln -s /usr/bin/pip3 /usr/bin/pip

# Upgrade pip and install conda (Miniconda)
ENV CONDA_DIR=/opt/conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh && \
bash miniconda.sh -b -p $CONDA_DIR && \
rm miniconda.sh
ENV PATH=$CONDA_DIR/bin:$PATH

# IMPORTANT: Build context must be the repo root, not the docker/ folder!
# Example: docker build -f docker/Dockerfile.hcase .
# This will copy the entire repo into /workspace in the container.
COPY . /workspace

# Create and activate environment
RUN conda env create -f /workspace/environment.yml && conda clean -a

# Activate the environment by default
SHELL ["conda", "run", "-n", "base", "/bin/bash", "-c"]

# Install cupy using pip or conda as needed (adjust CUDA version)
RUN pip install cupy-cuda12x

# Set working directory inside container
WORKDIR /workspace

# Default command
CMD ["python"]
61 changes: 61 additions & 0 deletions docker/Singularity.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
Bootstrap: docker
From: nvidia/cuda:12.2.0-devel-ubuntu22.04

%environment
export PATH=/opt/conda/bin:$PATH
export DEBIAN_FRONTEND=noninteractive
export WORKDIR=/workspace

%post
apt-get update && apt-get install -y \
wget \
git \
curl \
ca-certificates \
build-essential \
python3 \
python3-pip \
python3-venv \
&& rm -rf /var/lib/apt/lists/*

# Symlink python3 and pip3
ln -s /usr/bin/python3 /usr/bin/python || true
ln -s /usr/bin/pip3 /usr/bin/pip || true

# Install Miniconda
wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p /opt/conda
rm miniconda.sh

# Create workspace directory
mkdir -p /workspace

# Create conda environment
/opt/conda/bin/conda env create -f /workspace/environment.yml
/opt/conda/bin/conda clean -a

# Install cupy (adjust CUDA version if needed)
/opt/conda/bin/pip install cupy-cuda12x

%files
aux_code /workspace/aux_code
data /workspace/data
examples /workspace/examples
hcase /workspace/hcase
log /workspace/log
plots /workspace/plots
tests /workspace/tests
workflows /workspace/workflows
environment.yml /workspace/environment.yml
requirements.txt /workspace/requirements.txt
README.md /workspace/README.md
LICENSE /workspace/LICENSE
Makefile /workspace/Makefile
pyproject.toml /workspace/pyproject.toml
setup.py /workspace/setup.py
DISCLAIMER /workspace/DISCLAIMER
NOTES /workspace/NOTES

%runscript
cd /workspace
exec /opt/conda/bin/python "$@"
21 changes: 12 additions & 9 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,17 @@ channels:
- conda-forge
- anaconda
dependencies:
- python==3.13.2
- cupy==13.4.0
- rdkit==2024.09.6
- python==3.11
- rdkit==2025.03.3
- cupy==13.4.1
- pandarallel==1.6.5
- ipywidgets==8.1.5
- pip==25.0.1
- numpy==2.2.3
- tqdm==4.65.0
- pytest
- ipywidgets==8.1.7
- pip==25.1.1
- numpy==2.3.0
- tqdm==4.67.1
- pytest==8.4.0
- pip:
- -e .
- -e .
- molplotly==1.1.7
- dash==2.11.1
- jupyter-dash==0.4.2
19 changes: 11 additions & 8 deletions environment_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,16 @@ channels:
- conda-forge
- anaconda
dependencies:
- python==3.13.2
- rdkit==2024.09.6
- python==3.11
- rdkit==2025.03.3
- pandarallel==1.6.5
- ipywidgets==8.1.5
- pip==25.0.1
- numpy==2.2.3
- tqdm==4.65.0
- pytest
- ipywidgets==8.1.7
- pip==25.1.1
- numpy==2.3.0
- tqdm==4.67.1
- pytest==8.4.0
- pip:
- -e .
- -e .
- molplotly==1.1.7
- dash==2.11.1
- jupyter-dash==0.4.2
Loading