Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions .github/workflows/docker-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: Docker Build

on:
push:
branches: [ bilingual-docs ]
paths:
- 'Dockerfile'
- 'Dockerfile.ci'
- '.github/workflows/docker-build.yml'
pull_request:
branches: [ bilingual-docs ]
paths:
- 'Dockerfile'
- 'Dockerfile.ci'
workflow_dispatch:

jobs:
build-ci:
name: CI Optimized Build
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
submodules: false

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

- name: Build CI-optimized Docker image
uses: docker/build-push-action@v4
with:
context: .
file: ./Dockerfile.ci
push: false
tags: instag:ci
cache-from: type=gha,scope=ci
cache-to: type=gha,mode=max,scope=ci

build-full:
name: Full Production Build
runs-on: ubuntu-latest
needs: build-ci
if: ${{ github.event_name == 'workflow_dispatch' }}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
submodules: true

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

- name: Build full Docker image
uses: docker/build-push-action@v4
with:
context: .
file: ./Dockerfile
push: false
tags: instag:latest
cache-from: type=gha,scope=full
cache-to: type=gha,mode=max,scope=full
26 changes: 26 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# InsTaG Framework Commands and Guidelines

## Common Commands
- **Build Environment**: `conda env create --file environment.yml`
- **Process Video**: `python data_utils/process.py data/<ID>/<ID>.mp4`
- **Generate Teeth Mask**: `python data_utils/easyportrait/create_teeth_mask.py ./data/<ID>`
- **Extract Audio Features**: `python data_utils/deepspeech_features/extract_ds_features.py --input data/<n>.wav`
- **Pre-training**: `bash scripts/pretrain_con.sh data/pretrain output/<project_name> <GPU_ID>`
- **Fine-tuning**: `bash scripts/train_xx_few.sh data/<ID> output/<project_name> <GPU_ID>`
- **Synthesis**: `python synthesize_fuse.py -S data/<ID> -M output/<project_name> --audio <path> --audio_extractor <type>`
- **Docker Commands**: Use `./docker-run.sh` with various subcommands (see README_docker.md)

## Code Style Guidelines
- **Python Version**: 3.9 for main code, 3.10 for Sapiens
- **Formatting**: Follow existing style in files (indentation, line breaks)
- **Imports**: Group standard library, third-party, and local imports
- **Naming**: Use snake_case for variables/functions, CamelCase for classes
- **Error Handling**: Use try/except blocks for file operations and external calls
- **Documentation**: Add docstrings for new functions and classes

## Project Structure
- `/data`: Input videos and processed data
- `/output`: Generated models and results
- `/data_utils`: Processing utilities for various modalities
- `/scene`: Core rendering and modeling code
- `/utils`: Helper functions for audio, image, and graphics processing
67 changes: 67 additions & 0 deletions DOCUMENTATION_CN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Docker Setup for InsTaG Training Framework

## English

This pull request provides a complete Docker-based environment for the InsTaG training framework. It addresses several setup challenges documented in the issues by providing a consistent, containerized environment.

### Key Features:

1. **Dual Container Architecture:**
- Main container (CUDA 11.7, Python 3.9) for training and inference
- Separate Sapiens container (CUDA 12.1, Python 3.10) for geometry priors

2. **Helper Scripts:**
- `docker-run.sh` - Simplifies common operations
- `setup-docker.sh` - Automates initial setup and dependency installation

3. **Comprehensive Documentation:**
- Complete workflow examples
- Detailed troubleshooting guidance
- Support for different audio feature extractors (DeepSpeech, Wav2Vec, AVE, HuBERT)

4. **Automated Setup:**
- OpenFace integration for facial AU extraction
- EasyPortrait model download
- Sapiens model download

5. **Workflow Improvements:**
- No manual environment conflicts
- Simplified audio feature extraction
- Streamlined teeth mask generation
- Container-based geometry prior generation

The documentation includes examples for both short-video adaptation (with geometry priors) and long-video training, making it easier to use the framework in various scenarios.

---

## 中文

此 Pull Request 为 InsTaG 训练框架提供了完整的基于 Docker 的环境。它通过提供一致的容器化环境解决了 issues 中记录的几个设置挑战。

### 主要特点:

1. **双容器架构:**
- 主容器(CUDA 11.7,Python 3.9)用于训练和推理
- 单独的 Sapiens 容器(CUDA 12.1,Python 3.10)用于几何先验生成

2. **辅助脚本:**
- `docker-run.sh` - 简化常见操作
- `setup-docker.sh` - 自动化初始设置和依赖安装

3. **全面的文档:**
- 完整的工作流示例
- 详细的故障排除指南
- 支持不同的音频特征提取器(DeepSpeech、Wav2Vec、AVE、HuBERT)

4. **自动化设置:**
- OpenFace 集成用于面部 AU 提取
- EasyPortrait 模型下载
- Sapiens 模型下载

5. **工作流改进:**
- 没有手动环境冲突
- 简化的音频特征提取
- 简化的牙齿遮罩生成
- 基于容器的几何先验生成

文档包括短视频适应(带几何先验)和长视频训练的示例,使框架在各种场景中更易于使用。
158 changes: 158 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Version: 1.3.0 (Production Ready)
ARG BASE_IMAGE=nvcr.io/nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04
FROM $BASE_IMAGE

VOLUME [ "/instag" ]

# Install system dependencies
RUN apt-get update -yq --fix-missing \
&& DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends \
git \
wget \
cmake \
build-essential \
libboost-all-dev \
libopenblas-dev \
liblapack-dev \
libx11-dev \
libopencv-dev \
libgtk-3-dev \
pkg-config \
libavcodec-dev \
libavformat-dev \
libswscale-dev \
ffmpeg \
libsm6 \
libxext6 \
libgl1-mesa-glx \
libglib2.0-0 \
libsndfile1 \
portaudio19-dev \
ninja-build \
git-lfs \
vim \
curl \
libopenexr-dev \
openexr \
python3-dev \
libffi-dev \
libeigen3-dev \
&& rm -rf /var/lib/apt/lists/*

# Install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
&& rm Miniconda3-latest-Linux-x86_64.sh

# Add conda to PATH
ENV PATH="/opt/conda/bin:${PATH}"

# Initialize conda in bash
RUN conda init bash

# Clone InsTaG repository
RUN git lfs install \
&& git clone https://github.com/Fictionarry/InsTaG.git /instag \
&& cd /instag \
&& git submodule update --init --recursive

# Set up conda environment for InsTaG
WORKDIR /instag
RUN conda config --append channels conda-forge \
&& conda config --append channels nvidia \
&& conda create -n instag python=3.9 cudatoolkit=11.7 pytorch=1.13.1 torchvision=0.14.1 torchaudio -c pytorch -c nvidia -y \
&& echo "source activate instag" > ~/.bashrc

# Print debug information
RUN conda run -n instag python -c "import torch; print('PyTorch version:', torch.__version__); print('CUDA available:', torch.cuda.is_available()); print('CUDA version:', torch.version.cuda if torch.cuda.is_available() else 'N/A')"

# Install dependencies for InsTaG
RUN conda run -n instag pip install -r requirements.txt

# Install MMCV with specific CUDA version
RUN conda run -n instag pip install mmcv-full==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/index.html

# Install CUDA submodules
RUN conda run -n instag bash -c "cd /instag/submodules/diff-gaussian-rasterization && FORCE_CUDA=1 pip install -e ."
RUN conda run -n instag bash -c "cd /instag/submodules/simple-knn && FORCE_CUDA=1 pip install -e ."
RUN conda run -n instag bash -c "cd /instag/gridencoder && pip install -e ."
RUN conda run -n instag bash -c "cd /instag/shencoder && pip install -e ."

# Install PyTorch3D dependencies
RUN conda run -n instag pip install "fvcore>=0.1.5" "iopath>=0.1.7" "nvidiacub-dev"

# Install PyTorch3D with maximum compatibility
RUN conda run -n instag bash -c "\
pip install --no-cache-dir pytorch3d==0.7.4 || \
pip install --no-cache-dir 'git+https://github.com/facebookresearch/pytorch3d.git@stable' || \
echo 'PyTorch3D installation failed, but continuing. You can install it manually later.'"

# Install TensorFlow
RUN conda run -n instag pip install tensorflow-gpu==2.10.0

# Install OpenFace (critical for training)
# Split into multiple steps to avoid timeout issues
RUN mkdir -p /instag/OpenFace \
&& git clone https://github.com/TadasBaltrusaitis/OpenFace.git /tmp/OpenFace

# Download models
RUN cd /tmp/OpenFace && bash ./download_models.sh

# Build OpenFace with all cores for speed
RUN cd /tmp/OpenFace \
&& mkdir -p build \
&& cd build \
&& cmake -D CMAKE_BUILD_TYPE=RELEASE .. \
&& make -j$(nproc) \
&& make install

# Copy binaries and libraries to our OpenFace directory
RUN cp -r /tmp/OpenFace/build/bin /instag/OpenFace/ \
&& cp -r /tmp/OpenFace/lib /instag/OpenFace/ \
&& cp -r /tmp/OpenFace/build/lib /instag/OpenFace/ \
&& rm -rf /tmp/OpenFace

# Download EasyPortrait model
RUN mkdir -p /instag/data_utils/easyportrait \
&& conda run -n instag wget -O /instag/data_utils/easyportrait/fpn-fp-512.pth \
https://rndml-team-cv.obs.ru-moscow-1.hc.sbercloud.ru/datasets/easyportrait/experiments/models/fpn-fp-512.pth

# Run prepare script to download required models (critical for training)
RUN cd /instag && bash scripts/prepare.sh

# Create the Sapiens lite environment
RUN conda create -n sapiens_lite python=3.10 -y \
&& conda run -n sapiens_lite conda install pytorch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 pytorch-cuda=11.7 -c pytorch -c nvidia -y \
&& conda run -n sapiens_lite pip install opencv-python tqdm json-tricks

# Create directories for data and outputs
RUN mkdir -p /instag/data /instag/output /instag/jobs

# Set up environment paths
ENV PATH="/opt/conda/bin:/instag/OpenFace/bin:${PATH}"

# Create startup script to activate environment
RUN echo '#!/bin/bash' > /instag/startup.sh \
&& echo 'echo "Welcome to InsTaG on RunPod!"' >> /instag/startup.sh \
&& echo 'echo ""' >> /instag/startup.sh \
&& echo 'echo "Available environment commands:"' >> /instag/startup.sh \
&& echo 'echo "conda activate instag - Activate the main InsTaG environment"' >> /instag/startup.sh \
&& echo 'echo "conda activate sapiens_lite - Activate the Sapiens environment for geometry priors"' >> /instag/startup.sh \
&& echo 'echo ""' >> /instag/startup.sh \
&& echo 'echo "Common workflows:"' >> /instag/startup.sh \
&& echo 'echo "1. Process a video: python data_utils/process.py data/<ID>/<ID>.mp4"' >> /instag/startup.sh \
&& echo 'echo "2. Generate teeth masks: python data_utils/easyportrait/create_teeth_mask.py ./data/<ID>"' >> /instag/startup.sh \
&& echo 'echo "3. Run Sapiens (optional): bash data_utils/sapiens/run.sh ./data/<ID>"' >> /instag/startup.sh \
&& echo 'echo "4. Fine-tune the model: bash scripts/train_xx_few.sh data/<ID> output/<project_name> <GPU_ID>"' >> /instag/startup.sh \
&& echo 'echo "5. Synthesize: python synthesize_fuse.py -S data/<ID> -M output/<project_name> --audio <path> --audio_extractor <type>"' >> /instag/startup.sh \
&& echo 'echo ""' >> /instag/startup.sh \
&& echo 'source /opt/conda/etc/profile.d/conda.sh' >> /instag/startup.sh \
&& echo 'conda activate instag' >> /instag/startup.sh \
&& echo 'exec bash' >> /instag/startup.sh \
&& chmod +x /instag/startup.sh

# Set working directory
WORKDIR /instag

# Default command
CMD ["/instag/startup.sh"]
49 changes: 49 additions & 0 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Version: 1.0.0 (CI Optimized)
# This is a CI-optimized Dockerfile for GitHub Actions validation
# It skips time-consuming steps while still verifying build correctness
FROM nvcr.io/nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04

# Install system dependencies (minimal set)
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
git wget cmake build-essential \
libopencv-dev ffmpeg libsm6 libxext6 libgl1-mesa-glx \
libsndfile1 portaudio19-dev \
&& rm -rf /var/lib/apt/lists/*

# Install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh && \
bash /tmp/miniconda.sh -b -p /opt/conda && \
rm /tmp/miniconda.sh

# Add conda to PATH
ENV PATH="/opt/conda/bin:${PATH}"

# Initialize conda in bash
RUN conda init bash

# Clone InsTaG repository (shallow clone to speed up)
RUN git clone --depth 1 https://github.com/Fictionarry/InsTaG.git /instag

# Set up conda environment with PyTorch
WORKDIR /instag
RUN conda config --append channels conda-forge && \
conda config --append channels nvidia && \
conda create -n instag python=3.9 cudatoolkit=11.7 pytorch=1.13.1 torchvision=0.14.1 torchaudio -c pytorch -c nvidia -y && \
echo "source activate instag" > ~/.bashrc

# Install only core dependencies
RUN conda run -n instag pip install numpy==1.24.3 pillow==9.5.0 scipy opencv-python tqdm && \
conda run -n instag pip install -r requirements.txt

# Create mock directories and files for validating scripts
RUN mkdir -p /instag/data /instag/output && \
mkdir -p /instag/OpenFace/bin && \
echo '#!/bin/bash\necho "OpenFace mock for CI"' > /instag/OpenFace/bin/FeatureExtraction && \
chmod +x /instag/OpenFace/bin/FeatureExtraction

# Set up environment paths
ENV PATH="/opt/conda/bin:/instag/OpenFace/bin:${PATH}"

# Validation test command that will run in CI
CMD ["conda", "run", "-n", "instag", "python", "-c", "import torch; print(f'PyTorch {torch.__version__} with CUDA {torch.version.cuda if torch.cuda.is_available() else \"N/A\"}'); import numpy; import cv2; print('Core imports successful')"]
Loading