diff --git a/README.md b/README.md index 5bd22a4..b46eda4 100644 --- a/README.md +++ b/README.md @@ -7,84 +7,153 @@ Play around with it and raise Github issues if anything fails # Setting up -1. Install `conda` - - You can use either [Miniconda](https://docs.anaconda.com/miniconda/install/#quick-command-line-install) or [Miniforge](https://github.com/conda-forge/miniforge) -2. Clone repo +## 1. Install `bnd` + +### Option A — pipx (recommended) + +[pipx](https://pipx.pypa.io) installs `bnd` in an isolated environment and makes the CLI available system-wide. + +1. Install pipx if you don't have it: ```shell - git clone git@github.com:BeNeuroLab/bnd.git - cd ./bnd + # Windows (requires Python ≥ 3.10) + pip install pipx + pipx ensurepath # restart your terminal after this + + # Linux + sudo apt install pipx + pipx ensurepath ``` -3. Open either Miniconda prompt or Miniforge promt and run the following command. This - may take some time: + +2. Install `bnd`: ```shell - conda env create --file=env.yml + # Lightweight (upload, download, config only — fast install): + pipx install "bnd @ git+https://github.com/BeNeuroLab/bnd.git" + + # Full install with processing dependencies (NWB, kilosort, pyaldata): + pipx install "bnd[processing] @ git+https://github.com/BeNeuroLab/bnd.git" ``` - or if you want the processing depedencies: + To install a specific branch (e.g. for testing): ```shell - conda env create --file=processing_env.yml + pipx install "bnd[processing] @ git+https://github.com/BeNeuroLab/bnd.git@seperate-ks-env" ``` - For kilosorting you will also need: - 1. Install kilosort and the GUI, run `python -m pip install kilosort[gui]`. If you're on a zsh server, you may need to use `python -m pip install "kilosort[gui]"` - 2. You can also just install the minimal version of kilosort with python -m pip install kilosort. - 3. Next, if the CPU version of pytorch was installed (will happen on Windows), remove it with `pip uninstall torch` - 4. Then install the GPU version of pytorch `conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia` +3. Verify: + ```shell + bnd --help + ``` - If you installed the base environment and want to update later on: +To **update** to the latest commits: +```shell +pipx install --force "bnd[processing] @ git+https://github.com/BeNeuroLab/bnd.git" +``` + +### Option B — conda + +1. Install [Miniconda](https://docs.anaconda.com/miniconda/install/#quick-command-line-install) or [Miniforge](https://github.com/conda-forge/miniforge). +2. Clone the repo and create the environment: + ```shell + git clone git@github.com:BeNeuroLab/bnd.git + cd ./bnd + conda env create --file=processing_env.yml # includes scientific dependencies + conda activate bnd + pip install -e . + ``` + To update later: ```shell conda env update --file=processing_env.yml ``` - And then do the kilosort step -4. Create your configuration file: + +## 2. Set up Kilosort (separate conda env) + +Kilosort runs in its own conda environment — `bnd` invokes it via `conda run -n kilosort ...`. + +1. Create and activate the env: + ```shell + conda create -n kilosort python=3.10 pip + conda activate kilosort + ``` +2. Install Kilosort following the [official instructions](https://github.com/MouseLand/Kilosort): + ```shell + python -m pip install "kilosort[gui]" + ``` + Or minimal (no GUI): ```shell - bnd init # Provide the path to local and remote data storage - bnd --help # Start reading about the functions! + python -m pip install kilosort + ``` +3. Install GPU-enabled PyTorch (example for CUDA 11.8): + ```shell + conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia + ``` + +> **Note:** If your env is not named `kilosort`, set the environment variable `BND_KILOSORT_ENV` to +> the env name before running `bnd`. + +## 3. Configure `bnd` + +```shell +bnd init # Provide the path to local and remote data storage +bnd --help # Start reading about the functions! +``` # Example usage + Complete your experimental session on animal M099. Then: + ```shell bnd up M099 ``` Now, you want to process your data into a pyaldata format. Its a good idea to do this on one of the lab workstations: + ```shell bnd dl M099_2025_01_01_10_00 -v # Downloads everything bnd to-pyal M099_2025_01_01_10_00 # Run kilosort, nwb conversion, and pyaldata conversion bnd up M099_2025_01_01_10_00 # Uploads new files to server ``` -If you want specific things during your pipeline (e.g., dont run kilosort, use a custom channel map) read the API below. +If you want specific things during your pipeline (e.g., dont run kilosort, use a custom channel map) read the API below. # API ## Config + ### `bnd init` + Create a .env file (if there isnt one) to store the paths to the local and remote data storage. ### `bnd show-config` + Show the contents of the config file. ## Updating + ### `bnd check-updates` + Check if there are any new commits on the repo's main branch. ### `bnd self-update` -Update the bnd tool by pulling the latest commits from the repo's main branch. +Update the bnd tool by pulling the latest commits from the repo's main branch. ## Data Transfer + ### `bnd up ` + Upload data from session or animal name to the server. If the file exists on the server, it won't be replaced. Every file in the session folder will get uploaded. Example usage to upload everything of a given session: + ```shell bnd up M017_2024_03_12_18_45 bnd up M017 ``` + ### `bnd dl ` + Download experimental data from a given session from the remote server. Example usage to download everything: + ```shell bnd dl M017_2024_03_12_18_45 -v # will download everything, including videos bnd dl M017_2024_03_12_18_45 # will download everything, except videos @@ -92,7 +161,9 @@ bnd dl M017_2024_03_12_18_45 --max-size=50 # will download files smaller than 5 ``` ## Pipeline + ### `bnd to-pyal ` + Convert session data into a pyaldata dataframe and saves it as a .mat If no .nwb file is present it will automatically generate one and if a nwb file is present it will skip it. If you want to generate a new one run `bnd to-nwb` @@ -100,6 +171,7 @@ If no .nwb file is present it will automatically generate one and if a nwb file If no kilosorted data is available it will not kilosort by default. If you want to kilosort add the flag `-k` Example usage: + ```shell bnd to-pyal M037_2024_01_01_10_00 # Kilosorts data, runs nwb and converts to pyaldata bnd to-pyal M037_2024_01_01_10_00 -K # converts to pyaldata without kilosorting (if no .nwb file is present) @@ -107,11 +179,13 @@ bnd to-pyal M037_2024_01_01_10_00 -c # Use custom mapping during nwb conversion ``` ### `bnd to-nwb ` + Convert session data into a nwb file and saves it as a .nwb If no kilosorted data is available it will not kilosort by default. If you want to kilosort add the flag `-k` Example usage: + ```shell bnd to-nwb M037_2024_01_01_10_00 # Kilosorts data and run nwb bnd to-nwb M037_2024_01_01_10_00 -K # converts to nwb without kilosorting (if no .nwb file is present) @@ -119,13 +193,16 @@ bnd to-nwb M037_2024_01_01_10_00 -c # Use custom mapping during conversion if c ``` ### `bnd ksort ` + Kilosorts data from a single session on all available probes and recordings Example usage: + ```shell bnd ksort M037_2024_01_01_10_00 ``` # TODOs: + - Add `AniposeInterface` in nwb conversion - Implement Npx2.0 functionality diff --git a/bnd/cli.py b/bnd/cli.py index 7f591d6..be64b77 100644 --- a/bnd/cli.py +++ b/bnd/cli.py @@ -8,7 +8,6 @@ from rich import print from .config import ( - _check_is_git_track, _check_root, _check_session_directory, _get_env_path, @@ -311,8 +310,6 @@ def init(): else: print("\nConfig file doesn't exist. Let's create one.") - repo_path = _get_package_path() - _check_is_git_track(repo_path) local_path = Path( typer.prompt("Enter the absolute path to the root of the local data storage") @@ -325,7 +322,6 @@ def init(): _check_root(remote_path) with open(env_path, "w") as f: - f.write(f"REPO_PATH = {repo_path}\n") f.write(f"LOCAL_PATH = {local_path}\n") f.write(f"REMOTE_PATH = {remote_path}\n") diff --git a/bnd/config.py b/bnd/config.py index 7914958..ae9a34a 100644 --- a/bnd/config.py +++ b/bnd/config.py @@ -14,12 +14,32 @@ def _get_package_path() -> Path: return Path(__file__).absolute().parent.parent +def _get_config_dir() -> Path: + """ + Returns the path to the bnd configuration directory (~/.bnd/). + Creates it if it doesn't exist. + """ + config_dir = Path.home() / ".bnd" + config_dir.mkdir(parents=True, exist_ok=True) + return config_dir + + def _get_env_path() -> Path: """ Returns the path to the .env file containing the configuration settings. + Checks ~/.bnd/.env first, falls back to legacy location (next to package) for migration. """ - package_path = _get_package_path() - return package_path / ".env" + new_path = _get_config_dir() / ".env" + if new_path.exists(): + return new_path + + # Legacy: config stored next to the package source (editable / conda installs) + legacy_path = _get_package_path() / ".env" + if legacy_path.exists(): + return legacy_path + + # Default to the new location for fresh installs + return new_path def _check_session_directory(session_path): @@ -48,7 +68,6 @@ class Config: def __init__(self, env_path=_get_env_path()): self.REMOTE_PATH = None self.LOCAL_PATH = None - self.REPO_PATH = None # Load the actual environment PATHs self.load_env(env_path) self.datetime_pattern = "%Y_%m_%d_%H_%M" diff --git a/bnd/pipeline/__init__.py b/bnd/pipeline/__init__.py index fbc79a3..56940c4 100644 --- a/bnd/pipeline/__init__.py +++ b/bnd/pipeline/__init__.py @@ -8,9 +8,12 @@ def _check_processing_dependencies(): from .kilosort import run_kilosort_on_session from .nwb import run_nwb_conversion from .pyaldata import run_pyaldata_conversion - except Exception as e: + except ImportError as e: raise ImportError( - f"Could not import processing dependencies: {e}. Update your environment " - "with `conda env update -n bnd --file=processing_env.yml`" - ) + f"Missing processing dependencies: {e}.\n" + "Install them with:\n" + ' pipx install --force "bnd[processing] @ git+https://github.com/BeNeuroLab/bnd.git"\n' + "or:\n" + ' pip install "bnd[processing]"' + ) from e return diff --git a/bnd/pipeline/kilosort.py b/bnd/pipeline/kilosort.py index b291779..df4b05d 100644 --- a/bnd/pipeline/kilosort.py +++ b/bnd/pipeline/kilosort.py @@ -1,11 +1,12 @@ import os +import json +import shutil +import subprocess +import tempfile +import textwrap from configparser import ConfigParser from pathlib import Path -import torch -from kilosort import run_kilosort -from kilosort.utils import PROBE_DIR, download_probes - from ..logger import set_logging from ..config import Config, _load_config from ..config import find_file @@ -13,6 +14,176 @@ logger = set_logging(__name__) +_KILOSORT_RUNNER_CODE = textwrap.dedent( + """ + import json + import sys + from pathlib import Path + + params_path = Path(sys.argv[1]) + params = json.loads(params_path.read_text()) + + from kilosort import run_kilosort + from kilosort.utils import PROBE_DIR, download_probes + + probe_name = params["probe_name"] + + if not PROBE_DIR.exists(): + download_probes() + + if not any(PROBE_DIR.glob(probe_name)): + download_probes() + + run_kilosort( + settings=params["settings"], + probe_name=probe_name, + data_dir=params["data_dir"], + results_dir=params["results_dir"], + save_preprocessed_copy=params.get("save_preprocessed_copy", False), + ) +""" +).strip() + + +def _get_kilosort_env_name() -> str: + return ( + os.environ.get("BND_KILOSORT_ENV") + or os.environ.get("KILOSORT_CONDA_ENV") + or "kilosort" + ) + + +def _find_conda_runner() -> str: + conda_exe = os.environ.get("CONDA_EXE") + if conda_exe and Path(conda_exe).exists(): + return conda_exe + + for candidate in ("conda", "mamba", "micromamba"): + resolved = shutil.which(candidate) + if resolved: + return resolved + + raise FileNotFoundError( + "Could not find a conda runner executable (tried CONDA_EXE, conda, mamba, micromamba)." + ) + + +def _run_in_conda_env( + env_name: str, args: list[str], *, capture_output: bool = False +) -> subprocess.CompletedProcess: + runner = _find_conda_runner() + cmd = [runner, "run", "-n", env_name, *args] + + # Workaround for WSL/DrvFs temp-file oddities (e.g., ftruncate -> ENOENT) when + # TEMP/TMP point to `/mnt/c/...`. Force a sane temp dir for subprocesses. + env = os.environ.copy() + if Path("/tmp").exists(): + env["TMPDIR"] = "/tmp" + env["TEMP"] = "/tmp" + env["TMP"] = "/tmp" + try: + return subprocess.run( + cmd, + check=True, + capture_output=capture_output, + text=capture_output, + env=env, + ) + except subprocess.CalledProcessError as e: + raise RuntimeError( + f"Command failed in conda env '{env_name}': {cmd} (exit code {e.returncode})." + ) from e + + +def _check_kilosort_cuda(env_name: str) -> tuple[bool, str | None]: + code = textwrap.dedent( + """ + import torch + + if torch.cuda.is_available(): + print("CUDA_AVAILABLE") + print(torch.cuda.get_device_name(0)) + else: + print("CUDA_NOT_AVAILABLE") + """ + ).strip() + + script_path: Path | None = None + try: + with tempfile.NamedTemporaryFile("w", suffix=".py", delete=False) as f: + f.write(code) + script_path = Path(f.name) + + proc = _run_in_conda_env( + env_name, ["python", str(script_path)], capture_output=True + ) + except Exception: + logger.debug("Could not check CUDA in env '%s'", env_name, exc_info=True) + return False, None + finally: + if script_path: + try: + script_path.unlink(missing_ok=True) + except Exception: + pass + + lines = [line.strip() for line in (proc.stdout or "").splitlines() if line.strip()] + if not lines: + return False, None + + if lines[0] == "CUDA_AVAILABLE": + return True, lines[1] if len(lines) > 1 else None + + return False, None + + +def _run_kilosort_in_env( + *, + env_name: str, + settings: dict, + probe_name: str, + data_dir: Path, + results_dir: Path, +) -> None: + payload = dict( + settings=settings, + probe_name=probe_name, + data_dir=str(data_dir), + results_dir=str(results_dir), + save_preprocessed_copy=False, + ) + + tmp_path: Path | None = None + runner_path: Path | None = None + try: + with tempfile.NamedTemporaryFile("w", suffix=".json", delete=False) as tmp: + json.dump(payload, tmp) + tmp_path = Path(tmp.name) + + with tempfile.NamedTemporaryFile("w", suffix=".py", delete=False) as runner: + runner.write(_KILOSORT_RUNNER_CODE) + runner_path = Path(runner.name) + + _run_in_conda_env( + env_name, ["python", str(runner_path), str(tmp_path)] + ) + + except Exception as e: + raise RuntimeError( + f"Failed to run Kilosort in the separate conda env '{env_name}'. " + "Make sure it exists and has the `kilosort` package installed. " + "You can override the env name via BND_KILOSORT_ENV." + ) from e + + finally: + for p in (tmp_path, runner_path): + if p: + try: + p.unlink(missing_ok=True) + except Exception: + pass + + def read_metadata(filepath: Path) -> dict: """Parse a section-less INI file (eg NPx metadata file) and return a dictionary of key-value pairs.""" with open(filepath, "r") as f: @@ -120,27 +291,19 @@ def run_kilosort_on_stream( ) ksort_output_path.mkdir(parents=True, exist_ok=True) - if not PROBE_DIR.exists(): - logger.info("Probe directory not found, downloading probes") - download_probes() - - if any(PROBE_DIR.glob(f"{probe_name}")): - # Sometimes the gateway can throw an error so just double check. - download_probes() - # Check if the metadata file is complete # when SpikeGLX crashes, metadata misses some values. _fix_session_ap_metadata(meta_file_path) # Find out which probe type we have probe_name = _read_probe_type(meta_file_path) - _ = run_kilosort( + env_name = _get_kilosort_env_name() + _run_kilosort_in_env( + env_name=env_name, settings=sorter_params, probe_name=probe_name, data_dir=probe_folder_path, results_dir=ksort_output_path, - save_preprocessed_copy=False, - verbose_console=False, ) return @@ -210,11 +373,17 @@ def run_kilosort_on_session(session_path: Path) -> None: else: ephys_recording_folders = config.get_subdirectories_from_pattern(session_path, "*_g?") - # Check kilosort is installed in environment - if torch.cuda.is_available(): - logger.info(f"CUDA is available. GPU device: {torch.cuda.get_device_name(0)}") + env_name = _get_kilosort_env_name() + cuda_available, device_name = _check_kilosort_cuda(env_name) + if cuda_available: + if device_name: + logger.info(f"CUDA is available in '{env_name}'. GPU device: {device_name}") + else: + logger.info(f"CUDA is available in '{env_name}'.") else: - logger.warning("CUDA is not available. GPU computations will not be enabled.") + logger.warning( + f"CUDA is not available in '{env_name}'. GPU computations will not be enabled." + ) if len(ephys_recording_folders) > 1: raise ValueError( "It seems you are trying to run kilosort without GPU. Look at the README on instrucstions of how to do this. " diff --git a/bnd/update_bnd.py b/bnd/update_bnd.py index 8ceef12..ca4aec3 100644 --- a/bnd/update_bnd.py +++ b/bnd/update_bnd.py @@ -1,13 +1,26 @@ import platform +import shutil import subprocess import warnings from pathlib import Path from .logger import set_logging -from .config import _load_config +from .config import _get_package_path logger = set_logging(__name__) +_REPO_URL = "https://github.com/BeNeuroLab/bnd.git" + + +def _find_repo_path() -> Path | None: + """Return the git repo root if bnd was installed from a local clone, else None.""" + pkg = _get_package_path() + # Walk up looking for .git (editable installs live inside the repo) + for parent in (pkg, *pkg.parents): + if (parent / ".git").is_dir(): + return parent + return None + def _run_git_command(repo_path: Path, command: list[str]) -> str: """ @@ -73,10 +86,17 @@ def check_for_updates() -> bool: Returns True if new commits are found, False otherwise. """ - config = _load_config() - package_path = config.REPO_PATH + repo_path = _find_repo_path() - new_commits = _get_new_commits(package_path) + if repo_path is None: + print( + "bnd is not installed from a local git clone.\n" + "To update, run:\n" + f' pipx install --force "bnd @ git+{_REPO_URL}"' + ) + return False + + new_commits = _get_new_commits(repo_path) if len(new_commits) > 0: print("New commits found, run `bnd self-update` to update the package.") @@ -86,25 +106,34 @@ def check_for_updates() -> bool: return True print("No new commits found, package is up to date.") + return False def update_bnd(print_new_commits: bool = True) -> None: """ - Update bnd if it was installed with conda + Update bnd. Uses git pull for editable installs, or pipx reinstall otherwise. Parameters ---------- print_new_commits """ - config = _load_config() + repo_path = _find_repo_path() + + if repo_path is None: + # pipx / pip install — can't reinstall ourselves while running + print( + "bnd is installed via pipx. To update, run this in your terminal:\n\n" + f' pipx install --force "bnd @ git+{_REPO_URL}"\n' + ) + return - new_commits = _get_new_commits(config.REPO_PATH) + new_commits = _get_new_commits(repo_path) if len(new_commits) > 0: print("New commits found, pulling changes...") - _run_git_command(config.REPO_PATH, ["pull", "origin", "main"]) + _run_git_command(repo_path, ["pull", "origin", "main"]) print(1 * "\n") print("Package updated successfully.") diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 0000000..7475c19 --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,59 @@ +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[project] +name = "bnd" +version = "0.1.0" +description = "BeNeuro data pipeline CLI" +readme = "README.md" +license = "MIT" +requires-python = ">=3.10" +classifiers = [ + "Programming Language :: Python :: 3", + "License :: OSI Approved :: MIT License", + "Operating System :: OS Independent", +] + +dependencies = [ + # -- CLI (lightweight base) ---------------------------------------------- + "typer>=0.9", + "rich", +] + +[project.optional-dependencies] +processing = [ + # -- data I/O & NWB ------------------------------------------------------ + "pynwb>=2.5", + "ndx-pose", + "neuroconv~=0.6.0", + "h5py", + + # -- electrophysiology --------------------------------------------------- + "spikeinterface", + "probeinterface", + + # -- scientific stack ---------------------------------------------------- + "numpy", + "pandas", + "scipy", + + # -- misc ---------------------------------------------------------------- + "pydantic", + "python-dateutil", + "pytz", +] +dev = [ + "pytest", + "bnd[processing]", +] + +[project.scripts] +bnd = "bnd.cli:app" + +[project.urls] +Homepage = "https://github.com/BeNeuroLab/bnd" +Repository = "https://github.com/BeNeuroLab/bnd" + +[tool.hatch.build.targets.wheel] +packages = ["bnd"] diff --git a/setup.py b/setup.py deleted file mode 100644 index dc184b3..0000000 --- a/setup.py +++ /dev/null @@ -1,22 +0,0 @@ -from setuptools import find_packages, setup - -setup( - name="bnd", - version="0.1.0", - packages=find_packages(), - install_requires=[ - "typer", - ], - entry_points={ - "console_scripts": [ - "bnd=bnd.cli:app", - ], - }, - python_requires=">=3.10", - include_package_data=True, - classifiers=[ - "Programming Language :: Python :: 3", - "License :: OSI Approved :: MIT License", - "Operating System :: OS Independent", - ], -)