Skip to content

XavierJiezou/Cloud-Adapter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

41 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

logo

[TGRS 2025] Cloud-Adapter

Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images

arXiv Paper Project Page HugginngFace Models HugginngFace Datasets DOI

framework


πŸ“’ Latest Updates

  • 8/2025: Our paper was accepted by IEEE Transactions on Geoscience and Remote Sensing (TGRS).

Installation

  1. Clone the Repository
git clone https://github.com/XavierJiezou/Cloud-Adapter.git
cd Cloud-Adapter  
  1. Install Dependencies

You can either set up the environment manually or use our pre-configured environment for convenience:

  • Option 1: Manual Installation

Ensure you are using Python 3.8 or higher, then install the required dependencies:

pip install -r requirements.txt  
  • Option 2: Use Pre-configured Environment

We provide a pre-configured environment (envs) hosted on Hugging Face. You can download it directly from Hugging Face. Follow the instructions on the page to set up and activate the environment.

Prepare Data

We have open-sourced all datasets used in the paper, which are hosted on Hugging Face Datasets. Please follow the instructions on the dataset page to download the data.

After downloading, organize the dataset as follows:

Cloud-Adapter
β”œβ”€β”€ ...
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ cloudsen12_high_l1c
β”‚   β”‚   β”œβ”€β”€ ann_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”œβ”€β”€ img_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”œβ”€β”€ cloudsen12_high_l2a
β”‚   β”‚   β”œβ”€β”€ ann_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”œβ”€β”€ img_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”œβ”€β”€ gf12ms_whu_gf1
β”‚   β”‚   β”œβ”€β”€ ann_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”œβ”€β”€ img_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”œβ”€β”€ gf12ms_whu_gf2
β”‚   β”‚   β”œβ”€β”€ ann_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”œβ”€β”€ img_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”œβ”€β”€ hrc_whu
β”‚   β”‚   β”œβ”€β”€ ann_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”œβ”€β”€ img_dir
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”œβ”€β”€ ...

Training

Step 1: Download and Convert Weights

  1. Download pretrained weights of vision foundation models

You can download the pretrained weights from the DINOv2 official repository.

Once downloading, you can convert the weights using the following command:

python tools/convert_models/convert_dinov2.py weight_path save_path --height image_height --width image_width

This command allows you to specify the desired image height and width for your use case.

You can also download the pretrained weights from SAM official repository.

After downloading, use the following command to convert the weights:

python tools/convert_models/convert_sam.py weight_path save_path --height image_height --width image_width

Step 2: Modify the Configuration File

After converting the backbone network weights, make sure to correctly specify the path to the configuration file within your config settings.

For example:

# configs/_base_/models/cloud_adapter_dinov2.py
model = dict(
    backbone=dict(
        type="CloudAdapterDinoVisionTransformer",
        init_cfg=dict(
            type="Pretrained",
            checkpoint="checkpoints/dinov2_converted.pth", # you can set weight path here
        ),
    ),
   
)

Update the configs directory with your training configuration, or use one of the provided example configurations. You can customize the backbone, dataset paths, and hyperparameters in the configuration file (e.g., configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py).

Step 3: Start Training

Use the following command to begin training:

CUDA_VISIBLE_DEVICES=0 python tools/train.py configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py

Step 4: Resume or Fine-tune

To resume training from a checkpoint or fine-tune using pretrained weights, run:

python tools/train.py configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py --resume-from path/to/checkpoint.pth  

Step 5: Generate Complete Weights

To optimize disk usage and accelerate training, the saved weights include only the adapter and head components.To synthesize the full weights, use the following command:

python tools/generate_full_weights.py --segmentor_save_path full_weight_path backbone_path --backbone backbone_path --head adapter_and_head_weight_path

Make sure to provide the appropriate paths for the backbone and the adapter/head weights.

Evaluation

All model weights used in the paper have been open-sourced and are available on Hugging Face Models.

Use the following command to evaluate the trained model:

CUDA_VISIBLE_DEVICES=0 python tools/test.py configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py path/to/checkpoint.pth  

Special Evaluation: L8_Biome Dataset

If you want to evaluate the model’s performance on different scenes of the L8_Biome dataset, you can run the following script:

python tools/eval_l8_scene.py --config configs/to/path.py --checkpoint path/to/checkpoint.pth --img_dir data/l8_biome

This will automatically evaluate the model across various scenes of the L8_Biome dataset, providing detailed performance metrics for each scene.

Reproducing Paper Comparisons

If you would like to reproduce the other models and comparisons presented in the paper, please refer to our other repository: CloudSeg. This repository contains the implementation and weights of the other models used for comparison in the study.

Visualization

We have published the pre-trained model's visualization results of various datasets on Hugging Face at Hugging Face. If you prefer not to run the code, you can directly visit the repository to download the visualization results.

Gradio Demo

We have created a Gradio demo to showcase the model's functionality. If you'd like to try it out, follow these steps:

  1. Navigate to the hugging_face directory:
cd hugging_face  
  1. Run the demo:
python app.py  

This will start the Gradio interface, where you can upload remote sensing images and visualize the model's segmentation results in real-time.

Troubleshooting

  • If you encounter a file not found error, it is likely that the model weights have not been downloaded. Please visit Hugging Face Models to download the pretrained model weights.

  • GPU Requirements: To run the model on a GPU, you will need at least 16GB of GPU memory.

  • Running on CPU: If you prefer to run the demo on CPU instead of GPU, set the following environment variable before running the demo:

export CUDA_VISIBLE_DEVICES=-1  

Citation

If you use our code or models in your research, please cite with:

@ARTICLE{cloud-adapter,
  author={Zou, Xuechao and Zhang, Shun and Li, Kai and Wang, Shiying and Xing, Junliang and Jin, Lei and Lang, Congyan and Tao, Pin},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images}, 
  year={2025},
  volume={63},
  number={},
  pages={1-14},
  keywords={Cloud computing;Feature extraction;Image segmentation;Remote sensing;Adaptation models;Foundation models;Accuracy;Visualization;Transformers;Transfer learning;Cloud segmentation;domain adaptation;fine-tuning;remote sensing image processing},
  doi={10.1109/TGRS.2025.3597410}
}

Acknowledgments

About

[TGRS 2025] Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages