Skip to content
16 changes: 11 additions & 5 deletions training.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,21 @@ description: Post-train your models using reinforcement learning and supervised
mode: wide
---

Now in public preview, W&B Training offers serverless post-training for large language models (LLMs), including both reinforcement learning (RL) and supervised fine-tuning (SFT).
In public preview, W&B Training offers serverless post-training for large language models (LLMs), including both reinforcement learning (RL) and supervised fine-tuning (SFT).

* **[Serverless RL](/training/serverless-rl)**: Improve model reliability performing multi-turn, agentic tasks while increasing speed and reducing costs. RL is a training technique where models learn to improve their behavior through feedback on their outputs.
* **[Serverless RL](/training/serverless-rl)**: Improve model reliability for multi-turn, agentic tasks while increasing speed and reducing costs. RL is a training technique where models learn to improve their behavior through feedback on their outputs.
* **[Serverless SFT](/training/sft-training)**: Fine-tune models using curated datasets for distillation, teaching output style and format, or warming up before RL.

W&B Training includes integration with:
W&B Training integrates with the following:

* [ART](https://art.openpipe.ai/getting-started/about), a flexible fine-tuning framework.
* [RULER](https://openpipe.ai/blog/ruler), a universal verifier.
* A fully-managed backend on [CoreWeave Cloud](https://docs.coreweave.com/docs/platform).
* A fully managed backend on [CoreWeave Cloud](https://docs.coreweave.com/docs/platform).

To get started, satisfy the [prerequisites](/training/prerequisites) to start using the service and then see the [Serverless RL quickstart](https://art.openpipe.ai/getting-started/quick-start) or the [Serverless SFT docs](https://art.openpipe.ai/fundamentals/sft-training) to learn how to post-train your models.
To get started, complete the [prerequisites](/training/prerequisites), then see the [Serverless RL quickstart](https://art.openpipe.ai/getting-started/quick-start) or the [Serverless SFT docs](https://art.openpipe.ai/fundamentals/sft-training) to learn how to post-train your models.

Explore a [public demo workspace](https://wandb.ai/wandb/demo-project-qwen-email-agent-with-art-weave-models/workspace?nw=nwuserjuliarose) that includes the following examples:

* Train a Qwen model with OpenPipe RULER and [Weave Scorers](https://docs.wandb.ai/weave/guides/evaluation/scorers#create-your-own-scorers).
* Track training progress and [create custom plots](https://docs.wandb.ai/models/app/features/custom-charts) with W&B Models.
* Evaluate the final results on a [Weave leaderboard](https://docs.wandb.ai/weave/cookbooks/leaderboard_quickstart#leaderboard-quickstart).
Loading