30 Mar 19:44

ehsu3

dc43062

Release 1.3.24 Latest

Latest

New Features

Training Job Email Notifications (SMTJ, SMHP)

Added support for job notifications via email for training on the SMTJ and SMHP platforms via the NotificationManager class.
Provided CloudFormation notification templates to allow users to deploy a stack of resources for receiving status updates.

SMHP Cluster Instance Group Scaling

Added the scale_cluster() and get_instance_groups() functions to the SMHPRuntimeManager to allow users to scale their SMHP cluster restricted instance groups (RIGs) from the SDK.

RFT Lambda Deploy and Validation via RuntimeManager

Added deploy_lambda() to RuntimeManager to package a local python file into zip, create/update a Lambda function, and store the resulting ARN on rft_lambda_arn.
Added validate_lambda() method to invoke a deployed lambda with sample data from S3 to validate correctness before training.

Enhancements

Simplified ECS infrastructure for RFT Multi-Turn and improved reliability
- Removed Docker/ECR image push requirements
- Switched ECS tasks to a public Amazon Linux 2023 base image
- Improved CF stack state and virtual environment validation
- Note: User might need to update their IAM policies to remove ECR permissions and add new SSM/EC2/CloudFormation permissions for RFT multiturn.
Updated troubleshooting documentation to provide guidance on Bedrock deployment with permission restraints.
Improved CloudWatchLogMonitor usability by allowing job start time resolution from platform APIs when not explicitly provided.

Bug Fixes

Fixed evaluate() passing the incorrect training method to the RecipeBuilder to instead pass TrainingMethod.EVALUATION.
Added an optional “recipe” flag to the “validation_config” to bypass recipe validation when needed.

Assets 2

19 Mar 21:33

balajiaru

v1.3.18

6ed1f59

Release 1.3.18

Bug Fixes

Fixed incorrect input data upload format for RFT SageMaker Training Jobs

Assets 2

19 Mar 19:02

balajiaru

v1.3.17

0c80693

Release 1.3.17

Bug Fixes

Fixed incorrect input data upload format for Nova 1.0 SageMaker Training Jobs

Assets 2

18 Mar 15:01

amazeAmazing

v1.3.16

dd90fff

Release 1.3.16

Bug Fixes

Removed requirements.txt to address installation issue for Mac OS

Assets 2

17 Mar 19:36

amazeAmazing

v1.3.14

2506a29

Release 1.3.14

The package has been renamed to Nova Forge SDK.
All internal imports are now referenced as amzn_nova_forge.

Upgrade Instructions

To use the latest version of Nova Forge SDK, run:

# Remove the old SDK
pip uninstall amzn-nova-customization-sdk

# Install the new SDK
pip install amzn-nova-forge

New Features

Bedrock Fine‑tuning

Added support for Supervised Fine‑Tuning (SFT) and Reinforcement Fine‑Tuning (RFT) with Low‑Rank Adaptation (LoRA) on Bedrock.
Introduced a new runtime manager, platform support, and extensive infrastructure to integrate Bedrock as a training platform.
Implemented job creation, status tracking, and cleanup for Bedrock jobs.

Limitations & Constraints

Limitation	Details
Evaluation & batch inference	Not supported on Bedrock
Supported methods	Only SFT LoRA and RFT LoRA
Validation datasets	Not supported for Nova Lite 2 models
Monitoring	MLFlow monitoring is not available for Bedrock jobs

Serverless SageMaker Training Jobs

Enabled serverless SageMaker Training Jobs for SFT and Direct Preference Optimization (DPO) with Full‑Rank and LoRA.

Plot Training Metrics

Enabled plotting of:
- Training loss curve for SFT and Continuous Pre‑Training (CPT) jobs.
- Reward score curve for RFT jobs.

Enhancements

Data Prep operations refactored
These changes are backward compatible via deprecation warnings:
- transform()/validate() → method parameter renamed to training_method; the new parameter selects the operation type (default: SCHEMA).
- column_mappings moved from loader constructor to transform() kwarg.
- save_data() renamed to save().
- split_data() renamed to split().
- New guide: docs/data_prep.md.
- Updated README and QuickStart notebook.
SageMaker SDK V3 Upgrade
Security documentation
SECURITY.md contents moved to README.md.
Added Image URI override validation support.

Bug Fixes

Added missing IAM permissions for the RFT multiturn guidance.
Updated replicas override to use the customer‑provided instance_count instead of the recipe template value.

Assets 2

03 Mar 22:15

cmahima

v1.1.2

6a15c05

Release 1.1.2

New Features

Reinforcement Fine-Tuning (RFT) Multiturn

Added RFT Multiturn data transformer and validator.
Added support for session restoration using dump() and load().
Enabled custom starter kit path via starter_kit_path to use custom environments.
Deprecated start_training_environment() and start_evaluation_environment() in favor of start_environment().
Added a feature to detect duplicate environments running on same stack and infrastructure.

Reinforcement Fine-Tuning (RFT) Singleturn

Introduced RFT Lambda verification via the validation_config parameter in NovaModelCustomizer.

Job Caching

Added enable_job_caching to NovaModelCustomizer to save job results to disk and reuse them on subsequent calls with matching parameters.

Enhancements

Validation for SageMaker Inference (SMI) Configs

Added validation for context length and concurrency settings based on model and instance type when deploying to SMI endpoints.

Bug Fixes

Fixed validation logic for save_steps for RFT multiturn to accept integer values.
Removed pinned version constraint for numpy (numpy<=2.2.6) to resolve dependency conflicts.

Assets 2

20 Feb 16:11

amazeAmazing

v1.0.97

3dcffe0

Release 1.0.97

Bug Features

Addresses bug reported in #31
Updates error message with correct Github repo link

Assets 2

16 Feb 20:44

amazeAmazing

v1.0.96

23d2705

Release 1.0.96

New Features

SageMaker Model Deployment

Introduced SageMaker as a deployment platform option alongside Bedrock On-Demand and Provisioned Throughput
Introduced invoke_inference() method for real-time inference supporting both SageMaker and Bedrock platforms

Reinforcement Fine-Tuning (RFT) Multiturn

Enabled Reinforcement Fine-Tuning (RFT) multiturn training and evaluation for Nova 2.0 models (Forge-subscribed feature only)
Introduced CustomEnvironment and RFTMultiturnInfrastructure classes for setting up custom Reinforcement Learning (RL) environments on local, EC2, or ECS infrastructure for training and evaluation

Inference to SageMaker and Bedrock Endpoints

Added support for streaming and non-streaming inference requests for SageMaker text models
Added support for non-streaming inference requests for Bedrock text models

MLflow Monitoring

Added get_presigned_url() function to MLflowMonitor for generating presigned URLs to access the MLflow tracking server UI

Enhancements

Region Support

Added support for us-west-2

Model deployment interface (NovaModelCustomizer.deploy)

Renamed pt_units parameter to unit_count for broader applicability across deployment platforms
Replaced bedrock_execution_role_name parameter with execution_role_name for flexible role configuration across platforms
Added sagemaker_instance_type parameter with default value ml.g5.4xlarge for SageMaker deployments
Added sagemaker_environment_variablesparameter for SageMaker environment configuration

Bug Fixes

Fix recipe overrides by creating deep copies to avoid shared references especially for overrides that have the same values.
Fix validation logic for the "temperature" parameter to accept both int and float types.

Assets 2

30 Jan 18:01

amazeAmazing

v1.0.83

ac24ef0

Release 1.0.83

New Features

Training & Model Support

Adds Direct preference optimization (DPO) support (LoRA and Full) for SMTJ and SMHP on Nova 1.0

Memory Management

Refactors dataset loading to use iterators and lazy loading, allowing us to load large datasets with bounded host-memory utilization

Enhancements

Documentation

Improves documentation clarity by reducing bulk in README.md
Moves allowed instance types information from README.md to its own document
Expands SECURITY.md with security best practices and code examples
Adds CPT/DPO examples to JumpStart notebook
Updates spec.md with latest SDK changes and additional AWS documentation links

Installation & Setup

Improves guidance for Forge set-up
Improves notebook markdown formatting

Logging

Adds warning for checkpoint resolution failures with base model fallback in evaluate and batch inference

Bug Fixes

Fixed DatasetValidator for Multi-modal data with Nova models

Assets 2

23 Jan 18:49

swapneils

v1.0.72

99e6489

Release 1.0.72

Hey Nova builders! We have a bunch of features and Quality-of-Life enhancements for this release, and also improved some edge-case behaviors from the initial release.

New Features

MLflow Integration

Track training experiments with Amazon SageMaker MLflow tracking servers
Auto-discover DefaultMLFlowApp in your AWS account or specify a custom tracking URI
Log metrics, hyperparameters, and model artifacts automatically during training

Continued Pre-Training (CPT) Support

Pre-train Nova models on your own datasets

Data Mixing for SFT and CPT (Nova Forge customers only)

Blend your custom training data with Nova's high-quality curated datasets

OpenAI Messages Format Conversion

Transform datasets from OpenAI chat format to the Converse API format for use with Nova models

Multimodal SFT Dataset Validation

Validate image content in SFT datasets (PNG, JPEG, GIF formats)
Validate video content in SFT datasets (MOV, MKV, MP4 formats)
Validate document content for Nova 2.0 (PDF format)

Enhancements

Dataset Validation

Enhanced data-format validation for SFT, RFT, CPT, and Eval jobs, including message role ordering, content types, and tool specifications / uses.

IAM and Security

We now have more granular validations of whether the IAM calling role has the required permissions for a job. Where possible, we also validate the SMTJ execution role.
- IAM validation can be disabled via the validation_config parameter of NovaModelCustomizer
New create_bedrock_execution_role() utility function generates scoped-down IAM roles for Bedrock model deployment with minimal required permissions
New VPC configuration parameters (subnets, security_group_ids) in SMTJRuntimeManager allow training jobs to run within your VPC for network isolation
New kms_key_id parameter encrypts training job output artifacts and inter-container traffic with your KMS key

Recipe Management

Recipes are now automatically pulled from SageMaker JumpStart, ensuring you always use the latest supported configurations
When starting evaluation, batch inference, or model deployment, we now automatically extract the model checkpoint from your most recent training job output unless explicitly specified.
- You can also provide a TrainingResult object to extract the model checkpoint from it.

Documentation

Added SECURITY.md for vulnerability reporting

Bug Fixes

Fixed edge-case where evaluation jobs would fail if data_s3_path was set on the customizer but not needed for the evaluation task
Improved validation error messages with specific field locations
Improved README setup documentation and fixed some errors in the examples

Assets 2

Releases: aws/nova-forge-sdk

Release 1.3.24

New Features

Training Job Email Notifications (SMTJ, SMHP)

SMHP Cluster Instance Group Scaling

RFT Lambda Deploy and Validation via RuntimeManager

Enhancements

Bug Fixes

Uh oh!

Release 1.3.18

Release 1.3.18

Bug Fixes

Uh oh!

Release 1.3.17

Release 1.3.17

Bug Fixes

Uh oh!

Release 1.3.16

Release 1.3.16

Bug Fixes

Uh oh!

Release 1.3.14

Release 1.3.14

Upgrade Instructions

New Features

Bedrock Fine‑tuning

Limitations & Constraints

Serverless SageMaker Training Jobs

Plot Training Metrics

Enhancements

Bug Fixes

Uh oh!

Release 1.1.2

Release 1.1.2

New Features

Reinforcement Fine-Tuning (RFT) Multiturn

Reinforcement Fine-Tuning (RFT) Singleturn

Job Caching

Enhancements

Validation for SageMaker Inference (SMI) Configs

Bug Fixes

Uh oh!

Release 1.0.97

Bug Features

Uh oh!

Release 1.0.96

New Features

SageMaker Model Deployment

Reinforcement Fine-Tuning (RFT) Multiturn

Inference to SageMaker and Bedrock Endpoints

MLflow Monitoring

Enhancements

Region Support

Model deployment interface (NovaModelCustomizer.deploy)

Bug Fixes

Uh oh!

Release 1.0.83

New Features

Enhancements

Bug Fixes

Uh oh!

Release 1.0.72

New Features

Enhancements

Bug Fixes

Uh oh!