|
Today, we’re announcing a suite of customization capabilities for Amazon Nova in Amazon SageMaker AI. Customers can now customize Nova Micro, Nova Lite, and Nova Pro across the model training lifecycle, including pre-training, supervised fine-tuning, and alignment. These techniques are available as ready-to-use Amazon SageMaker recipes with seamless deployment to Amazon Bedrock, supporting both on-demand and provisioned throughput inference.
Amazon Nova foundation models power diverse generative AI use cases across industries. As customers scale deployments, they need models that reflect proprietary knowledge, workflows, and brand requirements. Prompt optimization and retrieval-augmented generation (RAG) work well for integrating general-purpose foundation models into applications, however business-critical workflows require model customization to meet specific accuracy, cost, and latency requirements.
Choosing the right customization technique
Amazon Nova models support a range of customization techniques including: 1) supervised fine-tuning, 2) alignment, 3) continued pre-training, and 4) knowledge distillation. The optimal choice depends on goals, use case complexity, and the availability of data and compute resources. You can also combine multiple techniques to achieve your desired outcomes with the preferred mix of performance, cost, and flexibility.
Supervised fine-tuning (SFT) customizes model parameters using a training dataset of input-output pairs specific to your target tasks and domains. Choose from the following two implementation approaches based on data volume and cost considerations:
- Parameter-efficient fine-tuning (PEFT) — updates only a subset of model parameters through lightweight adapter layers such as LoRA (Low-Rank Adaptation). It offers faster training and lower compute costs compared to full fine-tuning. PEFT-adapted Nova models are imported to Amazon Bedrock and invoked using on-demand inference.
- Full fine-tuning (FFT) — updates all the parameters of the model and is ideal for scenarios when you have extensive training datasets (tens of thousands of records). Nova models customized through FFT can also be imported to Amazon Bedrock and invoked for inference with provisioned throughput.
Alignment steers the model output towards desired preferences for product-specific needs and behavior, such as company brand and customer experience requirements. These preferences may be encoded in multiple ways, including empirical examples and policies. Nova models support two preference alignment techniques:
- Direct preference optimization (DPO) — offers a straightforward way to tune model outputs using preferred/not preferred response pairs. DPO learns from comparative preferences to optimize outputs for subjective requirements such as tone and style. DPO offers both a parameter-efficient version and a full-model update version. The parameter-efficient version supports on-demand inference.
- Proximal policy optimization (PPO) — uses reinforcement learning to enhance model behavior by optimizing for desired rewards such as helpfulness, safety, or engagement. A reward model guides optimization by scoring outputs, helping the model learn effective behaviors while maintaining previously learned capabilities.
Continued pre-training (CPT) expands foundational model knowledge through self-supervised learning on large quantities of unlabeled proprietary data, including internal documents, transcripts, and business-specific content. CPT followed by SFT and alignment through DPO or PPO provides a comprehensive way to customize Nova models for your applications.
Knowledge distillation transfers knowledge from a larger “teacher” model to a smaller, faster, and more cost-efficient “student” model. Distillation is useful in scenarios where customers do not have adequate reference input-output samples and can leverage a more powerful model to augment the training data. This process creates a customized model of teacher-level accuracy for specific use cases and student-level cost-effectiveness and speed.
Here is a table summarizing the available customization techniques across different modalities and deployment options. Each technique offers specific training and inference capabilities depending on your implementation requirements.
Recipe | Modality | Training | Inference | ||
---|---|---|---|---|---|
Amazon Bedrock | Amazon SageMaker | Amazon Bedrock On-demand | Amazon Bedrock Provisioned Throughput | ||
Supervised fine tuning | Text, image, video | ||||
Parameter-efficient fine-tuning (PEFT) | ✅ | ✅ | ✅ | ✅ | |
Full fine-tuning | ✅ | ✅ | |||
Direct preference optimization (DPO) | Text, image | ||||
Parameter-efficient DPO | ✅ | ✅ | ✅ | ||
Full model DPO | ✅ | ✅ | |||
Proximal policy optimization (PPO) | Text-only | ✅ | ✅ | ||
Continuous pre-training | Text-only | ✅ | ✅ | ||
Distillation | Text-only | ✅ | ✅ | ✅ | ✅ |
Early access customers, including Cosine AI, Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL), Volkswagen, Amazon Customer Service, and Amazon Catalog Systems Service, are already successfully using Amazon Nova customization capabilities.
Customizing Nova models in action
The following walks you through an example of customizing the Nova Micro model using direct preference optimization on an existing preference dataset. To do this, you can use Amazon SageMaker Studio.
Launch your SageMaker Studio in the Amazon SageMaker AI console and choose JumpStart, a machine learning (ML) hub with foundation models, built-in algorithms, and pre-built ML solutions that you can deploy with a few clicks.
Then, choose Nova Micro, a text-only model that delivers the lowest latency responses at the lowest cost per inference among the Nova model family, and then choose Train.
Next, you can choose a fine-tuning recipe to train the model with labeled data to enhance performance on specific tasks and align with desired behaviors. Choosing the Direct Preference Optimization offers a straightforward way to tune model outputs with your preferences.
When you choose Open sample notebook, you have two environment options to run the recipe: either on the SageMaker training jobs or SageMaker Hyperpod:
Choose Run recipe on SageMaker training jobs when you don’t need to create a cluster and train the model with the sample notebook by selecting your JupyterLab space.
Alternately, if you want to have a persistent cluster environment optimized for iterative training processes, choose Run recipe on SageMaker HyperPod. You can choose a HyperPod EKS cluster with at least one restricted instance group (RIG) to provide a specialized isolated environment, which is required for such Nova model training. Then, choose your JupyterLabSpace and Open sample notebook.
This notebook provides an end-to-end walkthrough for creating a SageMaker HyperPod job using a SageMaker Nova model with a recipe and deploying it for inference. With the help of a SageMaker HyperPod recipe, you can streamline complex configurations and seamlessly integrate datasets for optimized training jobs.
In SageMaker Studio, you can see that your SageMaker HyperPod job has been successfully created and you can monitor it for further progress.
After your job completes, you can use a benchmark recipe to evaluate if the customized model performs better on agentic tasks.
For comprehensive documentation and additional example implementations, visit the SageMaker HyperPod recipes repository on GitHub. We continue to expand the recipes based on customer feedback and emerging ML trends, ensuring you have the tools needed for successful AI model customization.
Availability and getting started
Recipes for Amazon Nova on Amazon SageMaker AI are available in US East (N. Virginia). Learn more about this feature by visiting the Amazon Nova customization webpage and Amazon Nova user guide and get started in the Amazon SageMaker AI console.
–Betty
Updated on July 16, 2025 – Revised the table data and console screenshot.