A guide on Bedrock pricing
A guide on pricing for Amazon Bedrock, including high-performing foundation models (FMs) through an API, to build generative AI applications

Categories
Topics
Pricing overview
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
With Amazon Bedrock, you will be charged for:
- Model inference
- Model customization
- You have a choice of two pricing plans for inference:
- 1. On-Demand and Batch
Pay-as-you-go pricing without time-based commitments. - 2. Provisioned Throughput:
Provision throughput to meet performance requirements in exchange for a time-based commitment.
- 1. On-Demand and Batch
Pricing Models
On Demand and Batch
- On-Demand Mode: Pay only for what you use with no term commitments.
- Text-Generation Models: Charged per input and output token.
- Embeddings Models: Charged per input token.
- Image-Generation Models: Charged per image generated.
- Cross-Region Inference: Supports using compute across different AWS Regions to manage traffic bursts, with no extra charge.
- Batch Mode: Submit prompts as a single input file and receive responses in an output file.
- Responses are stored in an Amazon S3 bucket for future access.
- Batch inference pricing is 50% lower than on-demand pricing for select models from providers like Anthropic, Meta, Mistral AI, and Amazon.
Provisioned Throughput
- Provisioned Throughput Mode: Purchase model units for a specific base or custom model.
- Designed for large, consistent inference workloads needing guaranteed throughput.
- Custom models are only available with this mode.
- Model Unit: Provides a defined throughput (tokens processed per minute).
- Pricing: Charged by the hour with a choice of 1-month or 6-month commitment terms.
Custom Model Import
- Custom Model Import: Import your customized models into Amazon Bedrock to use them like other hosted models.
- No Charge: Importing a custom model to Bedrock is free.
- On-Demand Serving: Imported models are available on-demand with no control plane actions required.
- Inference Pricing: Charged based on the number of model copies needed for inference and their active duration (billed in 5-minute increments).
- Model Copy Cost: Pricing depends on factors like architecture, context length, AWS Region, compute version, and model size tier.