A guide on Bedrock pricing

A guide on pricing for Amazon Bedrock, including high-performing foundation models (FMs) through an API, to build generative AI applications

Ammar Mohanna

AI Engineering Lead, EDT&Partners

Benito Castellanos

VP of Technology, EDT&Partners

calender-image
clock-image
2 min
Categories
Innovation & Technology
Topics
AI in Education
Cloud & Infrastructure
Table of contents

Pricing overview

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

With Amazon Bedrock, you will be charged for:

  • Model inference
  • Model customization
  • You have a choice of two pricing plans for inference:
    • 1. On-Demand and Batch
      Pay-as-you-go pricing without time-based commitments.
    • 2. Provisioned Throughput:
      Provision throughput to meet performance requirements in exchange for a time-based commitment.

Pricing Models

On Demand and Batch

  • On-Demand Mode: Pay only for what you use with no term commitments.
    • Text-Generation Models: Charged per input and output token.
    • Embeddings Models: Charged per input token.
    • Image-Generation Models: Charged per image generated.
    • Cross-Region Inference: Supports using compute across different AWS Regions to manage traffic bursts, with no extra charge.
  • Batch Mode: Submit prompts as a single input file and receive responses in an output file.
    • Responses are stored in an Amazon S3 bucket for future access.
    • Batch inference pricing is 50% lower than on-demand pricing for select models from providers like Anthropic, Meta, Mistral AI, and Amazon.

Provisioned Throughput

  • Provisioned Throughput Mode: Purchase model units for a specific base or custom model.
    • Designed for large, consistent inference workloads needing guaranteed throughput.
    • Custom models are only available with this mode.
    • Model Unit: Provides a defined throughput (tokens processed per minute).
    • Pricing: Charged by the hour with a choice of 1-month or 6-month commitment terms.

Custom Model Import

  • Custom Model Import: Import your customized models into Amazon Bedrock to use them like other hosted models.
    • No Charge: Importing a custom model to Bedrock is free.
    • On-Demand Serving: Imported models are available on-demand with no control plane actions required.
    • Inference Pricing: Charged based on the number of model copies needed for inference and their active duration (billed in 5-minute increments).
    • Model Copy Cost: Pricing depends on factors like architecture, context length, AWS Region, compute version, and model size tier.