AWS Blogs 11月06日 16:07
Amazon Nova推出多模态嵌入模型,实现跨模态检索
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊在Bedrock上推出了Amazon Nova Multimodal Embeddings,这是一个先进的多模态嵌入模型,支持文本、文档、图像、视频和音频,实现单一模型下的跨模态检索。该模型能够将不同类型的数据转换为数值表示,从而赋能AI系统进行语义搜索和检索增强生成(RAG)。与以往模型不同,Nova支持统一的语义空间,有效处理文本与图像混合的内容,并提供高达8K的上下文长度和200种语言支持。此外,它还支持分段处理长内容,并采用Matryoshka Representation Learning优化了检索效率和准确性,为处理海量非结构化数据提供了强大工具。

✨ **统一的多模态嵌入能力:** Amazon Nova Multimodal Embeddings是首个支持文本、文档、图像、视频和音频的统一嵌入模型,能够通过单一模型实现跨模态检索。这意味着用户可以用一种模型处理所有类型的数据,极大地简化了多模态应用开发,并能有效处理包含文本和图像混合的内容,捕捉更丰富的语义信息。

🚀 **领先的检索与RAG性能:** 该模型专为代理式检索增强生成(RAG)和语义搜索应用而设计,能够将不同模态的输入转换为数值表示(嵌入),这些嵌入能够捕捉输入的语义含义,使AI系统能够进行高效的比较、搜索和分析。这对于从日益增长的跨文本、图像、文档、视频和音频等非结构化数据中提取洞察至关重要。

💡 **灵活高效的特性:** Nova Multimodal Embeddings支持高达8K的上下文长度,覆盖200种语言,并能通过同步和异步API进行输入。其创新的分段(chunking)功能允许对长文本、视频或音频内容进行分割,生成独立嵌入,从而实现对海量内容的有效管理和检索。此外,通过Matryoshka Representation Learning(MRL)训练,模型提供了四种输出嵌入维度,能够在保持低延迟的同时,最大程度地减少准确性损失,并支持批量推理以提高处理效率。

📊 **卓越的性能表现:** Amazon Nova Multimodal Embeddings在广泛的基准测试中展现出领先的准确性,能够直接提供出色的开箱即用性能。模型通过将不同类型的数据映射到统一的语义空间,确保了跨模态搜索的准确性和相关性,例如可以使用参考图像进行搜索,或者检索包含视觉和文本信息的文档。

Today, we’re introducing Amazon Nova Multimodal Embeddings, a state-of-the-art multimodal embedding model for agentic retrieval-augmented generation (RAG) and semantic search applications, available in Amazon Bedrock. It is the first unified embedding model that supports text, documents, images, video, and audio through a single model to enable crossmodal retrieval with leading accuracy.

Embedding models convert textual, visual, and audio inputs into numerical representations called embeddings. These embeddings capture the meaning of the input in a way that AI systems can compare, search, and analyze, powering use cases such as semantic search and RAG.

Organizations are increasingly seeking solutions to unlock insights from the growing volume of unstructured data that is spread across text, image, document, video, and audio content. For example, an organization might have product images, brochures that contain infographics and text, and user-uploaded video clips. Embedding models are able to unlock value from unstructured data, however traditional models are typically specialized to handle one content type. This limitation drives customers to either build complex crossmodal embedding solutions or restrict themselves to use cases focused on a single content type. The problem also applies to mixed-modality content types such as documents with interleaved text and images or video with visual, audio, and textual elements where existing models struggle to capture crossmodal relationships effectively.

Nova Multimodal Embeddings supports a unified semantic space for text, documents, images, video, and audio for use cases such as crossmodal search across mixed-modality content, searching with a reference image, and retrieving visual documents.

Evaluating Amazon Nova Multimodal Embeddings performance
We evaluated the model on a broad range of benchmarks, and it delivers leading accuracy out-of-the-box as described in the following table.

Nova Multimodal Embeddings supports a context length of up to 8K tokens, text in up to 200 languages, and accepts inputs via synchronous and asynchronous APIs. Additionally, it supports segmentation (also known as “chunking”) to partition long-form text, video, or audio content into manageable segments, generating embeddings for each portion. Lastly, the model offers four output embedding dimensions, trained using Matryoshka Representation Learning (MRL) that enables low-latency end-to-end retrieval with minimal accuracy changes.

Nova Multimodal Embeddings supports batch inference, allowing users to convert large volumes of content into embeddings more efficiently. Instead of sending individual requests for each, users can send multiple items in a single request, reducing API overhead.

Let’s see how the new model can be used in practice.

Using Amazon Nova Multimodal Embeddings
Getting started with Nova Multimodal Embeddings follows the same pattern as other models in Amazon Bedrock. The model accepts text, documents, images, video, or audio as input and returns numerical embeddings that you can use for semantic search, similarity comparison, or RAG.

Here’s a practical example using the AWS SDK for Python (Boto3) that shows how to create embeddings from different content types and store them for later retrieval. For simplicity, I’ll use Amazon S3 Vectors, a cost-optimized storage with native support for storing and querying vectors at any scale, to store and search the embeddings.

Let’s start with the fundamentals: converting text into embeddings. This example shows how to transform a simple text description into a numerical representation that captures its semantic meaning. These embeddings can later be compared with embeddings from documents, images, videos, or audio to find related content.

To make the code easy to follow, I’ll show a section of the script at a time. The full script is included at the end of this walkthrough.

import jsonimport base64import timeimport boto3MODEL_ID = "amazon.nova-2-multimodal-embeddings-v1:0"EMBEDDING_DIMENSION = 3072# Initialize Amazon Bedrock Runtime clientbedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")print(f"Generating text embedding with {MODEL_ID} ...")# Text to embedtext = "Amazon Nova is a multimodal foundation model"# Create embeddingrequest_body = {    "taskType": "SINGLE_EMBEDDING",    "singleEmbeddingParams": {        "embeddingPurpose": "GENERIC_INDEX",        "embeddingDimension": EMBEDDING_DIMENSION,        "text": {"truncationMode": "END", "value": text},    },}response = bedrock_runtime.invoke_model(    body=json.dumps(request_body),    modelId=MODEL_ID,    contentType="application/json",)# Extract embeddingresponse_body = json.loads(response["body"].read())embedding = response_body["embeddings"][0]["embedding"]print(f"Generated embedding with {len(embedding)} dimensions")

Now we’ll process visual content using the same embedding space using a photo.jpg file in the same folder as the script. This demonstrates the power of multimodality: Nova Multimodal Embeddings is able to capture both textual and visual context into a single embedding that provides enhanced understanding of the document.

Nova Multimodal Embeddings can generate embeddings that are optimized for how they are being used. When indexing for a search or retrieval use case, embeddingPurpose can be set to GENERIC_INDEX. For the query step, embeddingPurpose can be set depending on the type of item to be retrieved. For example, when retrieving documents, embeddingPurpose can be set to DOCUMENT_RETRIEVAL.

# Read and encode imageprint(f"Generating image embedding with {MODEL_ID} ...")with open("photo.jpg", "rb") as f:    image_bytes = base64.b64encode(f.read()).decode("utf-8")# Create embeddingrequest_body = {    "taskType": "SINGLE_EMBEDDING",    "singleEmbeddingParams": {        "embeddingPurpose": "GENERIC_INDEX",        "embeddingDimension": EMBEDDING_DIMENSION,        "image": {            "format": "jpeg",            "source": {"bytes": image_bytes}        },    },}response = bedrock_runtime.invoke_model(    body=json.dumps(request_body),    modelId=MODEL_ID,    contentType="application/json",)# Extract embeddingresponse_body = json.loads(response["body"].read())embedding = response_body["embeddings"][0]["embedding"]print(f"Generated embedding with {len(embedding)} dimensions")

To process video content, I use the asynchronous API. That’s a requirement for videos that are larger than 25MB when encoded as Base64. First, I upload a local video to an S3 bucket in the same AWS Region.

aws s3 cp presentation.mp4 s3://my-video-bucket/videos/

This example shows how to extract embeddings from both visual and audio components of a video file. The segmentation feature breaks longer videos into manageable chunks, making it practical to search through hours of content efficiently.

# Initialize Amazon S3 clients3 = boto3.client("s3", region_name="us-east-1")print(f"Generating video embedding with {MODEL_ID} ...")# Amazon S3 URIsS3_VIDEO_URI = "s3://my-video-bucket/videos/presentation.mp4"S3_EMBEDDING_DESTINATION_URI = "s3://my-embedding-destination-bucket/embeddings-output/"# Create async embedding job for video with audiomodel_input = {    "taskType": "SEGMENTED_EMBEDDING",    "segmentedEmbeddingParams": {        "embeddingPurpose": "GENERIC_INDEX",        "embeddingDimension": EMBEDDING_DIMENSION,        "video": {            "format": "mp4",            "embeddingMode": "AUDIO_VIDEO_COMBINED",            "source": {                "s3Location": {"uri": S3_VIDEO_URI}            },            "segmentationConfig": {                "durationSeconds": 15  # Segment into 15-second chunks            },        },    },}response = bedrock_runtime.start_async_invoke(    modelId=MODEL_ID,    modelInput=model_input,    outputDataConfig={        "s3OutputDataConfig": {            "s3Uri": S3_EMBEDDING_DESTINATION_URI        }    },)invocation_arn = response["invocationArn"]print(f"Async job started: {invocation_arn}")# Poll until job completesprint("\nPolling for job completion...")while True:    job = bedrock_runtime.get_async_invoke(invocationArn=invocation_arn)    status = job["status"]    print(f"Status: {status}")    if status != "InProgress":        break    time.sleep(15)# Check if job completed successfullyif status == "Completed":    output_s3_uri = job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"]    print(f"\nSuccess! Embeddings at: {output_s3_uri}")    # Parse S3 URI to get bucket and prefix    s3_uri_parts = output_s3_uri[5:].split("/", 1)  # Remove "s3://" prefix    bucket = s3_uri_parts[0]    prefix = s3_uri_parts[1] if len(s3_uri_parts) > 1 else ""    # AUDIO_VIDEO_COMBINED mode outputs to embedding-audio-video.jsonl    # The output_s3_uri already includes the job ID, so just append the filename    embeddings_key = f"{prefix}/embedding-audio-video.jsonl".lstrip("/")    print(f"Reading embeddings from: s3://{bucket}/{embeddings_key}")    # Read and parse JSONL file    response = s3.get_object(Bucket=bucket, Key=embeddings_key)    content = response['Body'].read().decode('utf-8')    embeddings = []    for line in content.strip().split('\n'):        if line:            embeddings.append(json.loads(line))    print(f"\nFound {len(embeddings)} video segments:")    for i, segment in enumerate(embeddings):        print(f"  Segment {i}: {segment.get('startTime', 0):.1f}s - {segment.get('endTime', 0):.1f}s")        print(f"    Embedding dimension: {len(segment.get('embedding', []))}")else:    print(f"\nJob failed: {job.get('failureMessage', 'Unknown error')}")

With our embeddings generated, we need a place to store and search them efficiently. This example demonstrates setting up a vector store using Amazon S3 Vectors, which provides the infrastructure needed for similarity search at scale. Think of this as creating a searchable index where semantically similar content naturally clusters together. When adding an embedding to the index, I use the metadata to specify the original format and the content being indexed.

# Initialize Amazon S3 Vectors clients3vectors = boto3.client("s3vectors", region_name="us-east-1")# ConfigurationVECTOR_BUCKET = "my-vector-store"INDEX_NAME = "embeddings"# Create vector bucket and index (if they don't exist)try:    s3vectors.get_vector_bucket(vectorBucketName=VECTOR_BUCKET)    print(f"Vector bucket {VECTOR_BUCKET} already exists")except s3vectors.exceptions.NotFoundException:    s3vectors.create_vector_bucket(vectorBucketName=VECTOR_BUCKET)    print(f"Created vector bucket: {VECTOR_BUCKET}")try:    s3vectors.get_index(vectorBucketName=VECTOR_BUCKET, indexName=INDEX_NAME)    print(f"Vector index {INDEX_NAME} already exists")except s3vectors.exceptions.NotFoundException:    s3vectors.create_index(        vectorBucketName=VECTOR_BUCKET,        indexName=INDEX_NAME,        dimension=EMBEDDING_DIMENSION,        dataType="float32",        distanceMetric="cosine"    )    print(f"Created index: {INDEX_NAME}")texts = [    "Machine learning on AWS",    "Amazon Bedrock provides foundation models",    "S3 Vectors enables semantic search"]print(f"\nGenerating embeddings for {len(texts)} texts...")# Generate embeddings using Amazon Nova for each textvectors = []for text in texts:    response = bedrock_runtime.invoke_model(        body=json.dumps({            "taskType": "SINGLE_EMBEDDING",            "singleEmbeddingParams": {                "embeddingDimension": EMBEDDING_DIMENSION,                "text": {"truncationMode": "END", "value": text}            }        }),        modelId=MODEL_ID,        accept="application/json",        contentType="application/json"    )    response_body = json.loads(response["body"].read())    embedding = response_body["embeddings"][0]["embedding"]    vectors.append({        "key": f"text:{text[:50]}",  # Unique identifier        "data": {"float32": embedding},        "metadata": {"type": "text", "content": text}    })    print(f"  ✓ Generated embedding for: {text}")# Add all vectors to store in a single calls3vectors.put_vectors(    vectorBucketName=VECTOR_BUCKET,    indexName=INDEX_NAME,    vectors=vectors)print(f"\nSuccessfully added {len(vectors)} vectors to the store in one put_vectors call!")

This final example demonstrates the capability of searching across different content types with a single query, finding the most similar content regardless of whether it originated from text, images, videos, or audio. The distance scores help you understand how closely related the results are to your original query.

# Text to queryquery_text = "foundation models"  print(f"\nGenerating embeddings for query '{query_text}' ...")# Generate embeddingsresponse = bedrock_runtime.invoke_model(    body=json.dumps({        "taskType": "SINGLE_EMBEDDING",        "singleEmbeddingParams": {            "embeddingPurpose": "GENERIC_RETRIEVAL",            "embeddingDimension": EMBEDDING_DIMENSION,            "text": {"truncationMode": "END", "value": query_text}        }    }),    modelId=MODEL_ID,    accept="application/json",    contentType="application/json")response_body = json.loads(response["body"].read())query_embedding = response_body["embeddings"][0]["embedding"]print(f"Searching for similar embeddings...\n")# Search for top 5 most similar vectorsresponse = s3vectors.query_vectors(    vectorBucketName=VECTOR_BUCKET,    indexName=INDEX_NAME,    queryVector={"float32": query_embedding},    topK=5,    returnDistance=True,    returnMetadata=True)# Display resultsprint(f"Found {len(response['vectors'])} results:\n")for i, result in enumerate(response["vectors"], 1):    print(f"{i}. {result['key']}")    print(f"   Distance: {result['distance']:.4f}")    if result.get("metadata"):        print(f"   Metadata: {result['metadata']}")    print()

Crossmodal search is one of the key advantages of multimodal embeddings. With crossmodal search, you can query with text and find relevant images. You can also search for videos using text descriptions, find audio clips that match certain topics, or discover documents based on their visual and textual content. For your reference, the full script with all previous examples merged together is here:

import jsonimport base64import timeimport boto3MODEL_ID = "amazon.nova-2-multimodal-embeddings-v1:0"EMBEDDING_DIMENSION = 3072# Initialize Amazon Bedrock Runtime clientbedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")print(f"Generating text embedding with {MODEL_ID} ...")# Text to embedtext = "Amazon Nova is a multimodal foundation model"# Create embeddingrequest_body = {    "taskType": "SINGLE_EMBEDDING",    "singleEmbeddingParams": {        "embeddingPurpose": "GENERIC_INDEX",        "embeddingDimension": EMBEDDING_DIMENSION,        "text": {"truncationMode": "END", "value": text},    },}response = bedrock_runtime.invoke_model(    body=json.dumps(request_body),    modelId=MODEL_ID,    contentType="application/json",)# Extract embeddingresponse_body = json.loads(response["body"].read())embedding = response_body["embeddings"][0]["embedding"]print(f"Generated embedding with {len(embedding)} dimensions")# Read and encode imageprint(f"Generating image embedding with {MODEL_ID} ...")with open("photo.jpg", "rb") as f:    image_bytes = base64.b64encode(f.read()).decode("utf-8")# Create embeddingrequest_body = {    "taskType": "SINGLE_EMBEDDING",    "singleEmbeddingParams": {        "embeddingPurpose": "GENERIC_INDEX",        "embeddingDimension": EMBEDDING_DIMENSION,        "image": {            "format": "jpeg",            "source": {"bytes": image_bytes}        },    },}response = bedrock_runtime.invoke_model(    body=json.dumps(request_body),    modelId=MODEL_ID,    contentType="application/json",)# Extract embeddingresponse_body = json.loads(response["body"].read())embedding = response_body["embeddings"][0]["embedding"]print(f"Generated embedding with {len(embedding)} dimensions")# Initialize Amazon S3 clients3 = boto3.client("s3", region_name="us-east-1")print(f"Generating video embedding with {MODEL_ID} ...")# Amazon S3 URIsS3_VIDEO_URI = "s3://my-video-bucket/videos/presentation.mp4"# Amazon S3 output bucket and locationS3_EMBEDDING_DESTINATION_URI = "s3://my-video-bucket/embeddings-output/"# Create async embedding job for video with audiomodel_input = {    "taskType": "SEGMENTED_EMBEDDING",    "segmentedEmbeddingParams": {        "embeddingPurpose": "GENERIC_INDEX",        "embeddingDimension": EMBEDDING_DIMENSION,        "video": {            "format": "mp4",            "embeddingMode": "AUDIO_VIDEO_COMBINED",            "source": {                "s3Location": {"uri": S3_VIDEO_URI}            },            "segmentationConfig": {                "durationSeconds": 15  # Segment into 15-second chunks            },        },    },}response = bedrock_runtime.start_async_invoke(    modelId=MODEL_ID,    modelInput=model_input,    outputDataConfig={        "s3OutputDataConfig": {            "s3Uri": S3_EMBEDDING_DESTINATION_URI        }    },)invocation_arn = response["invocationArn"]print(f"Async job started: {invocation_arn}")# Poll until job completesprint("\nPolling for job completion...")while True:    job = bedrock_runtime.get_async_invoke(invocationArn=invocation_arn)    status = job["status"]    print(f"Status: {status}")    if status != "InProgress":        break    time.sleep(15)# Check if job completed successfullyif status == "Completed":    output_s3_uri = job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"]    print(f"\nSuccess! Embeddings at: {output_s3_uri}")    # Parse S3 URI to get bucket and prefix    s3_uri_parts = output_s3_uri[5:].split("/", 1)  # Remove "s3://" prefix    bucket = s3_uri_parts[0]    prefix = s3_uri_parts[1] if len(s3_uri_parts) > 1 else ""    # AUDIO_VIDEO_COMBINED mode outputs to embedding-audio-video.jsonl    # The output_s3_uri already includes the job ID, so just append the filename    embeddings_key = f"{prefix}/embedding-audio-video.jsonl".lstrip("/")    print(f"Reading embeddings from: s3://{bucket}/{embeddings_key}")    # Read and parse JSONL file    response = s3.get_object(Bucket=bucket, Key=embeddings_key)    content = response['Body'].read().decode('utf-8')    embeddings = []    for line in content.strip().split('\n'):        if line:            embeddings.append(json.loads(line))    print(f"\nFound {len(embeddings)} video segments:")    for i, segment in enumerate(embeddings):        print(f"  Segment {i}: {segment.get('startTime', 0):.1f}s - {segment.get('endTime', 0):.1f}s")        print(f"    Embedding dimension: {len(segment.get('embedding', []))}")else:    print(f"\nJob failed: {job.get('failureMessage', 'Unknown error')}")# Initialize Amazon S3 Vectors clients3vectors = boto3.client("s3vectors", region_name="us-east-1")# ConfigurationVECTOR_BUCKET = "my-vector-store"INDEX_NAME = "embeddings"# Create vector bucket and index (if they don't exist)try:    s3vectors.get_vector_bucket(vectorBucketName=VECTOR_BUCKET)    print(f"Vector bucket {VECTOR_BUCKET} already exists")except s3vectors.exceptions.NotFoundException:    s3vectors.create_vector_bucket(vectorBucketName=VECTOR_BUCKET)    print(f"Created vector bucket: {VECTOR_BUCKET}")try:    s3vectors.get_index(vectorBucketName=VECTOR_BUCKET, indexName=INDEX_NAME)    print(f"Vector index {INDEX_NAME} already exists")except s3vectors.exceptions.NotFoundException:    s3vectors.create_index(        vectorBucketName=VECTOR_BUCKET,        indexName=INDEX_NAME,        dimension=EMBEDDING_DIMENSION,        dataType="float32",        distanceMetric="cosine"    )    print(f"Created index: {INDEX_NAME}")texts = [    "Machine learning on AWS",    "Amazon Bedrock provides foundation models",    "S3 Vectors enables semantic search"]print(f"\nGenerating embeddings for {len(texts)} texts...")# Generate embeddings using Amazon Nova for each textvectors = []for text in texts:    response = bedrock_runtime.invoke_model(        body=json.dumps({            "taskType": "SINGLE_EMBEDDING",            "singleEmbeddingParams": {                "embeddingPurpose": "GENERIC_INDEX",                "embeddingDimension": EMBEDDING_DIMENSION,                "text": {"truncationMode": "END", "value": text}            }        }),        modelId=MODEL_ID,        accept="application/json",        contentType="application/json"    )    response_body = json.loads(response["body"].read())    embedding = response_body["embeddings"][0]["embedding"]    vectors.append({        "key": f"text:{text[:50]}",  # Unique identifier        "data": {"float32": embedding},        "metadata": {"type": "text", "content": text}    })    print(f"  ✓ Generated embedding for: {text}")# Add all vectors to store in a single calls3vectors.put_vectors(    vectorBucketName=VECTOR_BUCKET,    indexName=INDEX_NAME,    vectors=vectors)print(f"\nSuccessfully added {len(vectors)} vectors to the store in one put_vectors call!")# Text to queryquery_text = "foundation models"  print(f"\nGenerating embeddings for query '{query_text}' ...")# Generate embeddingsresponse = bedrock_runtime.invoke_model(    body=json.dumps({        "taskType": "SINGLE_EMBEDDING",        "singleEmbeddingParams": {            "embeddingPurpose": "GENERIC_RETRIEVAL",            "embeddingDimension": EMBEDDING_DIMENSION,            "text": {"truncationMode": "END", "value": query_text}        }    }),    modelId=MODEL_ID,    accept="application/json",    contentType="application/json")response_body = json.loads(response["body"].read())query_embedding = response_body["embeddings"][0]["embedding"]print(f"Searching for similar embeddings...\n")# Search for top 5 most similar vectorsresponse = s3vectors.query_vectors(    vectorBucketName=VECTOR_BUCKET,    indexName=INDEX_NAME,    queryVector={"float32": query_embedding},    topK=5,    returnDistance=True,    returnMetadata=True)# Display resultsprint(f"Found {len(response['vectors'])} results:\n")for i, result in enumerate(response["vectors"], 1):    print(f"{i}. {result['key']}")    print(f"   Distance: {result['distance']:.4f}")    if result.get("metadata"):        print(f"   Metadata: {result['metadata']}")    print()

For production applications, embeddings can be stored in any vector database. Amazon OpenSearch Service offers native integration with Nova Multimodal Embeddings at launch, making it straightforward to build scalable search applications. As shown in the examples before, Amazon S3 Vectors provides a simple way to store and query embeddings with your application data.

Things to know
Nova Multimodal Embeddings offers four output dimension options: 3,072, 1,024, 384, and 256. Larger dimensions provide more detailed representations but require more storage and computation. Smaller dimensions offer a practical balance between retrieval performance and resource efficiency. This flexibility helps you optimize for your specific application and cost requirements.

The model handles substantial context lengths. For text inputs, it can process up to 8,192 tokens at once. Video and audio inputs support segments of up to 30 seconds, and the model can segment longer files. This segmentation capability is particularly useful when working with large media files—the model splits them into manageable pieces and creates embeddings for each segment.

The model includes responsible AI features built into Amazon Bedrock. Content submitted for embedding goes through Amazon Bedrock content safety filters, and the model includes fairness measures to reduce bias.

As described in the code examples, the model can be invoked through both synchronous and asynchronous APIs. The synchronous API works well for real-time applications where you need immediate responses, such as processing user queries in a search interface. The asynchronous API handles latency insensitive workloads more efficiently, making it suitable for processing large content such as videos.

Availability and pricing
Amazon Nova Multimodal Embeddings is available today in Amazon Bedrock in the US East (N. Virginia) AWS Region. For detailed pricing information, visit the Amazon Bedrock pricing page.

To learn more, see the Amazon Nova User Guide for comprehensive documentation and the Amazon Nova model cookbook on GitHub for practical code examples.

If you’re using an AI–powered assistant for software development such as Amazon Q Developer or Kiro, you can set up the AWS API MCP Server to help the AI assistants interact with AWS services and resources and the AWS Knowledge MCP Server to provide up-to-date documentation, code samples, knowledge about the regional availability of AWS APIs and CloudFormation resources.

Start building multimodal AI-powered applications with Nova Multimodal Embeddings today, and share your feedback through AWS re:Post for Amazon Bedrock or your usual AWS Support contacts.

Editors note: 11/5/2025- Support of batch inference added

Danilo

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Nova Multimodal Embeddings RAG Semantic Search Cross-modal Retrieval Amazon Bedrock AI Machine Learning Embeddings Natural Language Processing Computer Vision Audio Processing Video Analysis
相关文章