Object detection with Amazon Nova 2 Lite

Traditional computer vision solutions can require large upfront investments. Setting up data pipelines, model training infrastructure, compute resources, and a dedicated data science team is often a prohibitive task for smaller companies and teams. Amazon Nova 2 Lite, available from Amazon Bedrock, offers an attractive alternative solution. This multimodal foundational model detects objects through natural language prompts without the need for training. When you specify “vehicle”, “person”, or “dent”, Nova returns the exact bounding box coordinates in structured JSON format.

This post describes implementing object detection using Amazon Nova 2 Lite. Learn how to deploy an object detection application using Amazon Bedrock, AWS Lambda, and Amazon API Gateway. You’ll also learn how to create effective prompts, process structured JSON output, and visualize the results. We are exploring practical applications across manufacturing, agriculture and logistics.

Solution overview

Before you begin, make sure you have the following:

AWS accounts and permissions

An active AWS account with Amazon Bedrock access enabled
IAM permissions bedrock:InvokeModel
Accessing Amazon Nova 2 Lite models in your region
AWS Command Line Interface (AWS CLI) configured for deployment

Development environment (for local testing)

Python 3.8 or later
AWS SDK for Python (Boto3) version 1.28.0+
Python Imaging Library (PIL/Pillow)

install:

pip install boto3 pillow

Estimated cost

Amazon Bedrock: $0.0003 per 1,000 input tokens, $0.0025 per 1,000 output tokens
Typical images: 230 input tokens (~$0.000069 per image) & ~200 output tokens (~$0.0005 per image)
Example: 10,000 images ≈ $5.69
AWS Lambda, Amazon API Gateway: Pay-as-you-go (minimum for testing)

Estimated time: 30-45 minutes

Object detection solutions use four main steps to identify and locate objects in images.

procedure:

rapid engineering – Configure prompts to specify objects and expected JSON output format
Amazon bedrock – Call Amazon Bedrock to access Amazon Nova 2 Lite without managing infrastructure and extract bounding box information from the response
Coordinate processing – Convert Nova normalized coordinates (0-1000 scale) to pixel locations
visualization – Render a bounding box on the image for verification

Submit the image and list of objects to detect through Amazon Bedrock’s Converse API. Amazon Nova 2 Lite analyzes the image and returns a JSON response containing the bounding box coordinates of each detected object. Next, convert the normalized coordinates (0-1000 scale) to pixel locations based on the image dimensions. Finally, visualize the result by drawing a bounding box on the original image.

Deploy object detection in just a few hours. No model training, machine learning (ML) expertise, or infrastructure management required.

prompt

Rapid engineering plays a key role in achieving accurate detection. A prompt template (see example below) contains a carefully crafted set of instructions that specify important requirements. Two variables in the prompt template: elements and schema is dynamically constructed based on detected object types, so the prompt template can handle any object category without modification.

# Object Detection and Localization

## Objective

Your task is to detect and localize objects in the target image with high precision and recall.

## Instruction

- The objects to be detected are: {elements}

- Analyze the provided target image and return only the reasoning and a JSON object with bounding box data for detected objects

- Think step-by-step and then provide precise bounding box coordinates for each detection

- Detect all instances of the specified objects

- Fit bounding boxes tightly around each object

- Do not output duplicate bounding boxes

- Coordinates should use the format [x_min, y_min, x_max, y_max] where:

  * (x_min, y_min) is the top-left corner of the bounding box

  * (x_max, y_max) is the bottom-right corner of the bounding box

## Output Requirements and Examples

The JSON output should strictly follow this structure including the word json:

```json

{schema}

```

### Example JSON Structure:

```json

{{
"car": [{{
    "bbox": [321, 432, 543, 876],
}}],
"pedestrian": [{{
    "bbox": [432, 543, 654, 987],
}},
{{
    "bbox": [123, 234, 345, 678],
}}],
// Continue for all detected elements...
}}

```

Briefly explain the detection results and provide the specified JSON format wrapped within triple backticks.

For complete implementation details, please see the GitHub repository.

Example: Street scene detection

We tested the Nova 2 Lite with images of street scenes. Without any training or fine-tuning, we’ll ask Nova to detect two object types: “vehicles” and “stop signs.”

As shown in Figure 1, Nova accurately detects not only obvious objects but also small, distant, or partially occluded objects. The bounding box fits tightly around the object’s boundaries with minimal gaps. Nova achieves this accuracy by omitting detailed descriptions and using only basic object names such as “vehicle” and “stop sign.”

Architecture diagram showing a serverless object discovery application using Amazon CloudFront, Amazon S3, Amazon API Gateway, AWS Lambda, Amazon Bedrock, and Amazon Nova 2 Lite. — *Figure 1. Bounding boxes generated by Amazon Nova 2 Lite for two object types: “vehicle” and “stop sign.”*

Deploy to the cloud

Amazon Bedrock provides API access to Amazon Nova 2 Lite. This means you can call Amazon Nova 2 Lite from any AWS computing service. Choose the service that best fits your workload.

Choosing a computing platform

For event-driven workloads and API endpoints, AWS Lambda offers automatic scaling and a pay-per-invocation model that eliminates idle costs. If you need more control over your runtime environment or have long-running processes, Amazon Elastic Compute Cloud (Amazon EC2) gives you complete flexibility to configure your instances exactly as you need them. For container-based deployments with automatic scaling, use Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS).

No matter which compute service you choose, they all call the same Amazon Bedrock Converse API to interact with your Nova model. This consistency makes it easy to integrate object detection into your existing infrastructure and migrate it between computing platforms as your requirements evolve.

Building an object detection application

I built a sample serverless web application that demonstrates object detection using Amazon Nova 2 Lite. This proof of concept includes a web interface, secure infrastructure, and autoscaling. Deploy to your AWS account in minutes.

This application follows a serverless-first architecture where multiple AWS services work together. Amazon CloudFront serves single-page applications from private Amazon Simple Storage Service (Amazon S3) buckets and provides global distribution and HTTPS enforcement through Origin Access Control. When a user uploads an image and specifies an object to discover, the front end sends a request to Amazon API Gateway, which routes the request to an AWS Lambda function.

The Lambda function acts as an orchestration layer and calls Amazon Bedrock’s Converse API to send images and discovery prompts to Amazon Nova 2 Lite. Nova returns the normalized bounding box coordinates of each detected object. The Lambda function converts it to pixel locations and renders it as an annotated box on the image. Annotated results return through the same path: Lambda, API Gateway, and front end. The user will see an image with the detected object highlighted.

Amazon CloudFront distributes front ends globally. API Gateway routes the request to Lambda, which calls Amazon Bedrock to perform object discovery. This architecture automatically scales and focuses each component on a single job.

AWS architecture diagram for a serverless object discovery application showing request flow from users through AWS Secrets Manager and Amazon CloudWatch Logs as services supporting Amazon CloudFront, an S3-hosted front end, Amazon API Gateway, an Image Grounding Lambda function, and Amazon Bedrock Nova Lite deployed in the us-west-2 region. — *Figure 2. Serverless object discovery sample application architecture*

try it yourself

The complete source code, including all AWS Cloud Development Kit (AWS CDK) infrastructure definitions and Lambda functions, is available in the GitHub repository. Deployment is easy by installing the AWS CLI and AWS CDK and enabling access to Amazon Nova 2 Lite in the Amazon Bedrock console.

This serverless pattern shows how you can quickly build AI applications using Nova models. Because it’s all infrastructure-as-code, you can version your entire application stack and consistently deploy it across multiple environments or AWS accounts.

cleaning

To avoid ongoing charges, delete the resources you created in this tutorial.

If you have deployed the sample application:

# Delete the AWS CloudFormation stack
cdk destroy

# Verify resources are removed
aws cloudformation list-stacks --stack-status-filter DELETE_COMPLETE

Manual cleanup (if required):

Delete an Amazon S3 bucket and content
Delete an AWS Lambda function
Delete an Amazon API Gateway endpoint
Delete an Amazon CloudFront distribution

Cost impact: Amazon Bedrock API calls are pay-as-you-go, with no ongoing infrastructure costs. When you delete a deployment resource, you only incur charges for making API calls.

Actual application example

The following examples demonstrate how Amazon Nova 2 Lite is applied to real-world use cases across industries.

manufacturing quality control

The metal manufacturing facility processes 10,000 parts each month. Each defective part shipped costs between $50 and $200 to return and rework. The large upfront investment to train traditional computer vision models is often prohibitive for their operations.

Using Amazon Nova 2 Lite, this facility automates quality inspections. Specify defects such as “scratches,” “dents,” and “rust,” and the system will automatically identify them. Analyzing 5 images per part costs about $8 per month.

precision agriculture

The 5,000-acre farm takes weekly drone images during the 20-week growing season to detect crop problems early. Early detection prevents overspraying of chemicals and damage to crops.

On the farm, we identify “leaf diseases,” “pest damage,” and “mold.” Processing 1.2 million high-resolution images per season costs approximately $200.

The same approach could enable GPS guidance equipment to detect obstacles (e.g., “vehicles,” “equipment,” “debris,” etc.), enabling autonomous field operations.

Logistics and fulfillment

Distribution centers identify damaged packages by specifying “torn box,” “crushed package,” or “water damage.” The system automatically flags items for inspection and sends them to the quality control area, ensuring consistent standards throughout the operation.

This approach extends to inventory monitoring (e.g., “empty shelves,” “misplaced items”) and safety compliance (e.g., “hard hats,” “safety vests,” “safety glasses”), making computer vision accessible to operations of all sizes.

conclusion

This post explained how Amazon Nova 2 Lite enables object detection. By specifying object names through natural language prompts, you can deploy computer vision applications in hours instead of months without managing infrastructure. It provides object detection performance through a single API, has a pay-as-you-go cost structure, and requires no machine learning (ML) expertise.

Ready to try it out? Deploy a sample application from the GitHub repository or explore Amazon Nova models in the Amazon Bedrock console.