Today, DBRX models, an open general-purpose large language model (LLM) developed by Databricks, are now available for customers to deploy and run inference with one click through Amazon SageMaker JumpStart. DBRX LLM employs a fine-grained Mix of Experts (MoE) architecture and is pre-trained with 12 trillion tokens of carefully curated data and a maximum context length of 32,000 tokens.
You can try this model using SageMaker JumpStart, a machine learning (ML) hub that provides access to algorithms and models to get started with ML. This post describes how to discover and deploy DBRX models.
What is DBRX model
DBRX is a sophisticated decoder-only LLM built on a transformer architecture. It employs a fine-grained MoE architecture with a total of 132 billion parameters, of which 36 billion are active for any input.
The model was pre-trained using a dataset consisting of 12 trillion text and code tokens. In contrast to other open MoE models such as Mixtral and Grok-1, DBRX features a fine-grained approach that uses a large number of small experts to optimize performance. Compared to other his MoE models, DBRX has 16 experts, of which he selects 4.
This model is made available for unrestricted use under the Databricks Open Model License.
What is SageMaker JumpStart?
SageMaker JumpStart is a fully managed platform that provides a state-of-the-art foundational model for a variety of use cases, including content creation, code generation, question answering, copywriting, summarization, classification, and information retrieval. Accelerate the development and deployment of ML applications by providing a collection of pre-trained models that can be quickly and easily deployed. One of the key components of SageMaker JumpStart is the Model Hub. Model Hub provides a huge catalog of pre-trained models, such as DBRX, for a variety of tasks.
You can now discover and deploy DBRX models with just a few clicks in Amazon SageMaker Studio or programmatically through the SageMaker Python SDK. This allows you to derive model performance and MLOps control using Amazon SageMaker features such as Amazon SageMaker Pipelines, Amazon SageMaker Debugger, and container logs. . Models are deployed in a secure environment in AWS and under the control of a VPC, which helps provide data security.
Discover models with SageMaker JumpStart
DBRX models can be accessed through SageMaker JumpStart in the SageMaker Studio UI and the SageMaker Python SDK. This section describes how to discover models in SageMaker Studio.
SageMaker Studio is an integrated development environment (IDE) that provides a single web-based visual interface with access to dedicated tools for all ML development steps, from data preparation to building, training, and deploying ML models. can be executed. For more information about how to get started and set up SageMaker Studio, see Amazon SageMaker Studio.
SageMaker Studio allows you to selectively access SageMaker JumpStart. jump start in the navigation pane.

From the SageMaker JumpStart landing page, you can search for “DBRX” in the search box. The search results will list DBRX Instruct and DBRX Base.

Select a model card to view details about the model, including its license, data used for training, and how the model is used. Also, expand Click the button to deploy the model and create the endpoint.

Deploy the model with SageMaker JumpStart
Select to start deployment. expand button. Once the deployment is complete, you will see that the endpoint has been created. To test the endpoint, pass a sample inference request payload or use the SDK and select the test option. If you select the option to use the SDK, you will see sample code that you can use with your selected notebook editor in SageMaker Studio.
DBRX base
To deploy using the SDK, first select the DBRX base model. model_id The value is hackingface-llm-dbrx-base. You can deploy any of the selected models to SageMaker using the following code. Similarly, you can deploy a DBRX Instruct using your own model ID.
from sagemaker.jumpstart.model import JumpStartModel
accept_eula = True
model = JumpStartModel(model_id="huggingface-llm-dbrx-base")
predictor = model.deploy(accept_eula=accept_eula)
This deploys your model to SageMaker with default configurations, such as the default instance type and default VPC configuration. You can change these configurations by specifying non-default values in JumpStartModel. To accept the End User License Agreement (EULA), the EULA value must be explicitly defined as True. Also, ensure that your endpoint usage has account-level service limits for using ml.p4d.24xlarge or ml.pde.24xlarge as one or more instances. You can request a service quota increase by following the steps here.
After deployment, you can perform inference on the deployed endpoints via SageMaker predictors.
payload = {
"inputs": "Hello!",
"parameters": {
"max_new_tokens": 10,
},
}
predictor.predict(payload)
Example prompt
You can work with the DBRX base model as you would any standard text generation model. The model processes the input sequence and outputs the predicted next word in the sequence. This section provides some example prompts and sample output.
code generation
Using the previous example, you can use the code generation prompt as follows:
payload = {
"inputs": "Write a function to read a CSV file in Python using pandas library:",
"parameters": {
"max_new_tokens": 30, }, }
response = predictor.predict(payload)["generated_text"].strip()
print(response)
The output is:
import pandas as pd
df = pd.read_csv("file_name.csv")
#The above code will import pandas library and then read the CSV file using read_csv
sentiment analysis
DBRX allows you to perform sentiment analysis using prompts such as:
payload = {
"inputs": """
Tweet: "I am so excited for the weekend!"
Sentiment: Positive
Tweet: "Why does traffic have to be so terrible?"
Sentiment: Negative
Tweet: "Just saw a great movie, would recommend it."
Sentiment: Positive
Tweet: "According to the weather report, it will be cloudy today."
Sentiment: Neutral
Tweet: "This restaurant is absolutely terrible."
Sentiment: Negative
Tweet: "I love spending time with my family."
Sentiment:""",
"parameters": {
"max_new_tokens": 2,
},
}
response = predictor.predict(payload)["generated_text"].strip()
print(response)
The output is:
Question-and-answer session
DBRX allows question answer prompts such as:
# Question answering
payload = {
"inputs": "Respond to the question: How did the development of transportation systems, such as railroads and steamships, impact global trade and cultural exchange?",
"parameters": {
"max_new_tokens": 225,
},
}
response = predictor.predict(payload)["generated_text"].strip()
print(response)
The output is:
The development of transportation systems, such as railroads and steamships, impacted global trade and cultural exchange in a number of ways.
The documents provided show that the development of these systems had a profound effect on the way people and goods were able to move around the world.
One of the most significant impacts of the development of transportation systems was the way it facilitated global trade.
The documents show that the development of railroads and steamships made it possible for goods to be transported more quickly and efficiently than ever before.
This allowed for a greater exchange of goods between different parts of the world, which in turn led to a greater exchange of ideas and cultures.
Another impact of the development of transportation systems was the way it facilitated cultural exchange. The documents show that the development of railroads and steamships made it possible for people to travel more easily and quickly than ever before.
This allowed for a greater exchange of ideas and cultures between different parts of the world. Overall, the development of transportation systems, such as railroads and steamships, had a profound impact on global trade and cultural exchange.
DBRX instruction
The instruction-coordinated version of DBRX accepts a form of instruction where the conversational role starts with a prompt from the user and must alternate between user instructions and an assistant (DBRX-instruct). The imperative form must be strictly respected or the model will produce suboptimal output. The template for building prompts for the Instruct model is defined as follows:
<|im_start|>system
{system_message} <|im_end|>
<|im_start|>user
{human_message} <|im_end|>
<|im_start|>assistant\n
<|im_start|> and <|im_end|> Special tokens for start of string (BOS) and end of string (EOS). The model can include multiple conversation turns between the system, user, and assistant, and can incorporate a small number of example shots to enhance the model's response.
The following code shows how to format the prompt in imperative format.
from typing import Dict, List
def format_instructions(instructions: List[Dict[str, str]]) -> List[str]:
"""Format instructions where conversation roles must alternate system/user/assistant/user/assistant/..."""
prompt: List[str] = []
for instruction in instructions:
if instruction["role"] == "system":
prompt.extend(["<|im_start|>system\n", (instruction["content"]).strip(), " <|im_end|>\n"])
elif instruction["role"] == "user":
prompt.extend(["<|im_start|>user\n", (instruction["content"]).strip(), " <|im_end|>\n"])
else:
raise ValueError(f"Invalid role: {instruction['role']}. Role must be either 'user' or 'system'.")
prompt.extend(["<|im_start|>assistant\n"])
return "".join(prompt)
def print_instructions(prompt: str, response: str) -> None:
bold, unbold = '\033[1m', '\033[0m'
print(f"{bold}> Input{unbold}\n{prompt}\n\n{bold}> Output{unbold}\n{response['generated_text'].strip()}\n")
Search for knowledge
You can use the following prompts to search knowledge:
instructions = [
{"role": "system", "content": "You are an expert in suggesting diet plans optimized for different sports or fitness activities, taking into account energy requirements, and recovery requirements."},
{"role": "user", "content": "I am trying to build muscle mass and increase my strength. What kind of high-protein, calorie-dense diet plan would you recommend for me?"}
]
prompt = format_instructions(instructions)
payload = {
"inputs": prompt,
"parameters": {"max_new_tokens": 700, "temperature": 0.5, "do_sample": True}
}
response=predictor.predict(payload)
print_instructions(prompt, response)
The output is:
> Input
<|im_start|>system
You are an expert in suggesting diet plans optimized for different sports or fitness activities, taking into account energy requirements, and recovery requirements. <|im_end|>
<|im_start|>user
I am trying to build muscle mass and increase my strength. What kind of high-protein, calorie-dense diet plan would you recommend for me? <|im_end|>
<|im_start|>assistant
> Output
Absolutely, I'd be happy to help you with that. Given your goal of building muscle mass and increasing strength, you'll want a diet that's high in protein and provides sufficient calories to support muscle growth and recovery. Here's a basic outline of what your diet plan could look like:
1. **Breakfast:**
- 3 whole eggs
- 3 egg whites
- 1 cup of oatmeal
- 1 scoop of whey protein
- 1 banana
2. **Mid-Morning Snack:**
- 1.5 ounces of almonds
- 1 apple
3. **Lunch:**
- 8 ounces of chicken breast
- 1 cup of brown rice
- 1 cup of mixed vegetables
4. **Afternoon Snack:**
- 1.5 cups of Greek yogurt
- 1 cup of mixed berries
5. **Dinner:**
- 8 ounces of salmon or lean beef
- 1 cup of quinoa
- 1.5 cups of mixed vegetables
6. **Evening Snack:**
- 1 scoop of casein protein
- 1 tablespoon of almond butter
This is a high-protein, calorie-dense diet that includes a variety of nutrient-rich foods. It's important to note that everyone's nutritional needs are different, so you may need to adjust portion sizes or food choices based on your specific needs and preferences.
Also, remember to stay hydrated by drinking plenty of water throughout the day. Regular exercise, particularly strength training, is crucial to achieving your goals. It's also important to get enough sleep to aid in muscle recovery and growth.
Before starting any new diet or exercise program, it's a good idea to consult with a healthcare professional or a registered dietitian. They can provide personalized advice based on your health history and current lifestyle.
code generation
The DBRX model demonstrates benchmarked strengths for coding tasks. For example, see the following code.
instructions = [
{"role": "system", "content": "You are an expert in coding and programming. Your role is to provide examples, explanations, and insights related to programming languages, algorithms, data structures, and software development practices."},
{"role": "user", "content": "Write a Python script to read a CSV file containing stock prices and plot the closing prices over time using Matplotlib."}
]
prompt = format_instructions(instructions)
payload = { "inputs": prompt, "parameters": {"max_new_tokens": 315, "temperature": 0.3, "do_sample": True} }
response=predictor.predict(payload)
print_instructions(prompt, response)
The output is:
> Input
<|im_start|>system
You are an expert in coding and programming. Your role is to provide examples, explanations, and insights related to programming languages, algorithms, data structures, and software development practices. <|im_end|>
<|im_start|>user
Write a Python script to read a CSV file containing stock prices and plot the closing prices over time using Matplotlib. <|im_end|>
<|im_start|>assistant
> Output
Sure, I'd be happy to help you with that. Here's a simple Python script that uses the `pandas` library to read a CSV file and `matplotlib` to plot the closing prices over time.
```python
import pandas as pd
import matplotlib.pyplot as plt
# Load the data from a CSV file
df = pd.read_csv('stock_prices.csv')
# Plot the closing prices over time
plt.figure(figsize=(12,5))
plt.plot(df['Date'], df['Close'])
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Stock Prices Over Time')
plt.show()
```
In this script, replace `'stock_prices.csv'` with the path to your CSV file. The file should have columns named 'Date' and 'Close' for this script to work correctly. If your file uses different column names, you'll need to adjust the script accordingly.
The `pandas` library is a powerful tool for data manipulation and analysis. It can read data from a variety of sources, including CSV files, and represent the data in a tabular format. The `matplotlib` library is a popular choice for creating static, animated, and interactive visualizations in Python.
This script first loads the data from the CSV file into a pandas DataFrame. It then plots the 'Close' column against the 'Date' column using matplotlib's `plot()` function. The `figure()` function is used to specify the size of the plot, and `show()` is used to display the plot.
mathematics and reasoning
The DBRX model also reports strengths in mathematical accuracy. For example, see the following code.
instructions = [
{"role": "system", "content": "You are an expert in mathematics and reasoning. Your role is to provide examples, explanations, and insights related to mathematical concepts, problem-solving techniques, and logical reasoning."},
{"role": "user", "content": "A car rental company charges $40 per day plus $0.25 per mile driven. If I rented a car for 3 days and drove 250 miles, what would be the total cost? Break down the calculation steps before giving the final answer."}
]
prompt = format_instructions(instructions)
payload = {
"inputs": prompt,
"parameters": {"max_new_tokens": 400, "temperature": 0.2, "do_sample": True}
}
response=predictor.predict(payload)
print_instructions(prompt, response)
DBRX can provide understanding as shown in the following output using mathematical logic.
> Input
<|im_start|>system
You are an expert in mathematics and reasoning. Your role is to provide examples, explanations, and insights related to mathematical concepts, problem-solving techniques, and logical reasoning. <|im_end|>
<|im_start|>user
A car rental company charges $40 per day plus $0.25 per mile driven. If I rented a car for 3 days and drove 250 miles, what would be the total cost? Break down the calculation steps before giving the final answer. <|im_end|>
<|im_start|>assistant
> Output
Sure, I'd be happy to help with that! The problem you've presented involves both fixed costs (the daily rate) and variable costs (the cost per mile). Let's break it down:
1. First, let's calculate the total cost of the fixed daily rate. The car rental company charges $40 per day, and you rented the car for 3 days. So, the total cost for the daily rate is $40/day * 3 days = $120.
2. Next, let's calculate the total cost of the variable rate, which is the cost per mile driven. The company charges $0.25 per mile, and you drove 250 miles. So, the total cost for the mileage is $0.25/mile * 250 miles = $62.50.
3. Finally, to find the total cost of renting the car, we add the total cost of the daily rate and the total cost of the mileage. So, the total cost is $120 + $62.50 = $182.50.
So, the total cost of renting the car for 3 days and driving 250 miles would be $182.50.
cleaning
Once your notebook has finished running, be sure to delete any resources you created during the process so that billing will stop. Use the following code:
predictor.delete_model()
predictor.delete_endpoint()
conclusion
In this post, you learned how to get started with DBRX in SageMaker Studio and deploy a model for inference. The base model is pre-trained, reducing training and infrastructure costs and allowing customization for your use case. Visit SageMaker JumpStart in SageMaker Studio to get started today.
resource
About the author
Shikhar Kwatra He is an AI/ML Specialist Solutions Architect at Amazon Web Services, working with leading global systems integrators. He has secured his over 400 patents in the AI/ML and IoT domains, earning him the title of one of India's youngest master inventors. He has over 8 years of industry experience from startups to large enterprises ranging from IoT Research Engineer, Data Scientist, Data & AI Architect. Shikhar helps organizations design, build, and maintain cost-effective, scalable cloud environments and helps GSI partners build strategic industries.
Nitin Vijeswaran I'm a solution architect at AWS. His areas of focus are generative AI and his AWS AI accelerator. He holds a Bachelor's degree in Computer Science and Bioinformatics. Niithiyn will work closely with the Generative AI GTM team to support AWS customers on a variety of fronts and accelerate their adoption of Generative AI. He is an avid Dallas Mavericks fan and enjoys collecting sneakers.
Sebastian Bustillo I'm a solution architect at AWS. He has a deep passion for generative AI and computing accelerators, with a focus on AI/ML technologies. At AWS, we help customers unlock business value through generative AI. When he's not working, he enjoys brewing the perfect specialty coffee and exploring the world with his wife.
Armando Diaz I'm a solution architect at AWS. His focus is on generative AI, AI/ML, and data analytics. At AWS, Armando helps customers integrate cutting-edge generative AI capabilities into their systems to drive innovation and competitive advantage. When he is not working, he enjoys spending time with his wife and family, hiking, and traveling around the world.