Provide machine learning models via REST API within 10 minutes

Sserve machine learning models via the REST API within 10 minutes

Images by the author | Canva

If you like building machine learning models and trying out new ones, it's really cool, but honestly, once you're available it will only be useful to others. To do this, it must be published via the web API so that other programs (or humans) can send data and regain predictions. That's where the REST API comes in.

In this article, you'll learn how to go with Fastapi in just under 10 minutes from a simple machine learning model using Fastapi, one of the fastest and most developer-friendly web frameworks in Python. And, while not just stopping with the “Make It Run” demo, I'll add something like this:

Verifying incoming data
Record all requests
Add background tasks to avoid slowing down
Elegantly handle errors

So, before moving on to the code parts, let's quickly show you what the project structure looks like.

ml-api/
│
├── model/
│   └── train_model.py        # Script to train and save the model
│   └── iris_model.pkl        # Trained model file
│
├── app/
│   └── main.py               # FastAPI app
│   └── schema.py             # Input data schema using Pydantic
│
├── requirements.txt          # All dependencies
└── README.md                 # Optional documentation

Step 1: Install what you need

This project requires several helpers such as Fastapi API, Scikit-Learn for models, Joblib and Pydantic. It can be installed using PIP.

pip install fastapi uvicorn scikit-learn joblib pydantic

And save your environment:

pip freeze > requirements.txt

Step 2: Train and save a simple model

Keep the machine learning part simple and allow you to focus on delivering your models. I use the famous one IRIS Data Set Training a Random Forest Classifier Predict iris flower types based on petal and cepal measurements.

This is the training script. Create a file called train_model.py in Model/ directory:

from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import joblib, os

X, y = load_iris(return_X_y=True)
clf = RandomForestClassifier()
clf.fit(*train_test_split(X, y, test_size=0.2, random_state=42)[:2])

os.makedirs("model", exist_ok=True)
joblib.dump(clf, "model/iris_model.pkl")
print("✅ Model saved to model/iris_model.pkl")

This script loads data, splits it, trains the model, and saves it using Joblib. Run it once to generate the model file.

python model/train_model.py

Step 3: Define the inputs the API should expect

Next, you need to define how users interact with the API. What should they send and in what format?

Use Pydantic, a built-in part of Fastapi, to create a schema that describes and validates incoming data. Specifically, it allows the user to provide four positive float values for the sepal length/width and petal length/width.

New file app/schema.pyaddition:

from pydantic import BaseModel, Field

class IrisInput(BaseModel):
    sepal_length: float = Field(..., gt=0, lt=10)
    sepal_width: float = Field(..., gt=0, lt=10)
    petal_length: float = Field(..., gt=0, lt=10)
    petal_width: float = Field(..., gt=0, lt=10)

Here we added value constraints (beyond 0 and below 10) to keep inputs clean and realistic.

Step 4: Create an API

Now, it's time to build an actual API. I use Fastapi.

Load the model
Accepts JSON input
Predict class and probability
Record requests in the background
Returns a clean JSON response

Write the main API code inside app/main.py:

from fastapi import FastAPI, HTTPException, BackgroundTasks
from fastapi.responses import JSONResponse
from app.schema import IrisInput
import numpy as np, joblib, logging

# Load the model
model = joblib.load("model/iris_model.pkl")

# Set up logging
logging.basicConfig(filename="api.log", level=logging.INFO,
                    format="%(asctime)s - %(message)s")

# Create the FastAPI app
app = FastAPI()

@app.post("/predict")
def predict(input_data: IrisInput, background_tasks: BackgroundTasks):
    try:
        # Format the input as a NumPy array
        data = np.array([[input_data.sepal_length,
                          input_data.sepal_width,
                          input_data.petal_length,
                          input_data.petal_width]])
        
        # Run prediction
        pred = model.predict(data)[0]
        proba = model.predict_proba(data)[0]
        species = ["setosa", "versicolor", "virginica"][pred]

        # Log in the background so it doesn’t block response
        background_tasks.add_task(log_request, input_data, species)

        # Return prediction and probabilities
        return {
            "prediction": species,
            "class_index": int(pred),
            "probabilities": {
                "setosa": float(proba[0]),
                "versicolor": float(proba[1]),
                "virginica": float(proba[2])
            }
        }

    except Exception as e:
        logging.exception("Prediction failed")
        raise HTTPException(status_code=500, detail="Internal error")

# Background logging task
def log_request(data: IrisInput, prediction: str):
    logging.info(f"Input: {data.dict()} | Prediction: {prediction}")

Let's pause and understand what's going on here.

Loads the model once when the app launches. When a user gets a hit / Predict Using the endpoint using valid JSON input, convert it to a numpy array, pass it to the model, and return the predicted class and probability. If something goes wrong, log it and return a friendly error.

Please be careful about BackgroundTasks Part – This is a neat Fastapi feature that can work after a response has been sent (such as saving logs). This keeps the API responding and avoids delays.

Step 5: Run the API

To start the server, use uvicorn as follows:

uvicorn app.main:app --reload

Visit: http://127.0.0.1:8000/docs
You will see an interactive Swagger UI that lets you test the API.
Try this sample input:

{
  "sepal_length": 6.1,
  "sepal_width": 2.8,
  "petal_length": 4.7,
  "petal_width": 1.2
}

Alternatively, you can use Curl to make a request such as:

curl -X POST "http://127.0.0.1:8000/predict" -H  "Content-Type: application/json" -d \
'{
  "sepal_length": 6.1,
  "sepal_width": 2.8,
  "petal_length": 4.7,
  "petal_width": 1.2
}'

Both produce the same response.

{"prediction":"versicolor",
 "class_index":1,
 "probabilities": {
	 "setosa":0.0,
	 "versicolor":1.0,
	 "virginica":0.0 }
 }

Optional Step: Expand API

You can deploy the Fastapi app as follows:

render.com (Zero Config Deployment)
Railway.App (for continuous integration)
Heroku (via Docker)

You can also extend this to production-enabled services by adding authentication (such as API keys and OAUTH) to protect your endpoints, monitoring requests with Prometheus and Grafana, and using Redis or Celry for your background job queue. You can also refer to my article: Step-by-step guide to deploying machine learning models using Docker.

I'll summarize

That's it – and it's already better than most demos. What we made is more than just an example of toys. But that:

Automatically validate input data
Returns meaningful responses with prediction reliability
Log all requests to a file (api.log)
Using background tasks, the API remains fast and responsive
Gorgeously deal with obstacles

All of that with less than 100 lines of code.

Kanwal Mehreen Kanwal is a machine learning engineer and a technical writer with a deep passion for the intersection of data science, AI and medicine. She co-authored the ebook “Maximizing Productivity with ChatGPT.” As APAC's Google Generation Scholar 2022, she advocates for diversity and academic excellence. She is also recognized as Teradata diversity for technology scholars, MITACS Globallink Research Scholar and Harvard Wecode Scholar. Kanwar is an avid advocate for change and has founded a fem code to empower women in the STEM field.

Source link

binance registrering commented on Global Industrial Automation Services Market Size to Reach: Your point of view caught my eye and was very inte
binance commented on WestMetric Defends Controversial On-Page SEO Services for the Era of AI: I don't think the title of your article matches th
创建个人账户 commented on AI in CMO Strategy: Transforming Marketing Leadership: Can you be more specific about the content of your
binance account creation commented on The rise of Artificial Intelligence in Film & TV: Thank you for your sharing. I am worried that I la
最佳gate io推荐代码 commented on Building more cyber-resilient satellites begins with a strong network: Can you be more specific about the content of your

Provide machine learning models via REST API within 10 minutes

Step 1: Install what you need

Step 2: Train and save a simple model

Step 3: Define the inputs the API should expect

Step 4: Create an API

Step 5: Run the API

Optional Step: Expand API

I'll summarize

Leave a Reply

RECENT POSTS

Pharma 4.0: Digital Integration, LIMS, and AI in the Lab

VA clinical staff rushed to use generated AI without supervision, watchdog finds

Distributed GPU cloud saves up to 90% on AI computing. — Kuasa

Step 1: Install what you need

Step 2: Train and save a simple model

Step 3: Define the inputs the API should expect

Step 4: Create an API

Step 5: Run the API

Optional Step: Expand API

I'll summarize

Related Posts

Leave a Reply