
Image by author
Hugging Face has developed a new serialization format called Safetensor that aims to simplify and streamline the storage and loading of large and complex tensors. Tensors are the main data structure used in deep learning and their size can pose challenges in terms of efficiency.
Safe tensors use a combination of efficient serialization and compression algorithms to reduce the size of large tensors, making them faster and more efficient than other serialization formats such as pickle.This is the traditional PyTorch serialization format model.safetensors It was used pytorch_model.bin Compared to , Safetensors are 76.6x faster on CPU and 2x faster on GPU. Check out our speed comparison.
ease of use
Safetensor has a simple and intuitive API for serializing and deserializing tensors in Python. This means developers can focus on building deep learning models instead of spending time serializing and deserializing.
cross-platform compatibility
You can serialize in Python and load the resulting file easily in various programming languages and platforms such as C++, Java, and JavaScript. This allows seamless sharing of models between different programming environments.
speed
Safetensor is optimized for speed and can handle serialization and deserialization of large tensors efficiently. This makes it an excellent choice for applications with large language models.
Size optimization
It combines effective serialization and compression algorithms to reduce the size of large tensors, resulting in faster and more efficient performance compared to other serialization formats such as pickle.
safety
To prevent corruption of serialized tensors during storage or transfer, Safetensor uses a checksum mechanism. This ensures an extra layer of security and ensures that all data stored in Safetensor is accurate and trustworthy. Additionally, it prevents DOS attacks.
lazy loading
When working in a distributed setup with multiple nodes or GPUs, it can be useful to load only part of the tensor into each model. BLOOM takes advantage of this format so that in just 45 seconds he loads the model on 8 GPUs, whereas his normal PyTorch weighting takes 10 minutes.
In this sectionsafetensors Describes the API and how to save and load file tensor files.
You can easily install the safetensor using pip manager.
Using the Torch shared tensor example, we build a simple neural network and use PyTorch safetensors.torch Save the model using the API.
from torch import nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.a = nn.Linear(100, 100)
self.b = self.a
def forward(self, x):
return self.b(self.a(x))
model = Model()
print(model.state_dict())
As you can see, the model was created successfully.
OrderedDict([('a.weight', tensor([[-0.0913, 0.0470, -0.0209, ..., -0.0540, -0.0575, -0.0679], [ 0.0268, 0.0765, 0.0952, ..., -0.0616, 0.0146, -0.0343], [ 0.0216, 0.0444, -0.0347, ..., -0.0546, 0.0036, -0.0454], ...,
next,model Save the model by specifying the object and file name. after that,nn.Module created using model Load a save file into an object.
from safetensors.torch import load_model, save_model
save_model(model, "model.safetensors")
load_model(model, "model.safetensors")
print(model.state_dict())
OrderedDict([('a.weight', tensor([[-0.0913, 0.0470, -0.0209, ..., -0.0540, -0.0575, -0.0679], [ 0.0268, 0.0765, 0.0952, ..., -0.0616, 0.0146, -0.0343], [ 0.0216, 0.0444, -0.0347, ..., -0.0546, 0.0036, -0.0454], ...,
In the second example,torch.zeros I try to save a tensor created using for that,save_file Use functions.
import torch
from safetensors.torch import save_file, load_file
tensors = {
"weight1": torch.zeros((1024, 1024)),
"weight2": torch.zeros((1024, 1024))
}
save_file(tensors, "new_model.safetensors")
To load the tensor,load_file Use functions.
load_file("new_model.safetensors")
{'weight1': tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]),
'weight2': tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])}
Safe Tensor APIs are available for Pytorch, Tensorflow, PaddlePaddle, Flax, and Numpy. You can figure it out by reading the Safetensors documentation.

Image from Torch API
In short, safe tensors are a new way to store large tensors used in deep learning applications. It offers faster, more efficient, and user-friendly features compared to other technologies. Furthermore, it ensures data confidentiality and safety while supporting various programming languages and platforms. Safetensor helps machine learning engineers optimize their time and focus on developing great models.
We highly recommend using Safetensor for your projects. Many top AI companies use Safetensor for their projects, including Hugging Face, EleutherAI, and StabilityAI.
reference
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. She now focuses on content creation and writes technical blogs on machine learning and data science techniques. Avid holds a Master’s degree in Technology Management and a Bachelor’s degree in Telecommunications Engineering. His vision is to use graph his neural networks to build his AI product for students suffering from mental illness.
