Privacy-first AI training on everyday devices

AI News


A new method developed by MIT researchers can speed up the training of privacy-preserving artificial intelligence by about 81 percent. This advancement will enable more accurate AI models to be deployed across a wide range of resource-constrained edge devices, such as sensors and smartwatches, while keeping user data secure.

MIT researchers have increased the efficiency of a technology known as federated learning, which involves a network of connected devices working together to train a shared AI model.

With federated learning, models are broadcast from a central server to wireless devices. Each device uses local data to train the model and forwards model updates to the server. Your data remains secure on each device.

However, not all devices in the network have sufficient capacity, computational power, or connectivity to store, train, and communicate models to and from servers in a timely manner. This causes delays and reduces training performance.

MIT researchers have developed a technique to overcome these memory constraints and communication bottlenecks. Their method is designed to handle heterogeneous networks of wireless devices with various limitations.

This new approach could make it more feasible to use AI models in high-stakes applications with strict security and privacy standards, such as healthcare and finance.

“This research aims to bring AI to small devices that currently cannot run these kinds of powerful models. We carry these devices around with us in our daily lives. We need to be able to run AI on these devices, not just huge servers or GPUs, and this research is an important step toward making that possible,” said Irene Tenison, a graduate student in electrical engineering and computer science (EECS) and lead author of a paper on the technology.

Her co-authors include Anna Murphy ’25, a machine learning engineer at Lincoln Laboratory; Charles Beauville is a visiting student from Ecole Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and a machine learning engineer at Flower Lab. Senior author Lalana Kagal is a principal investigator at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). This research will be presented at the IEEE International Joint Conference on Neural Networks.

Reduced lag time

Many federated learning approaches assume that all devices in the network have enough memory to train a complete AI model and a stable connection to quickly send updates to the server.

However, these assumptions do not hold true in networks of heterogeneous devices such as smartwatches, wireless sensors, and mobile phones. These edge devices have limited memory and computing power and often face intermittent network connectivity.

A central server typically waits to receive model updates from all devices and then averages them to complete a training round. This process is repeated until training is complete.

“This time lag can slow down the training procedure or even cause it to fail,” Tennyson says.

To overcome these limitations, researchers at MIT have developed a new framework called FTTE (Federated Tiny Training Engine) that reduces the memory and communication overhead required on each mobile device.

Their framework includes three major innovations.

First, FTTE does not broadcast the entire model to all devices, but instead sends a smaller subset of model parameters, reducing memory requirements on each device. Parameters are internal variables that the model adjusts during training.

FTTE uses a special search procedure to identify parameters that maximize model accuracy while staying within a given memory budget. This limit is set based on the device with the most memory constraints.

The server then updates the model using an asynchronous approach. Rather than waiting for responses from all devices, the server accumulates incoming updates until it reaches a certain capacity and then continues the training round.

Third, the server weights updates from each device based on when they are received. Thus, old updates do not contribute much to the training process. These stale data can throttle the model, slowing down the training process and reducing accuracy.

“We use this semi-asynchronous approach because we want the least powerful devices to be able to participate in the training process and contribute data to the model, but we don’t want the more powerful devices in the network to sit idle for too long and waste resources,” says Tenison.

Achieving acceleration

The researchers tested the framework in simulations using hundreds of heterogeneous devices and a variety of models and datasets. FTTE enabled us to complete training steps on average 81% faster than standard federated learning approaches.

Their method reduced on-device memory overhead by 80 percent and communication payload by 69 percent, while achieving nearly the same accuracy as other techniques.

“There is a trade-off in accuracy, as the model needs to be trained as fast as possible to save battery life on these resource-constrained devices. However, for some applications, a slight loss in accuracy may be acceptable, especially since our method runs very fast,” she says.

FTTE also demonstrated effective scalability and improved performance for large groups of devices.

In addition to these simulations, the researchers tested FTTE on a small network of real devices with varying computing power.

“Not everyone has the latest Apple iPhone. For example, in many developing countries, users may have lower-end mobile phones. Our technology can bring the benefits of federated learning to these environments,” she says.

In the future, the researchers hope to study how their method can be used to improve the personalized performance of AI models on each device, rather than focusing on the average performance of the model. They also want to conduct large-scale experiments on real hardware.

Funding for this research was provided in part by a Takeda PhD Fellowship.



Source link