“Edge computing” was originally developed to make big data processing faster and more secure, but it is now being combined with AI to provide cloud-free solutions. Everyday connected home appliances, from dishwashers to cars and smartphones, are examples of how this real-time data processing technology works by running machine learning models directly on built-in sensors, cameras, or embedded systems.
Sensors are increasingly embedded in homes, offices, farms, hospitals, and transportation systems, creating important opportunities to improve public safety and quality of life.
In fact, connected devices, also known as the Internet of Things (IoT), include temperature and air quality sensors to improve indoor comfort, wearable sensors to monitor patient health, LiDAR and radar to support traffic management, and cameras or smoke detectors to enable rapid fire detection and emergency response.
These devices generate vast amounts of data that can be used to “learn” patterns from their operating environment and improve application performance through AI-driven insights.
For example, connectivity data from Wi-Fi access points and Bluetooth beacons deployed in large buildings can be analyzed using AI algorithms to identify occupancy and movement patterns across different building types (offices, hospitals, universities, etc.), different periods of the year, and event types. These patterns can be leveraged for multiple purposes such as HVAC optimization, evacuation planning, etc.
Combining the Internet of Things and artificial intelligence comes with technical challenges
Artificial Intelligence of Things (AIoT) combines AI and IoT infrastructure to enable intelligent decision-making, automation, and optimization across interconnected systems. AIoT systems rely on large-scale real-world data to increase the accuracy and robustness of their predictions.
To support inference (i.e., insights from collected IoT data) and decision-making, IoT data must be effectively collected, processed, and managed. For example, occupancy data can be processed to estimate peak usage times for a building or predict future energy demands. This is typically accomplished by leveraging cloud-based platforms such as Amazon Web Services and Google Cloud Platform, which host compute-intensive AI models, including the recently introduced underlying model.
What is the Foundation Model?
- A foundation model is a type of machine learning model that is trained on a wide range of data and is designed to be adaptable to a variety of downstream tasks. These include, but are not limited to, large-scale language models (LLMs) that primarily process text data, but can also work with other modalities such as images, audio, video, and time series data.
- In generative AI, the underlying model serves as the basis for generating content such as text, images, audio, and code.
- Unlike traditional AI systems that rely heavily on task-specific datasets and extensive preprocessing, FM introduces zero-shot and few-shot capabilities, allowing it to adapt to new tasks and domains with minimal customization.
- Although FM is still in its infancy, it has the potential to bring immense value to companies across a variety of sectors. The rise of FM thus represents a paradigm shift in applied artificial intelligence.
Limitations of cloud computing for IoT data
Hosting heavy-duty AI or FM-based systems on cloud platforms offers the benefit of rich computational resources, but also introduces some limitations. In particular, sending large amounts of IoT data to the cloud can significantly increase the response time of AIoT applications, often resulting in delays ranging from hundreds of milliseconds to seconds, depending on network conditions and data volume.
Additionally, offloading data, especially sensitive and sensitive information, to the cloud raises privacy concerns and limits opportunities for local processing closer to the data source and end user.
For example, in a smart home, data from smart meters and lighting controls can reveal occupancy patterns or enable indoor localization (for example, detecting that Helen is usually in the kitchen preparing breakfast at 8:30 a.m.). Such insights are best captured close to the data source to minimize edge-to-cloud communication delays and reduce exposure of personal information on third-party cloud platforms.
What is edge computing and edge AI?
To reduce latency and enhance data privacy, edge computing is a good option, providing computational resources (i.e., devices with memory and processing capabilities) close to IoT devices and end users, typically in the same building, a local gateway, or a nearby micro data center.
However, these edge resources have significantly more limited processing power, memory, and storage than centralized cloud platforms, creating challenges when deploying complex AI models.
To address this, the emerging field of edge AI (particularly active in Europe) is researching how to efficiently run AI workloads at the edge.
One such technique is split computing. This splits deep learning models across multiple edge nodes within the same space (for example, a building) and even across different neighborhoods or cities. Deploying these models in a distributed environment is not easy and requires advanced technology. The integration of underlying models adds another layer of complexity, making it even more difficult to design and implement a split computing strategy.
What changes in terms of energy consumption, privacy, and speed?
Edge computing significantly reduces response times by processing data closer to the end user and eliminates the need to send information to distant cloud data centers. Edge computing not only enhances performance, but also privacy, especially with the advent of edge AI technologies.
For example, Federated Learning allows you to train machine learning models directly on local edge (or new IoT) devices with processing capabilities, ensuring that raw data remains on the device and only model updates are sent to edge or cloud platforms for aggregation and final training.
Privacy during inference is further protected. Once trained, your AI model can be deployed to the edge, allowing you to process data locally without exposing it to cloud infrastructure.
This is especially valuable for industries and small businesses looking to leverage large language models within their own infrastructure. Large-scale language models can be used to answer queries related to predicting system functions, monitoring, or tasks where data sensitivity is important. For example, when protecting sensitive or usage data is essential, queries can be correlated to the operational status of industrial machinery, such as predicting maintenance needs based on sensor data.
In these cases, keeping both queries and responses within your organization protects sensitive information and meets privacy and compliance requirements.
How does it work?
Unlike mature cloud platforms such as Amazon Web Services and Google Cloud, there is currently no established platform that supports large-scale deployment of applications and services at the edge.
However, communications providers are beginning to leverage existing local resources at antenna sites to provide computing capabilities closer to end users. Managing these edge resources remains challenging due to their variability and heterogeneity, often involving low-capacity servers and devices.
In my opinion, maintenance complexity is a major barrier to adoption of edge AI services. At the same time, advances in edge AI offer promising opportunities to enhance the utilization and management of these distributed resources.
Allocate resources across the IoT, edge, and cloud continuum for secure and efficient AIoT applications
Enables reliable and efficient deployment of AIoT systems in smart spaces such as homes, offices, industries, and hospitals. Our research group is working with partners across Europe to develop an AI-driven framework within the Horizon Europe project PANDORA.
PANDORA provides AI models as a service (AIaaS) tailored to end-user requirements (latency, accuracy, energy consumption, etc.). These models can be trained at design time or runtime using data collected from IoT devices deployed in smart spaces. In addition, PANDORA provides compute resources as a service (CaaS) across the IoT, edge, and cloud continuum to support the deployment of AI models. This framework manages the entire lifecycle of AI models and ensures continuous, robust, intent-driven operation of AIoT applications for end users.
At runtime, AIoT applications are dynamically deployed across the IoT, edge, and cloud continuum based on performance metrics such as energy efficiency, latency, and compute power. CaaS intelligently allocates workloads to resources at the most appropriate layers (IoT-Edge-Cloud) to maximize resource utilization. Models are selected based on domain-specific intent requirements (such as minimizing energy consumption or reducing inference time) and are continuously monitored and updated to maintain optimal performance.
