Byte

News

HomeHome / News / Byte

Sep 26, 2023

Byte

While artificial intelligence (AI) algorithms running on larger, more powerful

While artificial intelligence (AI) algorithms running on larger, more powerful hardware often steal the spotlight, the significance of edge AI should not be underestimated. Edge AI refers to the deployment of AI algorithms on local devices such as smartphones, cameras, sensors, and other Internet of Things devices, rather than relying solely on cloud-based solutions. This decentralized approach offers numerous benefits and unlocks a wide range of possible applications.

One of the primary advantages of edge AI is reduced latency. By processing data locally on the device itself, edge AI eliminates the need for round-trips to the cloud, resulting in faster response times. This real-time capability is crucial in scenarios where immediate decision-making is vital, such as with autonomous vehicles, industrial automation, and critical infrastructure monitoring. Additionally, edge AI enhances privacy and security since sensitive data remains on the local device, reducing the risk of data breaches and ensuring user confidentiality.

Despite the numerous advantages, running more resource-intensive algorithms, such as complex object detection or deep learning models, on edge devices presents a significant challenge. Edge computing devices often have limited computational power, memory, and energy resources compared to cloud-based hardware. Striking a balance between algorithm accuracy and device constraints becomes crucial to ensure efficient operation. Optimizations like model compression, quantization, and efficient inference techniques are necessary to make these algorithms work well on edge devices.

Because understanding and recognizing objects in images or videos is a fundamental task in visual perception, object detection algorithms are of special importance across various industries and applications. Great strides have been made in adapting object detection models to resource-constrained edge devices, like Edge Impulse's FOMO algorithm that runs up to 30 times faster than MobileNet SSD, yet requires less than 200 KB of memory for many use cases. But for such important and diverse application areas, there is plenty of room for further advancements to be made.

The latest entrant into the field is a team of researchers from the Center for Project Based Learning at ETH Zurich. They have developed a highly flexible, memory-efficient, and ultra-lightweight object detection network that they call TinyissimoYOLO. The optimizations applied to this model make it well-suited for running on low-power microcontrollers.

TinyissimoYOLO is a convolutional neural network (CNN) based on the architecture of the popular YOLO algorithm. It was constructed of quantized convolutional layers with 3 x 3 kernels and a fully connected output layer. Both convolutional and fully connected linear layers are heavily optimized in the hardware and software toolchains of modern devices, which gives TinyissimoYOLO a boost in terms of speed and efficiency. It is a generalized object detection network that can be applied to a wide range of tasks, and requires no more than 512 KB of flash memory to store model parameters.

The model can be deployed on virtually any hardware that meets its very modest requirements, including platforms with Arm Cortex-M processors or AI hardware accelerators. A wide range of devices were tested with TinyissimoYOLO, including the Analog Devices MAX78000, Greenwaves GAP9, Sony Spresense, and Syntiant TinyML.

While evaluating their methods, the team found that they could run object detection on a MAX78000 board at a staggering 180 frames per second. And this excellent performance came with an ultra-low energy consumption of only 196 µJ per inference. Of course none of this matters if the model does not work well. But amazingly, this tiny model also performed comparably to much larger object detection algorithms.

Naturally some corners need to be cut to pull off such a feat, however. The input size of the image, for example, is limited to 88 x 88 pixels. That is insufficient resolution for many uses. Also, because the multiclass object detection problem gets more difficult as the number of objects increases, a maximum of three objects per image is supported.

Despite these limitations, the versatility, accuracy, and minimal hardware requirements of TinyissimoYOLO make it an attractive option for those looking to do object detection on the edge.