Note: This website was automatically translated, so some terms or nuances may not be completely accurate.
Generative AI Applications in Edge Environments

frog
This article presents content originally published in "Design Mind," a design journal operated by frog, under the supervision of Mr. Noriaki Okada of Dentsu BX Creative Center.

To realize the future of IoT (Internet of Things) at relatively low cost, generative AI is shifting its focus toward the "edge."
The "edge" is where the digital and physical worlds intersect. It refers to the connected devices we use—like smartphones and laptops—as well as sensors and robotic actuators that digitize and transform the physical world.
This article explores "How to Utilize Generative AI in Edge Environments" while unraveling why edge AI is becoming the next frontier for generative AI.
<Table of Contents>
▼The Edge is the Next Frontier for Generative AI
▼Cost-wise, Small Models on Edge Processors Hold the Advantage
▼Generative AI Opens a New Era of IoT Leveraging Edge-Specific Value
▼The Usefulness of Unique Training Data Gained at the Edge Increases
▼Cross-Domain Approaches and Teams Will Be Essential Going Forward
The "Edge" is the Next Frontier for Generative AI
The adoption of generative AI and the expectations placed upon it have surged dramatically over the past year. This is because it promises to bring fundamental changes that overturn conventional wisdom to companies, products, and their customers across all industries.
The related market is projected to grow at an average annual rate of 42% over the next decade, reaching a scale of $1.3 trillion.
To date, generative AI usage has primarily centered on use cases leveraging "Large Language Models (LLMs)" to generate useful text, images, audio, and now video content based on prompts (instructions) for mass-market applications on the web.
The shift from keyword-based information retrieval to prompt-driven content generation will be a key driver of business opportunities and transformation in the coming years.
Predicting the next step for this market requires a significant leap of imagination, akin to envisioning electricity's applications upon first seeing a light bulb illuminate. That said, one trend becoming increasingly clear is that the "edge" is poised to become the next frontier for generative AI.

From a cost perspective, small models on edge processors hold an advantage
For many applications, smaller models running on edge processors offer a cost advantage.
AI models are becoming increasingly sophisticated. Major tech companies competing to lead the consumer-facing generative AI space see developing powerful AI using ever-greater volumes of training data as absolutely essential to winning the race.
Within just a few years, the number of parameters each model (GPT, Gemini, Mistral, Llama, etc.) relies on has increased by over 1000 times, and scores on large-scale Multi-Task Language Understanding (MMLU) benchmarks (a measure akin to a high school graduation exam for gauging an AI model's knowledge) have tripled.
This state of affairs makes sense. These AI models must handle everything from suggesting recipes based on ingredients to generating professional-quality portraits from a single photo. The drive to secure leadership in the mass market also fuels cost optimization for these commercial applications. This trend will continue.
These increasingly powerful AI models can cost several cents per query, with costs varying based on the scale and nature of the prompt and output. Consequently, a fierce competition for computational power has emerged, driving the stock price of NVIDIA (a semiconductor manufacturer at the forefront of AI technology) to record highs.
However, most companies and applications do not require these top-tier models. As applications become specialized for specific purposes, the scope of information needed to generate useful responses and the degree of model sophistication required decrease.
Furthermore, model performance is advancing rapidly. Even "small" models like Gemini Nano are now achieving MMLU scores comparable to "large" language models like GPT-3, which were developed less than two years ago.
To be specific, creating an effective customer support chatbot requires only an understanding of conversational language, relevant documentation for specific products, and ideally transcripts of customer calls. Knowledge of poetry, music, or images is unnecessary. This significantly reduces the computational load on the model.
Today, tools like Edge Impulse enable developers to easily leverage large models like GPT-4o to train vastly smaller custom models. These models can then execute narrow-scope functions on edge device processors with significantly lower latency.
Simultaneously, trends in platform subscription fees for services like Microsoft's AI tool Copilot and per-query costs for custom model applications are strongly driving the shift towards building efficient, moderately sized models.
When running generative AI on edge hardware like users' laptops, smartphones, or connected devices, queries become effectively free if the major challenge of achieving satisfactory AI model performance despite limited local processing power and memory can be solved.
With these factors converging, we will likely see a surge of use cases leveraging generative AI with relatively small language models in specialized applications for specific business domains or products, both at the edge and in the cloud.
The figure below shows the correlation between operational costs and model size. "Tokens" represent words or character fragments, and query costs are determined by the number of tokens processed. This means that larger prompts or outputs require more tokens, increasing overall costs. "LLM active parameters" refer to the number of internal variables within a large language model that are used during processing. More parameters mean a larger, higher-performance model, but also greater resource requirements.

Generative AI Opens a New Era of IoT Leveraging Edge-Specific Value
As tangible, analog beings, we humans live in the physical world, experiencing happiness and practicality. This is why the edge holds such significant influence. For us, the edge is a two-way window to the digital world we connect with through touchscreens, voice assistants, and AR (augmented reality) headsets. It's the pathway where data-enabled actions and practicality ultimately take shape. While you can buy airline tickets and check in for a flight on your smartphone, the real benefit comes from actually boarding the plane and flying to your physical destination.
Edge processing and edge AI are not new concepts. Edge processing is the technology that enables execution on the device itself when cloud connectivity or cloud processing presents issues like unreliability, slowness, excessive cost, or high risk. There are already numerous valuable use cases enabled by edge processing and AI: autonomous vehicles avoiding obstacles, surgical robots using haptic feedback to make precise incisions, sensors controlling production processes in remote factories in real time.
The use of generative AI at the edge will open up new frontiers in practical applications. Generative AI is now driving a paradigm shift from search engines that merely display result lists to chatbots that generate content tailored to each individual user. Similarly, the use of generative AI at the edge is expected to focus on generating predictions, suggestions, and content customized for each user, based on information that combines real-time context from the physical world with knowledge from the digital world.
This represents precisely the "level-up in value" everyone has anticipated from IoT. Considering that previous IoT primarily presented novel features like quantification and connectivity as practical, without delivering compelling value beyond that, this is a significant step forward.
The possibilities are limitless, and expectations continue to grow. Many practical applications already exist, generally falling into two categories: products and services that deliver unprecedented user interfaces and experiences, and those that enable actual actions.
<User Interface and Experience>
-
According to Apple's announcement last year, the company plans to integrate ChatGPT into iOS 18, iPadOS 18, and macOS Sequoia. This integration is touted as offering privacy benefits for users across all areas of content creation.
-
For travelers, edge-based generative AI enables voice recognition, background noise removal, and real-time translation into local languages, facilitating smooth conversations beyond language barriers. Samsung's latest smartphone, the Galaxy S24, features this capability.
-
In industrial facilities, edge-based generative AI enables text and voice-based queries on real-time camera footage using machine vision. This allows for precise monitoring of complex events, such as verifying whether employees are wearing appropriate safety gear. NVIDIA has already demonstrated this use case.
-
For consumers, the practicality of digital assistants and augmented reality devices like Apple Vision Pro, Limitless Pendant, rabbit r1, Humane Ai Pin, and Brilliant Labs Frame AI smart glasses is expected to significantly improve through edge-based generative AI. These devices can generate suggestions and contextual information by incorporating the surrounding environment alongside user instructions, all with minimal latency, eliminating frustration. Qualcomm has already demonstrated large-scale multimodal models running on Android smartphones.
-
Google recently demonstrated edge-based generative AI capabilities on Android for the e-commerce platform Shopify. This enables retailers to leverage edge-based generative AI for modifying and preparing product images for their e-commerce sites.
-
In healthcare, edge-based generative AI can generate medical recommendations in real time, securely and without transferring patient data to the cloud. This is done by combining voice input from treatment teams, medical training data, and multimodal biosignal sensor data collected from patients.
<Edge Actions>
-
In robotics, edge generative AI enables processing of voice commands involving "inference," far exceeding the scope of repetitive task automation. This allows robots to operate seamlessly as more useful assistants. Figure, a startup developing humanoid robots, recently demonstrated this capability by leveraging a partnership with ChatGPT. The robot can receive indirect prompts like "Can I get something to eat?" and reason based on usable items within its field of view. Embedding this generative AI interface directly into the robot at the edge reduces latency and lowers operational costs.
-
For autonomous vehicles, leveraging generative AI at the edge can improve driving by inputting sensor data—such as from LiDAR systems or cameras detecting the surrounding environment—to generate predictions about the movements of other objects. While previous machine vision systems could detect pedestrians, generative AI can predict that a pedestrian is about to cross a crosswalk. This processing must be done instantly on the vehicle's edge system.
The value of unique training data obtained at the edge increases
In most AI application development, the proportion of time spent on model development is already lower than data collection, processing, testing, and optimization. As the performance of readily available models, including open-source solutions like Meta's Llama, continues to improve, the focus of AI development is expected to shift increasingly from model development to other activities.
For applications specialized in specific tasks, the primary challenges lie in data collection, labeling, cleaning, and maintenance.
Many business-specific applications can leverage data already existing solely in the digital realm (such as internal knowledge management based on full-text documents in company directories). Conversely, accessing and digitizing data from the physical world via connected sensors and devices presents a significant opportunity for companies seeking to differentiate themselves in the current AI development race.
Privacy concerns and constraints will also be a significant driving factor. According to Capgemini Research Institute, 75% of consumers consider trust a key factor when deciding to purchase connected products. Therefore, companies are compelled not only to consider privacy and security but also to provide enough value that customers feel their data is worth sharing. If someone doesn't want to share their information, they can easily disable cookies. On the other hand, for those who have embraced connected products in their homes or workplaces because of the value they bring, letting go of those products is far more difficult than disabling cookies.
Moving forward, cross-disciplinary approaches and teams will be essential.
Considering the cost pressures and computational resource constraints of cloud-based LLMs, coupled with the potential for high-value use cases enabled by edge-based generative AI and data collection, this field of edge generative AI adoption is expected to experience rapid growth starting in 2024 and beyond.
As this trend continues, companies must consider how to ride this wave. Since this field is still in its infancy, developing a solid strategy requires cross-disciplinary approaches and teams. It will require collaboration among engineers who can evaluate whether models can be built and operated on resource-constrained edge processors, user experience and strategy teams who understand the new technologies enabling this potential and the value they deliver to users, and business analysts who can calculate the trade-offs and benefits to ensure a return on investment (ROI).
Was this article helpful?
Newsletter registration is here
We select and publish important news every day
For inquiries about this article
Back Numbers
Author

frog
frog is a company that delivers global design and strategy. We transform businesses by designing brands, products, and services that deliver exceptional customer experiences. We are passionate about creating memorable experiences, driving market change, and turning ideas into reality. Through partnerships with our clients, we enable future foresight, organizational growth, and the evolution of human experience. <a href="http://dentsu-frog.com/" target="_blank">http://dentsu-frog.com/</a>


