Make sound business decisions with audio AI
The beep of scanners, clatter of sorting equipment, and rumble of delivery trucks—these are the sounds of the logistics sector in constant motion and working tirelessly to deliver. With audio Artificial Intelligence (AI), these background noises are further transformed into rich sources of information, enabling data-driven decisions and optimized operations.
What is Audio AI?
Audio AI is a subset of AI that focuses on the analysis and understanding of sound patterns by machines, allowing them to "hear" and interpret like humans.
The technology consists of a range of capabilities, including:
- Speech recognition: Transcribing spoken language into written text.
- Sound classification: Categorizing sounds into different types (e.g., engine noises, alarms, and customer voices).
- Environmental noise detection: Identifying and analyzing background noises to understand the context of a sound.
Driven by advancements in technology, the audio AI market is booming, with increasing adoption across various sectors, and the growing demand for personalized audio experiences.
This industry is projected to grow from US$25.5 billion (€24.2 billion) in 2025 to US$952 billion (€904 billion) by 2034, exhibiting a compound annual growth rate (CAGR) of a whopping 49.5 percent, according to a report by Market Research Future.
But what could companies do with audio AI?
1. Quality control & maintenance
In many industries, quality checks and equipment maintenance are often performed manually by human inspectors. This manual approach could be subjective and prone to errors, and identifying subtle defects in products would be challenging and time-consuming.
Audio AI, however, can significantly enhance this process. The BMW Group Plant Dingolfing uses an AI model that automatically carries out an audio-based quality check. Microphones on the car seats record all driving noises and use AI to analyze and recognize whether there is any background noise. This is the final check before the vehicle is handed over to the customer.
The advantages of audio quality testing using AI are obvious: the automated process is faster, more efficient, and eliminates subjective perception of the sounds, even those that are not easily detectable by human ears. This leads to higher product quality and ultimately, improved customer satisfaction.
Audio AI can also help companies avoid reactive maintenance, where repairs are only initiated after equipment failures occur. Instead, they can take a proactive approach—identify potential issues before major problems are caused by listening to the sounds their machines make through audio AI. A study by McKinsey estimated that this can reduce machine downtime by 30 percent to 50 percent and increase machine life by 20 percent to 40 percent.
2. Workers’ safety
Warehouse safety often relies heavily on human observation and manual reporting of incidents. This leads to delayed responses during emergencies due to the time required to discover and report incidents.
Audio AI can employ advanced algorithms to continuously monitor ambient sounds within the warehouse environment, including:
- Collisions: The impact of forklifts colliding with objects or other vehicles.
- Falls: The sound of a worker falling from a height or tripping.
- Screams: The sound of a worker in distress.
- Glass breaking: Indicating potential damage to equipment or structures
These sounds, if detected accurately, can trigger real-time alerts to immediately notify safety personnel. Early intervention from timely notifications can limit the severity of injuries and property damage, creating a safer and more secure working environment for warehouse employees.
On the road, audio AI can also help anticipate dangerous situations and prevent accidents. Unilever’s trucks are equipped with Joctan, an AI system that can detect unsafe driving behaviors, such as talking on the phone or eating. Once detected, an in-cabin alarm is triggered, notifying the driver to react appropriately.
3. Smart voice-controlled interfaces
With warehouse labor as scarce as it has been, voice-guided technology has been a lifesaver for many operators, with some reporting reductions in errors and boosts in productivity as high as 35 percent.
While today’s technology allows workers to use voice commands to input data, they still need to undergo lengthy voice-training sessions to teach the system their unique voice, dialect, and jargon. This has changed with the integration of audio AI. Voice systems like Lydia Voice now boast over 99 percent accuracy in understanding worker commands. The technology has also improved by ignoring warehouse noises to optimize the sound.
Bringing the functions of audio AI a step further, Honeywell’s voice-guided warehousing solution integrates picking, packing, and put-away. Workers can be directed around the warehouse with an auditory to-do list, allowing them to reinforce focus with their hands and eyes.
Beyond the audio AI technology
As with any new technology, ethical and security concerns will exist.
With audio AI, deepfake phishing could potentially deceive unsuspecting employees into making unauthorized decisions. According to a McAfee global study, 25 percent of people have been or knew someone on the receiving end of an AI voice impersonation scam.
In the logistics sector, impersonation scams may involve releasing large monetary transactions, divulging sensitive cargo data, or revealing intricate delivery routes. An employee of LastPass was targeted with deepfake audio technology that impersonated the CEO of LastPass. Fortunately, the deception attempt failed due to the employee’s recognition of the deepfake audio.
Such risks can be mitigated through robust data encryption and user authentication, along with regular security audits and employee training on identifying and responding to potential threats.
The ethical implications of user consent and privacy are yet another concern. Logistics companies must be transparent with their employees and customers about how audio data is being collected, used, and stored.
An option is to clearly communicate the purpose of audio data collection, how it will be used (e.g., for training AI models, improving customer service), and obtain explicit, informed consent. Clear opt-out options ought to be provided so employees and customers can easily do so and have a say in what can be done with their data.
Play it by ear, audio AI is here to stay
Imagine a world where warehouses operate with near-zero human intervention, from the moment goods arrive to the second they ship out.
A fantasy? With the advancement of audio AI, it's slowly becoming a reality.
Audio AI has the potential to automatically direct parcels to the appropriate sorting area based on the sounds made by different packaging materials. At the same time, damaged goods (e.g., broken glass, crushed packaging) can be automatically flagged and rerouted for inspection or return.
Meanwhile, the sound of items being placed on shelves, moved by robots, or picked for orders is analyzed to track inventory in real-time, eliminating the need for manual stock checks. Unusual sounds like malfunctioning machines are immediately detected by the system, triggering alerts, and allowing for proactive maintenance or intervention.
During dispatch, audio sensors also monitor the loading process, so items are loaded correctly and securely onto trucks. The sounds of items shifting or falling are detected and addressed. Finally, real-time traffic sounds (e.g., congestion, accidents) are analyzed to optimize delivery routes and schedules, ensuring timely and efficient delivery.
ALSO WORTH READING