top of page

Accelerating conservation science with ARMSoM CM5 SoM.

Real-time marine mammal detection for active management solutions using an inexpensive, open-source compute module.


ree

Buoy Photo and video credits go to Cetaware

Company size: 1-10 employees

Company Industry: Environmental Sciences & Services

Website: www.cetaware.nz 

 


List of features and technical specifications

 

Real-time signal processing and inference for classification of cetacean vocalisations.

96kHz sampling rate (up to 384kHz)

Hydrophone sensitivity -165 dB re 1V µPa

LTE Cat M-1 for telemetry/connectivity.

ARMSoM Compute Module (CM5) for edge processing and machine learning.

 

 

Project background

 

By listening to the natural world, we can monitor how animals are communicating with one another, and we can obtain fundamental baselines so to confirm future changes in response to environmental change. For example, by detecting the presence of vocalisations from vocal species in a variety of habitats, researchers can investigate populations and plot trends in abundance, habitat uses and biodiversity. These datasets are crucial for establishing baselines so that impacts from human activity, such as construction, can be quantified and better managed. The exciting aspect of ecological acoustics (such as bioacoustics) is that vocal animals lend themselves so well to passive acoustic monitoring (PAM) technology. PAM’s power is its scalability. Many autonomous loggers (real-time or recording to memory cards) can be deployed in remote habitats that are too difficult to study with visual-based methods, such as visual surveys (cameras, aerial surveys or line transects).

 

The ocean is one such place that is very expensive and difficult to sample at scale. Weather and visibility are constant constraints when studying habitat use/presence of marine mammals, as visual surveys require fine weather and day light hours. This inherently leads to sampling biases in some datasets, where marine mammal ‘hot spots’ tend to overlap with areas with higher human presence. However, whales and dolphins are active at night and can move to different areas than during the day. Therefore, passive acoustic monitoring (PAM) for whale and dolphin vocalisations, and even fishes and invertebrates, can reveal a great deal about their ecology and provides researchers are more complete picture. Furthermore, human presence, or activity in marine habitats can be monitored using the same devices, by detecting the noise of vessels.

 

Challenges faced

 

Due to PAM’s potential in providing continuous ecological data on difficult to study species that is not restricted by weather or light conditions, acoustic loggers are becoming more and more commonly used in conservation science. In the oceans, hydrophones are used to eavesdrop on reefs, the deep ocean, estuaries and coastlines. The most common technology used around the world is still the autonomous recorder, which is a depth-rated device consisting of a hydrophone(s) connected to a pre-amplifier, a digital-to-analogue (DAC) converter, some controller and a storage device. Recorders are deployed at all depths (depending on the device used) and typically deployed for several months at a time (up to a year in many cases). Some PAM systems have hydrophones that are cabled to shore stations and operate on a more permanent basis over many years. These technologies lead to researchers curating hundreds of thousands of hours of recordings that demand automated analyses to obtain meaningful information from them. This is where deep learning has been a game changer.

 

Deep-learning models and AI-accelerated hardware has unlocked PAM’s true scalability potential by enabling scientists to analyse terabytes of recordings in reasonable time frames. Transfer learning means researchers can use their own annotated datasets collected over their careers to fine-tune highly efficient deep-learning models for their own purposes. And the advent of AI-accelerated hardware means those models can run at unprecedented speeds compared to algorithms running on CPUs, including more traditional machine learning based classifiers, such as random forest models. However, when autonomous loggers are retrieved, especially from multiple habitats after several months of recording, there are bottlenecks that delay data analyses (assuming the instruments were successfully found and retrieved). Additionally, autonomous loggers are at risk of water ingress (i.e. flooding) and memory card corruption (as all technology that is submerged in water), which means data can be lost unknowingly and subsequent gaps in baseline datasets can impair scientists from making statistically robust conclusions.

 

Deploying deep-learning models at the edge circumvents bottlenecks in data analyses from autonomous recorders, providing a substantial advantage to researchers who use ‘big data’ to study their ecosystems. Deploying deep-learning models at the edge and transmitting processed data in real-time circumvents data security issues as data can be immediately retrieved, archived and backed up.

 

However, the ocean is noisy, and marine mammal monitoring often occurs in areas that also contain a lot of human activity. For example, working ports, construction sites, shipping lanes, urbanised estuaries and harbours are all areas with lots of different anthropogenic (man-made) noise sources. Furthermore, weather, currents (tides), biofouling, and other animal sounds (such as fishes and invertebrates) also produce a lot of sounds that need to ‘filtered out’ (i.e., removed) so not to mask target vocalisations, or lead to false positives (i.e., reporting detections of vocalisations when they are, in fact, from another source, such as weather or a vessel’s sonar). Therefore, regular updates of models are important so that the onboard AI continues to be trained after the initial data is verified by trained bioacousticians (i.e., a human-in-the-loop training pipeline). Over the air updates means advancing technology, such as mobile-friendly model architectures trained using Google DeepResearch agile modelling frameworks, can be pushed to sea without device retrieval.

 

The CM5 solution (Why ArmSoM CM5?)

 

Adaptive edge processing and multi-threaded edge processing for multiple acoustic sources demands computationally capable, yet extremely efficient, hardware and software. ARMSoM’s system-on-modules (SoM) are the perfect balance of computational power for AIoT and power consumption (0.5-1W idle power consumption running Armbian). The RK3576 SoC, with its 4x Cortex A72 and 4x A55 processors, allows for tailored resource use, while the NPU and LPDDR5 RAM provide extremely low latency for real-time audio stream classification.

 

The advent of integrated NPUs with ARM processors presents real advantages for edge machine learning in bioacoustics. The use of microcontrollers, such as ESP32-S3, is still highly relevant, but ARMSoM’s adoption of RK35xx SoCs in their boards enables parallel processing pipelines and concurrent inferencing for audio classification. This is hugely powerful as it means different acoustic analyses can be performed using a single hydrophone to simultaneously listen for multiple species of whales and dolphins, as well as other sources.

 

Then, there is the cost benefit of using open-source SoMs like the ARMSoM CM5. Monitoring the oceans using scientific-grade hardware and instruments is very expensive. Real-time acoustic monitoring and bioacoustics signal detectors/classifiers are also very expensive. This can mean that passive acoustic monitoring can be cost-prohibitive, especially for developing nations or not-for-profit organisations. A 2024 survey found 90% of respondents reporting a need for low-cost autonomous sound recording devices, giving rise to the international low-cost hydrophone project. The survey included researchers in governments, universities, NGOs from across the world. By using open-source computing hardware, build costs for real-time acoustic monitoring can be lower than when using systems with specialised microcontrollers.

 

 

 

Implementation process

 

To demonstrate the advantages of open-source hardware for edge processing in bioacoustics, we needed a demonstration buoy. We specialise in developing and deploying real-time acoustic monitoring systems for marine science but for our technology to work, it requires a platform. The ARMSoM CM5 and the peripheral hardware was installed inside a Nexsens buoy inside Whangarei Harbour in the north of New Zealand, in partnership with Northport Limited. Hydrophones were installed below the buoy, connected to a 24-bit, 96kHz DAC, then to the compute module. Connectivity to AWS for the real-time data transmission was achieved using the 4G cellular network. Detection data (audio data, spectrograms and metadata) are sent to AWS, where further analyses can be done shore-side.

ree

(credit Ingrid Visser for orca photos required)


Outcomes and benefits

 

The RK3576 is a powerful SoC that comfortably runs our entire real-time AI pipeline but with the added benefit of the LPDDR5 RAM and lower power consumption than before. Currently installed inside the buoy in Whangarei, it has successfully transmitted data on a range of dolphin species in real-time to our online dashboard without fail. Over the past 13 months, the buoy has provided acoustic data of 219 encounters of marine mammals entering/leaving the harbour. Of those 219 encounters, 24 of them were orca, while the bottlenose dolphins made up most of the remaining encounters (three of them were NZ fur seals).


Comments


bottom of page
keywords:rk3576开发板rk3576开发板rk3576开发板rk3576开发板