Edge AI Evangelist’s Thoughts Vol.16: Smart Edge Devices with Spresense

Hello everyone, this is Haruyuki Tago, Edge Evangelist at HACARUS’ Tokyo R&D center.

In this series of articles, I will share some insights from my decades of experience in the semiconductor industry and I will comment on various AI industry-related topics from my unique perspective.

In this volume, I will talk about a recent experiment I conducted using Sony’s Spresense board to monitor the rotational noise from a model motor. The Spresense was also used to determine if the motor was running in a normal or abnormal state.  


Using an Edge Device to Determine Normal & Abnormal Motor Rotation

In previous volumes, I have used the Spresense to analyze Boston housing data with Lasso [1]. Once again, I used the Sony Spresense but this time it was as a smart edge device to analyze motor noise. I’ve included a short video above to demonstrate how the prototype ‘Realtime Sound Analyzer’ was used to classify the motor state as either normal or abnormal. 

Below, I will give a brief overview of the Realtime Sound Analyzer as well as its implementation in Spresense. I will also explain the experimental procedures and the accompanying results. 

Benefits of Sound Processing on Edge Devices

When analyzing noise profiles, such as motor rotation, frequency analysis is a technique that is commonly used. I also used frequency analysis in this experiment for two separate cases. One case was performed using the cloud and another was conducted on the edge device. The two different methods were compared by looking at the data transfer rate in figure 1.

A table that explains the method, data transfer rate, and formula for the cloud and edge device.

Figure 1 Comparison of data transfer rates for the cloud

Based on Figure 1, when uploading raw sound data to the cloud, a transfer rate of 704kbit/s is required. On the other hand, when an edge device analyzes the sound frequency, calculated the spectral data, and sends it to the cloud, the data transfer rate is only 1.231 kbit/s. While these values only apply to cases where the frequencies are fairly consistent, it still shows a difference in data transfer performance of around 571 times. 

Furthermore, the amount of data transferred and the strain on the cloud processing can be reduced even further by only notifying the cloud when an abnormal frequency is detected. This is where Spresense really shines because it can capture sound at a sample rate of 48,000 times per second, which is the same rate as that of a music CD. The Spresense can also perform the frequency analysis using a fast ‘Fast Fourier Transform (FFT).

Real-time Sound Analyzer Overview & Experimental Steps

Starting with the experiment, Figure 2 shows the experimental setup for the prototype Realtime Sound Analyzer along with some simple explanations. 

A picture of the experimental device and a simple description for each of the components.

Figure 2 Real-time sound analyzer and experimental equipment

The Real-time Sound Analyzer shown consists of the Spresense, a connected microphone, and an operation panel (lower part of figure 2). The system analyzed the motor rotation while it was superimposed on the factory noise from the PC speakers to simulate real working conditions. 

To simulate a variety of abnormal situations, I changed the resistance value between the model motor and the power supply. The voltage applied to the motor varied in four steps (2.88V, 2.28V, 2.00V, and 1.58V). The rotational sound set at 2.28V acted as the normal sound. I then tested to see if the remaining three rotational sounds could be detected as abnormal sounds. 

The Real-time Sound Analyzer also had two modes, ‘reference sound capture’ mode, and ‘sound test’ mode which was toggled using the mode switch. The process flow for each of these modes is shown in figure 3.    

To begin the test, I started by setting the mode switch to the reference sound capture mode. I then pressed the reference sound capture switch to acquire the reference sound. At this time, the reference sound was subjected to FFT to calculate the frequency spectrum, shown on the left of Figure 3.  

A diagram that shows the mode switch apparatus and the flowchart of both the sound test and reference sound capture modes.

Figure 3 Processing flow of the Real-time Sound Analyzer

This data was written to the micro SD card on the Spresense expansion board. Next, I switched the mode switch to sound test mode in order to analyze the sound to be tested. While in test mode, the Spresense repeated the following loop:

  1. Acquired the target sound
  2. Calculated the frequency spectrum of the target sound using FFT
  3. Calculated the difference between the reference sound spectrum and the test spectrum
  4. Compared the difference with the criterion value of normal and abnormal states and displayed the results on the LED shown on the right of Figure 3

Implementation in Spresense – A Powerful Sound Processor

Looking at the hardware, the Spresense consists of the CX5207 chip, which allows sound input/output processing and power control functions. It also includes the CXD5602 chip, which comes equipped with six ARM Cortex M4F cores shown in Figure 4 [2]. 

Shifting to the software development environment, I used the Spresense Arduino 1.8.13. The Asymmetric MultiProcessing (ASMP) framework was provided to support the communication between the cores and the sound input functions [3][4]. In this experiment, I also used one analog microphone channel. 

A layout of the CXD5247 and CXD5602 chips.

Figure 4 Specifications of SPRESENSE with LSI (CXD5602 & CXD5247) [1]

Figure 5 shows the processing flow of the Spresense, in which the main program in MainCore communicates with both the sound acquisition program of the CX5247 chip and SubCore1 which uses the ASMP Framework and the FFT program run in parallel [4]. 

For the ASMP programming, I used the Spresense sample program, ‘Sound Detector’, as a reference [5]. When running the FFT, 1024-point FFT took only 0.065 seconds to perform. The stability of the FFT was further improved by running the program ten times and averaging the results. Even with ten runs, the cycle time was only around 0.65 seconds and any computational delay was unnoticeable.   

Flow charts for each of the programs run using the CX5602 chip.

Figure 5: Spresense processing flow

Experimental Results

As shown in Figure 6, the experimental scenarios for Steps one through seven are mapped. Matching the reference sound using 2.28V, steps 1, 5, and 7 were all considered normal sound states. This also means that steps 2, 3, 4, and 6 should all be considered abnormal states.  

Figure 6: Experimental condition scenario for the video

When checking for abnormal noise states, the difference between the reference and test sound spectrums was calculated. In this experiment, there were various indices used to express these differences between spectrums using Cross-Entropy [6] which is used in machine learning.

Figure 7 plots the spectrum differences for each of the seven steps compared to the reference sound spectrum. The horizontal axis shows the elapsed video time and the vertical axis shows the spectral difference at each measurement point. The green circles indicate the normal sounds while the red circles indicate abnormal sounds. These results are consistent with the predictions for the theoretical scenario shown in figure 6. 

A display that shows the spectrum groupings for each of the 7 steps. They are circled by a red circle to indicate abnormal noise and a green circle for normal noise.

Figure 7: Spectrum differences of test Step1~7


  • In today’s volume, I developed a prototype ‘Real-time Sound Analyzer’ using Sony’s Spresense as a smart edge device. The Spresense was used to dramatically reduce the amount of data transferred to the cloud. It also decreases the processing load on the cloud because it acquires, analyzes, and differentiates normal and abnormal sounds on the edge device. 
  • The experiment used the Real-time Sound Analyzer to differentiate normal and abnormal sounds for a model motor by monitoring the rotational noise. The motor noise was also superimposed with factory noise to simulate real working conditions.
  • By calculating the differences in sound spectra, the experimental setup was able to distinguish between normal and abnormal sounds. This was done using four voltage steps, where one acted as the reference sound and the other three created abnormal sounds. 


[1] 『半導体業界の第一人者,AI業界を行く!』 Vol.8:Spresense でエッジAI -LASSOでボストン住宅データを分析-


[2]  ソニー Spresense 製品情報


[3] Spresense Arduino Library Developer Guide


[4] 5.7. ASMP Framework


[5] Spresense Arduino Examples & Tutorials, SoundDetector


[6] Cross entropy


Subscribe to our newsletter

Click here to sign up