The intelligent classification method of aircraft cockpit sound based on deep learning

Di ZHANG; Yuantong CHAI; Peipei ZENG; Juan YANG

doi:10.1051/jnwpu/20254340784

Open Access

Issue		JNWPU Volume 43, Number 4, August 2025


Page(s)		784 - 793
DOI		https://doi.org/10.1051/jnwpu/20254340784
Published online		08 October 2025

JNWPU 2025, 43(4): 784–793

The intelligent classification method of aircraft cockpit sound based on deep learning

基于深度学习的飞机舱音智能分类方法

Di ZHANG (张迪)¹, Yuantong CHAI (柴源通)², Peipei ZENG (曾佩佩)¹ and Juan YANG (杨娟)¹

¹ Engineering Technology Training Center, Civil Aviation University of China, Tianjin 300300, China
² College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China

Received: 9 September 2024

Abstract

The critical background sounds in the cockpit provide important evidence for flight monitoring evaluations and accident investigations. Regarding the high complexity and large data requirements of cockpit voice recorder(CVR) audio recognition, the issue of identifying low-frequency transient background sounds is particularly challenging, along with the interference caused by engine noise, an intelligent classification method of CVR background sounds based on deep learning is paper proposed. A dataset of 10 types of CVR background sounds was established, with acoustic features extracted by using three spectrogram methods, and a time-delay neural network model was built. Context-aware masking modules were used to reduce the impact of noise on operational sounds, while the front-end convolution module captured low-frequency transient signals. This study optimized a hybrid convolutional and time-delay neural network model, TDNN-CF. The improved model achieved a classification accuracy of 98.90%, representing increases of 13.04 and 2.99 percentage points comparing with the traditional CNN and TDNN models, respectively. Additionally, comparing with the classic machine learning algorithms like decision trees, random forests, and K-nearest neighbors(KNN), accuracy improved by 18.07, 15.62, and 14.55 percentage points, respectively. Experimental results show that the present method efficiently classifies CVR audio.

摘要

飞机舱音的关键背景声为航空器飞行监控评估与事故调查分析提供了重要的依据。针对驾驶舱话音记录器(CVR)音频识别的高专业性和数据密集型特征、低频瞬时背景声识别难度高以及发动机噪声干扰的问题, 提出了一种基于深度学习的CVR背景声智能分类方法。该方法以十类CVR背景声建立数据集; 采用3种特征谱图提取声学特征, 并搭建时延神经网络模型; 利用上下文掩蔽模块降低噪声对开关和操作声音的影响, 使用前端卷积模块捕捉低频瞬时声信号, 进而优化出卷积神经和时延神经的混合模型TDNN-CF。改进后模型的CVR音频分类准确率达到98.90%, 相较于传统的卷积神经网络和时延神经网络模型, 其准确率分别提升了13.04和2.99个百分点。此外, 与决策树、随机森林和K近邻等经典机器学习算法相比, 准确率分别提升了18.07, 15.62和14.55个百分点。实验结果表明, 所提方法实现了CVR音频的高效分类。

Key words: cockpit voice recorder / sound classification / characteristic spectrum / time delay neural network / context-aware masking

关键字 : 驾驶舱话音记录器 / 声音分类 / 特征谱图 / 时延神经网络 / 上下文掩蔽模块

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.