Improved Patch-Mix Transformer and Contrastive Learning Method for Sound Classification in Noisy Environments
@article{Chen2024ImprovedPT,
title={Improved Patch-Mix Transformer and Contrastive Learning Method for Sound Classification in Noisy Environments},
author={Xu Chen and Mei Wang and Ruixiang Kan and Hongbing Qiu},
journal={Applied Sciences},
year={2024},
url={https://api.semanticscholar.org/CorpusID:273594112}
}A Contrastive Learning-based Audio Spectrogram Transformer that incorporates a Patch-Mix mechanism and adaptive contrastive learning strategies while simultaneously improving and utilizing adaptive data augmentation techniques for model training is proposed.
26 References
ResNet Based on Multi-Feature Attention Mechanism for Sound Classification in Noisy Environments
- 2023
Computer Science, Environmental Science
A three-feature fusion ResNet + attention method (Net50_SE) to maximize information representation in environmental sound signals to overcome the structural impact of urban noise on audio signals and improve classification accuracy.
EnvGAN: a GAN-based augmentation to improve environmental sound classification
- 2022
Environmental Science, Computer Science
This paper introduces an architecture named EnvGAN for the adversarial generation of environmental sounds, and presents a method for GAN-based augmentation in the context of environmental sound classification, especially suitable for handling class-imbalanced datasets.
Environmental sound classification using a regularized deep convolutional neural network with data augmentation
- 2020
Environmental Science, Computer Science
Sound Classification and Processing of Urban Environments: A Systematic Literature Review
- 2022
Computer Science, Engineering
It can be realized that Deep Learning architectures, attention mechanisms, data augmentation techniques, and pretraining are the most crucial factors to consider while creating an efficient sound classification model.
Efficient Classification of Environmental Sounds through Multiple Features Aggregation and Data Enhancement Techniques for Spectrogram Images
- 2020
Environmental Science, Computer Science
This paper reports on employing various acoustic features aggregation and data enhancement approaches for the effective classification of environmental sounds using the transfer learning model DenseNet-161 and introduces two novel and innovative features based on the logarithmic scale of the Mel spectrogram denoted as L2M and L3M.
Environmental Sound Classification Based on Transfer-Learning Techniques with Multiple Optimizers
- 2022
Environmental Science, Computer Science
This paper aims to determine the effectiveness of employing pre-trained convolutional neural networks (CNNs) for audio categorization and the feasibility of retraining, and investigates various hyper-parameters and optimizers, such as optimal learning rate, epochs, and Adam, Adamax, and RMSprop optimizers for several pre- trained models.
High Accurate Environmental Sound Classification: Sub-Spectrogram Segmentation versus Temporal-Frequency Attention Mechanism
- 2021
Environmental Science, Computer Science
This paper proposes a sub-spectrogram segmentation with score level fusion based ESC classification framework, and adopts the proposed convolutional recurrent neural network (CRNN) for improving the classification accuracy.
Transformers for Urban Sound Classification—A Comprehensive Performance Evaluation
- 2022
Engineering, Environmental Science
Many relevant sound events occur in urban scenarios, and robust classification models are required to identify abnormal and relevant events correctly. These models need to identify such events within…
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
- 2017
Environmental Science, Computer Science
It is shown that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation.
Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction
- 2021
Computer Science, Environmental Science
This paper proposes a novel technique that uses only a single feature, namely the Mel-Frequency Cepstral Coefficient and just three layers of CNN, and demonstrates that such a simple network can considerably outperform several conventional and deep learning-based algorithms.