Physician-patient speech separation method based on voiceprint technology and privacy protection

Open Access

Issue		BIO Web Conf. Volume 111, 2024 2024 6^th International Conference on Biotechnology and Biomedicine (ICBB 2024)


Article Number		03015
Number of page(s)		6
Section		Medical Testing and Health Technology Integration
DOI		https://doi.org/10.1051/bioconf/202411103015
Published online		31 May 2024

S. S. E. Tranter and D. A. Reynolds, "An overview of automatic speaker diarization systems," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1557–1565, Sept. 2006, doi: 10.1109/TASL.2006.878256. [CrossRef] [Google Scholar]
X. Anguera, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland and O. Vinyals, "Speaker Diarization: A Review of Recent Research," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 356–370, Feb. 2012, doi: 10.1109/TASL.2011.2125954. [CrossRef] [Google Scholar]
Tranter, Sue et al. “An Investigation into the Interactions between Speaker Diarisation Systems and Automatic Speech Transcription B Accuracy of Cts Forced Alignments 44.” (2003). [Google Scholar]
H. Gish, M.H. Siu and R. Rohlicek, "Segregation of speakers for speech recognition and speaker identification," [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing, Toronto, ON, Canada, 1991, pp. 873–876 vol.2, doi: 10.1109/ICASSP.1991.150477. [CrossRef] [Google Scholar]
Chen, Scotte et al. “Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion.” (1998). [Google Scholar]
X. Anguera, C. Wooters and J. Hernando, "Acoustic Beamforming for Speaker Diarization of Meetings," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, pp. 2011–2022, Sept. 2007, doi: 10.1109/TASL.2007.902460. [CrossRef] [Google Scholar]
D. Vijayasenan, F. Valente and H. Bourlard, "An Information Theoretic Approach to Speaker Diarization of Meeting Data," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 7, pp. 1382–1393, Sept. 2009, doi: 10.1109/TASL.2009.2015698. [CrossRef] [Google Scholar]
F. Valente, P. Motlicek and D. Vijayasenan, "Variational Bayesian speaker diarization of meeting recordings," 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 2010, pp. 4954–4957, doi: 10.1109/ICASSP.2010.5495087. [CrossRef] [Google Scholar]
P. Kenny, D. Reynolds and F. Castaldo, "Diarization of Telephone Conversations Using Factor Analysis," in IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 6, pp. 1059–1070, Dec. 2010, doi: 10.1109/JSTSP.2010.2081790. [CrossRef] [Google Scholar]
N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel and P. Ouellet, "Front-End Factor Analysis for Speaker Verification," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 788–798, May 2011, doi: 10.1109/TASL.2010.2064307. [CrossRef] [Google Scholar]
E. Variani, X. Lei, E. McDermott, I. L. Moreno and J. Gonzalez-Dominguez, "Deep neural networks for small footprint text-dependent speaker verification," 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014, pp. 4052–4056, doi: 10.1109/ICASSP.2014.6854363. [CrossRef] [Google Scholar]
G. Heigold, I. Moreno, S. Bengio and N. Shazeer, "End-to-end text-dependent speaker verification," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 5115–5119, doi: 10.1109/ICASSP.2016.7472652. [Google Scholar]
Q. Wang, C. Downey, L. Wan, P. A. Mansfield and I. L. Moreno, "Speaker Diarization with LSTM," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 2018, pp. 5239–5243, doi: 10.1109/ICASSP.2018.8462628.clustering [Google Scholar]
D. Snyder, D. Garcia-Romero, G. Sell, D. Povey and S. Khudanpur, "X-Vectors: Robust DNN Embeddings for Speaker Recognition," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 2018, pp. 5329–5333, doi: 10.1109/ICASSP.2018.8461375. [CrossRef] [Google Scholar]
Fujita, Yusuke, et al. "End-to-end neural speaker diarization with permutation-free objectives." arXiv preprint arXiv:1909.05952 (2019). [Google Scholar]
Y. Fujita, N. Kanda, S. Horiguchi, Y. Xue, K. Nagamatsu and S. Watanabe, "End-to-End Neural Speaker Diarization with Self-Attention," 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Singapore, 2019, pp. 296–303, doi: 10.1109/ASRU46091.2019.9003959. [CrossRef] [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.