Phil Rose papers on Forensic Voice Comparison

2026	Tianle Yang, Chengzhe Sun, Phil Rose, Cassandra L. Jacobs, Siwei Lyu, : Assessing the Ability of Neural TTS Systems to Model Consonant-Induced F0 Perturbation. Computer Speech and Language 100. This study tests the ability of neural TTS models to reproduce consonant-induced f0 perturbation. Results show accurate reproduction for high-frequency words but poor generalization to low-frequency items, suggesting that the TTS architectures examined rely more on lexical-level memorization rather than abstract segmental-prosodic encoding. A copy of this paper can be downloaded before 5th May 2026 from: https://authors.elsevier.com/c/1mo8J_K8BYv8vk [preprint.pdf]
2025	Tianle Yang, Chengzhe Sun, Siwei Lyu, Phil Rose: Forensic deepfake audio detection using segmental speech features. 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Hawaii. This study explores the potential of using acoustic features of segmental speech sounds to detect deepfake audio. The results demonstrate that certain segmental features commonly used in forensic voice comparison are effective in identifying deepfakes, whereas some global features provide little value. [preprint.pdf]
2024	Applications of the Likelihood Ratio Framework in Forensic Speech Science Cases Involving Disputed Utterances, Tampering and Voice Lineups. Peer-reviewed paper in Olga Maxwell & Rikke Bundgaard-Nielsen (eds.) 19th Australasian Int'l conf. on Speech Science and Technology, Melbourne. Examples are given of the use of the likelihood ratio framework in real world case-work involving disputed utterances and tampering, and its theoretical application in voice lineups. Some of the problems in application are pointed out, especially with respect to estimation of priors. [pdf]
2022	Likelihood Ratio-based Forensic Semi-automatic Speaker Identification with Alveolar Fricative Spectra in a Real-world Case. Peer-reviewed paper accepted for 18th Australasian Int'l conf. on Speech Science and Technology, Canberra. [this is intended as a useful summary of the carbombing case in the 2017 report below] [pdf]
2021	Forensic Semi-automatic Voice Comparison - an Explanation using Chinese Speech Sounds. This paper was originally commissioned as a chapter for the Palgrave Handbook of Chinese Language Studies, but I withdrew it due to incompetent copyediting. [pdf]
2018	(with ZHANG Cuiling) Conversational Style Mismatch: its Effect on the Evidential Strength of Long-term F0 in Forensic Voice Comparison. Proc. 17th Australasian Int'l conf. on Speech Science and Technology, Sydney (SST 2018), pp. 157-160. [full paper peer review]. [pdf]
2017	Likelihood ratio-based forensic voice comparison with higher level features: research and reality. In Eduardo Lleida & Luis J. Rodriguez-Fuentes (eds.) Recent Advances in Speaker and Language Recogniton and Characterisation. Computer Speech and Language Special Issue : 476-502. [This is an extended and revised version of Bruce Wang's and my Odyssey 2016 paper. It includes additional results of testing with another Cantonese higher level feature - bandlimited cepstral spectrum of nasal /m/ - and discussion of more real-world forensic voice comparison cases.] [preprint pdf]
2017	Report in a real Forensic Voice Comparison case involving the (non-terroist) blowing up of a car. This is the annonymised report of an interesting forensic voice comparison I did in 2017, where the aim was exculpation by identification. The case was interesting for several reasons. One of which was the feature used: it compared questioned and known data with respect to the spectra of their alveolar fricatives /s/ and /z/, parametrised with band-limited cepstral coefficients between 1k and 8k. Another reason was the sparseness of the questioned data - one token each of [s] and [z]. The report also contains a section on validation, which many now consider an essential part of any case-work (you need to be able to demonstrate that your system does what you claim). [annonymised pdf]
2017	Evaluating Strength of Evidence in Voice Lineups This is a powerpoint presentation at a festschrift-symposium to honour Andy Butcher on his seventieth birthday. I address how to properly – that is rationally – evaluate the outcome of a voice lineup. If an earwitness picks out the suspect in a lineup of 10 – what evidential value should you logically assign to that? I make use of a 2002 paper by D. Hodgson (Judge of the NSW Court of Appeal): A LAWYER LOOKS AT BAYES’ THEOREM. The Australian Law Journal 76: 109-118. Hodgson’s paper is an early attempt to explain to the legal profession how to rationally evaluate the prosecution and defence hypotheses, given the (numerical) evidence adduced in their respective favour. Justice Hodgson uses a gedankenexperiment of a lineup to explain, with Bayes’ theorem, how identification through a properly-conducted lineup can be very powerful, and becomes more powerful as the number in the lineup increases. This presentation explains how to quantify the strength of evidence in a voice lineup and then adduces some real data on speaker recognition by naïve unfamiliar listeners to see how useful the strength of evidence can be in a real world ‘multiple auditory confrontation’. [pdf of powerpoint]
2016	(with Wang Xiao) Cantonese forensic voice comparison with higher-level features: likelihood ratio-based validation using F-pattern and tonal F0 trajectories over a disyllabic hexaphone. Accepted after peer review for Odyssey 2016 Speaker and Language Recognition Workshop, Bilbao. [pdf]
2016	Forensic speech science report in a case involving the alleged tampering of a recording. [annonymised pdf]
2015	Forensic Voice Comparison with Monophthongal Formant Trajectories - a likelihood ratio-based discrimination of "Schwa" vowel acoustics in a close social group of young Australian females. Proc. Int'l Conf. on Acoustics Speech & Signal Processing (ICASSP 40) Brisbane, pp.4819-4823. [pdf]
2013	Where the Science Ends and the Law Begins: likelihood ratio-based forensic voice comparison in a $150 million telephone fraud. Int'l Journal of Speech Language and the Law 20/2: 277-324. [pdf]
2013	More is better: Likelihood ratio-based forensic voice comparison with vocalic segmental cepstra frontends. Int'l Journal of Speech Language and the Law 20/1: 77-116. [pdf]
2013	David Vandyke, Phil Rose, Michael Wagner. The Voice Source in Forensic-Voice-Comparison: a Likelihood-Ratio based Investigation with the Challenging YAFM Database. Paper presented at International Association of Forensic Phonetics and Acoustics Conference, Tampa, Florida. [abstract pdf]
2012	Morrison, G. S., Rose, P., Zhang, C. Protocol for the collection of databases of recordings for forensic-voice-comparison research and practice. Australian Journal of Forensic Sciences, 44/2, 155–167. [pdf]
2012	Where the Science Ends and the Law Begins: Theory and Reality in Likelihood Ratio-based Forensic Voice Comparison. In D. T. Toledano, A. O. Giménez, A.Teixeira,J. González-Rodríguez, L. Hernández-Gómez, R. S. Segundo-Hernández & D. Ramos-Castro (eds.) Proceedings of Iberspeech 2012 VII Jornadas en Tecnología del Habla and III Iberian SLTech Workshop, pp 27-39. ISBN: 84-616-1535-2. http://iberspeech2012.ii.uam.es/ pdf
2012	Caiyu Wang & Phil Rose (2012) Likelihood Ratio-based Forensic Voice Comparison with Cantonese /i/ F-pattern and Tonal F0. In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 209-212. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	Jialin Pang & Phil Rose (2012) Likelihood Ratio-based Forensic Voice Comparison with the Cantonese Diphthong /ei/ F-pattern. In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 205-208. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	Joanne Jingwen Li & Phil Rose (2012) Likelihood Ratio-based Forensic Voice Comparison with F-pattern and Tonal F0 from the Cantonese /eu/ Diphthong. In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 201-204. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	Aishu Chen & Phil Rose (2012) Likelihood Ratio-based Forensic Voice Comparison with the Cantonese Triphthong /iau/. In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 197-200. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	Ruijuan Zheng & Phil Rose (2012) Likelihood Ratio-based Forensic Voice Comparison with Cantonese Short-term Fundamental Frequency Distribution Parameters. In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 153-156. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	Alex C.Y. Yim & Phil Rose (2012) Are Nasals Better? Likelihood Ratio-based Forensic Voice Comparison with Segmental Cepstra from Cantonese and Japanese Syllabic/Mora Nasals. In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 5-8. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	"Yes, not too bad – Likelihood Ratio-Based Forensic Voice Comparison in a $150 Million Telephone Fraud." In F. Cox, K. Demuth, S. Lin, K. Miles, S. Palethrope, J. Shaw & I. Yuen (eds.), Proceedings of the 14th Australasian International Conference on Speech Science and Technology, pp. 161-164. Australian Speech Science & Technology Association: Sydney, ISSN 1039-0227. pdf
2012	"The Likelihood Ratio goes to Monte Carlo: the effect of reference sample size on likellihood-ratio estimates." University of New South Wales Forensic Speech Science Conference. pdf
2011	Forensic Voice Comparison with Japanese Vowels – a likelihood ratio-based approach using segmental cepstra. In Wai-Sum Lee & Eric Zee (eds.), Proc. 17th International Congress of Phonetic Sciences, Hong Kong: 1718-1721. pdf
2011	Forensic Voice Comparison with Secular Shibboleths – a hybrid fused GMM-Multivariate likelihood-ratio-based approach using alveoli-palatal fricative cepstral spectra. Proc. International Conference on Acoustics Speech & Signal Processing (ICASSP), IEEE. 5900-5903. pdf with embedded video [this will play when downloaded]
2010	G.S. Morrison, J. Epps, P. Rose, T. Thiruvaran, C. Zhang. Measuring reliability in forensic voice comparison. Journal of the Acoustical Society of America 128/2378. abstract
2010	G.S. Morrison, C. Zhang, P. Rose. An empirical estimate of the precision of likelihood ratios from a forensic-voice-comparison system. Forensic Science International 208: 59-65.
2010	Bernard’s 18 – Vowel Inventory Size and Strength of Forensic Voice Comparison Evidence. In Tabain, Fletcher, Grayden, Hajek, Butcher eds. Proc.13 th Australasian International Conf. Speech Science and Technology, Australasian Speech Science and Technology Association: 30-33. pdf
2010	P. Rose & E. Winter. Traditional Forensic Voice Comparison with Female Formants: Gaussian mixture model and multivariate likelihood ratio analysis. In Tabain, Fletcher, Grayden, Hajek, Butcher eds. Proc.13th Australasian International Conf. Speech Science and Technology (full paper review), Australasian Speech Science and Technology Association: 42-45. pdf
2010	The Effect of Correlation on Strength of Evidence Estimates in Forensic Voice Comparison: Uni- and Multivariate Likelihood ratio-based discrimination with Australian English Vowel Acoustics. International Journal of Biometrics 2/4: 316 – 329. pdf
2010	Combining linguistic and non-linguistic information in likelihood-ratio-based forensic voice comparison. Invited presentation at Special Session on Forensic Voice Comparison, Acoustical Society of America Conference, Cancun, November 2010. The presentation contains some ideas on FVC with a combination of segmental and long-term cepstra. The approaches were developed in later publications on FVC with cepstral spectra of fricatives (2011) and vowels (2013). [powerpoint pdf]
2010	M.Wagner et al. The Big Australian Speech Corpus (The Big ASC). Proc. Int'l Australiasian Conference on Speech Science & Technology. 166:170.
2009	P. Rose & G. S. Morrison. A response to the UK Position Statement on forensic speaker comparison. Intl. Journal of Speech Language and the Law, 16(1): 139 – 163. [pdf] [Chinese version]
2009	Y. Kinoshita S. Ishihara & P.Rose Exploring the Discriminatory Potential of F0 Distribution Parameters in Traditional Forensic Speaker Recognition. Intl. Journal of Speech Language and the Law, 16(1): 91 – 111. [This is a slightly revised version of our Odyssey 2008 paper on forensic F0] [pdf]
2009	Report on Evaluation of Disputed Utterance Evidence in R v Bain, New Zealand. pdf
2009	Response to England and Wales Law Commission Paper: The Admissibili8ty of Expert Evidence in Criminal Proceedings in England and Wales: A New Approach to the Determination of Evidentiary Reliability. pdf
2008	Morrison G, Zhang C, Rose P. Forensic Speaker Recognition in Chinese: A Multivariate Likelihood Ratio Discrimination on /i/ and /y/. In Fletcher & Loakes (eds.) Proc. 9th Annual Conference of International Speech Communication Association (Interspeech ’08) ISSN 1990-9772: 1937-1940. pdf
2008	张翠玲 (Zhang C.), Rose, P 基于似然方法的率语音证据评价 Strength Evaluation of Forensic Speaker Recognition Evidence based on Likelihood Ratio Approach 证据科学 [Evidence Science] 16/3: 337-342. pdf
2008	Morrison, G.S., Rose, P., Kinoshita, Y. Extraction of likelihood ratio forensic evidence from the formant trajectory of diphthongs. Journal of the Acoustical Society of America 23(5): 3877.
2008	Kinoshita Y, Ishihara S., Rose, P. Beyond the Long-term Mean: Exploring the Potential of F0 Distribution Parameters in Traditional Forensic Speaker Recognition. In Brummer (ed.) Proc. '08 Odyssey Speaker and Language Recognition Conference. (full paper review) ISBN 978-0-620-40331-3. pdf
2007	Forensic Speaker Discrimination with Australian English Vowel Acoustics Proc Intl Congress of Phonetic Sciences 07. pdf
2007	Gonzalez-Rodriguez, J., Rose, P., Ramos, D.,Torre, D.,Ortega-Garcia, J. Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition. IEEE Transactions on Audio Speech and Language Processing 15/7. [abstract]
2006	Rose, P., Kinoshita, Y., Alderman, T. Realistic Extrinsic Forensic Speaker Recognition with the Diphthong /ai/. In Warren & Watson (eds.) Proc. 11th Australasian International Conf. Speech Science and Technology. pdf
2006	The Intrinsic Forensic Discriminatory Power of Diphthongs. In Warren & Watson (eds.) Proc. 11th Australasian International Conf. Speech Science and Technology. pdf
2006	Accounting for Correlation in Linguistic-Acoustic Likelihood Ratio-based Forensic Speaker Discrimination. In Berkling (ed.) Proc. Odyssey Speaker and Language Recognition Workshop, Puerto Rico. pdf
2006	Forensic Speaker Recognition at the Beginning of the Twenty-first Century – an Overview and a Demonstration. Invited paper. The Australian Journal of Forensic Sciences, 37/2. draft pdf
2006	Catching Criminals by their Voice - Combining Automatic and Traditional Methods for Optimum Performance in Forensic Speaker Identification. Australian Research Council Grant Proposal. pdf
2006	Technical Forensic Speaker Recognition: evaluation, types and testing of evidence. Invited paper for Computer Speech and Language 20/2-3: 159-191. [pdf]
2004	Rose, P., Lucy, D., Osanai, T.) Linguistic-Acoustic Forensic Speaker Identification with Likelihood Ratios from a Multivariate Hierarchical Random Effects Model – A Non-Idiot’s Bayes’ Approach. In S. Cassidy (ed.) Proc. 10th Australian International Conference on Speech Science & Technology. Australian Speech Science and Technology Association: 492-497. pdf
2004	Technical Forensic Speaker Identification from a Bayesian Linguist’s Perspective. In J. Ortega-Garcia et al. (eds.) ‘Proc. of Odyssey-04, The Speaker and Language Recognition Workshop’, Toledo: 3-10. pdf
2003	*The technical comparison of forensic voice samples. In Freckelton & Selby (eds.) Expert Evidence* 99. Sydney, Thomson Reuters: 1051-6102. pdf
2003	Rose, P., Osanai T., Kinoshita, Y. Strength of forensic speaker identification evidence: multispeaker formant- and cepstrum-based segmental discrimination with a Bayesian likelihood ratio as threshold. The International Journal of Speech Language and the Law10/2: 179-202. pdf
2002	*Forensic Speaker Identification*. Taylor & Francis. New York, London. Now also published by Taylor & Francis CRC Press on the web as part of their FORENSICnetBASE (http://www.forensicnetbase.com/). This book has been very favourably reviewed. Reviews can be accessed here.
2002	Rose, P., Osanai T., Kinoshita, Y. Strength of forensic speaker identification evidence: multispeaker formant- and cepstrum-based segmental discrimination with a Bayesian likelihood ratio as threshold. In C. Bow (ed.) Proc. 9th Australian Intl. Conf on Speech Science & Technology, Melbourne: Australian Speech Science & Technology Association: 303-308. pdf
2002	Rose, P. DNA Can't Talk - Some facts about Forensic Speaker Identifcation. Invited presentation, 2002 Media Science Forum - The Science of Terrorism, University of Technology, Sydney, October 29th. pdf
2001	Rose, P., Clermont, F. A Comparison of Two Acoustic Methods for Forensic Speaker Discrimination. Acoustics Australia 29/1: 31-35. pdf
2000	Rose, P., Clermont, F. Comparative performance of cepstrum- and formant-based analyses on similar-sounding speakers for forensic speaker identification. In Michael Barlow (ed.) Proceedings of the 8th Australian International Speech Science and Technology Conference Canberra: Australian Speech Science and Technology Association: 172-177. pdf
2000	Ann Kumar, P. Rose ‘Lexical Evidence for Early Contact between Indonesian Languages and Japanese’. Oceanic Linguistics 39/2: 219 – 255. pdf
1999	*Long- and Short-term within-speaker differences in the formants of Australian hello.* Journal of the International Phonetics Association.29/1: 1-31. pdf
1999	Differences and Distinguishibility in the Acoustic Characteristics of Hello in Voices of Similar-sounding speakers - a Forensic Phonetic Investigation. Australian Review of Applied Linguistics 21(2): 1-42. pdf
1998	A Forensic Phonetic Investigation into Non-contemporaneous Variation into the F-patern of Similar-sounding Speakers. In Robert Mannell and Jordi Robert-Ribes (eds.) Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP): Vol 2: 217-220. pdf
1997	Identifying Criminals by their Voice: the emerging applied discipline of Forensic Phonetics. Australian Language Matters 5/2: 6-7. pdf
1996	Observations on Forensic Speaker Recognition. Proc 6th International Criminal Law Congress, Organising committee. Organising Committee, 6th ICLC. pdf
1996	Speaker Verification Under Realistic Forensic Conditions. In Paul McCormak & Alison Russel (eds.) Proceedings of the 6th Australian International Conf. on Speech Science and Technology, Australian Speech Science and Technology Association: 109-114. pdf
1996	Rose, P. and Simmons, A. F-pattern variability in Disguise and Over the Telephone - Comparisons for Forensic Speaker Identification. In Paul McCormak & Alison Russel (eds.) Proceedings of the 6th Australian International Conf. on Speech Science and Technology, Australian Speech Science and Technology Association: 121-126. pdf
1995	Rose, P. and Duncan, S. 'Naive auditory identification and discrimination of similar voices by familiar listeners'. Forensic Linguistics 2/1: 1-17. pdf
1995	S. Ran, B. Millar, I.Macleod, P. Rose 'Automatic Vowel Quality Description Using Four Primary Cardinal Vowels'. Proc. 13th Intl. congress of Phonetic Sciences,Vol. 3, pp. 318-321. pdf
1994	S. Ran, B. Millar, I. Macleod, P. Rose 'Automatic Vowel Quality Description using a Cardinal Vowel Reference Model'. In Roberto Togneri (ed.) Proc. 5th Australian Intl. Conf. on Speech Science and Technology. Australian Speech Science and Technology Association, 387-392. pdf
1986	M. O'Kane, J. Gillis, P. Rose, M.Wagner 'Deciphering Speech Waveforms'. Proc. Int'l Conf. on Acoustics Speech & Signal Processing (ICASSP), vol. 11, IEEE, 2227-2230. pdf

Phil Rose Papers on Forensic Voice Comparison

Click on "pdf" to open document.

Documents without "pdf" can be requested for research purposes from: philjohn.rose@gmail.com

return to home page