Recently we've been working on a text-to-speech synthesis (TTS) system for Scottish Gaelic. TTS systems are trained to generate artificial speech audio from text inputs using paired examples of short recorded utterances and their corresponding text transcripts.
For our systems, we are using audio from LearnGaelic's weekly Letter to Learners, written and recorded by Ruairidh Maclean. We are very grateful to Ruairidh for allowing us to use his vocal likeness, and to BBC ALBA and MG ALBA for their permission to make use of these recordings.
For full details of the systems presented below, and how we prepared the training corpus, see our paper "A Low-Resource Pipeline for Text-to-Speech from Found Data With Application to Scottish Gaelic" presented at Interspeech 2023. See also the poster for a more schematic outline of the approach.
Utt ID | Natural | Copy | Phone 8h | Phone 2h | Char 8h | Char 2h | Unit 8h | Unit 8+21h | Unit 2+21h | Transcript |
litir0904 | Bha comas leughaidh ann an Gàidhlig aca. | |||||||||
litir0697 | Uill, leis an fhìrinn innse, cha robh mòran chnothan innte. | |||||||||
litir0478 | Phaisg e an leanabh ann am plaide, dh'fhalbh e a-mach air uinneig air cùlaibh an taighe, agus chaidh e don uaimh. | |||||||||
litir1175 | Bha mi ag innse dhuibh mu Leabhar Dhèir, The Book of Deer. An-diugh, tha mi a' dol a leughadh dhuibh a' chiad earrainn de na notaichean Gàidhlig ann. |