Welsh language AI faces challenges
Artificial intelligence (AI) is a rapidly developing field that has many applications and benefits for different languages and cultures. However, some languages face more challenges than others when it comes to AI, especially those with smaller data sets and less resources. One of these languages is Welsh, which has about 870,000 speakers in Wales and around the world.
Welsh language AI relies on computerised large language models, which use huge amounts of data such as webpages, books and articles to predict which words and phrases go together. However, there is a lack of Welsh data available for these models, which means that translations and transcriptions are often inaccurate or incomplete. This affects the quality and functionality of AI services such as chatbots, voice assistants, and online events.
Welsh businesses and researchers call for more data
Some Welsh businesses and researchers are already using AI to provide bilingual services and products, but they say they need more cooperation and support from the Welsh government and other organisations to improve the accuracy and efficiency of their AI systems. They also say that more Welsh language data should be made available under permissive licences, so that they can train their models with more relevant and diverse information.
One of these businesses is Haia, an online events company based in Anglesey, which uses simultaneous translation software to enable speakers to talk in Welsh or English with translated subtitles. Its co-founder, Tom Burke, said their product could be improved if more Welsh language data was legally available. “One of the issues we have is how accurate it is. If you compare with German or Spanish, Welsh is a small data-set,” said Mr Burke. “We’ll often find there are inaccuracies in the translation or transcription and the way to improve that is for us to get access to the wealth of data that is actually available for the Welsh language.”
Another example is ChatGPT, a chatbot that can understand and communicate in Welsh, which was developed by OpenAI in the US and adapted by researchers at Bangor University’s Canolfan Bedwyr. The chatbot has impressed researchers with its ability to generate natural and fluent conversations in Welsh, but they say it still needs more data to reflect the reality of Wales and its culture.
The head of the Language Technologies Unit at Canolfan Bedwyr, Gruffudd Prys, said: “One of the things we can do to improve the quality of artificial intelligence is to enable the data that’s out there to be available under permissive licences so that the models reflect the reality of Wales and that they’re not overly American or international models.”
Welsh government plans to renew its AI strategy
The Welsh government said it recognises the importance and potential of AI for the Welsh language and economy, and that it plans to renew its AI strategy soon. A spokesperson said: “We are committed to supporting the development of artificial intelligence in Wales and ensuring that it benefits our people, businesses and public services. We are currently reviewing our AI strategy to ensure that it reflects our priorities and ambitions for the future.”
The spokesperson also said that the Welsh government has invested in several projects and initiatives related to AI and the Welsh language, such as the National Centre for Artificial Intelligence (NCAI), which aims to create a network of expertise and collaboration across Wales; the National Language Technologies Portal (NLTP), which provides access to tools and resources for Welsh language AI; and the Digital Innovation Fund (DIF), which supports innovative digital projects that address social and economic challenges.
The spokesperson added: “We recognise that there is more work to be done to ensure that the Welsh language is part of the AI revolution, and we will continue to work with our partners in academia, industry and civil society to achieve this goal.”