Limburgish on the digital map
For years, Limburgish has been struggling with a major shortage of digital resources and technical systems to support, study and make the language and all its dialects accessible. This lack hinders not only scientific research, but also the development of digital applications such as speech recognition, machine translation and other AI-based technologies.
A new project, implemented by Andreas Simons under the direction of Leonie Cornips (Chair of Language Culture in Limburg) and subsidised by the Hoes veur ‘t Limburgs, is now committed to changing this.
Why a Limburgish Corpus?
Modern technologies and scientific research in languages depend on so-called ‘corpora’: large databases of source material such as books, poetry, internet blogs and conversations. However, Limburgish is among the most poorly documented Germanic languages. This means that the language is hardly accessible to researchers, developers, education and governments. The lack of digital availability leads to a vicious circle in which the visibility, use and prestige of Limburgish continue to decline.
Despite these challenges, there is growing interest in Limburgish. Young people increasingly use the language on social media, and various resources such as local literature, dialect dictionaries and theatre scripts exist. What is missing is a central and publicly accessible repository for this material.
Digital infrastructure in the Hoes veur ‘t Limburgs
A digital infrastructure (digital resources and technical systems to store, manage and make the language accessible) will be set up in one year to collect, manage and complete a Limburgish Corpus. At the end of the project, a basic version of the corpus will be available for further scientific research and industry applications. An edited version of the corpus will be made publicly available so that researchers, students and developers can work with the data. This will create a snowball effect for further research and position Limburgish as a ‘studyable’ language.
The infrastructure will also be easily expandable so that future projects can add to and further develop the corpus. This paves the way for training language models and other applications, similar to initiatives for other minority languages.
Picture by: Laura Knipsael
Also read
-
Writer Frank Nellen and researcher Suzanne Kooloos investigate special book collections
Writer Frank Nellen and researcher Suzanne Kooloos are joining forces to delve into the special book collections of Radboud University and Maastricht University. Their research will serve as inspiration for a joint artistic project. The writer and researcher will be given free rein and will report...
-
Maastricht University recognised among top institutions in CEO Magazine’s 2025 Green MBA Rankings
We are proud to share that Maastricht University School of Business and Economics has been recognised as a top-ranked institution in the CEO Magazine 2025 Green MBA Rankings.
-
Pamela Habibović joins Executive Board Universities of the Netherlands
As of 1 July 2025, Pamela Habibović, UM’s Rector Magnificus, will join the Executive Board of Universities of the Netherlands (UNL).