Chirp = Bird | Elia Formisano

Elia Formisano, professor of Neural Signal Analysis at the Faculty of Psychology and Neuroscience recently published a paper in Nature Neuroscience in collaboration with Bruno Giordano at Université Aix-Marseille, France and Michele Esposito, Giancarlo Valente. The title of the paper is Intermediate acoustic-to-semantic representations link behavioural and neural responses to natural sounds.

And if you are a bit like me, that title sounds quite intimidating. So, I met up with professor Formisano to talk about his findings.

Sound processing

“In Maastricht, we’ve been doing research on how the brain makes sense of sounds for years. That can be any sound, a bird, a washing machine, a hammer. We use different brain imaging methods (e.g. fMRI, EEG) to read out how the brain responds to sounds. But in order to answer the question “how”, knowing which brain areas are active and at what time is not enough, you need computational models”. Computational and mathematical models are models constructed to simulate what the brain does. “Nowadays there are computational models, so called Deep Neural Networks, that are trained to recognise sounds by being exposed to millions and millions of different sounds”. Among the models that Formisano used in his article, there were deep networks created by Google and trained using the sounds from millions of existing YouTube videos and their tags to teach itself what, for example, a bird sounds like.

Transforming sound into something meaningful

For a while now, models have existed that could simulate how the human brain processes simple sounds. But, no model has been able to show how we transform a sound into something that has meaning. We hear a chirp, and we immediately know that it is a bird. How does this translation work? When does a sound become a bird? “These networks simulate this process, not exactly as the brain does, but well enough to give us a good idea of how the brain could perform these tasks”.


How do you then confirm the findings? “We took brain responses from an experiment where participants in the fMRI scanner listened to many different sounds. We played the same sounds for many different models and then we compared the results”. In Marseille, Giordano did the same experiment but in a behavioural context, with the question: how similar are these sounds. “By comparing the brain and behavioural data, we could see which of the AI models comes closest to the actual brain. And we could see how accurate the results were: with our fMRI results we could actually predict the behavioural responses of participants in Marseille”.


“Our brain is able to recognise events from sounds, even if you have not heard that specific sound yet. Much like in language, our brain takes small pieces of sounds, like syllables in a word and stitches them together to determine this must be, for example a  glass breaking. We haven’t solved the riddle totally yet, our brain does not work exactly like these models. We don’t need to first hear a million birds to make the conclusion chirp = bird”. After publishing these findings, Formisano’s work is now focussing on filling in the areas of sound processing that the models cannot (yet).

Sound recognition software

“With the results of this research, we started building our own neural network that is getting even closer to the working of the human brain. With this research and technology, we can contribute to the development of better sound recognition software”. For example: alarm systems, hearing robots or systems for elderly people to help them around the house. When they are going out the door, the system could remind them that the tea kettle is boiling, because it can recognise the sound”.

The full publication can be read here.