Artificial Intelligence Solved This Audio Illusion. Can You?
The Cocktail Party Effect is an auditory phenomena that, really, humans shouldn't have to solve – normally we can automatically separate different sounds and voices through our selective attention. But it's hard, right? Well Artificial Intelligence found it even harder – a machine found it difficult to determine the difference between audio tracks.
And Freeze. Now, did you hear what I was saying?
Clearly enough that you could, say, write it in dow the comments?
If so, you just experienced a phenomenon called
The Cocktail Party Effect.
You can hear me while there's people talking right next to us or if there's a jazz band
across the room.
This is because of selective attention - our ability to focus on one particular thing while
tuning out our surroundings.
And it's the same effect that allows us to separate the vocals from the background
music in a song
This comes so naturally to us, but machines find these tasks extremely hard.
To a machine, a voice singing is just another track in a song that isn't easily discernible
from the piano track or the violin track or the harmonica track.
So how do you train a machine to separate voices at a party or vocals from a song like
Well, the answer lies in algorithms and lots of data.
Recently, researchers developed an algorithm that can identify the vocals in multiple songs.
And this is thanks to breakthroughs in machine learning - a method used in artificial intelligence
to allow machines to learn by analysing data.
To do so, researchers used a deep neural network - these networks are software inspired by
how our brain works.
They can learn using a method called deep learning, a kind of machine learning technique
that works through a series of layers.
An input layer, an output layer and middle hidden layers.
These hidden layers are where the magic happens.
And to train an artificial neural network, you have to feed them a ton of data - just
like us, the more they know, the better they can learn.
So researchers trained their neural network by giving it 50 songs.
They let the neural network try to separate the vocals and the non-vocal components (the
other instruments), and compare its results with the correct answer - which is the particular
song already separated into the different components.
Every time the neural network gets closer to the correct result, it's rewarded.
So it improves with each run.
It was then tested with 13 new songs, and it correctly separated the vocals from the
background music in each one.
It taught itself to tell the vocals apart from the other instruments.
What separates deep learning from previous types of machine learning is this layered
structure, which is modelled specifically after the cortex, the wrinkly outer layer
of the brain.
It's the part responsible for higher-order brain function like sensory perception, cognition,
spatial reasoning and language.
Basically it's the part that makes you...
different from a lizard.
It's made up of 6 layers, and different aspects of processing happen at each level.
For example, when you see an apple, the first layer might identify the color red, the second
layer detects the round edges, and so on until finally the last layer puts it all together
and says hey, that's an apple!
Deep learning software tries to imitate this hierarchical structure of neurons in the cortex.
The first few layers of a deep neural network learn to identify simple patterns, like single
The next layers learn to recognize more complicated patterns, like words.
Eventually, the result is that extremely complicated patterns like the entire vocals of a song
can be recognized and distinguished from the other instruments.
This layered process is at the heart of deep learning's success.
Starting with simple ideas and making them become a more and more like a generalized
concept seems to capture something fundamental about intelligence.
Humans used to have a clear advantage in pattern recognition, but in 2015 a deep neural network
beat a human at image recognition for the first time.
This means we're able to make better and more sophisticated machines that can master
tasks we thought were unique to humans.
Machines are helping doctors make better diagnoses and robots are learning to cook by watching
And when a robot can learn to cook by watching YouTube videos - that makes you question
what it really means to be human
More Episodes (164)
The Attention Economy Needs to Change. But How?January 07, 2019
Google Owns 28% of Your BrainDecember 17, 2018
Your Emotions Are For SaleDecember 17, 2018
How One Company Redefined Social NormsDecember 05, 2018
The Psychological Tricks Keeping You OnlineNovember 26, 2018
ATTENTION WARS | TrailerNovember 19, 2018