SoI decided to see if I could design a simple system that would detect the direction a sound came from. How difficult could it be? As it turns out, it’s not that easy, and I handicapped
myself by using an eight-bit microcontroller. Despite that,
the ROBOEAR system works surprisingly well and can
also function as the data acquisition front-end for a more
In psychoacoustics, the ability to determine the direction
and distance of a sound source is called localization. The
distance to a sound source is normally judged by the
amplitude of the sound based on our past experiences.
If we hear a car horn, we can make a good guess about
how close it is. If we don’t know what we’re hearing, then
judging the distance can be difficult.
There are two classic methods used to describe how we
determine the direction to a sound source. Phase difference
predominates at low frequencies and amplitude difference
at high frequencies; at middle frequencies (around 1,500
Hz), neither works particularly well.
Differences in amplitude are more effective at higher
frequencies because there is more shadowing effect from
the head and the amplitude difference between the ears is
more pronounced. Differences in phase are more effective
at low frequencies because sound wavelengths are long
compared to the size of the head and diffraction allows the
sound to bend around the far side of the head.
A common signal processing method to look at phase
differences of noisy signals is cross-correlation. Cross-correlation is the comparison of two different time series
to detect if there’s a correlation between their peaks and
valleys. Simply put, one signal is time shifted with respect to
the other and the correlation is measured at each time shift.
The mathematical expression of the cross-correlation
For each time shift — called a lag (L) — the values of
When I started the ROBOEAR project, I was thinking about the
sensors robots use and how they compared with human senses.
Humans interact with the world primarily through vision and
hearing. Robot “vision” can range from line sensors to video
cameras. Robot “hearing” is mostly ultrasonic ranging. However,
human hearing provides us with both range and directional cues.
By Brian Beard
cross(L) = ∑ x(n) y(n + L)
n = o
42 SERVO 09/10.2018