1. Unshaven and one bit short
To death and taxes, Benjamin Franklin’s binary list of life’s certainties, add the expectation that this six-note sequence:
Will continue with this:
Although we ponder ways to avoid or evade Franklin’s list of unavoidable events, we generally accept this more benign certainty as immutable. To demonstrate, consider this:
The penultimate note of the tune generates such strong and specific anticipation that you are likely finding it difficult to continue reading without resolving the sequence. That anxious pause is key to composition and music’s power. It creates a sense of prophetic certainty that allows musicians to play against expectations by thwarting the expected. The controlled manipulation of certainty and likelihood lurks behind those magical moments in which music has caused a shiver or a tear to fall. By infusing uncertainty or surprise into the mix, musicians literally play on our emotions. As a composer myself, I often lead listeners down musical paths that avoid expected arrivals.
The ubiquitous “Shave and a Haircut” and its aborted variant provide ideal stimuli to study how the brain responds to violated expectations. Understanding the mechanisms of violated expectations in music elucidates some of the basic functions of learning, memory, and our perception of time. Along with enhancing our understanding of music, the study of how we process expectations, and learn to revel in ambiguity and uncertainty, is important in understanding how we appreciate many aspects of art and life that involve solving puzzles and deciphering codes, from poetry to painting, science to math.
2. Penny Lane is in my ears
Contrary to the proverbial tree-falling-in-the forest quandary, a musical note that fails to materialize is at least as present in our brain as it would be had it actually sounded. That’s because neural substrates of imagined sound correlate with those of perceived external sounds. The more vivid the image of what must happen, the more jarring it is when that certainty is subverted.
In the 1970s, psychologists Robert Rescorla and Allan R. Wagner proposed that we learn one thing leads to another through the discrepancy between what we expect will occur and what actually transpires.1 When expectation is upended, the surprise makes a strong and lasting impression in our brains. Neuroscientists have found that the brain’s neural signals, critical to learning, are more active when confronted by surprise.
I have shown that effect in my own studies at Stanford University’s Center for Computer Research in Music and Acoustics. A current experiment led by Stanford post-doctoral fellow, Daniel Abrams, utilizing “Shave and a Haircut,” shows that prediction involves signaling throughout the brain. Our hypothesis is that although that final note fails to arrive to the ear, the message that it should have arrived may be detectable in the brainstem. That would suggest that the sub-cortical level of the auditory system is being primed according to a belief system that originates in cortical structures up the chain of the auditory network.
A musical note that fails to materialize is at least as present in our brain as it would be had it actually sounded.
My students and I further studied brain response in the absence of a highly expected event. In an experiment led by my Ph.D. student, Blair Kaneshiro, we presented subjects with a sequence of chords, an encapsulation of the most common harmonic progression in tonal music (“Heart and Soul,” “Blue Moon,” “Penny Lane,” and a zillion other pop tunes, as well as a formidable number of works in the classical repertoire). The final chord of the progression repeats the opening tonic chord and creates a firm expectation for resolution to this return:
As we did with “Shave and a Haircut,” we occasionally replaced the anticipated resolution with three different types of surprises, one which repeats the penultimate or dominant chord:
One which introduces a new, flatted chord that sits outside the set of available, or diatonic, chords:
And the third, our old friend, silence:
We found the brain recognizes and reacts to violated expectations in highly specific ways. Not only does it register a wrong event, it also—even more strongly—reacts to the missing event. Furthermore, both the cortical and sub-cortical responses to violated expectation—particularly when a silence replaces a firm and specific expectation—suggests a well-integrated network of brain activity that draws from experientially acquired schemas to focus the auditory system on expected events, and to immediately register and react to failed expectations.2
3. Answer the damn phone
Thwarting expectations and creating ambiguity is fundamental to great art. It is also a useful tool for grabbing consumers’ attention.
About 15 years ago I boarded a bus in the Middle East and was assaulted by a cacophony of ringing cell-phones, each shrieking the original Nokia ringtone:3
In those days, ringtones were short, unaccompanied melodies (“monophonic” in music parlance), synthesized using a blandly invariant triangle wave—the sort of sound heard in the earliest computer games. Every attempt I made to divert the Nokia ringtone failed. Unable to dislodge the tune from my brain, I mentally transcribed it. Despite its simplicity, I could not decide whether it started on an upbeat:
or on a downbeat:
The more the earworm plagued me, the more I obsessed over the ambiguity. No sooner did I convince myself that it was iambic, the trochaic interpretation took preference. I switched back and forth—“exclusively allocating,” as psychologists call it, just as our visual perception responds to Rubin vase/faces and other figure-ground ambiguities.
Then it struck me that the ambiguity underlies the tune’s success as a ringtone. It is innocuous enough to avoid undue arousal, yet present enough to demand attention.
In an informal series of experiments tailored both to musically literate and untrained listeners I asked others to determine where the accent was in the ringtone. Ultimately over 1,200 subjects responded. Of those respondents, 48 percent placed the accent on the initial note and 52 percent on the third note. I contacted Nokia and even they argued among themselves about where the accent fell. Once made aware of the alternative interpretation, few insisted that theirs was unequivocally correct.
When that ringtone rang, our internal metronomes subconsciously struggled to resolve the ambiguity. The only way to stop struggling was to answer the damn phone.
4. The sly Mr. Bach
Not long ago, my daughter, Tamar, was learning the Sarabande from Bach’s 5th Suite for solo cello.4
One day she left her cello lesson frustrated because her teacher insisted she play the opening phrase in a manner that made no sense to her. On the surface, the music is unremarkable: five equal note durations followed by a longer note. Yet embedded within this seemingly simple stream are multiple possible interpretations.
Here is a synthesized version of the opening of the Sarabande with no rhythmic or accentual inflection:
The task of the performer is to shape the phrase by biasing one interpretation over the others. Tamar’s teacher preferred accenting the first beat of each duration:
While Tamar was hearing this, accenting every third beat:
Another possibility is to accent the second beat of each group, as Yo-Yo Ma often does:5
Of course, in the face of uncertainty, one might avoid commitment to any one interpretation and distort all possibilities, as did the great cellist Pablo Casals in this performance:
At the heart of the problem is carefully sewn uncertainty as to which notes are elaborative and which are structural.6 No one solution consistently works. Well, not unless one considers the least likely solution. Suppose, for a moment, that Bach’s purpose was to create a puzzle based on a figure-ground ambiguity, an auditory version of the Rubin face/vase.
Unlike the visually ambiguous images, so loved by M.C. Escher, this temporal figure-ground confusion can (and often is) biased by how the performer chooses to accentuate particular notes. However, when performed with this ambiguity in mind, both performer and listener are challenged to reveal the underlying logic of a work that has a discomforting lack of metric orientation.
SPOILER ALERT: I may be about to ruin this movement for you for the rest of your lives.
Imagine the work is not at all a Sarabande that starts on a downbeat, but is a Courante, in which that first downbeat is a pick-up (a preparatory, unaccented beat followed be an accented beat) and the accentual pattern is in three eight-note patterns.
This is, in fact, the only consistently plausible interpretation—it is the only reading that works throughout without encountering a musical obstacle. The underlying ambiguity of the opening is, in fact, identical to that of our Nokia ringtone—just more ambiguous. Underlying the rather thin veil of a Sarabande, Bach is hiding an entirely separate dance!
This extraordinarily rich ambiguity consisting of multiple possible simultaneous accentual patterns and, ultimately, the simultaneous overlay of two highly contrasting dances is all the more remarkable in that the entire puzzle is embedded within a single unaccompanied melodic line.
5. Playing with time
Why do we humans spend so much time reveling in musical uncertainty? Marvin Minsky, the father of artificial intelligence, suggested an evolutionary perspective over a quarter of a century ago.
Minsky pondered why the highly repetitive theme of Beethoven’s Fifth Symphony retains its interest over time despite its enormous redundancy. He hypothesized that just as toddlers learn about spatial relationships by playing with blocks, we learn about temporal relationships by playing with segmented units of time. The “da-da-da-dum” gives us the temporal blocks, and both Beethoven and his interpreters play with those blocks, setting up expectations and artfully dashing them. We seem to be endlessly fascinated by such violated expectations and we revel in the uncertainty preceding them. This fascination starts in infancy with the crescendo and slight pitch rise in the elongated “a” of the “peek-a-boo” game. The expectation for the “boo” is strong and specific, but its time of arrival creates a pleasurable tension not at all unlike that of a pianist playing with the arrival time of a cadence. Beethoven’s iconic “da-da-da-dum” grabs us because, along with its redundancy is the variability of what interval will follow those three “da-da-das,” which is often, though not always, the same.
The entire puzzle is embedded within a single unaccompanied melodic line.
In 2007, neuroscientists Sridharan Devarajan and Vinod Menon, music cognition researchers Chris Chafe and Daniel Levitin, and I designed a study to measure what’s going on in the brain during those intervals. Specifically, we were interested in how the brain responds to musical phrase boundaries (the musical equivalent of a grammatical pause such as a comma or period), cadences, and the pause between movements in a symphony. A number of remarkable features emerged from the study.
First, the anticipation of the arrival of the cadence was evident in the group of subjects with no musical training—even preceding the temporal cues in the music itself. It seems that, simply by experiencing music—even with little consciousness or attention paid—listeners develop an awareness of musical structure. Second, the peak of activation occurred during the silence, suggesting that, as Minsky proposed, these temporal segmentation boundaries are critical in how humans organize events in time.
As shown in the following video clip, two distinct and seemingly dependent functional networks are activated: a ventral fronto-temporal network followed by a dorsal front-parietal network. The former has been shown in numerous studies to be critical in detecting violated expectations, which, by their very nature, chop the flow of perceived time with a discrete segmentation boundary. The dorsal network, meanwhile, is active in detecting salient features, directing attention to a particular object, and manipulating and monitoring these selected features in working memory. In short, one network works to predict imminent structural pauses, and the other kicks in to direct and maintain attention and update our sense of what is salient.
Essentially our study suggests that we anticipate that all-important structural silence and use that pause to update our working memory, implicitly supporting the notion that, beyond sheer pleasure, music serves a purpose in teaching and constantly re-enforcing the temporal component in processing uncertainty.7
In a classic episode of Monty Python’s Flying Circus, Beethoven sits at the piano searching for what will follow “da-da-da-dum.” At the brink of discovery he is interrupted, first by his myna bird, then by his wife in search of the sugar bowl, then the jam spoon (“It was in the sugar bowl,” she says), then by asking poor Ludwig if he wanted peanut butter or sandwich spread with his tea, then vacuuming. Each time he finds the missing note, he loses it again.
It’s a hilarious scene but illustrates a truth about the musical experience: at its heart lies the formulation of expectations, and the way the composer and performer chose to manage them. Even non-musicians are actively engaged, at least subconsciously, in tracking the ongoing development of a musical piece and forming predictions about what will come next.
Since music typically unfolds upon an underlying stream of pulses (the tactus), expectations as to when something will occur can be formulated along with expectations as to what will occur. Great music—and even, as we have seen, not-so-great music—plays off of these expectations. Most importantly, the temporal playground of music provides a virtual classroom for learning to revel in uncertainty8, and a laboratory to practice navigating ambiguity.
My violin concerto, Jiyeh, willfully wavers between building firm expectations destined to be violated, and imposing a sense of uncertainty and vagueness. The work, a cry against the senseless absurdity of war, is based upon an ecological disaster caused by a rocket attack on an aging power plant on the Lebanese coast. The explosion triggered a massive oil spill in the Mediterranean Sea that went undocumented, save for satellite photographs from NASA’s Advanced Spaceborne Thermal Emission and Reflection Radiometer satellite.
Following the sequence of daily photographs, I was able to measure the spread of the spill and turn this information into musical sound. The outer two movements frame the spread. In the first movement, the string orchestra plays the oozing expansion. In the final movement, the solo violin performs the evolution of the ornate, perversely beautiful Baroque patterns at the edges of the spill. The violin solo in the central movement, Adagio (heard below), is a mournful cry of anguish, characterized by repeatedly establishing a strong expectation for resolution, but persistently refusing the expected arrival until the very end of the movement.
Jonathan Berger is the Denning Family Provostial Professor in Music at Stanford University, where he teaches composition, music theory, and cognition at the Center for Computer Research in Music and Acoustics. His two chamber operas based on auditory hallucinations, Theotokia and The War Reporter, premiered in April 2013 at Stanford. His violin concerto, “Jiyeh,” was recently released on Harmonia Mundi’s Eloquentia label.
1. Rescorla, R.A. & Wagner, A.R., “A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement.” In Classical Conditioning II, A. Black & W.F. Prokasky, Jr. (eds.), (1972).
2. The graphs below represent the averaged event-related potentials (ERP), or brain response to stimulus, from EEG recordings of 19 subjects and almost 7,000 trials (5,928 with the expected chord and 988 with violated expectations). These are 2 of 128 electrodes.
The integers on the X-axis represent beats, or each of the five chords in the sequence. The region of interest is from immediately after 5 (the final event) and beyond. Note the extreme response to the silence (the red line) just after 5, as well as the decreased ERP just following the silence.
3. The Nokia ringtone quotes a 1902 composition by Spanish composer and guitarist Francisco Tárrega (1852-1909). Julian Treasure, author of the book Sound Business, who has done marketing work for Nokia, calculated that the melody rang about 1.8 billion times a day—about 20,000 times per second. Poor Tárrega missed out on a goldmine.
4. This recording of Christopher Costanza’s performance of the Sarabande from J.S. Bach’s 5th Suite for Solo Cello is used by permission of the performer. Costanza’s extraordinary performances of the entire set of the Bach Cello Suites along with a wealth of materials on these works is available at http://costanzabach.stanford.edu/.
5. Although this synthetic version may render this interpretation odd, it conforms to the traditional accentual pattern of the Sarabande.
6. The Sarabande from Bach’s 5th Suite avoids overt reference to the metric and harmonic schemas that typify the Sarabande. A more typical Sarabande, this one from Bach’s 3rd Cello Suite, provides a clear example of the stately triple dance with its elongated duration on each second beat:
J S. Bach, Sarabande from the 3rd Cello Suite (C-Major-BWV-1009), performed by Christopher Costanza.
Here’s a typical Courante from Bach’s 1st Cello Suite:
J.S. Bach, Courante from the 1st Cello Suite (G-Major-BWV-1007), performed by Christopher Costanza.
7. Supplemental data from Sridharan, Levitin, Berger, Chafe, and Menon, “Neural Dynamics of Event Segmentation in Music: Converging Evidence of Dissociable Ventral and Dorsal Networks,” Neuron, Volume 55, Issue 3, pages 521-532 (2007). Caption from paper: “Brain responses starting ten seconds prior to the structural pause (the transition between movements) until ten seconds after the silence. The brain responses are predominantly right lateralized. With time, activity shifts along a ventral-dorsal axis, with the ventral regions—ventrolateral prefrontal cortex (VLPFC) and posterior temporal cortex (PTC)—active before and during the early part of the transition, and the dorsal regions—dorsolateral prefrontal cortex (DLPFC) and posterior parietal cortex (PPC)—active during and following the transition.”
8. Reveling in uncertainty is not always a joy. Loud noise creates uncertainty and can induce hallucinations. Thankfully our brains contain a gating mechanism that inhibits startled responses and modulates our sensitivity to noise. Individuals with schizophrenia lack this inhibitory mechanism. Their impaired sensorimotor mechanisms make them prone to a state of heightened startle, which triggers hallucinations. My recent chamber opera, Theotokia, presents the ritualistic delusional world ofthe protagonist, Leon, who suffers from schizophrenia. At the opening of the opera (heard below), Leon hears the chanting of congregants and the invocations of their spiritual leader, Mother Anne. The work charts the metamorphosis of Mother Anne from beckoning spiritual leader, to mythical and terrifying mother of god, to the voice of Leon’s actual mother. Scored for five singers, chamber ensemble and ambisonic digital audio (a three-dimensional sound field), Theotikia places the audience inside Leon’s tormented mind.