a-d UMAP projections of syllables from a single example bird from each species (Bengalese finch, California thrasher, Cassin's vireo, European starling). e-h Sequence-transitions between syllables from the same bird as a-d i-h Markov model abstraction of e-h
Human speech possesses a rich hierarchical structure that allows for meaning to be altered by words spaced far apart in time. Conversely, the sequential structure of nonhuman communication is thought to follow non-hierarchical Markovian dynamics operating over only short distances. Here, we show that human speech and birdsong share a similar sequential structure indicative of both hierarchical and Markovian organization. We analyze the sequential dynamics of song from multiple songbird species and speech from multiple languages by modeling the information content of signals as a function of the sequential distance between vocal elements. Across short sequence-distances, an exponential decay dominates the information in speech and birdsong, consistent with underlying Markovian processes. At longer sequence-distances, the decay in information follows a power law, consistent with underlying hierarchical processes. Thus, the sequential organization of acoustic elements in two learned vocal communication signals (speech and birdsong) shows functionally equivalent dynamics, governed by similar processes.
I wrote a short twitter thread describing the main results.
We modeled the sequential relationships between elements of birdsong and speech as Mutual Information (MI) decay and compared them to hierarchical and Markovian models of sequential organization. pic.twitter.com/TQrAu8QkUN— Tim Sainburg (@tim_sainburg) August 12, 2019
Markovian relationships decay exponentially in MI as a function of sequential-distance, while a power-law MI decay is indicative of hierarchical organization. pic.twitter.com/DnGFHhhQkr— Tim Sainburg (@tim_sainburg) August 12, 2019
In four speech datasets (English, Japanese, German, Calabrian) we found that MI decay showed signs of both Markovian and hierarchical organization, where Markovian organization was primarily observed in short-distance within-word relationships. pic.twitter.com/BNJBRZo5Uv— Tim Sainburg (@tim_sainburg) August 12, 2019
In birdsong (European starling, Cassin's vireo, Bengalese finch, California thrasher) we found MI decayed in a similar way. Long-range relationships between vocal elements showed signs of both hierarchical and Markovian organization. pic.twitter.com/eFNkPmNJWh— Tim Sainburg (@tim_sainburg) August 12, 2019
Collectively, our results reveal a common structure in both the short- and long-range sequential dependencies in birdsong and speech. At short timescales, information decay is largely exponential, indicating structure that is governed by Markovian processes.— Tim Sainburg (@tim_sainburg) August 12, 2019
Throughout vocal sequences, however, and especially for long-timescale dependencies, a power law, indicative of non-Markovian hierarchical processes, governs information decay in both birdsong and speech.— Tim Sainburg (@tim_sainburg) August 12, 2019
More to explore and more to come! Let me know what you all think!— Tim Sainburg (@tim_sainburg) August 12, 2019