The key distinctions in this paper are about multimodal relations that must balance grammar in multiple domains. Many models of multimodal relations describe the various meaningful (i.e., semantic) interactions between modalities. This paper extends beyond these relationships to talk about how the dominance of meaning in one modality or another must negotiate grammatical structure in one or multiple modalities.
This paper has had a long journey... I first had many of these ideas way back in 2003, and they were part of an early draft on my website about multimodality called "Interactions and Interfaces." In 2010, I started reconsidering how to integrate the theory in the context of my mentor Ray Jackendoff's model of language—the parallel architecture. The component parts were always the same, but articulating them in this way allowed for a better grounding in a model of cognition, and in further elaborating on how these distinctions about multimodality fit within a broader architecture. I then tinkered with the manuscript on and off for another 5 years...
So, 12 years later, this paper is finally out! It pretty much lays out how I conceive of language and different modalities of language (verbal, signed, visual), not to mention their relationships. I suppose that makes it a pretty significant paper for me.
The paper can be found on my Downloadable Papers page, and a direct link (pdf) is here.
Human communication is naturally multimodal, and substantial focus has examined the semantic correspondences in speech–gesture and text–image relationships. However, visual narratives, like those in comics, provide an interesting challenge to multimodal communication because the words and/or images can guide the overall meaning, and both modalities can appear in complicated ‘‘grammatical” sequences: sentences use a syntactic structure and sequential images use a narrative structure. These dual structures create complexity beyond those typically addressed by theories of multimodality where only a single form uses combinatorial structure, and also poses challenges for models of the linguistic system that focus on single modalities. This paper outlines a broad theoretical framework for multimodal interactions by expanding on Jackendoff’s (2002) parallel architecture for language. Multimodal interactions are characterized in terms of their component cognitive structures: whether a particular modality (verbal, bodily, visual) is present, whether it uses a grammatical structure (syntax, narrative), and whether it ‘‘dominates” the semantics of the overall expression. Altogether, this approach integrates multimodal interactions into an existing framework of language and cognition, and characterizes interactions between varying complexity in the verbal, bodily, and graphic domains. The resulting theoretical model presents an expanded consideration of the boundaries of the ‘‘linguistic” system and its involvement in multimodal interactions, with a framework that can benefit research on corpus analyses, experimentation, and the educational benefits of multimodality.
Cohn, Neil. 2015. A multimodal parallel architecture: A cognitive framework for multimodal interactions. Cognition 146: 304-323