This paper has actually been through a long journey. The first draft of this paper was actually posted on my website way back in 2003—ten years ago! I then expanded and refined it into its current form in 2009, and after pulling it from the journal where it was originally accepted (and languished for some time without being published), it finally found a published home. So, I'm glad it's now finally out!
The full paper can be accessed directly here (pdf), and is of course available with all my other papers.
Speech balloons and thought bubbles are among the most recognizable visual signs of the visual language used in comics. These enclosed graphic containers provide a way in which text and image can interface with each other. However, their stereotypical meanings as representing speech or thought betray much deeper semantic richness. This paper uses these graphic signs as a platform for examining the multimodal interfaces between text and image, and details four types of interfaces that characterize the connections between modalities: Inherent, Emergent, Adjoined, and Independent relationships. Each interface facilitates different levels of multimodal integration, tempered by principles of Gestalt grouping and underlying semantic features. This process allows the possibility of creating singular cohesive units of text and image that is on par with other multimodal interfaces, such as between speech and gesture.
Cohn, Neil. 2013. Beyond speech balloons and thought bubbles: The integration of text and image. Semiotica. 2013(197): 35-63.