Saturday, May 04, 2019

New paper: Being explicit about the implicit

My cascade of recent new papers continues with my latest paper, "Being explicit about the implicit: inference generating techniques in visual narrative", which has recently been published open access in Language and Cognition. This is a paper that was gestating for quite awhile, and it's fun to finally see it published.

This paper is about how inference is generated in visual narratives like comics—i.e., how you get meaning when it is not provided overtly. This has been a primary focus of studies of how comics communicate at least since McCloud's notion of "closure" in Understanding Comics, and many other scholars have posited how we "fill the gaps" for knowing what we don't see.

However, much of this work has posited vague principles (closure, arthrology, etc.) for saying that people generate inference, but without discussing the specific cues and techniques that are used to motivate that inference in the first place. As I hope I demonstrate in this paper, inference is not a happenstance thing, and it also doesn't occur "in the gaps between panels," as most in comics studies seem to argue.

Rather, specific techniques motivate readers to create inference. These techniques are patterned ways of showing, or not showing, information that in turn signals to readers that they need to make an inference. The figure below provides a handy-dandy summary of some of these techniques mentioned in the paper (though it isn't a figure in the paper). A high-res version for printing is available here if you want to use it for personal use.



The overarching argument thus is that it's not enough to posit broad generalities for how visual narratives like comics are comprehended, but rather research should explore the specific methods and techniques that motivate that comprehension.

Not only does this paper list off these various techniques, but I also provide an analytical framework for characterizing their underlying features. This analysis actually goes back to about 5 years ago when my former students Kaitlin Pederson and Ryan Taylor met with me in my office at UCSD to brainstorm about inference, resulting in this scrawling whiteboard which laid the foundation for the table at the end of the article:



You can find the full article online here, or a pdf file here and via my downloadable papers page.

Abstract

Inference has long been acknowledged as a key aspect of comprehending narratives of all kinds, be they verbal discourse or visual narratives like comics and films. While both theoretical and empirical evidence points towards such inference generation in sequential images, most of these approaches remain at a fairly broad level. Few approaches have detailed the specific cues and constructions used to signal such inferences in the first place. This paper thereby outlines several specific entrenched constructions that motivate a reader to generate inference. These techniques include connections motivated by the morphology of visual affixes like speech balloons and thought bubbles, the omission of certain narrative categories, and the substitution of narrative categories for certain classes of panels. These mechanisms all invoke specific combinatorial structures (morphology, narrative) that mismatch with the elicited semantics, and can be generalized by a set of shared descriptive features. By detailing specific constructions, this paper aims to push the study of inference in visual narratives to be explicit about when and why meaning is ‘filled in’ by a reader, while drawing connections to inference generation in other modalities.


Cohn, Neil. 2019. Being explicit about the implicit: inference generating techniques in visual narrative. Language and cognition.

Saturday, April 13, 2019

New paper: Your brain on comics

I'm very excited to announce the publication of my newest paper,"Your brain on comics: A cognitive model of visual narrative comprehension" in Topics in Cognitive Science. This journal issue is actually a themed issue edited by me about visual narratives, and this paper is my personal contribution.

This paper in many ways is a culmination of about 10 years of experimental research asking "how do we comprehend a sequence of images?" Much of this work comes from my studies measuring people's brainwaves while they read comics, but it integrates this work with research from fields of discourse, event cognition, and other related disciplines. Here, I tie this work together in a cognitive model to provide an explanation for what happens in the brain when you progress through a sequence of images. My emphasis on brain studies gives the overall endeavor a neurocognitive focus, although the model itself is not specific to the brain.

The primary paper focuses on the evidence for two levels of representation in processing a sequence of images: a semantic structure, that computes the meaning, and a narrative structure, which organizes and presents that meaning in sequencing. In addition, I discuss how these mechanisms are connected to other aspects of cognition, like language and music processing, and I discuss the role of expertise and fluency in comprehending sequential images.

Overall, this is the first full processing theory of visual narrative comprehension, making it a significant marker in the growth of this research field.

The paper is readable online with Open Access, though a downloadable pdf is available here, and via my downloadable papers page. Here's the abstract:


The past decade has seen a rapid growth of cognitive and brain research focused on visual narratives like comics and picture stories. This paper will summarize and integrate this emerging literature into the Parallel Interfacing Narrative-Semantics Model (PINS Model)—a theory of sequential image processing characterized by an interaction between two representational levels: semantics and narrative structure. Ongoing semantic processes build meaning into an evolving mental model of a visual discourse. Updating of spatial, referential, and event information then incur costs when they are discontinuous with the growing context. In parallel, a narrative structure organizes semantic information into coherent sequences by assigning images to categorical roles, which are then embedded within a hierarchic constituent structure. Narrative constructional schemas allow for specific predictions of structural sequencing, independent of semantics. Together, these interacting levels of representation engage in an iterative process of retrieval of semantic and narrative information, prediction of upcoming information based on those assessments, and subsequent updating based on discontinuity. These core mechanisms are argued to be domain-general—spanning across expressive systems—as suggested by similar electrophysiological brain responses (N400, P600, anterior negativities) generated in response to manipulation of sequential images, music, and language. Such similarities between visual narratives and other domains thus pose fundamental questions for the linguistic and cognitive sciences.



Cohn, N. (2019). Your brain on comics: A cognitive model of visual narrative comprehension. Topics in Cognitive Science. doi:10.1111/tops.12421

Friday, April 05, 2019

Knowing the rules of comic page layouts

One of my more engaged-with blog posts of recent memory reviewed the data for whether the panel arrangement on the right was “confusing.” So, here’s a post with some additional thoughts on this and the “rules” of comic page layouts**…

First off, let me remind people that I've given this layout a name: When you have a vertical stack of panels next to a tall panel, I call it "blockage." You can find terms (and science!) related to page layout in my book and my scientific papers (also linked throughout).

Most of the claims I make about page layouts are based on the experiments that I and others have done about them. For this layout, the key experimental findings came from two studies presenting people with empty page layouts, and then asking them to choose the order that they would read the panels.

We found that for blockage layouts, around 90% say “down”. Or, conversely put, less than 10% of choices in these situations followed the "left-to-right-and-down" Z-path that follows the order of written text. As I said in my previous blog post, this rate is essentially the inverse of what we find for pure grids. In simple grids, we find 90% of responses choose to follow the Z-path (i.e., go right) instead of choosing other paths.


Now, one criticism people have about these studies is that they don't have content in the layouts. Yes, these experiments presented empty panels, which might be different than if content is included. But, there's a good reason for this: the question we were asking wasn’t “how do people read these layouts?” but rather “what are people’s preferences for ordering these layouts?” Having no content works just fine for doing good science and factoring out confounding variables, and it answers our question of whether people have preferences for orders: yes they clearly do.

So, these results show that readers have a preference for the proper reading direction. In other words, the “rule” in their minds is that, they should read downward in blockage layouts. You might think that the “rule” of reading comic page layouts is “left-to-right and down”, like text, and thus this layout is confusing. But, that’s not the rule. I’ll explain this more in a bit…

When I say that “this layout is not confusing”, I mean that readers have these clear intuitions for what to do in these situations. The layout itself is not confusing, since people know what to do with it. What gives confusion then, is when creators don’t know or don’t obey this "go downward" rule, and still use layouts where blockage is read to the right. This could feasibly create confusion, since it treats this layout as “neutral” or like there isn’t a rule for its order.

However, there is a clear rule for it, and thinking it’s neutral is wrong according to the experimental results for what people say their preferences are. Grids aren’t used as if right and down are equal choices (though they’re even more physically ambiguous), and nor should this layout.

Certainly a creator can manipulate the reading path by using the content or balloons to go in a different directly. They do this all the time in effective and creative ways even with grids, like in the layout to the left. But, doing it against the downward path in blockage layouts have to be recognized as “breaking the rule” with artistic intent.

So, why isn’t “left to right and down” the real rule of layout? Well, it’s *one* rule in comic page layouts, but it’s just a surface choice within a broader overarching set of rules/principles.

Readers don’t just read a comic page to just go from panel to panel along the “surface” of the canvas making choices like right, down, etc. While it is likely that surface features like balloons and bubbles can "direct" the eye, layouts themselves have rules that are not dependent on these surface features, as demonstrated by the consistent results using empty layouts.

(Note: To my knowledge, there are no controlled experimental results showing that content directs readers' eyes through layouts. There is one non-controlled study that has some hints about this though.)

Here’s the actual rules of layout: Readers go through layouts guided by a desire to create grouped structures out of panels. The surface decisions that they make are on the basis of alignments between the edges of panels, but these choices are subservient to the larger goal of making hierarchic groupings.

I argue that these grouping mechanisms are what underlie readers choices when they move from panel to panel. This may involve some surface level rules, but there is an overarching principle I call “Assemblage” that has four basic sub-principles:

1. Grouped areas > non-grouped areas
2. Smooth paths > broken paths
3. Do not jump over units
4. Do not leave gaps

The reason so many people agree on going down in this layout is because it facilitates chunking the page into grouped structures, while the rightward path doesn’t, and a rightward path violates the Assemblage principles. This is why it’s a “rule.”

So, if you’re a comic creator, knowing what readers are trying to do while they read can help you design layouts, including how to break those rules with intent if you need to do so artistically.



**This originally appeared as a Twitter thread, and has now been expanded for blog format.

Thursday, December 20, 2018

2018: My publications in review

The last few years I've closed out the year by summarizing all of my papers that came out (2016, 2017), and so this year I'm doing the same. It's been a diverse year of papers, with some theoretical papers, a few brainwave papers carried out by colleagues, and a corpus study. So, here are the papers that I published in 2018...

The cultural pages of comics (PDF) - This paper coauthored with my student assistants followed up our analysis of page layouts in superhero comics by comparing page layouts in 60 comics, 10 each from US superhero comics, US Indy comics, Japanese shonen manga, Hong Kong manhua, French bande desinée, and Swedish comics. Overall, we found that cultures differ in their page layout features in patterned and systematic ways. For example, layouts in Asian comics use more vertical segments, while those from Europe and US Indy comics use more staggering of panels within horizontal rows.

In defense of a “grammar” in the visual language of comics (PDF) - This theoretical paper reviewed my theory of narrative structure, and defended it against critiques that sequential image comprehension requires only meaningful connections between panels. I review and compare the theories, and lay out arguments for why a narrative structure is both necessary and supported by the experimental evidence. I also take the hard line that any proposal for how visual narrative sequences are understood must account for the cognitive results in experimentation.

Combinatorial morphology in visual languages (PDF) - In this chapter from the recent book The Construction of Words: Advances in Construction Morphology, I try to formalize the linguistic structure of the morphology ("symbology") of visual representations like hearts or lightbulbs above the head, motion lines, and impact stars. It discusses both how these forms use systematic strategies to combine elements, and the ways they derive meaning through symbolic and metaphorical techniques.

Listening beyond seeing (PDF) - My coauthor Mirella Manfredi carried out this cool study which showed people comics, and at the critical panel also played sounds to people. The panel showed an action, while either playing people a spoken onomatopoeia that matched or mismatched the action, or an actual sound effect that matched/mismatched the action. We measured people's brainwaves, and found that their processing of these multimodal meanings partially overlapped, but partially did not. Brainwaves to words and sounds differed at the start of their processing, but in later parts of the processing seemed to not differ, implying some sort of integrative process.

Visual Language Theory and the scientific study of comics (PDF) - This chapter appeared in the recent book Empirical Comics Research, which has a wide survey of studies using empirical methods (corpus, computational, cognitive) to study comics. My paper provides a review of my Visual Language Theory, and its structures of vocabulary, layout, and narrative structure. I describe how theories of their structure combines with corpus analysis and psychological experimentation to give us a converging view of how visual languages in comics are built. I think it's a relatively decent introductory paper for people who are unfamiliar with my theories.

Are emoji a poor substitute for words? (PDF, Poster) - Our conference paper from the 2018 Meeting of the Cognitive Science Society looked at how people process sentences when emoji are substituted for words. We found that people view emoji slower than words in sentences, but even slower when the emoji mismatches the part of speech (ex. a "noun-ish" emoji in verb position). When people read the next word after seeing a congruous emoji, they process it just as easily as seeing an all text sentence, but words after incongruous emoji are still read slower. This suggests that congruous emoji substituted for words can readily be integrated into the syntax of sentences. We also compared logos and emoji substituted in text, and found they didn't differ in their processing.

Visual and linguistic narrative comprehension in autism spectrum disorders (PDF) - My first paper with my colleague Emily Coderre compares the brainwaves of neurotypical individuals with individuals with autism while they comprehended both verbal and visual narratives. People have often claimed that autistic individuals do better with visual materials, but we show similar processing deficits for both verbal and visual materials, hinting at a more general issue processing meaning across modalities. This is the first of my papers on autism and visual narratives with Emily, and we've got lots more on tap coming soon.

Workshop: How we make and understand drawings - Finally, not a publication, but back in April I gave two workshops at the University of Connecticut with philosopher Gabe Greenberg where we examine the structure and meaning of individual and sequential images. My portion (first day) examines how drawings are structured and how people learn to draw, which starts midway (02:18:15) through this video:



On the second day, my portion reviewed my findings about how visual narratives are processed, particularly the combination of narrative structure and meaning. I then presented my multimodal model of language and cognition. That's in the second half of this video (02:04:20), which unfortunately has less good sound:




Forecasting ahead to next year, I can already say that it's going to be a big year. I have a special issue of a journal that I'm editing that has some great looking papers. I also have two big review papers that should be coming out, one on processing and one on "fluency" of sequential images. Plus, we've now run five (!) brainwave studies in my operational EEG lab here in Tilburg, all of which are being written up. So, here's looking forward to a good 2019...

These and all my papers are available on my website here.

Monday, December 17, 2018

Review: Metaphoricity of Conventionalized Diagetic Images in Comics

Michał Szawerna's recent book Metaphoricity of Conventionalized Diegetic Images in Comics: A Study in Multimodal Cognitive Linguistics analyzes a variety of structural aspects of the visual languages of comics by taking a deep dive into Peircean semiotics and cognitive linguistics, particularly conceptual metaphor theory, and cognitive grammar. The book seems to have flown largely under the radar of most discussions of comics theory, but it is interesting in several regards.

The book opens with an analysis of the history of scholarship on comics, emphasizing the structuralist and linguistic analyses. Included in this is a discussion of Polish research, which I had not previously seen discussed in other publications. It also extensively covers the semiotic theories of C.S. Peirce and the developments of conceptual metaphor theory over the past 30 years.

The substantive chapters then each delve into a different aspect of the structure of comics. This starts with a chapter on the abstract properties of panels and how they convey time across sequences, then progresses to a discussion of depictions of motion (motion lines, polymorphic panels).  Chapters then discuss the depictions of sound (balloons), and "mental experiences" (like thought bubbles, upfixes). A concluding chapter then summarizes the overall arguments.

The book throughout contains several insightful examples and analyses, and at the least makes one consider the complexity of various visual conventions. For example, the chapter on motion discusses what I've called "polymorphic" representations, where a single panel shows a character repeated in an action to imply motion. Here Szawerna observes that this overall pattern extends beyond motion, and can also depict transformations, like a werewolf's shift from a man to wolf-man. I don't think I've seen this representation discussed in any other paper, and it's  nice observation of its similarities to other polymorphic panels.

Other observations seem a little overly strong. For example, in the chapter on comic panels, Szawerna takes on the strong McCloudian position that the width of panels has a direct correspondence to time duration. He also claims that images in sequence are directly mapping to a timeline of episodic events (a space = time metaphor), even comparing comics to the grid pattern of days on a calendar. I've long pointed out problems with this view, and support against it has been provided by several experiments.

This relates to my first critique of the book. Though the book has many good insights, it ultimatley feels like a case of “if all you have is a hammer, then everything looks like a nail.” That is, the metaphorical interpretations run so rampant throughout that no alternative interpretations are offered nor considered. I don't disagree with metaphorical interpretations of various conventions, but it seems a metaphorical interpretation should be a "last resort" if a simpler explanation is possible. For example, experimentation of motion lines has implied their understanding is not metaphorical or based on our perception of moving objects, but driven largely by conventionalization.

Also, while the work is clearly well-researched, at times references seem selective or miss important arguments. For example, in the introductory chapter, Szawerna critiques my notion of visual language on the basis of Hockett’s design features for language, claiming that visual languages cannot be languages because they do not exhibit thing like duality of patterning or arbitrariness. However, these issues are addressed in the second chapter of my book, which is cited, and perhaps more importantly, does not acknowledge that those features do not hold up for sign languages, nor are they even consistent descriptors of spoken languages.

My second main critique of the book relates to cognition. Mostly the book seeks describe what is happening in the visual language of comics, often in very intense details. But, these often amount to just giving labels to things, falling short of explaining the mechanisms and cognitive processes involved in these representations. Granted, description is important too, but I would have hoped for more of a balance.

More concerning is the repeated invocation for the “psychological reality” of the argued analyses, despite no evidence being provided for such interpretations. There are no theoretical diagnostic tests, nor is any empirical literature discussed, even though there has been relevant psychological experiments about many of the issues under analysis.  Claims of "psychological reality" need to engage the actual experimental cognitive literature, as should any theoretical claims about how "comics work."

For example, the experimental literature would especially be useful to examine Szawerna's claim that people transparently understand images and conventions in visual languages (which he attributes to Miodrag). The empirical literate actually shows cultural differences for many conventions that occur in comics (and even basic drawings). Also, developmental psychology has shown trajectories for learning to understand basic images, image sequences, and morphemes like motion lines and carriers. Szawerna uses the assumption of transparency to ground claims of metaphoric knowledge motivated by universal and embodied understanding, but the literature does not seem to support this (although, non-transparency does not rule out a metaphoric interpretation).

Finally, it should be noted that stylistically this book is not an easy read, particularly for those who don't often read research on linguistics. It is often weighed heavily by jargon and exceedingly long sentences. Some serious copyediting could beneficially cut at least a third of the book's 490 page length. This would have been useful, as I fear that sometimes the book’s insights are buried beneath the prose.

Criticisms aside, the book seems like it would be important for scholars to engage if they are interested in the understanding of these elements of visual vocabulary and/or visual metaphor. In addition, this book seems to be a landmark in the study of the visual language of comics for what it does. It is the first, to my knowledge, to devote a book extensively to rigorously analyzing just a few structural features of the visual domain. Such depth of analysis is indicative of the growing seriousness and sophistication of the linguistic and cognitive approach to visual languages, hopefully making Szawerna's book a harbinger of further works to come.



Szawerna, Michał. 2017. Metaphoricity of Conventionalized Diegetic Images in Comics: A Study in Multimodal Cognitive Linguistics, Łódź Studies in Language 54: Peter Lang Publishing.

Tuesday, November 27, 2018

New paper: The cultural pages of comics

I'm excited to announce that our paper, "The cultural pages of comics: cross-cultural variation in page layouts", has been published in the Journal of Graphic Novels and Comics! It actually came out back around a year ago, but I was waiting for it to leave "early view." Since it's still unchanged, I figured better to just post it and get it out rather than waiting around.

This paper is a follow up to our prior paper looking at how page layout has changed in American superhero comics across time. This project, largely undertaken by my student co-authors, instead compared the page layouts in six different types of comics from around the world.

Our overall findings found that page layout could be a factor that characterizes different types of comics, since different cultures' layouts differed in consistent ways. In particular, Asian layouts (like Japanese manga and Hong Kong manhua) use more vertical segments than Western comics. Indy comics from the US and European comics tend to use more horizontal staggering, while American comics use more "pure" grids.

These findings further contribute to showing that there are systematic cross-cultural differences between the "visual languages" used in comics of the world. We've shown in many studies (several of which are still on the way to being published) that cultures' comics differ across nearly every dimension possible, and often have variation within cultures (such as between genres). To some degree, such diversity calls into question just how consistent an abstract notion of the "comics medium" there is in the first place. More on this to come in the future for sure.

The full paper is downloadable here, and along with all my papers here.

Abstract:

Page layouts are a salient feature of comics, which have only recently begun to be studied using empirical methods. This preliminary study uses corpus analysis to investigate the properties of page layouts in comics from Europe (Sweden, France), Asia (Japan, Hong Kong), and America (Mainstream, Indy genres). Pages from Asian books used more vertical segments and bleeding panels, while European and American Indy pages used more horizontal staggering. Pages from American mainstream comics used widescreen panels spanning a whole row, and more variable distances between panels (separation, overlap). These results suggest that pages from different types of comics have different systematic characteristics, which can be studied by empirical methods.

Full reference:

Cohn, Neil, Jessika Axnér, Michaela Diercks, Rebecca Yeh, and Kaitlin Pederson. 2017. The cultural pages of comics: Cross-cultural variation in page layouts. Journal of Graphic Novels and Comics. doi: 10.1080/21504857.2017.1413667.

Thursday, September 13, 2018

New paper: Visual and linguistic narrative comprehension in autism spectrum disorders

My new paper with my collaborator, Emily Coderre, is finally out in Brain and Language. Our paper,"Visual and linguistic narrative comprehension in autism spectrum disorders: Neural evidence for modality-independent impairments," examines the neurocognition of how meaning is processed in verbal and visual narratives for individuals with autism and neurotypical controls.

We designed this study because there are many reports that individuals with autism do better with visual than verbal information. In the brain literature, we also see reduced brainwaves indicative of semantic processing for language processing in these individuals. So, we asked here: are these observations about semantic processing due to differences between visual and verbal information, or is it due to processing meaning across a sequence.

Thus, we presented both individuals with autism and neurotypical controls with either verbal or visual narratives (i.e., comics, or comics "translated" into text) and then introduced anomalous words/images at their end to see how incongruous information would be processed in both types of stimuli.

We found that individuals with autism had reduced semantic processing (the N400 brainwaves) to the incongruities in both the verbal and visual narratives. This implies that it's not a deficit in processing of a type of modality, but in a more general type of information processing.

The full paper is available at my Downloadable Papers page, or at this link (pdf).

Abstract

Individuals with autism spectrum disorders (ASD) have notable language difficulties, including with understanding narratives. However, most narrative comprehension studies have used written or spoken narratives, making it unclear whether narrative difficulties stem from language impairments or more global impairments in the kinds of general cognitive processes (such as understanding meaning and structural sequencing) that are involved in narrative comprehension. Using event-related potentials (ERPs), we directly compared semantic comprehension of linguistic narratives (short sentences) and visual narratives (comic panels) in adults with ASD and typically-developing (TD) adults. Compared to the TD group, the ASD group showed reduced N400 effects for both linguistic and visual narratives, suggesting comprehension impairments for both types of narratives and thereby implicating a more domain-general impairment. Based on these results, we propose that individuals with ASD use a more bottom-up style of processing during narrative comprehension.


Coderre, Emily L., Neil Cohn, Sally K. Slipher, Mariya Chernenok, Kerry Ledoux, and Barry Gordon. 2018. "Visual and linguistic narrative comprehension in autism spectrum disorders: Neural evidence for modality-independent impairments." Brain and Language 186:44-59.

Thursday, August 02, 2018

New paper: Visual Language Theory and the scientific study of comics

My latest paper is a chapter in the exciting new book collection, Empirical Comics Research: Digital, Multimodal, and Cognitive Methods, edited by Alexander Dunst, Jochen Laubrock, and Janina Wildfeuer. The book is a collection of empirical studies about comics, summarizing many of the works presented at the Empirical Studies of Comics conference at Bremen University in 2017.

It's fairly gratifying to see a collection like this combining various scholars' work using empirical methods to analyze comics. I've been doing this kind of work for almost two decades at this point, and most if it has been without many other people doing such research, and certainly not coming together in a collaborative way. So, a publication like this is a good marker for what is hopefully an emerging field.

My own contribution to the collection is the last chapter, "Visual Language Theory and the scientific study of comics." I provide an overview of my visual language research across the fields of the visual vocabulary of images, narrative structure, and page layout.

I also give some advice for how to go about such research and the necessity of an interdisciplinary perspective balancing theory, experimentation, and corpus analysis. The emphasis here is that all three of these techniques are necessary to make progress, and using one technique alone is limiting.

 You can find a preprint version of my chapter here, though I recommend checking out the whole book:

Empirical Comics Research: Digital, Multimodal, and Cognitive Methods

Abstract of my chapter:

The past decades have seen the rapid growth of empirical and experimental research on comics and visual narratives. In seeking to understand the cognition of how comics communicate, Visual Language Theory (VLT) argues that the structure of (sequential) images is analogous to that of verbal language, and that these visual languages are structured and processed in similar ways to other linguistic forms. While these visual languages appear prominently in comics of the world, all aspects of graphic and drawn information fall under this broad paradigm, including diverse contexts like emoji, Australian aboriginal sand drawings, instruction manuals, and cave paintings. In addition, VLT’s methods draw from that of the cognitive and language sciences. Specifically, theoretical modeling has been balanced with corpus analysis and psychological experimentation using both behavioral and neurocognitive measures. This paper will provide an overview of the assumptions and basic structures of visual language, grounded in the growing corpus and experimental literature. It will cover the nature of visual lexical items, the narrative grammar of sequential images, and the compositional structure of page layouts. Throughout, VLT emphasizes that these components operate as parallel yet interfacing structures, which manifest in varying ‘visual languages’ of the world that temper a comprehender’s fluency for such structures. Altogether, this review will highlight the effectiveness of VLT as a model for the scientific study of how graphic information communicates.


Cohn, Neil. 2018. Visual Language Theory and the scientific study of comics. In Wildfeuer, Janina, Alexander Dunst, Jochen Laubrock (Ed.). Empirical Comics Research: Digital, Multimodal, and Cognitive Methods. (pp. 305-328) London: Routledge.

Sunday, July 08, 2018

New paper: Listening beyond seeing

Our new paper has just been published in Brain and Language, titled "Listening beyond seeing: Event-related potentials to audiovisual processing in visual narrative." My collaborator Mirella Manfredi carried out this study, which builds on her previous work looking at different types of words (Pow! vs. Hit!) substituted into visual narrative sequences.

Here, Mirella showed visual narratives where the climactic event either matched or mismatched auditory sounds or words. So, like the figure to the right, a panel showing Snoopy spitting would accompany the sound of spitting or the word "spitting". Or, we played incongruous sounds, like the sound of something getting hit, or the word "hitting."

We measured participants brainwave responses (ERPs) to these panels/sounds. We found that these stimuli elicited an "N400 response"—which occurs to the processing of meaning in any modality (words, sounds, images, video, etc.). We found that though the overall semantic processing response (N400) was similar to both stimulus types, the incongruous sounds evoked a slightly different response across the scalp than the incongruous words. This suggested that, despite the overall process of computing meaning being similar, these stimuli may be processed in different parts of the brain.

In addition, these patterned responses very much resembled what is typical of showing words or sounds in isolation, and did not resemble what often appear to images. This suggests that, despite the multimodal image-sound/word interaction determining whether stimuli were congruent or incongruent, the semantic processing of the images did not seem to factor into the responses (or, was equally subtracted out across stimulus types).

So, overall, this implies that semantic processing across different modalities uses a similar response (N400), but may differ in neural areas.

You can find the paper here (pdf) or along with my other downloadable papers.

Abstract
Every day we integrate meaningful information coming from different sensory modalities, and previous work has debated whether conceptual knowledge is represented in modality-specific neural stores specialized for specific types of information, and/or in an amodal, shared system. In the current study, we investigated semantic processing through a cross-modal paradigm which asked whether auditory semantic processing could be modulated by the constraints of context built up across a meaningful visual narrative sequence. We recorded event-related brain potentials (ERPs) to auditory words and sounds associated to events in visual narratives—i.e., seeing images of someone spitting while hearing either a word (Spitting!) or a sound (the sound of spitting)—which were either semantically congruent or incongruent with the climactic visual event. Our results showed that both incongruent sounds and words evoked an N400 effect, however, the distribution of the N400 effect to words (centro-parietal) differed from that of sounds (frontal). In addition, words had an earlier latency N400 than sounds. Despite these differences, a sustained late frontal negativity followed the N400s and did not differ between modalities. These results support the idea that semantic memory balances a distributed cortical network accessible from multiple modalities, yet also engages amodal processing insensitive to specific modalities.

Full reference:

Manfredi, Mirella, Neil Cohn, Mariana De Araújo Andreoli, and Paulo Sergio Boggio. 2018. "Listening beyond seeing: Event-related potentials to audiovisual processing in visual narrative." Brain and Language 185:1-8. doi: https://doi.org/10.1016/j.bandl.2018.06.008.

Sunday, April 22, 2018

New paper: Combinatorial morphology in visual languages

I'm very pleased to announce that my newest paper, "Combinatorial morphology in visual languages" has now been published in a book collection edited by Geert Booij, The Construction of Words: Advances in Construction Morphology. The overall collection looks excellent and is a great resource for work in linguistics on morphology across domains.

My own contribution makes a first attempt to formalize the structure of combinatorial visual morphology—how visual signs like motion lines or hearts combine with their "stems" to create a larger additive meaning.

This paper also introduces a new concept for these types of signs. Since various visual morphemes are affixes—like the "upfixes" that float above faces (right)—it begs the question: what are these affixes attaching to? In verbal languages, affixes attach to "word" units. But visual representations don't have words, so this paper discusses what type of structure would be required to fill that theoretical gap, and formalizes this within the parallel architecture model of language.

You can download a pre-print of chapter here (pdf) or on my downloadable papers page.

Abstract

Just as structured mappings between phonology and meaning make up the lexicons of spoken languages, structured mappings between graphics and meaning comprise lexical items in visual languages. Such representations may also involve combinatorial meanings that arise from affixing, substituting, or reduplicating bound and self-standing visual morphemes. For example, hearts may float above a head or substitute for eyes to show a person in love, or gears may spin above a head to convey that they are thinking. Here, we explore the ways that such combinatorial morphology operates in visual languages by focusing on the balance of intrinsic and distributional construction of meaning, the variation in semantic reference and productivity, and the empirical work investigating their cross-cultural variation, processing, and acquisition. Altogether, this work draws these parallels between the visual and verbal domains that can hopefully inspire future work on visual languages within the linguistic sciences.


Cohn, Neil. Combinatorial morphology in visual languages. In Booij, Geert (Ed.). The Construction of Words: Advances in Construction Morphology. (pp. 175-199). London: Springer

Tuesday, April 10, 2018

Workshop: How We Make and Understand Drawings

A few weeks back I had the pleasure of doing a workshop with Gabriel Greenberg (UCLA) about the understanding of drawings and visual narratives at the University of Connecticut. The workshop was hosted by Harry van der Hulst from the Linguistics Department, and we explored the connections between graphic systems and the structure of language. UConn has now been nice enough to put our talks online for everyone, and I've posted them below.

On Day 1, Gabriel first talked about his theory of pictorial semantics. Then, I presented my theory about the structure of the "visual lexicon(s)" of drawing systems, and then about how children learn to draw. This covered what it means for people to say "I can't draw," as was the topic of my papers on the structure of drawing.



On Day 2, we covered the understanding of sequential images. Here our views diverged, with Gabriel taking more of a "discourse approach", while I presented my theory of Visual Narrative Grammar and several of the studies supporting it. I finished by presenting my "grand theory of everything" about a multimodal model of language and communication. Unfortunately, the mic ran out of batteries on the second day and we didn't know it, so the sound is very soft. But, if you crank up the volume and listen carefully, you should be able to hear it (hopefully).

Thursday, February 15, 2018

New Paper: In defense of a “grammar” in the visual language of comics

I'm excited to announce that my new paper, "In defense of a 'grammar' in the visual language of comics" is now published in the Journal of Pragmatics. This paper provides an overview of my theory of narrative grammar, and rigorously compares it against other approaches to sequential image understanding.

Since my proposal that a "narrative grammar" operates to guide meaningful information in (visual) narratives, there have been several critiques and misunderstandings about how it works. Some approaches have also been proposed as a counterpoint. I feel all of this is healthy in the course of development of a theory and (hopefully) a broader discipline.

In this paper I address some of these concerns. I detail how my model of Visual Narrative Grammar operates and I review the empirical evidence supporting it. I then compare it in depth to the specifics and assumptions found in other models. Altogether I think it makes for a good review of the literature on sequential image understanding, and outlines what we should expect out of a scientific approach to visual narrative.

The paper is available on my Downloadable Papers page, or direct through this link (pdf).

Abstract:

Visual Language Theory (VLT) argues that the structure of drawn images is guided by similar cognitive principles as language, foremost a “narrative grammar” that guides the ways in which sequences of images convey meaning. Recent works have critiqued this linguistic orientation, such as Bateman and Wildfeuer's (2014) arguments that a grammar for sequential images is unnecessary. They assert that the notion of a grammar governing sequential images is problematic, and that the same information can be captured in a “discourse” based approach that dynamically updates meaningful information across juxtaposed images. This paper reviews these assertions, addresses their critiques about a grammar of sequential images, and then details the shortcomings of their own claims. Such discussion is directly grounded in the empirical evidence about how people comprehend sequences of images. In doing so, it reviews the assumptions and basic principles of the narrative grammar of the visual language used in comics, and it aims to demonstrate the empirical standards by which theories of comics' structure should adhere to.


Full reference:

Cohn, Neil. 2018. In defense of a "grammar" in the visual language of comics. Journal of Pragmatics. 127: 1-19