Sunday, October 27, 2019

Interview with A. David Lewis

I had the pleasure of being interviewed on a streaming video with the comics scholar A. David Lewis recently, and he's now posted the video online! His primary line of questioning is whether my neurocognitive research could be considered a complementary side of Graphic Medicine (the field that uses graphics and comics to communicate and explore health related concerns). Here's our discussion...

Tuesday, September 03, 2019

ERC Starting Grant for Visual Language research

I'm very happy to officially announce that I have received an ERC Starting Grant! This is my first major individual research grant (after many many tries), and I'm very excited to have the chance to work on a project I've been planning for over 10 years.

My project "Visual narratives as a window into language and cognition" (nicknamed "TINTIN") is going to build tools for analyzing visual and multimodal information, and then incorporate it into a corpus of data. All of these tools and data will be made publicly accessible for other researchers to explore, though we'll be using them to study whether there are cross-cultural patterns in the visual languages used in comics of the world, and whether those patterns connect to the spoken languages of their authors. In the coming months I'll be hiring a team of students and researchers to put this project into motion.

This project is a follow up and expansion from my previous corpus work in the Visual Language Research Corpus, which capped out around 300 comics (+ 4,000 Calvin and Hobbes strips). We're finishing writing up this data, which has already appeared in papers about cross-cultural page layouts, and American page layouts and storytelling over time. However, since the TINTIN project will be launching a new, more sophisticated coding scheme and methods, I plan on making the data of the VLRC publicly available soon as well.

Here's my official description of the TINTIN project:

"Drawn sequences of images are a fundamental aspect of human communication, appearing from instruction manuals and educational material to comics. Despite this, only recently have scholars begun to examine these visual narratives, making this an untapped resource to study the cognition of sequential meaning-making. The emerging field analysing this work has implicated similarities between sequential images and language, which raises the question: Just how similar is the structure and processing of visual narratives and language? I propose to explore this query by drawing on interdisciplinary methods from the psychological and linguistic sciences. First, in order to examine the structural properties of visual narratives, we need a large-scale corpus of the type that has benefited language research. Yet, no such databases exist for visual narrative systems. I will thus create innovative visual annotation tools to build a corpus of 1,500 annotated comics from around the world (Stage 1). With such a corpus, I will then ask, do visual narratives differ in their properties around the world, and does such variance influence their comprehension (Stage 2)? Next, we might ask why such variation appears, particularly: might differences between visual narratives be motivated by patterns in spoken languages, thereby implicating cognitive processes across modalities (Stage 3)? Thus, this proposal aims to investigate the domain-specific (Stage 2) and domain-general (Stage 3) properties of visual narratives, particularly in relation to language, by analysing both production (corpus analyses) and comprehension (experimentation). This research will be ground-breaking by challenging our knowledge about the relations between drawing, sequential images, and language. The goal is not simply to create tools to explore a limited set of questions, but to provide resources to jumpstart a budding research field for visual and multimodal communication in the linguistic and cognitive sciences."

Be ready to hear a lot more about this project over the next 5+ years!

Saturday, June 01, 2019

New paper: Structural complexity in visual narratives

2019 so far has been a flurry of published papers for me, and here's yet another. My paper "Structural complexity in visual narratives: Theory, brains, and cross-cultural diversity" is now published in the book collection Narrative Complexity and Media: Experiential and Cognitive Interfaces. The book is an extensive resource (468 pages!) including many chapters about the cognitive study of narrative. Mine is one of several that discusses visual narratives, along with complementary chapters by Joe Magliano and James Cutting. So, the book is highly recommended!

In this paper, I tackle the issue of "narrative complexity" in three ways. First, I describe the way in which sequences of images are built in terms of their underlying structure. This complexity comes from the narrative structure, and how various schematic principles combine to create patterns with "complexity" in their architecture similar to what is found in syntactic structure in sentences.

The second level of complexity comes in how these narrative patterns manifest in different types of comics from around the world. We coded the properties of various comics to see how comics from Europe, the United States, and Asia might differ in their narrative patterns. We found that they indeed vary, with comics from Asia (Japan, Korea, Hong Kong) using more complex sequencing patterns than those from Europe or the United States. This is important because such diversity is systematic, implying that they are encoded in the minds of their authors and readers.

The third level of complexity comes in how visual narratives like comics are processed. Many theories posit that we understand comics by simply linking meanings between panels. This implies a fairly uniform process guided only by updating meaning from image to image. However, neurocognitive research implies that the brain actually uses several interacting mechanisms in the processing of narrative image sequences, balancing both meaning and a narrative structure of the type described in the previous sections.

Altogether, this paper outlines a balance between theoretical, cross-cultural, and neurocognitive research that identifies complexity at multiple levels.

The paper is available in the book itself, but a downloadable preprint version is available here or on my downloadable papers page.


Cohn, Neil. 2019. Structural complexity in visual narratives: Theory, brains, and cross-cultural diversity. In Grishakova, Marina and Maria Poulaki (Ed.). Narrative Complexity and Media: Experiential and Cognitive Interfaces  (pp. 174-199). Lincoln: University of Nebraska Press

New paper: The neurophysiology of event processing in language and visual events

In yet another one of my recent publications, here is a book chapter that's been awaiting publication for many years. My paper with my dear departed friend, Martin Paczynski, "The neurophysiology of event processing in language and visual events" is now finally published in the Oxford Handbook of Event Structure.

Our chapter gives an overview of research on the understanding of events from the perspective of cognitive neuroscience, particularly research using EEG. We actually wanted the original paper to be titled "Events electrified" but the book collection wanted less punchy titles. Our focus is on the N400 and P600 ERP effects, as they manifest in both language about events and in the perception of visual events themselves.

The paper can be downloaded here or at my downloadable papers page.

First paragraph:

"Events are a fundamental part of human experience. All actions that we undertake, discuss, and view are embedded within the understanding of events and their structure. With the increasing complexity of neuroimaging over the past several decades, we have been able for the first time to examine how this tacit knowledge is processed and stored in people’s minds and brains. Among the techniques used to study the brain, electroencephalography (EEG) offers one of the few ways in which we can directly study information processed by the brain. Unlike functional imaging, whether PET or fMRI, which rely on metabolic consequences of neural activity, the EEG signal is generated by post-synaptic potentials in pyramidal cells which make up approximately 80% of neurons within the cerebral cortex. As such, EEG offers a temporal resolution measured in milliseconds, rather than seconds, making it well suited for exploring the rapid nature of language processing. Though there are numerous ways in which the EEG signal can be analyzed, in the current chapter we will focus our attention on the most common measure: event-related potentials (ERPs), the portion of the EEG signal time-locked to an event of interest, such as a word, image, or the start of a video clip."

Cohn, Neil and Martin Paczynski. 2019. The neurophysiology of event processing in language and visual events. In Truswell, Robert (Ed.). Handbook of event structure. (pp. 624-637). Oxford: Oxford University Press.

New paper: Visual narratives and the Mind

My latest paper, "Visual narratives and the mind: Comprehension, cognition, and learning" is published in the collection Psychology of Learning and Motivation. This paper integrates a few threads of research that I've been working on lately.

The first section presents the cognitive processes that go into understanding a sequence of images, integrating two of the most recent psychological models on the issue. These include my own neurocognitive model of sequential image understanding that integrates both semantic and narrative structures, and an approach from some of my colleagues emphasizing aspects of scene perception and   event cognition.

The second section then asks, given these cognitive processes related to visual narrative understanding, how much of them are specialized for that specifically? Are these general mechanisms that also apply to other aspects of cognition, like language? I argue for two levels of this: more specialized processing mostly has to do with the modalities themselves: how you engage written text might be different from how you engage pictures. However, the "back end" processes—how you compute meaning and order them into sequences—likely is more connected across other domains.

Finally, I then examine the relation between these cognitive processes and how children learn to understand a sequence of images. A wide literature points to children only starting to understand the sequential aspects of visual narratives between ages 4 and 6. So, I discuss the stages in children's development of understanding sequential images, and link this to the cognitive processes discussed in the first section.

You can find a direct preprint pdf version of the paper here, as well as on my downloadable papers page. Here's the abstract:

The way we understand a narrative sequence of images may seem effortless, given the prevalence of comics and picture stories across contemporary society. Yet, visual narrative comprehension involves greater complexity than is often acknowledged, as suggested by an emerging field of psychological research. This work has contributed to a growing understanding of how visual narratives are processed, how such mechanisms overlap with those of other expressive modalities like language, and how such comprehension involves a developmental trajectory that requires exposure to visual narrative systems. Altogether, such work reinforces visual narratives as a basic human expressive capacity carrying great potential for exploring fundamental questions about the mind.



Cohn, Neil. 2019. Visual narratives and the mind: Comprehension, cognition, and learning. In Federmeier, Kara D. and Diane M. Beck (Eds). Psychology of Learning and Motivation: Knowledge and Vision. Vol. 70. (pp. 97-128). London: Academic Press

Saturday, May 04, 2019

New paper: Being explicit about the implicit

My cascade of recent new papers continues with my latest paper, "Being explicit about the implicit: inference generating techniques in visual narrative", which has recently been published open access in Language and Cognition. This is a paper that was gestating for quite awhile, and it's fun to finally see it published.

This paper is about how inference is generated in visual narratives like comics—i.e., how you get meaning when it is not provided overtly. This has been a primary focus of studies of how comics communicate at least since McCloud's notion of "closure" in Understanding Comics, and many other scholars have posited how we "fill the gaps" for knowing what we don't see.

However, much of this work has posited vague principles (closure, arthrology, etc.) for saying that people generate inference, but without discussing the specific cues and techniques that are used to motivate that inference in the first place. As I hope I demonstrate in this paper, inference is not a happenstance thing, and it also doesn't occur "in the gaps between panels," as most in comics studies seem to argue.

Rather, specific techniques motivate readers to create inference. These techniques are patterned ways of showing, or not showing, information that in turn signals to readers that they need to make an inference. The figure below provides a handy-dandy summary of some of these techniques mentioned in the paper (though it isn't a figure in the paper). A high-res version for printing is available here if you want to use it for personal use.



The overarching argument thus is that it's not enough to posit broad generalities for how visual narratives like comics are comprehended, but rather research should explore the specific methods and techniques that motivate that comprehension.

Not only does this paper list off these various techniques, but I also provide an analytical framework for characterizing their underlying features. This analysis actually goes back to about 5 years ago when my former students Kaitlin Pederson and Ryan Taylor met with me in my office at UCSD to brainstorm about inference, resulting in this scrawling whiteboard which laid the foundation for the table at the end of the article:



You can find the full article online here, or a pdf file here and via my downloadable papers page.

Abstract

Inference has long been acknowledged as a key aspect of comprehending narratives of all kinds, be they verbal discourse or visual narratives like comics and films. While both theoretical and empirical evidence points towards such inference generation in sequential images, most of these approaches remain at a fairly broad level. Few approaches have detailed the specific cues and constructions used to signal such inferences in the first place. This paper thereby outlines several specific entrenched constructions that motivate a reader to generate inference. These techniques include connections motivated by the morphology of visual affixes like speech balloons and thought bubbles, the omission of certain narrative categories, and the substitution of narrative categories for certain classes of panels. These mechanisms all invoke specific combinatorial structures (morphology, narrative) that mismatch with the elicited semantics, and can be generalized by a set of shared descriptive features. By detailing specific constructions, this paper aims to push the study of inference in visual narratives to be explicit about when and why meaning is ‘filled in’ by a reader, while drawing connections to inference generation in other modalities.


Cohn, Neil. 2019. Being explicit about the implicit: inference generating techniques in visual narrative. Language and cognition.

Saturday, April 13, 2019

New paper: Your brain on comics

I'm very excited to announce the publication of my newest paper,"Your brain on comics: A cognitive model of visual narrative comprehension" in Topics in Cognitive Science. This journal issue is actually a themed issue edited by me about visual narratives, and this paper is my personal contribution.

This paper in many ways is a culmination of about 10 years of experimental research asking "how do we comprehend a sequence of images?" Much of this work comes from my studies measuring people's brainwaves while they read comics, but it integrates this work with research from fields of discourse, event cognition, and other related disciplines. Here, I tie this work together in a cognitive model to provide an explanation for what happens in the brain when you progress through a sequence of images. My emphasis on brain studies gives the overall endeavor a neurocognitive focus, although the model itself is not specific to the brain.

The primary paper focuses on the evidence for two levels of representation in processing a sequence of images: a semantic structure, that computes the meaning, and a narrative structure, which organizes and presents that meaning in sequencing. In addition, I discuss how these mechanisms are connected to other aspects of cognition, like language and music processing, and I discuss the role of expertise and fluency in comprehending sequential images.

Overall, this is the first full processing theory of visual narrative comprehension, making it a significant marker in the growth of this research field.

The paper is readable online with Open Access, though a downloadable pdf is available here, and via my downloadable papers page. Here's the abstract:


The past decade has seen a rapid growth of cognitive and brain research focused on visual narratives like comics and picture stories. This paper will summarize and integrate this emerging literature into the Parallel Interfacing Narrative-Semantics Model (PINS Model)—a theory of sequential image processing characterized by an interaction between two representational levels: semantics and narrative structure. Ongoing semantic processes build meaning into an evolving mental model of a visual discourse. Updating of spatial, referential, and event information then incur costs when they are discontinuous with the growing context. In parallel, a narrative structure organizes semantic information into coherent sequences by assigning images to categorical roles, which are then embedded within a hierarchic constituent structure. Narrative constructional schemas allow for specific predictions of structural sequencing, independent of semantics. Together, these interacting levels of representation engage in an iterative process of retrieval of semantic and narrative information, prediction of upcoming information based on those assessments, and subsequent updating based on discontinuity. These core mechanisms are argued to be domain-general—spanning across expressive systems—as suggested by similar electrophysiological brain responses (N400, P600, anterior negativities) generated in response to manipulation of sequential images, music, and language. Such similarities between visual narratives and other domains thus pose fundamental questions for the linguistic and cognitive sciences.



Cohn, N. (2019). Your brain on comics: A cognitive model of visual narrative comprehension. Topics in Cognitive Science. doi:10.1111/tops.12421

Friday, April 05, 2019

Knowing the rules of comic page layouts

One of my more engaged-with blog posts of recent memory reviewed the data for whether the panel arrangement on the right was “confusing.” So, here’s a post with some additional thoughts on this and the “rules” of comic page layouts**…

First off, let me remind people that I've given this layout a name: When you have a vertical stack of panels next to a tall panel, I call it "blockage." You can find terms (and science!) related to page layout in my book and my scientific papers (also linked throughout).

Most of the claims I make about page layouts are based on the experiments that I and others have done about them. For this layout, the key experimental findings came from two studies presenting people with empty page layouts, and then asking them to choose the order that they would read the panels.

We found that for blockage layouts, around 90% say “down”. Or, conversely put, less than 10% of choices in these situations followed the "left-to-right-and-down" Z-path that follows the order of written text. As I said in my previous blog post, this rate is essentially the inverse of what we find for pure grids. In simple grids, we find 90% of responses choose to follow the Z-path (i.e., go right) instead of choosing other paths.


Now, one criticism people have about these studies is that they don't have content in the layouts. Yes, these experiments presented empty panels, which might be different than if content is included. But, there's a good reason for this: the question we were asking wasn’t “how do people read these layouts?” but rather “what are people’s preferences for ordering these layouts?” Having no content works just fine for doing good science and factoring out confounding variables, and it answers our question of whether people have preferences for orders: yes they clearly do.

So, these results show that readers have a preference for the proper reading direction. In other words, the “rule” in their minds is that, they should read downward in blockage layouts. You might think that the “rule” of reading comic page layouts is “left-to-right and down”, like text, and thus this layout is confusing. But, that’s not the rule. I’ll explain this more in a bit…

When I say that “this layout is not confusing”, I mean that readers have these clear intuitions for what to do in these situations. The layout itself is not confusing, since people know what to do with it. What gives confusion then, is when creators don’t know or don’t obey this "go downward" rule, and still use layouts where blockage is read to the right. This could feasibly create confusion, since it treats this layout as “neutral” or like there isn’t a rule for its order.

However, there is a clear rule for it, and thinking it’s neutral is wrong according to the experimental results for what people say their preferences are. Grids aren’t used as if right and down are equal choices (though they’re even more physically ambiguous), and nor should this layout.

Certainly a creator can manipulate the reading path by using the content or balloons to go in a different directly. They do this all the time in effective and creative ways even with grids, like in the layout to the left. But, doing it against the downward path in blockage layouts have to be recognized as “breaking the rule” with artistic intent.

So, why isn’t “left to right and down” the real rule of layout? Well, it’s *one* rule in comic page layouts, but it’s just a surface choice within a broader overarching set of rules/principles.

Readers don’t just read a comic page to just go from panel to panel along the “surface” of the canvas making choices like right, down, etc. While it is likely that surface features like balloons and bubbles can "direct" the eye, layouts themselves have rules that are not dependent on these surface features, as demonstrated by the consistent results using empty layouts.

(Note: To my knowledge, there are no controlled experimental results showing that content directs readers' eyes through layouts. There is one non-controlled study that has some hints about this though.)

Here’s the actual rules of layout: Readers go through layouts guided by a desire to create grouped structures out of panels. The surface decisions that they make are on the basis of alignments between the edges of panels, but these choices are subservient to the larger goal of making hierarchic groupings.

I argue that these grouping mechanisms are what underlie readers choices when they move from panel to panel. This may involve some surface level rules, but there is an overarching principle I call “Assemblage” that has four basic sub-principles:

1. Grouped areas > non-grouped areas
2. Smooth paths > broken paths
3. Do not jump over units
4. Do not leave gaps

The reason so many people agree on going down in this layout is because it facilitates chunking the page into grouped structures, while the rightward path doesn’t, and a rightward path violates the Assemblage principles. This is why it’s a “rule.”

So, if you’re a comic creator, knowing what readers are trying to do while they read can help you design layouts, including how to break those rules with intent if you need to do so artistically.



**This originally appeared as a Twitter thread, and has now been expanded for blog format.