Monday, December 18, 2017

2017: My publications in review

Last year I summarized all the papers I published in 2016, and I thought it worked out so well I might as well keep it going. This year wasn't quite the flurry of books and papers as last year (due largely to setting up a new EEG lab and submitting multiple grants), but we had several significant papers come out balancing both brainwave studies and corpus analyses.

So, here are the papers that I published in 2017...

Drawing the Line Between Constituent Structure and Coherence Relations in Visual Narratives (pdf) - This project with my former assistant Patrick Bender looked at people's intuitions for how to "segment" visual narratives into different subsections. Contrary to work on events and discourse, we found that breaks in categories of my model of narrative grammar were better predictors of segmentation than just changes in meaning between images (like spatial or character changes).

When a hit sounds like a kiss (pdf) - This project with Mirella Manfredi and Marta Kutas examined how the brain processes words that replace panels, like Pow! or Hit! replacing a climactic event. We found that the context of the sequence modulated the semantic processing of the words, and that descriptive words (Hit!) generated brain responses consistent with lower probability words than onomatopoeia (Pow!).

What's your neural function, narrative conjunction? (online article, pdf) - I consider this to be one of my coolest and most interesting studies to date. With Marta Kutas, I examined the brain response to a narrative pattern called Environmental-Conjunction. We found that it elicits two types of brain responses consistent with grammatical processing in language. Other work has shown that Environmental-Conjunction appears more in Japanese manga than Western comics, and indeed we found that readership of manga modulated this brain response. So: the brain uses grammatical processing for narrative patterns, and people familiar with this pattern process it in ways that are different from people who are less familiar with it. In other words, the way you process the sequences in comics depends on which ones you read.

A picture is worth more words over time (pdf) - This project with co-authors Ryan Taylor and Kaitlin Pederson is the companion to last year's paper by Kaitlin on how page layouts have changed in superhero comics over the past 80 years. Here, we look at how text-image interactions and storytelling methods have changed from the 1940s to 2010s in American superhero comics. Here's also a link to Ryan presenting this work at Comic-Con International a few years ago.

Path salience in motion events from verbal and visual languages (pdf) - In this corpus study we examined how paths are depicted in 35 different comics from 6 different countries around the world. We found that the patterns of paths differed along dimensions similar to what is found in distinctions of those authors' spoken languages, hinting at possible connections between a visual language that one draws and the spoken language one speaks or writes.

Not so secret agents (pdf) - This paper with Marta Kutas looked at the brain processes of certain postures of characters in events. We found that preparatory postures (like reaching back to throw a  ball or to punch) differed from those that did not hint at such subsequent events.

Not a bad collection, if I do say so myself. I'm already excited about the new work set to come out next year, so stay tuned. All these papers and more are available online here.

Saturday, September 23, 2017

New paper: Not so secret agents

I'm excited to announce a new paper, "Not so secret agents: Event-related potentials to semantic roles in visual event comprehension," in the journal Brain and Cognition. This paper was done during my time in the lab of my co-author, Marta Kutas, and collaborating with my friend from grad school, co-author Martin Paczynski.

This paper is a follow up of a study Martin and I did previously that found that agents-to-be, the doers of actions, elicit more predictions about subsequent events than patients-to-be, the receivers of actions. For example, an agent-to-be would be a person reaching back their arm to punch (like in this image from the classic How to Draw Comics the Marvel Way), which will convey more information about that upcoming event than the patient-to-be (who is about to be punched).

In this follow up, we measured participants' brainwaves to see whether this type of "agent advantage" appears when comparing these agents appear in preparatory postures against patients, and versus those with agents where we took away the preparatory postures. So, instead of reaching back to punch, the agent's arm instead would be hanging next to their body, not indicting an upcoming punch. We found indeed that preparatory actions appear to be more costly prior to an action, and appear to have a downstream influence on processing the subsequent action.

The paper is available here or on my downloadable papers page or this direct pdf link, and is summarized concisely in Experiment 1 of this poster, which has subsequently made for some keen pillows on my couch (right).


Research across domains has suggested that agents, the doers of actions, have a processing advantage over patients, the receivers of actions. We hypothesized that agents as “event builders” for discrete actions (e.g., throwing a ball, punching) build on cues embedded in their preparatory postures (e.g., reaching back an arm to throw or punch) that lead to (predictable) culminating actions, and that these cues afford frontloading of event structure processing. To test this hypothesis, we compared event-related brain potentials (ERPs) to averbal comic panels depicting preparatory agents (ex. reaching back an arm to punch) that cued specific actions with those to non-preparatory agents (ex. arm to the side) and patients that did not cue any specific actions. We also compared subsequent completed action panels (ex. agent punching patient) across conditions, where we expected an inverse pattern of ERPs indexing the differential costs of processing completed actions as a function of preparatory cues. Preparatory agents evoked a greater frontal positivity (600–900 ms) relative to non-preparatory agents and patients, while subsequent completed actions panels following non-preparatory agents elicited a smaller frontal positivity (600–900 ms). These results suggest that preparatory (vs. non-) postures may differentially impact the processing of agents and subsequent actions in real time.

Full reference:

Cohn, Neil, Martin Paczynski, and Marta Kutas. 2017. Not so secret agents: Event-related potentials to semantic roles in visual event comprehension. Brain and Cognition. 119: 1-9.

Thursday, June 15, 2017

New paper: A picture is worth more words over time

I'm excited to announce we have another new paper, "A picture is worth more words over time: Multimodality and narrative structure across eight decades of American superhero comics," now out in the journal Multimodal Communication. This paper examines the changes in text-image relations and storytelling in American superhero comics from the 1940s though the 2010s.

This was a project first undertaken by students in my 2014 Cognition of Comics class, which became expanded into a larger study. My co-authors, Ryan Taylor and Kaitlin Pederson, coded 40 comics across 8 decades (over 9,000 panels), complementing Kaitlin's study of page layout across time in superhero comics.

We examined three aspects of structure: multimodality (text-image relationships and their balance of meaning and narrative), the framing of information in panels (image above), and the linear changes in meaning that occur between panels.

Overall, we found evidence that American superhero comics have shifted to relying less on text, and more towards the visual narrative sequencing carrying more weight of the storytelling. This has accompanied changes in the framing of information in panels to use fewer elements (as in the example figure), and to use fewer spatial location changes with more time changes across panels.

In addition, this trend is not new, but has been steadily occurring over the past forty years. That means it cannot just be attributed to the influence of manga since the 1980s (and indeed, as we discuss, our results suggest the influence of manga may be more complicated than people suspect).

You can download the paper here (pdf), or along with all my other downloadable papers. You can also see Ryan presenting this study at Comic-Con International in our panel in 2015:


The visual narratives of comics involve complex multimodal interactions between written language and the visual language of images, where one or the other may guide the meaning and/or narrative structure. We investigated this interaction in a corpus analysis across eight decades of American superhero comics (1940–2010s). No change across publication date was found for multimodal interactions that weighted meaning towards text or across both text and images, where narrative structures were present across images. However, we found an increase over time of narrative sequences with meaning weighted to the visuals, and an increase of sequences without text at all. These changes coincided with an overall reduction in the number of words per panel, a shift towards panel framing with single characters and close-ups rather than whole scenes, and an increase in shifts between temporal states between panels. These findings suggest that storytelling has shifted towards investing more information in the images, along with an increasing complexity and maturity of the visual narrative structures. This has shifted American comics from being textual stories with illustrations to being visual narratives that use text.


Cohn, Neil, Ryan Taylor, and Kaitlin Pederson. 2017. A picture is worth more words over time: Multimodality and narrative structure across eight decades of American superhero comics. Multimodal Communication. 6(1): 19-37.

Wednesday, May 24, 2017

New paper: What's your neural function, narrative conjunction?

I'm excited to announce that my new paper "What's your neural function, narrative conjunction: Grammar, meaning, and fluency in sequential image processing" is now out in the open access journal Cognitive Research: Principles and Implications. This study was co-authored by Marta Kutas, who was my advisor while I was a postdoctoral fellow at UC San Diego.

Simple take home message: The way you process the sequences in comics depends on which ones you read.

The more detailed version... This study is maybe the coolest brain study I've done. Here, we examine a particular pattern in the narrative grammar used in comics: Environmental-Conjunction. This is when you have characters in different panels at the same narrative state, but you infer that they belong to the same spatial environment.

Most approaches to comprehending sequential images focus on just the comprehension of meaning (like in "panel transitions"). However, my theory says that this pattern involves both the construction of meaning and the processing of this narrative pattern. The patterning uses only a narrative grammar which is independent of meaning.

When analyzing people's brain responses to conjunction patterns, we found support for two processes. We found one brainwave associated with an "updating" of a mental model (the P600), and another associated with grammatical processing (an anterior negativity). Crucially, this grammatical processor was insensitive to manipulations of meaning, indicating that it was only processing the conjunction pattern. So, you don't just process meaning, but also the narrative pattern.

But, that's only the first part...

In other analyses, we've shown that Japanese manga use more Environmental-Conjunction than American or European comics. So, we used a statistical analysis to analyze whether participants' background reading habits influenced their brain processing of conjunction. And... it did!

Specifically, we found that participants who more frequently read manga "while growing up" tended to rely more on the grammar processing, while infrequent manga readers used more updating. In other words, since frequent manga readers were exposed to the conjunction pattern more in their reading habits, their brains used a more automatic, grammatical process to comprehend it. Note: this result is especially cool, because our comics stimuli were not manga, they were manipulated Peanuts strips that used a pattern frequent in manga.

This result contradicts the idea that comics are uniformly understood by all people, or even the idea that their processing uses a single cognitive process (like "closure"). Rather, comics are understood based on people's fluency with the patterns found in specific visual languages across the world.

You can read the paper online here, download the pdf here, or check out the poster summary.


Visual narratives sometimes depict successive images with different characters in the same physical space; corpus analysis has revealed that this occurs more often in Japanese manga than American comics. We used event related brain potentials to determine whether comprehension of “visual narrative conjunctions” invokes not only incremental mental updating as traditionally assumed, but also, as we propose, “grammatical” combinatoric processing. We thus crossed (non)/conjunction sequences with character (in)/congruity. Conjunctions elicited a larger anterior negativity (300-500ms) than non-conjunctions, regardless of congruity, implicating “grammatical” processes. Conjunction and incongruity both elicited larger P600s (500-700ms), indexing updating. Both conjunction effects were modulated by participants’ frequency of reading manga while growing up. Greater anterior negativity in frequent manga readers suggests more reliance on combinatoric processing; larger P600 effects in infrequent manga readers suggest more resources devoted to mental updating. As in language comprehension, it seems that processing conjunctions in visual narratives is not just mental updating but also partly grammatical, conditioned by comic readers’ experience with specific visual narrative structures.


Cohn, Neil and Marta Kutas. 2017. "What’s your neural function, visual narrative conjunction? Grammar, meaning, and fluency in sequential image processing." Cognitive Research: Principles and Implications. 2(27): 1-13

Sunday, April 16, 2017

Tourist traps in comics land*: Unpublished comics research

In a series of Twitter posts, I recently reflected on the pitfalls of various comics research that hasn't been published. Since I think it contains some valuable lessons, I'm going to repeat and expand on them here...

Though I've written the most about psychological studies about how people understand comics, other people have been doing these types of studies before me. What's interesting is that many of these studies were not published, because they found null results. There are a few trends in this work...

Space = Time

The topic I've heard about the most is the testing of McCloud's idea that panel size relates to the duration of conceived time, and that longer vs. shorter gutters relates to longer vs. shorter spaces of "time" between panels. I critiqued the theory behind this idea that "space = time" back in this paper, but I've heard of several scholars who have tested this with experiments. Usually these studies involved presenting participants with different size panels/gutters and then having participants rate their perceived durations.

In almost all of these studies, no one found any support of the idea that "physical space = conceived time". I can only think of one study that did find something supporting it, and it was only for a subset of the stimuli, and thus warranted further testing (which hasn't been done yet).

Because these studies found null results, they weren't deemed noteworthy enough to warrant publication. And since none got published, other labs didn't know about it, so they tried it too with the same null results. I think it's a good case for importance of publishing null results: they serve to both disprove hypotheses, and inform others not to try to grab at the same smoke.


The other type of study on comics that usually doesn't get published is eye-tracking. I know of at least half-a-dozen unpublished eye-tracking studies looking at people reading comic pages. The main reason these studies aren't published is because they're often exploratory, with no real hypotheses to be tested. Most comics eye-tracking studies just examine what people look at, which doesn't really tell you much if you don't manipulate anything. This can be useful for telling you basic facts about what people look (types of information, how long, etc.), but without a specific manipulation, it is less informative and has lots of confounds.

An example: Let's say you run an eye-tracking study of a particular superhero comic and find that people spend more time fixating on text than on the images (which is a frequent finding). Now the questions arise: Is it because of the specific comic you chose? Is it because your comic had a particular uncontrolled multimodal interaction that weights meaning more to the text? Is it because your participants lacked visual language fluency, and so they relied more on text than images? Is it because you chose a superhero comic, but your participants read more manga? Without more controls, it's hard to know anything substantial.

Good science means testing a hypothesis, which means having a theory that can possibly be tested by manipulating something. Without a testable theory you don't have any real hypothesis to create a manipulation, which results in not a publishable eye-tracking study about comics. Eye-tracking is an informative tool, but the real "meat" of the research needs to be in the thing that is being manipulated.

I'll note that this is the same as when people do (or advise) using fMRI or EEG to study processing (visual) narratives in the brain. I've seen several studies of "narrative" or "visual narrative" where they simply measure the brain activity to non-manipulated materials and then claim that "these are the brain areas involved in comics/visual narrative/narrative!"

In fact, such research is wholly uninformative, because nothing specific is being tested, and such research betrays an ignorance for just how complex these structures actually are. It would be inconceivable for any serious scholar of language to simply have someone passively read sentences and then claim that they "know how they work" by measuring fMRI or eye-tracking to them. Why then the presumption of simplicity for visual narratives?

Final remarks

One feature of unpublished research on comics is that they are often undertaken by very good researchers who had little knowledge-base for what goes on in comics and/or the background literature of that field. It is basically "scientific tourism." While it is of course great that people are interested enough in the visual language of comics to invest the time and effort to run experiments, it's also a recipe for diminishing returns. Without background knowledge or intuition, it's hard to know why your experiment might not be worth running.

Nevertheless, I also agree that it would be useful to know what types of unpublished studies people have done. Publishing such results would be informative for what isn't found, and would prevent future researchers from chasing topics they maybe shouldn't.

So, let me conclude with an "open call"...

If you've done a study on comics that hasn't been published (or know someone who has!): Please contact me. At the least, I'll feature a summary (or link) to your study on this blog, and if I accrue enough of them, perhaps I can curate a journal or article for reporting such results.

*Thanks to Emiel van Miltenburg for the post title!

Friday, February 24, 2017

New paper: When a hit sounds like a kiss

I'm excited to announce that I have new paper out in the journal Brain and Language entitled "When a hit sounds like a kiss: an electrophysiological exploration of semantic processing in visual narrative." This was a project by the first author Mirella Manfredi, who worked with me during my time in Marta Kutas's lab at UC San Diego.

Mirella has an interest in the cognition of humor, and also wanted to know about how the brain processes different types of information, like words vs. images. So, she designed a study using "action stars"—the star shaped flashes that appear at the size of whole panels to indicate that an event happened, but not show you what it is. Into these action stars, she placed either onomatopoeia (Pow!), descriptions (Impact!), anomalous onomatopoeia or descriptions (Smooch!, Kiss!), or grawlixes (#$%?!).

We then measured people's brainwaves for these action star panels. We found a brainwave effect that is sensitive to semantic processing (the "N400")—how people process meaning—that suggested the anomalies were harder to understand than the congruous ones. This suggested that meaning garnered by the context of the visual sequence impacted how people processed the textual words. In addition, the grawlixes showed very little signs of this type of processing, suggesting that they don't hold specific semantic meanings.

In addition, we found that descriptive sound effects evoked another type of brain response (a late frontal positivity) often associated with the violation of very specific expectations (like getting a slightly different word than expected, even though it might not be anomalous).

This response was fairly interesting, because we also recently showed that American comics use descriptive sound effects far less compared to onomatopoeia. What this means is that this brain response wasn't just sensitive to certain words, but was sensitive to the low expectations for a certain type of words: descriptive sound effects in the context of comics.

Mirella and I are now continuing to collaborate on more studies about the interactions between multimodal and crossmodal information, so nice to have this one to kick things off!

You can find the paper along with all my other Downloadable Papers, or directly here (pdf).


Researchers have long questioned whether information presented through different sensory modalities involves distinct or shared semantic systems. We investigated uni-sensory cross-modal processing by recording event-related brain potentials to words replacing the climactic event in a visual narrative sequence (comics). We compared Onomatopoeic words, which phonetically imitate action sounds (Pow!), with Descriptive words, which describe an action (Punch!), that were (in)congruent within their sequence contexts. Across two experiments, larger N400s appeared to Anomalous Onomatopoeic or Descriptive critical panels than to their congruent counterparts, reflecting a difficulty in semantic access/retrieval. Also, Descriptive words evinced a greater late frontal positivity compared to Onomatopoetic words, suggesting that, though plausible, they may be less predictable/expected in visual narratives. Our results indicate that uni-sensory cross-model integration of word/letter-symbol strings within visual narratives elicit ERP patterns typically observed for written sentence processing, thereby suggesting the engagement of similar domain-independent integration/interpretation mechanisms.

Manfredi, Mirella, Neil Cohn, and Marta Kutas. 2017. When a hit sounds like a kiss: an electrophysiological exploration of semantic processing in visual narrative. Brain and Language. 169: 28-38.

Saturday, February 04, 2017

New paper: Drawing the Visual Narratives

I'm happy to announce that we have a new paper in the latest issue of the Journal of Experimental Psychology: Learning, Memory, and Cognition entitled "Drawing the Line Between Constituent Structure and Coherence Relations in Visual Narratives."

This was my final project at project at Tufts University, and was carried out by my former assistant (and co-author) Patrick Bender, who is now in graduate school at USC.

We wanted to examine the relationship between meaningful panel-to-panel relationships ("panel transitions") and the hierarchic constructs of my theory of narrative grammar. Many discourse theories have posited that people do assess meaningful relations between each image in a visual sequence, and (like in my theory) people make groupings. Yet, in these theories, the groupings are signaled by major changes in meaning, such as a "transition" with a big character change. We hypothesized that groupings were not actually motivated by changes in meaning, but by narrative category information that align with larger narrative structures.

So, we simply gave people various visual sequences and asked them to "draw a line" between panels that would best divide the sequence into two meaningful parts—i.e., to break up the sequence into groupings. People then continued to draw lines until all panels had lines between them, and we looked at what influenced their groupings. Similar tasks have been used in many studies of discourse and event cognition.

We found that panel transitions did indeed influence their segmentation of the sequences. However, narrative category information was a far greater predictor of where they divided sequences than these meaningful transitions between panels. That is: narrative structure better predicts how people intuit groupings in visual sequences than semantic "panel transitions."

The paper is downloadable here (pdf) or along with all of my other papers.

Full abstract:

Theories of visual narrative understanding have often focused on the changes in meaning across a sequence, like shifts in characters, spatial location, and causation, as cues for breaks in the structure of a discourse. In contrast, the theory of visual narrative grammar posits that hierarchic “grammatical” structures operate at the discourse level using categorical roles for images, which may or may not co-occur with shifts in coherence. We therefore examined the relationship between narrative structure and coherence shifts in the segmentation of visual narrative sequences using a “segmentation task” where participants drew lines between images in order to divide them into subepisodes. We used regressions to analyze the influence of the expected constituent structure boundary, narrative categories, and semantic coherence relationships on the segmentation of visual narrative sequences. Narrative categories were a stronger predictor of segmentation than linear coherence relationships between panels, though both influenced participants’ divisions. Altogether, these results support the theory that meaningful sequential images use a narrative grammar that extends above and beyond linear semantic shifts between discourse units.

Full Reference:

Cohn, Neil and Patrick Bender. 2017. Drawing the line between constituent structure and coherence relations in visual narratives. Journal of Experimental Psychology: Learning, Memory, and Cognition. 43(2): 289-301.

Sunday, December 18, 2016

2016: My Publications in Review

As 2016 nears its close, I thought I should do a post reflecting on all the research I've released with my colleagues over the past year. This was my biggest year of publishing yet, so I thought it would be good to just go over what we came out with.

First off, in January, Bloomsbury published my edited volume The Visual Narrative Reader. The book features 12 chapters summarizing or reprinting important and often overlooked papers in the field of visual narrative research, with topics ranging from metaphor theory and multimodality, to how kids draw and understand sequential images, to various examinations of cross-cultural and historical visual narrative systems. In my classes, I use it as a companion volume to my monograph, The Visual Language of Comics.

The rest of the year then saw a flurry of publications (title links go to blog summaries, pdf links to pdfs):

A multimodal parallel architecture (pdf) - Outlines a cognitive model for language and multimodal interactions between verbal (spoken language), visual-graphic (drawings, visual languages), and visual-bodily (gesture, sign languages) modalities. This paper essentially presents the core theoretical model of my research, and my vision for the cognitive architecture of the language system.

The vocabulary of manga (pdf) - This project with Sean Ehly coded over 5,000 panels in 10 shonen and 10 shojo manga to reveal that they mostly use the same 70 visual morphemes ("symbology"), though they use them in differing proportions. This suggested that there is a broad Japanese Visual Language in which genre-specific "dialects" manifest variations on this generalized structure.

The pieces fit (pdf) - This experiment with Carl Hagmann tested participants'  comprehension of sequential images with rapidly presented panels (1 second, half a second) when we switched the positions of panels. In general, switches between panels nearby to each other in the original sequence were easier to comprehend than panel switches across distances, but switches that crossed boundaries of groupings (" narrative constituents") were worse than those within groupings. This provides further evidence that people make groupings of panels, not just linear panel transitions.

Reading without words (pdf) - My project with collaborator Tom Foulsham reports on one of the first controlled eye-tracking studies using comics. We show that people's eyes move through a grid layout largely the same as they would across text—left-to-right and down largely keeping to their row, and looking back mostly to adjacent frames. We also found that people mostly look at the same content of a panel whether shown in a grid or with one panel at a time, but eye fixations to panels from scrambled sequences are slightly more disperse than those to panels in normal sequences.

Meaning above the head (pdf) - This paper with my student Beena Murthy and collaborator Tom Foulsham explored the understanding of "upfixes"—the visual elements that float above characters' heads like lightbulbs or hearts. We show that upfixes are governed by constraints that the upfix needs to go above the head, not next to it, and must "agree" with the facial expression (storm clouds can't go above a happy face). These constraints operate over both conventional and novel upfixes, suggesting that this is an abstract schematic pattern.

The changing pages of comics (pdf) - My student Kaitlin Pederson and I report on her project coding over 9,000 panels in 40 American superhero comics from the 1940s through the 2010s to see how page layout has changed over time. Overall, we argue that layouts over time have become both more systematic as well as more decorative.

Pow, punch, pika, and chu (pdf) - Along with students Nimish Pratha and Natalie Avunjian, we report on their analyses of sound effects in American comics (Mainstream vs. Indy) and Japanese manga (shonen vs. shojo) and show that the structure and content of sound effects differ both within and between cultures.

Sequential images are not universal, or caveats for using visual narratives in experimental tasks (pdf) - This conference proceedings paper reviews some of the research showing that sequential images are not understood universally, and are dependent on cultural and developmental knowledge to be understood.

Finally, I also had a book chapter come out in the book Film Text Analysis by Janina Wildfeuer and John Bateman. My chapter, "From Visual Narrative Grammar to Filmic Narrative Grammar" explores how my theory of narrative structure for static sequential images can also be applied to explaining the comprehension of film. I'll hopefully do a write up of it on this site sometime soon.

It was a big year of publications, and next year will hopefully be just as exciting! All my papers are available on this page.

Tuesday, December 06, 2016

New paper: Pow, punch, pika, and chu

I'm once again excited to announce the publication of another of my students' projects. Our paper, "Pow, Punch, Pika, and Chu: The Structure of Sound Effects in Genres of American Comics and Japanese Manga" is now published in the latest issue of Multimodal Communication. This was another paper derived from student projects in my 2014 class, The Cognition of Comics. The first two authors, Nimish Pratha and Natalie Avunjian, did research projects examining the use of onomatopoeia in genres of Japanese manga and American comics.

The biggest finding in American comics was that onomatopoetic sound effects (Pow!) are used in much greater proportion than descriptive sound effects (Punch!). In fact, though we found some descriptive sound effects in genres of "Independent" comics, we found none in the 10 superhero comics that were analyzed.

In Japanese manga, we found slightly different results. We categorized two types of "sound" effects in manga. Giongo are hearable sounds (crack!) while gitaigo are unhearable qualities (sparkle!). We found that shonen manga used more giongo than gitaigo, while shojo manga had the opposite trend: they used more gitaigo than giongo. In addition, these sound effects in shonen manga were more often written in the katakana script than hiragana, while the reverse occurred for shojo manga.

Overall, our results suggested that different types of comics can be characterized by the way they use "sound effects."

You can download the paper here, or at my downloadable papers page.

Here's Nimish speaking about the project at this past ComicCon (unfortunately cut just slightly short by time):


As multimodal works, comics are characterized as much by their use of language as by the style of their images. Sound effects in particular are exemplary of comics’ language-use, and we explored this facet of comics by analyzing a corpus of books from genres in the United States (mainstream and independent) and Japan (shonen/boys’ and shojo/girls’). We found variation between genres and between cultures across several properties of the content and presentation of sound effects. Foremost, significant differences arose between the lexical categories of sound effects (ex. onomatopoetic: Pow! vs. descriptive: Punch!) between genres within both culture’s works. Additionally, genres in Japanese manga vary in the scripts used to write sound effects in Japanese (hiragana vs. katakana). We argue that, in English, a similar function is communicated through the presence or absence of textual font stylization. Altogether, these aspects of variation mark sound effects as important carriers of multimodal information, and provide distinctions by which genres and cultures of comics can be distinguished.


Pratha, Nimish K., Natalie Avunjian, and Neil Cohn. 2016. Pow, punch, pika, and chu: The structure of sound effects in genres of American comics and Japanese manga. Multimodal Communication. 5(2): 93-109.

Tuesday, November 22, 2016

New paper: The changing pages of comics

I'm excited to announce the publication of our latest paper, "The changing pages of comics: Page layouts across eight decades of American superhero comics" in the latest issue of Studies in Comics. This was a student project undertaken by the first author, Kaitlin Pederson, from my 2014 class the Cognition of Comics. She analyzed how page layouts have changed over time in American superhero comics from the 1940s to the 2010s. This is the first published, data-driven paper using corpus analysis on page layouts in comics, so that's quite exciting!

Kaitlin went panel-by-panel in these books analyzing various properties of their page layouts. She coded over 9,000 panels across 40 comics. Some of these features are captured in this figure:

She found that certain features have decreased over time (horizontal staggering, etc.), while others have increased over time (whole rows, etc.). Overall, her conclusion is that pages in earlier comics were fairly unsystematic in their layouts, while over time they grew to be more systematic, and at the same time more treating the page as a whole "canvas." This is complemented especially by changes towards using "widescreen" layouts in pages over the past two decades.

You can download the paper here (pdf), or at my downloadable papers page.

Also, here's Kaitlin at Comic-Con 2015 reporting on her initial analyses of this project:

Full abstract:

Page layouts are one of the most overt features of comics’ structure. We hypothesized that American superhero comics have changed in their page layout over eight decades, and investigated this using a corpus analysis of 40 comics from 1940 through 2014. On the whole, we found that comics pages decreased in their use of grid-type layouts over time, with an increase in various non-grid features. We interpret these findings as indicating that page layouts moved away from conventional grids and towards a “decorative” treatment of the page as a whole canvas. Overall, our analysis shows the benefit of empirical methods for the study of the visual language of comics. 


Pederson, Kaitlin and Neil Cohn. 2016. The changing pages of comics: Page layouts across eight decades of American superhero comics. Studies in Comics. 7(1):7-28

Sunday, September 18, 2016

New paper: Meaning above the head

I'm happy to announce that our new paper, "Meaning above the head" is now published in the Journal of Cognitive Psychology! This one explores the structure of "upfixes" which are the class of visual signs that float above character's heads, like lightbulbs or hearts.

In my book, The Visual Language of Comics, I made a few hypotheses about these elements. First, I argued that they were bound by a few constraints: 1) they are typically above the head, and are weird when moved to the side. 2) the upfix has a particular "agreement" relationship with the face (e.g., storm clouds go with a sad face, but are weird with a happy face). Also, I argued that upfixes are an abstract class, meaning they can easily allow for new ones, though they won't be quite as comprehensible as conventional ones (as in the image below).

With these hypotheses stated, my enterprising student Beena Murthy set out to test these ideas as part of a experiment she ran for a class project (many of the projects from that class are now published). We were then joined by my collaborator Tom Foulsham who aided us in testing additional questions in a second experiment (which you may have taken online!).

Lo and behold, most all of my hypotheses appear to be borne out! Overall, this means that upfixes use particular constraints in their construction, and allow for the creation of new, novel signs! We now plan to follow up these experiments with several more.

Check out the paper, which is available on my Downloadable Papers page, or directly here: PDF.

AND... don't forget that you can also get awesome t-shirts with both the normal and unconventional upfixes. The shirt designs (which are the in this post images) actually feature our stimuli from the experiments!


“Upfixes” are “visual morphemes” originating in comics where an element floats above a character’s head (ex. lightbulbs or gears). We posited that, similar to constructional lexical schemas in language, upfixes use an abstract schema stored in memory, which constrains upfixes to locations above the head and requires them to “agree” with their accompanying facial expressions. We asked participants to rate and interpret both conventional and unconventional upfixes that either matched or mismatched their facial expression (Experiment 1) and/or were placed either above or beside the head (Experiment 2). Interpretations and ratings of conventionality and face–upfix matching (Experiment 1) along with overall comprehensibility (Experiment 2) suggested that both constraints operated on upfix understanding. Because these constraints modulated both conventional and unconventional upfixes, these findings support that an abstract schema stored in long-term memory allows for generalisations beyond memorised individual items.

Full reference:

Cohn, Neil, Beena Murthy, and Tom Foulsham. (2016). Meaning above the head: combinatorial constraints on the visual vocabulary of comics. Journal of Cognitive Psychology. 28(5): 559-574.

Tuesday, August 30, 2016

Dispelling myths about comics page layout

There are many websites and twitter accounts that give advice about how to draw comics, and perhaps no other piece of "advice" arises more than the repeated advocacy to avoid page layouts like the one in the image to the right. Advice-givers claim that this layout is confusing because a reader may not know whether to follow their usual left-to-right and down "Z-path" from A to C (resulting in a backtrack to B), or whether to go vertically from A to B, then to C. Because of this confusion, this layout is advised to be avoided at all costs, with the fervor of a grammar nazi for the visual language of comics.

This post aims to disentangle what we know and what we don't know about this layout, how people navigate through it, and how it occurs in comics. I here report on what science tells us about this layout, not "gut feelings" or conventional wisdom.

First off, let me give this phenomenon a name. In my papers and my book, The Visual Language of Comics, I labeled this layout as "blockage" because the long vertical panel "blocks" the flow of horizontal navigation. I called the flipped version of this layout (long panel on the left, vertical stack on the right) "Reverse blockage" simply because it was named after.

Is this layout confusing?

I understand why people think that this layout is confusing. That's why it was one of the key elements that I tested in the very first experiment I ever did about comics, conducted at Comic-Con International way back in 2004 (though it took many more years to write it up and get published).

In that first study, I presented people with empty page layouts—with no content in the panels—and asked them to number the panels in the order that they would read them. As in the graph to the side (red bar), blockage was chosen using the Z-path (horizontal reading) at only around 32% of the time. People used the vertical reading roughly 62-68% of the time (see details in the paper). This showed that people actually preferred the vertical reading by a 2 to 1 margin.

Also, one might note in the graph, of all the features we tested in the study, it was the most impactful on pushing people to not use the Z-path of reading. So, of all layout features, this one was the most consistent at pushing people away from the Z-path.

Now, admittedly, my first study was not that carefully controlled as an experiment. It was my first one, after all, and I did it before I even started my graduate training in experimental psychology. I essentially tested lots of different instances of this layout (among other aspects of layout), but I did not explicitly manipulate it to see what variables affect it. In particular, the relationship between vertical staggering and blockage was not clear. So, we did a second study...

In our follow up study (pdf), we more carefully manipulated the layouts to ask the question: what is the relationship between a blockage layout and a "staggered" layout where the gutters are merely discontinuous. We used several page "templates" with basic aspects of layouts that were then filled with our experimental manipulations, which modulated the height of the right-hand border:

I should note that this design meant that it was not obvious that we were testing this phenomenon (we also tested other aspects of layout too). People saw lots of different whole page layouts with lots of variations. This is important, because this attempted to make sure participants were not aware of what was being tested, and thus they could proceed in an unbiased nature (as was true in my first study, though less systematically controlled).

In this study, we found that in "pure blockage" arrangements ("Bottom full"), there was a rate of 91% to go down vertically, and only 9% to go horizontally. This could be modulated by raising the righthand gutter though. The higher the gutter was raised (i.e., the more the stagger), the more likely people were to go horizontally. The data were actually beautiful, and there's another graph below that shows this.

If there is a rate of going vertical at 91%, this is pretty solid evidence that people prefer to read these arrangements vertically. This is not "confused"—there is overwhelmingly consistency. That's why when I see people harping about avoiding this layout, I send around the graphic above and say "no, it's not confusing! Feel free to use it!"

Now, one might say "But these data show that 1 out of 10 people will find this confusing! That's still confusing! Don't use it!" Let me unpack this. First off, almost no scientific study will show 100% of people doing something 100% of the time. Case in point: we didn't even get 100% consistency in reading 2 x 2 panel grids, which should be obvious as to how to navigate (a point I'll return to below).

Second, these data are not counts of people, but are averages across instances in the experiment (each person saw more than one version of the layout—we averaged across them), and then we took an average across participants to analyze. So, it's not 1 out of 10 people, it's that there is a mean rate of 91% for people go down rather than over across individual instances and people.

Third, if you you think that having a rate of 91%/9% is still "confusing" for people's preferences, then bear in mind that's roughly the same rate that people didn't choose the Z-path for arrangements in a 2 x 2 grid that was also found throughout the experiment. The actual graph for our data is to the left. (As I said, it beautifully shows a stepwise pattern for the height of the gutter.) The rates at which people use the Z-path in blockage (red) and for the grid (grey) are essentially the inverse of each other.

In other words, the rate for going horizontal (Z-path) in blockage is the same as going vertical (non-Z-path) in pure grids. So, if you're going to harp on blockage for being confusing, does that mean you're also going to harp on basic grids for being confusing?

Caveats: What these experiments show is that people have intuitions for how to navigate through blockage (and other) layouts in consistent ways that are measurable and quantifiable. These experiments show what people's preferences are; i.e., how they would consciously choose to navigate a layout. And, they do this by using layouts with no content.

It is certainly the case that navigating comics with content is different than those with empty panels. The inclusion of content may push people in different ways, which we can study (color, balloon placement, characters overlapping panels, etc.). But, this is exactly consistent with my theory about page layouts: there are many factors ("preference rules") pushing around how you navigate. For example, if you colored panels A and C blue and colored B yellow, that visual feature might push you towards C instead of B.

However, this isn't your basic preference.  By testing empty panels that don't have these additional influences, we can factor out these additional influences and get at people's basic intuitions. This is how you do science: by isolating structures to understand their influences on a situation.

Finally, since these experiments tested people's preferences, they don't test people's actual behavior. In the one study that has looked at people's behavior with these layouts, a Japanese team found that eye-movements in these layouts caused more looks back and forth ("regressions") than when those same panels were rearranged post hoc. Note though, there were several problems with this experiment (described here). Nevertheless, the results should not be discounted, and they imply that there may be a disconnect between what people's behavior is (like eye-movements), and what their intuited preferences are for navigation. We're currently doing studies to tease this apart.

What about comic reading experience? 

One factor that might could possibly influence how people read comics is their experience. I've shown that the frequency with which people read comics can influence lots of things about how they're comprehended, including people's brain responses. There is a "fluency" for the visual language used in comics. Maybe this could extend to blockage layouts?

In my first experiment, the only thing that modulated whether people used the Z-path in blockage layouts was whether they had any comic reading experience at all. People who said they "never" read comics were significantly more likely to use the Z-path than those who read comics to any degree whatsoever. This is the dotted blue line in the graph to the right.

In our second study, we used a more sophisticated measurement of comic reading expertise called the Visual Language Fluency Index (VLFI) score, which I've used in many studies. We didn't find any significant correlations between VLFI and blockage paths, but we did find an interesting trend. The statistics related to correlations (r-values) increased as the gutter got higher. This suggested that the more the layout used blockage, the more experience in reading comics seemed to matter. But, again, this wasn't statistically significant.

What about different types of comics?

Another factor that might influence this layout is the degree to which it appears in comics of the world. Over the past several years, my students and I have been gathering data about properties from different comics around the world, and this is one of the things we've coded for.

The first study to code for properties of page layout in comics was done by my student, Kaitlin Pederson. She analyzed how page layouts have changed across the last 80 years of American superhero comics. The paper for this study should come out soon (EDIT: here it is), but here is her presentation on this material from Comic-Con of 2015. Essentially, she found that blockage occurs in fairly small proportions in American comics, but it has been increasing in how often it occurs in page layouts over time (that is, it's being used more often more recently), but this trend was only approaching statistical significance.

If it is the case that blockage is increasing in usage over time, that would imply a corollary to cognition. We might expect younger readers (who experience it more) to have less of an issue with it than older readers (who experienced it less frequently). However, in neither study did we show correlations between the age of participants and blockage choices.

In more recent work, we've looked at layouts from across the world. This work isn't published, but it was presented by my students at Comic-Con 2016. We found that blockage is used much more in Asian books (Japanese manga, Chinese manhua) than Western books (US superhero and Indy books, and books from France and Sweden). Paper hopefully being written up soon.

So, might it be the case that the rate at which people read manga (which use more blockage) impacts how they choose to navigate this layout? It doesn't seem to be the case. In my first study, I found no impact of people's reading frequency for manga on blockage layouts. This was actually a surprising finding for me, since my intuition was that blockage occurs more in manga (which we now seem to have data to support), and thus I figured experience reading manga matters. But, the data don't bear this out. I also went back into the data for my second study and looked at whether manga reading had an impact: Nope, no influence.

So, yes, this does vary across comics from different cultures and time periods. However—at least so far (and this could change with more studies)—it seems that the types of comics you read do not impact how you navigate pages. I'll also note, this is different than some other recent findings I have showing that the types of comics you read does impact how your brain comprehends image sequences (EDIT: Like this one).

Closing thoughts

In this post, I've discussed what science—not conventional wisdom or hearsay—tells us about "blockage" layouts. I've discussed data from two experiments published in peer-reviewed journals, which show that people are fairly consistent about how they choose to navigate these layouts—at least as consistent as people navigate through grids. This navigation is modulated somewhat by having experience reading comics, but not overwhelmingly. It also seems unaffected by which types of comics people read, even though it appears more in Asian books than Western ones.

At the outset of this post I likened harping on avoiding blockage layouts akin to being a "grammar nazi." I actually think this is an apt analogy. Like blockage, most of the so called "rules" of language upheld by grammar nazis are not actually rules of English grammar. They're gospels hammered into people through rote conditioning, but have little reality in the way English is structured or comprehended. This is the visual language equivalent of one of these "rules."

So, I again say: this is not an overly problematic layout and people are making much ado about nothing. Feel free to use it without worry that people will be confused by it. The visual language of comics is incredibly complex and multifaceted in its structure, and the most complicated and impactful aspects of this structure usually go unnoticed or un-commented on by critics and creators alike. In the scope of that complexity, this layout is fairly minor in terms of people's comprehension of comics. Perhaps it's time to focus on other things?