Sunday, April 16, 2017

Tourist traps in comics land*: Unpublished comics research

In a series of Twitter posts, I recently reflected on the pitfalls of various comics research that hasn't been published. Since I think it contains some valuable lessons, I'm going to repeat and expand on them here...

Though I've written the most about psychological studies about how people understand comics, other people have been doing these types of studies before me. What's interesting is that many of these studies were not published, because they found null results. There are a few trends in this work...

Space = Time

The topic I've heard about the most is the testing of McCloud's idea that panel size relates to the duration of conceived time, and that longer vs. shorter gutters relates to longer vs. shorter spaces of "time" between panels. I critiqued the theory behind this idea that "space = time" back in this paper, but I've heard of several scholars who have tested this with experiments. Usually these studies involved presenting participants with different size panels/gutters and then having participants rate their perceived durations.

In almost all of these studies, no one found any support of the idea that "physical space = conceived time". I can only think of one study that did find something supporting it, and it was only for a subset of the stimuli, and thus warranted further testing (which hasn't been done yet).

Because these studies found null results, they weren't deemed noteworthy enough to warrant publication. And since none got published, other labs didn't know about it, so they tried it too with the same null results. I think it's a good case for importance of publishing null results: they serve to both disprove hypotheses, and inform others not to try to grab at the same smoke.


The other type of study on comics that usually doesn't get published is eye-tracking. I know of at least half-a-dozen unpublished eye-tracking studies looking at people reading comic pages. The main reason these studies aren't published is because they're often exploratory, with no real hypotheses to be tested. Most comics eye-tracking studies just examine what people look at, which doesn't really tell you much if you don't manipulate anything. This can be useful for telling you basic facts about what people look (types of information, how long, etc.), but without a specific manipulation, it is less informative and has lots of confounds.

An example: Let's say you run an eye-tracking study of a particular superhero comic and find that people spend more time fixating on text than on the images (which is a frequent finding). Now the questions arise: Is it because of the specific comic you chose? Is it because your comic had a particular uncontrolled multimodal interaction that weights meaning more to the text? Is it because your participants lacked visual language fluency, and so they relied more on text than images? Is it because you chose a superhero comic, but your participants read more manga? Without more controls, it's hard to know anything substantial.

Good science means testing a hypothesis, which means having a theory that can possibly be tested by manipulating something. Without a testable theory you don't have any real hypothesis to create a manipulation, which results in not a publishable eye-tracking study about comics. Eye-tracking is an informative tool, but the real "meat" of the research needs to be in the thing that is being manipulated.

I'll note that this is the same as when people do (or advise) using fMRI or EEG to study processing (visual) narratives in the brain. I've seen several studies of "narrative" or "visual narrative" where they simply measure the brain activity to non-manipulated materials and then claim that "these are the brain areas involved in comics/visual narrative/narrative!"

In fact, such research is wholly uninformative, because nothing specific is being tested, and such research betrays an ignorance for just how complex these structures actually are. It would be inconceivable for any serious scholar of language to simply have someone passively read sentences and then claim that they "know how they work" by measuring fMRI or eye-tracking to them. Why then the presumption of simplicity for visual narratives?

Final remarks

One feature of unpublished research on comics is that they are often undertaken by very good researchers who had little knowledge-base for what goes on in comics and/or the background literature of that field. It is basically "scientific tourism." While it is of course great that people are interested enough in the visual language of comics to invest the time and effort to run experiments, it's also a recipe for diminishing returns. Without background knowledge or intuition, it's hard to know why your experiment might not be worth running.

Nevertheless, I also agree that it would be useful to know what types of unpublished studies people have done. Publishing such results would be informative for what isn't found, and would prevent future researchers from chasing topics they maybe shouldn't.

So, let me conclude with an "open call"...

If you've done a study on comics that hasn't been published (or know someone who has!): Please contact me. At the least, I'll feature a summary (or link) to your study on this blog, and if I accrue enough of them, perhaps I can curate a journal or article for reporting such results.

*Thanks to Emiel van Miltenburg for the post title!

Friday, February 24, 2017

New paper: When a hit sounds like a kiss

I'm excited to announce that I have new paper out in the journal Brain and Language entitled "When a hit sounds like a kiss: an electrophysiological exploration of semantic processing in visual narrative." This was a project by the first author Mirella Manfredi, who worked with me during my time in Marta Kutas's lab at UC San Diego.

Mirella has an interest in the cognition of humor, and also wanted to know about how the brain processes different types of information, like words vs. images. So, she designed a study using "action stars"—the star shaped flashes that appear at the size of whole panels to indicate that an event happened, but not show you what it is. Into these action stars, she placed either onomatopoeia (Pow!), descriptions (Impact!), anomalous onomatopoeia or descriptions (Smooch!, Kiss!), or grawlixes (#$%?!).

We then measured people's brainwaves for these action star panels. We found a brainwave effect that is sensitive to semantic processing (the "N400")—how people process meaning—that suggested the anomalies were harder to understand than the congruous ones. This suggested that meaning garnered by the context of the visual sequence impacted how people processed the textual words. In addition, the grawlixes showed very little signs of this type of processing, suggesting that they don't hold specific semantic meanings.

In addition, we found that descriptive sound effects evoked another type of brain response (a late frontal positivity) often associated with the violation of very specific expectations (like getting a slightly different word than expected, even though it might not be anomalous).

This response was fairly interesting, because we also recently showed that American comics use descriptive sound effects far less compared to onomatopoeia. What this means is that this brain response wasn't just sensitive to certain words, but was sensitive to the low expectations for a certain type of words: descriptive sound effects in the context of comics.

Mirella and I are now continuing to collaborate on more studies about the interactions between multimodal and crossmodal information, so nice to have this one to kick things off!

You can find the paper along with all my other Downloadable Papers, or directly here (pdf).


Researchers have long questioned whether information presented through different sensory modalities involves distinct or shared semantic systems. We investigated uni-sensory cross-modal processing by recording event-related brain potentials to words replacing the climactic event in a visual narrative sequence (comics). We compared Onomatopoeic words, which phonetically imitate action sounds (Pow!), with Descriptive words, which describe an action (Punch!), that were (in)congruent within their sequence contexts. Across two experiments, larger N400s appeared to Anomalous Onomatopoeic or Descriptive critical panels than to their congruent counterparts, reflecting a difficulty in semantic access/retrieval. Also, Descriptive words evinced a greater late frontal positivity compared to Onomatopoetic words, suggesting that, though plausible, they may be less predictable/expected in visual narratives. Our results indicate that uni-sensory cross-model integration of word/letter-symbol strings within visual narratives elicit ERP patterns typically observed for written sentence processing, thereby suggesting the engagement of similar domain-independent integration/interpretation mechanisms.

Manfredi, Mirella, Neil Cohn, and Marta Kutas. 2017. When a hit sounds like a kiss: an electrophysiological exploration of semantic processing in visual narrative. Brain and Language. 169: 28-38.

Saturday, February 04, 2017

New paper: Drawing the Visual Narratives

I'm happy to announce that we have a new paper in the latest issue of the Journal of Experimental Psychology: Learning, Memory, and Cognition entitled "Drawing the Line Between Constituent Structure and Coherence Relations in Visual Narratives."

This was my final project at project at Tufts University, and was carried out by my former assistant (and co-author) Patrick Bender, who is now in graduate school at USC.

We wanted to examine the relationship between meaningful panel-to-panel relationships ("panel transitions") and the hierarchic constructs of my theory of narrative grammar. Many discourse theories have posited that people do assess meaningful relations between each image in a visual sequence, and (like in my theory) people make groupings. Yet, in these theories, the groupings are signaled by major changes in meaning, such as a "transition" with a big character change. We hypothesized that groupings were not actually motivated by changes in meaning, but by narrative category information that align with larger narrative structures.

So, we simply gave people various visual sequences and asked them to "draw a line" between panels that would best divide the sequence into two meaningful parts—i.e., to break up the sequence into groupings. People then continued to draw lines until all panels had lines between them, and we looked at what influenced their groupings. Similar tasks have been used in many studies of discourse and event cognition.

We found that panel transitions did indeed influence their segmentation of the sequences. However, narrative category information was a far greater predictor of where they divided sequences than these meaningful transitions between panels. That is: narrative structure better predicts how people intuit groupings in visual sequences than semantic "panel transitions."

The paper is downloadable here (pdf) or along with all of my other papers.

Full abstract:

Theories of visual narrative understanding have often focused on the changes in meaning across a sequence, like shifts in characters, spatial location, and causation, as cues for breaks in the structure of a discourse. In contrast, the theory of visual narrative grammar posits that hierarchic “grammatical” structures operate at the discourse level using categorical roles for images, which may or may not co-occur with shifts in coherence. We therefore examined the relationship between narrative structure and coherence shifts in the segmentation of visual narrative sequences using a “segmentation task” where participants drew lines between images in order to divide them into subepisodes. We used regressions to analyze the influence of the expected constituent structure boundary, narrative categories, and semantic coherence relationships on the segmentation of visual narrative sequences. Narrative categories were a stronger predictor of segmentation than linear coherence relationships between panels, though both influenced participants’ divisions. Altogether, these results support the theory that meaningful sequential images use a narrative grammar that extends above and beyond linear semantic shifts between discourse units.

Full Reference:

Cohn, Neil and Patrick Bender. 2017. Drawing the line between constituent structure and coherence relations in visual narratives. Journal of Experimental Psychology: Learning, Memory, and Cognition. 43(2): 289-301.

Sunday, December 18, 2016

2016: My Publications in Review

As 2016 nears its close, I thought I should do a post reflecting on all the research I've released with my colleagues over the past year. This was my biggest year of publishing yet, so I thought it would be good to just go over what we came out with.

First off, in January, Bloomsbury published my edited volume The Visual Narrative Reader. The book features 12 chapters summarizing or reprinting important and often overlooked papers in the field of visual narrative research, with topics ranging from metaphor theory and multimodality, to how kids draw and understand sequential images, to various examinations of cross-cultural and historical visual narrative systems. In my classes, I use it as a companion volume to my monograph, The Visual Language of Comics.

The rest of the year then saw a flurry of publications (title links go to blog summaries, pdf links to pdfs):

A multimodal parallel architecture (pdf) - Outlines a cognitive model for language and multimodal interactions between verbal (spoken language), visual-graphic (drawings, visual languages), and visual-bodily (gesture, sign languages) modalities. This paper essentially presents the core theoretical model of my research, and my vision for the cognitive architecture of the language system.

The vocabulary of manga (pdf) - This project with Sean Ehly coded over 5,000 panels in 10 shonen and 10 shojo manga to reveal that they mostly use the same 70 visual morphemes ("symbology"), though they use them in differing proportions. This suggested that there is a broad Japanese Visual Language in which genre-specific "dialects" manifest variations on this generalized structure.

The pieces fit (pdf) - This experiment with Carl Hagmann tested participants'  comprehension of sequential images with rapidly presented panels (1 second, half a second) when we switched the positions of panels. In general, switches between panels nearby to each other in the original sequence were easier to comprehend than panel switches across distances, but switches that crossed boundaries of groupings (" narrative constituents") were worse than those within groupings. This provides further evidence that people make groupings of panels, not just linear panel transitions.

Reading without words (pdf) - My project with collaborator Tom Foulsham reports on one of the first controlled eye-tracking studies using comics. We show that people's eyes move through a grid layout largely the same as they would across text—left-to-right and down largely keeping to their row, and looking back mostly to adjacent frames. We also found that people mostly look at the same content of a panel whether shown in a grid or with one panel at a time, but eye fixations to panels from scrambled sequences are slightly more disperse than those to panels in normal sequences.

Meaning above the head (pdf) - This paper with my student Beena Murthy and collaborator Tom Foulsham explored the understanding of "upfixes"—the visual elements that float above characters' heads like lightbulbs or hearts. We show that upfixes are governed by constraints that the upfix needs to go above the head, not next to it, and must "agree" with the facial expression (storm clouds can't go above a happy face). These constraints operate over both conventional and novel upfixes, suggesting that this is an abstract schematic pattern.

The changing pages of comics (pdf) - My student Kaitlin Pederson and I report on her project coding over 9,000 panels in 40 American superhero comics from the 1940s through the 2010s to see how page layout has changed over time. Overall, we argue that layouts over time have become both more systematic as well as more decorative.

Pow, punch, pika, and chu (pdf) - Along with students Nimish Pratha and Natalie Avunjian, we report on their analyses of sound effects in American comics (Mainstream vs. Indy) and Japanese manga (shonen vs. shojo) and show that the structure and content of sound effects differ both within and between cultures.

Sequential images are not universal, or caveats for using visual narratives in experimental tasks (pdf) - This conference proceedings paper reviews some of the research showing that sequential images are not understood universally, and are dependent on cultural and developmental knowledge to be understood.

Finally, I also had a book chapter come out in the book Film Text Analysis by Janina Wildfeuer and John Bateman. My chapter, "From Visual Narrative Grammar to Filmic Narrative Grammar" explores how my theory of narrative structure for static sequential images can also be applied to explaining the comprehension of film. I'll hopefully do a write up of it on this site sometime soon.

It was a big year of publications, and next year will hopefully be just as exciting! All my papers are available on this page.

Tuesday, December 06, 2016

New paper: Pow, punch, pika, and chu

I'm once again excited to announce the publication of another of my students' projects. Our paper, "Pow, Punch, Pika, and Chu: The Structure of Sound Effects in Genres of American Comics and Japanese Manga" is now published in the latest issue of Multimodal Communication. This was another paper derived from student projects in my 2014 class, The Cognition of Comics. The first two authors, Nimish Pratha and Natalie Avunjian, did research projects examining the use of onomatopoeia in genres of Japanese manga and American comics.

The biggest finding in American comics was that onomatopoetic sound effects (Pow!) are used in much greater proportion than descriptive sound effects (Punch!). In fact, though we found some descriptive sound effects in genres of "Independent" comics, we found none in the 10 superhero comics that were analyzed.

In Japanese manga, we found slightly different results. We categorized two types of "sound" effects in manga. Giongo are hearable sounds (crack!) while gitaigo are unhearable qualities (sparkle!). We found that shonen manga used more giongo than gitaigo, while shojo manga had the opposite trend: they used more gitaigo than giongo. In addition, these sound effects in shonen manga were more often written in the katakana script than hiragana, while the reverse occurred for shojo manga.

Overall, our results suggested that different types of comics can be characterized by the way they use "sound effects."

You can download the paper here, or at my downloadable papers page.

Here's Nimish speaking about the project at this past ComicCon (unfortunately cut just slightly short by time):


As multimodal works, comics are characterized as much by their use of language as by the style of their images. Sound effects in particular are exemplary of comics’ language-use, and we explored this facet of comics by analyzing a corpus of books from genres in the United States (mainstream and independent) and Japan (shonen/boys’ and shojo/girls’). We found variation between genres and between cultures across several properties of the content and presentation of sound effects. Foremost, significant differences arose between the lexical categories of sound effects (ex. onomatopoetic: Pow! vs. descriptive: Punch!) between genres within both culture’s works. Additionally, genres in Japanese manga vary in the scripts used to write sound effects in Japanese (hiragana vs. katakana). We argue that, in English, a similar function is communicated through the presence or absence of textual font stylization. Altogether, these aspects of variation mark sound effects as important carriers of multimodal information, and provide distinctions by which genres and cultures of comics can be distinguished.


Pratha, Nimish K., Natalie Avunjian, and Neil Cohn. 2016. Pow, punch, pika, and chu: The structure of sound effects in genres of American comics and Japanese manga. Multimodal Communication. 5(2): 93-109.

Tuesday, November 22, 2016

New paper: The changing pages of comics

I'm excited to announce the publication of our latest paper, "The changing pages of comics: Page layouts across eight decades of American superhero comics" in the latest issue of Studies in Comics. This was a student project undertaken by the first author, Kaitlin Pederson, from my 2014 class the Cognition of Comics. She analyzed how page layouts have changed over time in American superhero comics from the 1940s to the 2010s. This is the first published, data-driven paper using corpus analysis on page layouts in comics, so that's quite exciting!

Kaitlin went panel-by-panel in these books analyzing various properties of their page layouts. She coded over 9,000 panels across 40 comics. Some of these features are captured in this figure:

She found that certain features have decreased over time (horizontal staggering, etc.), while others have increased over time (whole rows, etc.). Overall, her conclusion is that pages in earlier comics were fairly unsystematic in their layouts, while over time they grew to be more systematic, and at the same time more treating the page as a whole "canvas." This is complemented especially by changes towards using "widescreen" layouts in pages over the past two decades.

You can download the paper here (pdf), or at my downloadable papers page.

Also, here's Kaitlin at Comic-Con 2015 reporting on her initial analyses of this project:

Full abstract:

Page layouts are one of the most overt features of comics’ structure. We hypothesized that American superhero comics have changed in their page layout over eight decades, and investigated this using a corpus analysis of 40 comics from 1940 through 2014. On the whole, we found that comics pages decreased in their use of grid-type layouts over time, with an increase in various non-grid features. We interpret these findings as indicating that page layouts moved away from conventional grids and towards a “decorative” treatment of the page as a whole canvas. Overall, our analysis shows the benefit of empirical methods for the study of the visual language of comics. 


Pederson, Kaitlin and Neil Cohn. 2016. The changing pages of comics: Page layouts across eight decades of American superhero comics. Studies in Comics. 7(1):7-28

Sunday, September 18, 2016

New paper: Meaning above the head

I'm happy to announce that our new paper, "Meaning above the head" is now published in the Journal of Cognitive Psychology! This one explores the structure of "upfixes" which are the class of visual signs that float above character's heads, like lightbulbs or hearts.

In my book, The Visual Language of Comics, I made a few hypotheses about these elements. First, I argued that they were bound by a few constraints: 1) they are typically above the head, and are weird when moved to the side. 2) the upfix has a particular "agreement" relationship with the face (e.g., storm clouds go with a sad face, but are weird with a happy face). Also, I argued that upfixes are an abstract class, meaning they can easily allow for new ones, though they won't be quite as comprehensible as conventional ones (as in the image below).

With these hypotheses stated, my enterprising student Beena Murthy set out to test these ideas as part of a experiment she ran for a class project (many of the projects from that class are now published). We were then joined by my collaborator Tom Foulsham who aided us in testing additional questions in a second experiment (which you may have taken online!).

Lo and behold, most all of my hypotheses appear to be borne out! Overall, this means that upfixes use particular constraints in their construction, and allow for the creation of new, novel signs! We now plan to follow up these experiments with several more.

Check out the paper, which is available on my Downloadable Papers page, or directly here: PDF.

AND... don't forget that you can also get awesome t-shirts with both the normal and unconventional upfixes. The shirt designs (which are the in this post images) actually feature our stimuli from the experiments!


“Upfixes” are “visual morphemes” originating in comics where an element floats above a character’s head (ex. lightbulbs or gears). We posited that, similar to constructional lexical schemas in language, upfixes use an abstract schema stored in memory, which constrains upfixes to locations above the head and requires them to “agree” with their accompanying facial expressions. We asked participants to rate and interpret both conventional and unconventional upfixes that either matched or mismatched their facial expression (Experiment 1) and/or were placed either above or beside the head (Experiment 2). Interpretations and ratings of conventionality and face–upfix matching (Experiment 1) along with overall comprehensibility (Experiment 2) suggested that both constraints operated on upfix understanding. Because these constraints modulated both conventional and unconventional upfixes, these findings support that an abstract schema stored in long-term memory allows for generalisations beyond memorised individual items.

Full reference:

Cohn, Neil, Beena Murthy, and Tom Foulsham. (2016). Meaning above the head: combinatorial constraints on the visual vocabulary of comics. Journal of Cognitive Psychology. 28(5): 559-574.

Tuesday, August 30, 2016

Dispelling myths about comics page layout

There are many websites and twitter accounts that give advice about how to draw comics, and perhaps no other piece of "advice" arises more than the repeated advocacy to avoid page layouts like the one in the image to the right. Advice-givers claim that this layout is confusing because a reader may not know whether to follow their usual left-to-right and down "Z-path" from A to C (resulting in a backtrack to B), or whether to go vertically from A to B, then to C. Because of this confusion, this layout is advised to be avoided at all costs, with the fervor of a grammar nazi for the visual language of comics.

This post aims to disentangle what we know and what we don't know about this layout, how people navigate through it, and how it occurs in comics. I here report on what science tells us about this layout, not "gut feelings" or conventional wisdom.

First off, let me give this phenomenon a name. In my papers and my book, The Visual Language of Comics, I labeled this layout as "blockage" because the long vertical panel "blocks" the flow of horizontal navigation. I called the flipped version of this layout (long panel on the left, vertical stack on the right) "Reverse blockage" simply because it was named after.

Is this layout confusing?

I understand why people think that this layout is confusing. That's why it was one of the key elements that I tested in the very first experiment I ever did about comics, conducted at Comic-Con International way back in 2004 (though it took many more years to write it up and get published).

In that first study, I presented people with empty page layouts—with no content in the panels—and asked them to number the panels in the order that they would read them. As in the graph to the side (red bar), blockage was chosen using the Z-path (horizontal reading) at only around 32% of the time. People used the vertical reading roughly 62-68% of the time (see details in the paper). This showed that people actually preferred the vertical reading by a 2 to 1 margin.

Also, one might note in the graph, of all the features we tested in the study, it was the most impactful on pushing people to not use the Z-path of reading. So, of all layout features, this one was the most consistent at pushing people away from the Z-path.

Now, admittedly, my first study was not that carefully controlled as an experiment. It was my first one, after all, and I did it before I even started my graduate training in experimental psychology. I essentially tested lots of different instances of this layout (among other aspects of layout), but I did not explicitly manipulate it to see what variables affect it. In particular, the relationship between vertical staggering and blockage was not clear. So, we did a second study...

In our follow up study (pdf), we more carefully manipulated the layouts to ask the question: what is the relationship between a blockage layout and a "staggered" layout where the gutters are merely discontinuous. We used several page "templates" with basic aspects of layouts that were then filled with our experimental manipulations, which modulated the height of the right-hand border:

I should note that this design meant that it was not obvious that we were testing this phenomenon (we also tested other aspects of layout too). People saw lots of different whole page layouts with lots of variations. This is important, because this attempted to make sure participants were not aware of what was being tested, and thus they could proceed in an unbiased nature (as was true in my first study, though less systematically controlled).

In this study, we found that in "pure blockage" arrangements ("Bottom full"), there was a rate of 91% to go down vertically, and only 9% to go horizontally. This could be modulated by raising the righthand gutter though. The higher the gutter was raised (i.e., the more the stagger), the more likely people were to go horizontally. The data were actually beautiful, and there's another graph below that shows this.

If there is a rate of going vertical at 91%, this is pretty solid evidence that people prefer to read these arrangements vertically. This is not "confused"—there is overwhelmingly consistency. That's why when I see people harping about avoiding this layout, I send around the graphic above and say "no, it's not confusing! Feel free to use it!"

Now, one might say "But these data show that 1 out of 10 people will find this confusing! That's still confusing! Don't use it!" Let me unpack this. First off, almost no scientific study will show 100% of people doing something 100% of the time. Case in point: we didn't even get 100% consistency in reading 2 x 2 panel grids, which should be obvious as to how to navigate (a point I'll return to below).

Second, these data are not counts of people, but are averages across instances in the experiment (each person saw more than one version of the layout—we averaged across them), and then we took an average across participants to analyze. So, it's not 1 out of 10 people, it's that there is a mean rate of 91% for people go down rather than over across individual instances and people.

Third, if you you think that having a rate of 91%/9% is still "confusing" for people's preferences, then bear in mind that's roughly the same rate that people didn't choose the Z-path for arrangements in a 2 x 2 grid that was also found throughout the experiment. The actual graph for our data is to the left. (As I said, it beautifully shows a stepwise pattern for the height of the gutter.) The rates at which people use the Z-path in blockage (red) and for the grid (grey) are essentially the inverse of each other.

In other words, the rate for going horizontal (Z-path) in blockage is the same as going vertical (non-Z-path) in pure grids. So, if you're going to harp on blockage for being confusing, does that mean you're also going to harp on basic grids for being confusing?

Caveats: What these experiments show is that people have intuitions for how to navigate through blockage (and other) layouts in consistent ways that are measurable and quantifiable. These experiments show what people's preferences are; i.e., how they would consciously choose to navigate a layout. And, they do this by using layouts with no content.

It is certainly the case that navigating comics with content is different than those with empty panels. The inclusion of content may push people in different ways, which we can study (color, balloon placement, characters overlapping panels, etc.). But, this is exactly consistent with my theory about page layouts: there are many factors ("preference rules") pushing around how you navigate. For example, if you colored panels A and C blue and colored B yellow, that visual feature might push you towards C instead of B.

However, this isn't your basic preference.  By testing empty panels that don't have these additional influences, we can factor out these additional influences and get at people's basic intuitions. This is how you do science: by isolating structures to understand their influences on a situation.

Finally, since these experiments tested people's preferences, they don't test people's actual behavior. In the one study that has looked at people's behavior with these layouts, a Japanese team found that eye-movements in these layouts caused more looks back and forth ("regressions") than when those same panels were rearranged post hoc. Note though, there were several problems with this experiment (described here). Nevertheless, the results should not be discounted, and they imply that there may be a disconnect between what people's behavior is (like eye-movements), and what their intuited preferences are for navigation. We're currently doing studies to tease this apart.

What about comic reading experience? 

One factor that might could possibly influence how people read comics is their experience. I've shown that the frequency with which people read comics can influence lots of things about how they're comprehended, including people's brain responses. There is a "fluency" for the visual language used in comics. Maybe this could extend to blockage layouts?

In my first experiment, the only thing that modulated whether people used the Z-path in blockage layouts was whether they had any comic reading experience at all. People who said they "never" read comics were significantly more likely to use the Z-path than those who read comics to any degree whatsoever. This is the dotted blue line in the graph to the right.

In our second study, we used a more sophisticated measurement of comic reading expertise called the Visual Language Fluency Index (VLFI) score, which I've used in many studies. We didn't find any significant correlations between VLFI and blockage paths, but we did find an interesting trend. The statistics related to correlations (r-values) increased as the gutter got higher. This suggested that the more the layout used blockage, the more experience in reading comics seemed to matter. But, again, this wasn't statistically significant.

What about different types of comics?

Another factor that might influence this layout is the degree to which it appears in comics of the world. Over the past several years, my students and I have been gathering data about properties from different comics around the world, and this is one of the things we've coded for.

The first study to code for properties of page layout in comics was done by my student, Kaitlin Pederson. She analyzed how page layouts have changed across the last 80 years of American superhero comics. The paper for this study should come out soon, but here is her presentation on this material from Comic-Con of 2015. Essentially, she found that blockage occurs in fairly small proportions in American comics, but it has been increasing in how often it occurs in page layouts over time (that is, it's being used more often more recently), but this trend was only approaching statistical significance.

If it is the case that blockage is increasing in usage over time, that would imply a corollary to cognition. We might expect younger readers (who experience it more) to have less of an issue with it than older readers (who experienced it less frequently). However, in neither study did we show correlations between the age of participants and blockage choices.

In more recent work, we've looked at layouts from across the world. This work isn't published, but it was presented by my students at Comic-Con 2016. We found that blockage is used much more in Asian books (Japanese manga, Chinese manhua) than Western books (US superhero and Indy books, and books from France and Sweden). Paper hopefully being written up soon.

So, might it be the case that the rate at which people read manga (which use more blockage) impacts how they choose to navigate this layout? It doesn't seem to be the case. In my first study, I found no impact of people's reading frequency for manga on blockage layouts. This was actually a surprising finding for me, since my intuition was that blockage occurs more in manga (which we now seem to have data to support), and thus I figured experience reading manga matters. But, the data don't bear this out. I also went back into the data for my second study and looked at whether manga reading had an impact: Nope, no influence.

So, yes, this does vary across comics from different cultures and time periods. However—at least so far (and this could change with more studies)—it seems that the types of comics you read do not impact how you navigate pages. I'll also note, this is different than some other recent findings I have showing that the types of comics you read does impact how your brain comprehends image sequences.

Closing thoughts

In this post, I've discussed what science—not conventional wisdom or hearsay—tells us about "blockage" layouts. I've discussed data from two experiments published in peer-reviewed journals, which show that people are fairly consistent about how they choose to navigate these layouts—at least as consistent as people navigate through grids. This navigation is modulated somewhat by having experience reading comics, but not overwhelmingly. It also seems unaffected by which types of comics people read, even though it appears more in Asian books than Western ones.

At the outset of this post I likened harping on avoiding blockage layouts akin to being a "grammar nazi." I actually think this is an apt analogy. Like blockage, most of the so called "rules" of language upheld by grammar nazis are not actually rules of English grammar. They're gospels hammered into people through rote conditioning, but have little reality in the way English is structured or comprehended. This is the visual language equivalent of one of these "rules."

So, I again say: this is not an overly problematic layout and people are making much ado about nothing. Feel free to use it without worry that people will be confused by it. The visual language of comics is incredibly complex and multifaceted in its structure, and the most complicated and impactful aspects of this structure usually go unnoticed or un-commented on by critics and creators alike. In the scope of that complexity, this layout is fairly minor in terms of people's comprehension of comics. Perhaps it's time to focus on other things?

Sunday, August 28, 2016

August craziness from the east side of the Atlantic

Busy times lately: This marks my first blog post on the east side of the Atlantic! I've now been living in the Netherlands for the better part of August, and things have been a bit crazy moving and getting settled in. We're now entering the new school year here at Tilburg University, where I'm teaching my introductory course on The Visual Language of Comics, and co-teaching Multimodal Communication with Joost Schilperoord. I've even got a keen new departmental profile page!

In between initially moving to the Netherlands and fully moving in, I actually flew back to the States to attend the recent Cognitive Science Society conference in Philadelphia. It was a great conference, especially with the theme of event comprehension.

For my part, I organized a symposium on "Comics and Cognitive Systems" (pdf) which featured an introductory talk by me about how you can use different methods (theory, experimentation, corpus analysis) to converge on greater knowledge of issues than using any one method. This was followed by a great talk by my colleague Joe Magliano about generating inferences in sequential images. My collaborator Emily Coderre then talked about our recent brainwave experiments looking at how autistics process visual narratives compared to sentences. Finally, my collaborator Lia Kendall discussed her behavioral and brainwave studies comparing cartoony and realistic drawing styles. It was an exciting session!

Later on, I gave a second talk about how "Sequential images are not universal" (pdf). This presentation was a caveat to people who use sequential images/comics for experiments and outreach believing that they are universally understood and transparent. I then showed cross-cultural research showing that not everyone understands sequential images, and developmental work showing that understanding of the sequence as a sequence falls along a particular developmental trajectory.

More updates coming soon including a recently published paper, a few more papers about to be published, and hopefully video from recent presentations like Comic-Con.

Tuesday, July 19, 2016

Moves, Comic-Con, and one crazy summer!

Summer has been a wild one so far, and only getting crazier. In addition to several conferences in the US, I'm now preparing to move to The Netherlands for my new job as an assistant professor at Tilburg University, in the Tilburg center for Cognition and Communication! It's a big move, but I'm very excited to be joining an exciting department with colleagues doing very interesting work.

As if having movers come take all my stuff wasn't crazy enough this week, we've got Comic-Con here! Following on the success of last year's panel, I'll again be chairing a panel with my undergraduate researchers. They'll be presenting about their projects related to cross-cultural diversity in different comics. In these studies, researchers go panel by panel coding different dimensions of structure outlined in visual language theory. We then analyze this data to look at the trends displayed by different cultures.

This year's talk is Friday at 10:30am in room 26AB, as part of the Comic Arts Conference:

Data-Driven Comics Research

Recent work analyzing comics has turned to scientific methods. Neil Cohn (Tilburg University) will chair this panel discussing projects that annotate properties of comics from around the world, and discuss growing efforts for analyzing comics within the cognitive sciences. Then, presentations by Jessika Axner (University of California, San Diego) and Michaela Diercks (University of California, San Diego) will explore the differences between the structures found in comics from America, Japan, Hong Kong, and various European countries, such as France and Sweden. Finally, Nimish Pratha (University of California, San Diego) will describe how sound effects differ across genres of American comics and Japanese manga. Together, these presentations show the benefits of a data-driven, scientific approach to studying comics.

I'll also be signing on Thursday and Friday afternoons at the Reading with Pictures booth (#2445) which has been kind enough to again let me join them.

Hope to see you there!

Friday, April 22, 2016

New paper: Reading without words

One of the most frequent questions that people ask about reading comics is "what are people's eyes doing when comprehending comics?" More and more, I see people planning eye-tracking experiments with comics where eyes are recorded for where they move across a page or strip. I've reviewed a number of these studies on this blog, and many use fairly ad hoc methods without systematically manipulating elements within the experiment.

I'm very proud to say that my new paper, "Reading without words: Eye-movements in the comprehending of comic strips," with Tom Foulsham and Dean Wybrow in the journal Applied Cognitive Psychology, addresses these very issues. I was very happy to collaborate with Tom on this project, and it should be the first of several papers related to eye-tracking that we will produce. To our knowledge, this is the first paper on eye-tracking of comics that systematically manipulates aspects of the sequences in controlled experiments outside of free-form reading or without resorting to post-hoc alterations.

We did two experiments where participants read Peanuts comics in either a coherent or scrambled sequence. In Experiment 1, participants were presented with each panel one-at-a-time, while Experiment 2 presented them in a 3x2 grid. Overall, we found that people had more dispersed eye-movements for the scrambled strips, which also created more "regressions" (looks backward) to other panels in the sequence. This study also addressed a few myths of how comics are understood:

1) By and large, reading order in the 3x2 grid resembled that of text—a left-to-right and down motion with regressions to adjacent units. There was no "scanning" of the page prior to reading, as some have claimed.

2) We also found no real difference in eye-movements for the content of panels between layouts. That is, changing the layout did not affect the comprehension of the sequence.

You can download the paper on my Downloadable Papers page or click here for a pdf.

Here's the full abstract:

"The study of attention in pictures is mostly limited to individual images. When we ‘read’ a visual narrative (e.g., a comic strip), the pictures have a coherent sequence, but it is not known how this affects attention. In two experiments, we eyetracked participants in order to investigate how disrupting the visual sequence of a comic strip would affect attention. Both when panels were presented one at a time (Experiment 1) and when a sequence was presented all together (Experiment 2), pictures were understood more quickly and with fewer fixations when in their original order. When order was randomised, the same pictures required more attention and additional ‘regressions’. Fixation distributions also differed when the narrative was intact, showing that context affects where we look. This reveals the role of top-down structures when we attend to pictorial information, as well as providing a springboard for applied research into attention within image sequences."

Foulsham, Tom, Dean Wybrow, and Neil Cohn. (2016) Reading Without Words: Eye Movements in the Comprehension of Comic Strips. Applied Cognitive Psychology. 30: 566-579

Tuesday, March 29, 2016

Dispelling myths of comics understanding

In reading through various works about comics understanding, I keep hearing several statements repeated over and over. But, several of these statements are not reflective of the way people actually understand comics. So, I'd like to go through several of these myths about "understanding comics" and explain why they aren't true:

1. Page layout ≠ sequential image understanding

This is one of the biggest recurring myths that I see, and was leveled against both McCloud's panel transitions (see below) and my own model of narrative grammar. Somehow, because we focus on the relations between panel content, it is seen as denying the "meaning" that arises from a page layout. This myth conflates content and layout

Content is the meaningful connections between the depictions within panels. Page layout is the physical arrangement of panels on a canvas (like a page). While there are some cases where page layout can factor into the relations between panel content, these are not the same thing and are fully independent structures.

First off, it is easy to see that layout and content are different because you can rearrange panels into different layouts and it does not change the meaning. So long as the order of panels remains the same, then it doesn't matter if six panels are in a 2 x 3 grid, 6 panels vertically, or 6 panels horizontally. Now, you may end up manipulating the visual composition of panel relations by rearranging them, but that is still not necessarily the same as the "understanding" that is derived from the relations between the meaningful content in images.

Second, we also know that page layout is different from content because we've done experiments on them. In two separate studies, we gave people comic pages with empty panels and asked them to number the panels in the order they'd read them. We find that people choose consistent orderings of panels, even in unusual patterns, that does not rely at all on panel content (pdf, pdf). That is, knowing how to "read" a page is not contingent on panel content.

Also, in a recent study we actually tested the difference between single-panel viewing and panels arranged in a 3 x 2 grid. We effectively found little difference between what people look at and how they understand the panels in the different "layouts." In this case, alterations of layout makes no difference on comprehension of content.

Finally, when we do an experiment where we present panels one at a time, it is not a confound of how people actually understand sequential images. In fact, it is the opposite. These types of experiments aren't looking at "sequential image understanding" in general, but each experiment focuses on specific aspects of comprehension. In doing science, you want to factor out as many variables as you can so that your experiment focuses on the specific thing you are examining. If we included page layout, that is one additional variable that might confound the actual specific things we're examining. 

2. There is no "closure" that takes place between juxtaposed images

I have great fondness for McCloud's seminal theories on comics, but we've reached a point where it is clear that several of his claims are not accurate. In particular, the belief that we "fill in the gap" between panels is simply not correct. While it is the case that readers definitely make inferences for information that is not depicted, this inference does not occur "in the gutter" and also not in panel-to-panel juxtapositions (pdf).

We've now shown evidence of this for several experiments. First, when we do examine how people make inferences (like in action stars in the image to the left, or when that image would be deleted), the evidence for understanding missing information is not reflected in the gap between images, but in understanding the content of the second image relative to the first (or relative to several previously). We see this with slower reading times to the panel after an inference (pdf), or to particular brain responses associated with "mental updating" at the panel after an inference (pdf).

Also, we've shown that people make predictions about what content will come in subsequent images. In one study, we showed that disrupting a sequence within a grouping of panels is worse than disrupting it between groupings (pdf, video). Crucially, the disruption within the first grouping was understood worse than at the break between groupings. Because both of these disruptions appeared before the break between images, people could not have been using the "panel transition" at the break as a cue for those groupings. Rather, people's brains had to have been predicting upcoming structure. This means that there was no "filling in the gaps" because the subsequent image relationship had not yet even been reached.

3. Not all juxtaposed image relationships are meaningful somehow

There is a pervasive myth that somehow all possible juxtapositions of panels can somehow be meaningful. This implies that, no matter what panels are arranged, some type of meaning will be construed. Because of this, it implies that any image has roughly equal probability of appearing after any other panel. This is simply untrue. Not only do we show that different panel relations are understood by the brain in different ways, but people also will choose to order panels in particular orders, not random choices. This emerges in discriminations that...

1. Scrambling the orders of panels is worse than coherent narrative orders (pdf, pdf)
2. Fully random panels pulled from different comics are worse than narrative orders and those where random panels share meaningful associations (like being random panels, but all about baseball) (pdf)
3. Switching the position of some panels in a sequence is worse than others—the longer the switch, the worse the sequence. Also, switches across groupings of panels is worse than within groupings (pdf)
4. People choose to omit some panel types more than others. Those same types are also recognized when missing more often than those that are chosen to be deleted less. (pdf)

You can also just ask people about relations between panels: if you give them a "yes/no" rating of whether panel relations are comprehensible, they will consistently say that those expected to be anomalous are indeed incongruous. Or if they rate sequences on a 1 to 7 scale, they will consistently rate the incongruous ones as lower than the congruous ones. While conscious interpretation can be somewhat problematic (see below), people are fairly uniform in their assessment of whether something "makes sense" or does not.

4. Consciously explaining a relation between panels is different than an immediate, unconscious brain response.

This one is particularly important, especially for understanding experimental results like reaction times or brain responses. When you derive meaning from a relationship between panels, your brain will respond in a way that attempts to integrate those meanings together. Sometimes, no relation may be made, and you can measure this relationship by comparing different types of relations to each other. This brain response is also very fast: Brainwaves reveal that people recognize the difference between incongruous or out-of-place panels and congruous panels in a sequence in less than half a second.

This type of "meaning" is different than what might be consciously explained. Sure, you may be able to concoct some far flung way in which two totally unrelated images might have some relationship. However, this post hoc conscious explanation does not mean that is the way you brain is deriving meaning from the relation between images, and is far slower than that brain process.

In fact, such explanations are evidence in and of themselves that the relationship may be incongruous: if you have to do mental gymnastics and consciously explain a relation, you are compensating for the clear lack of a relationship that actually exists between those images.


Want more advice about how to do research on the visual language of comics? Check out this paper and these blog posts.