Saturday, February 04, 2017

New paper: Drawing the Line...in Visual Narratives

I'm happy to announce that we have a new paper in the latest issue of the Journal of Experimental Psychology: Learning, Memory, and Cognition entitled "Drawing the Line Between Constituent Structure and Coherence Relations in Visual Narratives."

This was my final project at project at Tufts University, and was carried out by my former assistant (and co-author) Patrick Bender, who is now in graduate school at USC.

We wanted to examine the relationship between meaningful panel-to-panel relationships ("panel transitions") and the hierarchic constructs of my theory of narrative grammar. Many discourse theories have posited that people do assess meaningful relations between each image in a visual sequence, and (like in my theory) people make groupings. Yet, in these theories, the groupings are signaled by major changes in meaning, such as a "transition" with a big character change. We hypothesized that groupings were not actually motivated by changes in meaning, but by narrative category information that align with larger narrative structures.

So, we simply gave people various visual sequences and asked them to "draw a line" between panels that would best divide the sequence into two meaningful parts—i.e., to break up the sequence into groupings. People then continued to draw lines until all panels had lines between them, and we looked at what influenced their groupings. Similar tasks have been used in many studies of discourse and event cognition.

We found that panel transitions did indeed influence their segmentation of the sequences. However, narrative category information was a far greater predictor of where they divided sequences than these meaningful transitions between panels. That is: narrative structure better predicts how people intuit groupings in visual sequences than semantic "panel transitions."

The paper is downloadable here (pdf) or along with all of my other papers.

Full abstract:

Theories of visual narrative understanding have often focused on the changes in meaning across a sequence, like shifts in characters, spatial location, and causation, as cues for breaks in the structure of a discourse. In contrast, the theory of visual narrative grammar posits that hierarchic “grammatical” structures operate at the discourse level using categorical roles for images, which may or may not co-occur with shifts in coherence. We therefore examined the relationship between narrative structure and coherence shifts in the segmentation of visual narrative sequences using a “segmentation task” where participants drew lines between images in order to divide them into subepisodes. We used regressions to analyze the influence of the expected constituent structure boundary, narrative categories, and semantic coherence relationships on the segmentation of visual narrative sequences. Narrative categories were a stronger predictor of segmentation than linear coherence relationships between panels, though both influenced participants’ divisions. Altogether, these results support the theory that meaningful sequential images use a narrative grammar that extends above and beyond linear semantic shifts between discourse units.

Full Reference:

Cohn, Neil and Patrick Bender. 2017. Drawing the line between constituent structure and coherence relations in visual narratives. Journal of Experimental Psychology: Learning, Memory, and Cognition. 43(2): 289-301.



Sunday, December 18, 2016

2016: My Publications in Review

As 2016 nears its close, I thought I should do a post reflecting on all the research I've released with my colleagues over the past year. This was my biggest year of publishing yet, so I thought it would be good to just go over what we came out with.

First off, in January, Bloomsbury published my edited volume The Visual Narrative Reader. The book features 12 chapters summarizing or reprinting important and often overlooked papers in the field of visual narrative research, with topics ranging from metaphor theory and multimodality, to how kids draw and understand sequential images, to various examinations of cross-cultural and historical visual narrative systems. In my classes, I use it as a companion volume to my monograph, The Visual Language of Comics.

The rest of the year then saw a flurry of publications (title links go to blog summaries, pdf links to pdfs):

A multimodal parallel architecture (pdf) - Outlines a cognitive model for language and multimodal interactions between verbal (spoken language), visual-graphic (drawings, visual languages), and visual-bodily (gesture, sign languages) modalities. This paper essentially presents the core theoretical model of my research, and my vision for the cognitive architecture of the language system.

The vocabulary of manga (pdf) - This project with Sean Ehly coded over 5,000 panels in 10 shonen and 10 shojo manga to reveal that they mostly use the same 70 visual morphemes ("symbology"), though they use them in differing proportions. This suggested that there is a broad Japanese Visual Language in which genre-specific "dialects" manifest variations on this generalized structure.

The pieces fit (pdf) - This experiment with Carl Hagmann tested participants'  comprehension of sequential images with rapidly presented panels (1 second, half a second) when we switched the positions of panels. In general, switches between panels nearby to each other in the original sequence were easier to comprehend than panel switches across distances, but switches that crossed boundaries of groupings (" narrative constituents") were worse than those within groupings. This provides further evidence that people make groupings of panels, not just linear panel transitions.

Reading without words (pdf) - My project with collaborator Tom Foulsham reports on one of the first controlled eye-tracking studies using comics. We show that people's eyes move through a grid layout largely the same as they would across text—left-to-right and down largely keeping to their row, and looking back mostly to adjacent frames. We also found that people mostly look at the same content of a panel whether shown in a grid or with one panel at a time, but eye fixations to panels from scrambled sequences are slightly more disperse than those to panels in normal sequences.

Meaning above the head (pdf) - This paper with my student Beena Murthy and collaborator Tom Foulsham explored the understanding of "upfixes"—the visual elements that float above characters' heads like lightbulbs or hearts. We show that upfixes are governed by constraints that the upfix needs to go above the head, not next to it, and must "agree" with the facial expression (storm clouds can't go above a happy face). These constraints operate over both conventional and novel upfixes, suggesting that this is an abstract schematic pattern.

The changing pages of comics (pdf) - My student Kaitlin Pederson and I report on her project coding over 9,000 panels in 40 American superhero comics from the 1940s through the 2010s to see how page layout has changed over time. Overall, we argue that layouts over time have become both more systematic as well as more decorative.

Pow, punch, pika, and chu (pdf) - Along with students Nimish Pratha and Natalie Avunjian, we report on their analyses of sound effects in American comics (Mainstream vs. Indy) and Japanese manga (shonen vs. shojo) and show that the structure and content of sound effects differ both within and between cultures.

Sequential images are not universal, or caveats for using visual narratives in experimental tasks (pdf) - This conference proceedings paper reviews some of the research showing that sequential images are not understood universally, and are dependent on cultural and developmental knowledge to be understood.

Finally, I also had a book chapter come out in the book Film Text Analysis by Janina Wildfeuer and John Bateman. My chapter, "From Visual Narrative Grammar to Filmic Narrative Grammar" explores how my theory of narrative structure for static sequential images can also be applied to explaining the comprehension of film. I'll hopefully do a write up of it on this site sometime soon.

It was a big year of publications, and next year will hopefully be just as exciting! All my papers are available on this page.


Tuesday, December 06, 2016

New paper: Pow, punch, pika, and chu

I'm once again excited to announce the publication of another of my students' projects. Our paper, "Pow, Punch, Pika, and Chu: The Structure of Sound Effects in Genres of American Comics and Japanese Manga" is now published in the latest issue of Multimodal Communication. This was another paper derived from student projects in my 2014 class, The Cognition of Comics. The first two authors, Nimish Pratha and Natalie Avunjian, did research projects examining the use of onomatopoeia in genres of Japanese manga and American comics.

The biggest finding in American comics was that onomatopoetic sound effects (Pow!) are used in much greater proportion than descriptive sound effects (Punch!). In fact, though we found some descriptive sound effects in genres of "Independent" comics, we found none in the 10 superhero comics that were analyzed.

In Japanese manga, we found slightly different results. We categorized two types of "sound" effects in manga. Giongo are hearable sounds (crack!) while gitaigo are unhearable qualities (sparkle!). We found that shonen manga used more giongo than gitaigo, while shojo manga had the opposite trend: they used more gitaigo than giongo. In addition, these sound effects in shonen manga were more often written in the katakana script than hiragana, while the reverse occurred for shojo manga.

Overall, our results suggested that different types of comics can be characterized by the way they use "sound effects."

You can download the paper here, or at my downloadable papers page.

Here's Nimish speaking about the project at this past ComicCon (unfortunately cut just slightly short by time):



Abstract:

As multimodal works, comics are characterized as much by their use of language as by the style of their images. Sound effects in particular are exemplary of comics’ language-use, and we explored this facet of comics by analyzing a corpus of books from genres in the United States (mainstream and independent) and Japan (shonen/boys’ and shojo/girls’). We found variation between genres and between cultures across several properties of the content and presentation of sound effects. Foremost, significant differences arose between the lexical categories of sound effects (ex. onomatopoetic: Pow! vs. descriptive: Punch!) between genres within both culture’s works. Additionally, genres in Japanese manga vary in the scripts used to write sound effects in Japanese (hiragana vs. katakana). We argue that, in English, a similar function is communicated through the presence or absence of textual font stylization. Altogether, these aspects of variation mark sound effects as important carriers of multimodal information, and provide distinctions by which genres and cultures of comics can be distinguished.

Reference:

Pratha, Nimish K., Natalie Avunjian, and Neil Cohn. 2016. Pow, punch, pika, and chu: The structure of sound effects in genres of American comics and Japanese manga. Multimodal Communication. 5(2): 93-109.

Tuesday, November 22, 2016

New paper: The changing pages of comics

I'm excited to announce the publication of our latest paper, "The changing pages of comics: Page layouts across eight decades of American superhero comics" in the latest issue of Studies in Comics. This was a student project undertaken by the first author, Kaitlin Pederson, from my 2014 class the Cognition of Comics. She analyzed how page layouts have changed over time in American superhero comics from the 1940s to the 2010s. This is the first published, data-driven paper using corpus analysis on page layouts in comics, so that's quite exciting!

Kaitlin went panel-by-panel in these books analyzing various properties of their page layouts. She coded over 9,000 panels across 40 comics. Some of these features are captured in this figure:


She found that certain features have decreased over time (horizontal staggering, etc.), while others have increased over time (whole rows, etc.). Overall, her conclusion is that pages in earlier comics were fairly unsystematic in their layouts, while over time they grew to be more systematic, and at the same time more treating the page as a whole "canvas." This is complemented especially by changes towards using "widescreen" layouts in pages over the past two decades.

You can download the paper here (pdf), or at my downloadable papers page.

Also, here's Kaitlin at Comic-Con 2015 reporting on her initial analyses of this project:



Full abstract:


Page layouts are one of the most overt features of comics’ structure. We hypothesized that American superhero comics have changed in their page layout over eight decades, and investigated this using a corpus analysis of 40 comics from 1940 through 2014. On the whole, we found that comics pages decreased in their use of grid-type layouts over time, with an increase in various non-grid features. We interpret these findings as indicating that page layouts moved away from conventional grids and towards a “decorative” treatment of the page as a whole canvas. Overall, our analysis shows the benefit of empirical methods for the study of the visual language of comics. 

Reference:

Pederson, Kaitlin and Neil Cohn. 2016. The changing pages of comics: Page layouts across eight decades of American superhero comics. Studies in Comics. 7(1):7-28

Sunday, September 18, 2016

New paper: Meaning above the head

I'm happy to announce that our new paper, "Meaning above the head" is now published in the Journal of Cognitive Psychology! This one explores the structure of "upfixes" which are the class of visual signs that float above character's heads, like lightbulbs or hearts.

In my book, The Visual Language of Comics, I made a few hypotheses about these elements. First, I argued that they were bound by a few constraints: 1) they are typically above the head, and are weird when moved to the side. 2) the upfix has a particular "agreement" relationship with the face (e.g., storm clouds go with a sad face, but are weird with a happy face). Also, I argued that upfixes are an abstract class, meaning they can easily allow for new ones, though they won't be quite as comprehensible as conventional ones (as in the image below).

With these hypotheses stated, my enterprising student Beena Murthy set out to test these ideas as part of a experiment she ran for a class project (many of the projects from that class are now published). We were then joined by my collaborator Tom Foulsham who aided us in testing additional questions in a second experiment (which you may have taken online!).

Lo and behold, most all of my hypotheses appear to be borne out! Overall, this means that upfixes use particular constraints in their construction, and allow for the creation of new, novel signs! We now plan to follow up these experiments with several more.

Check out the paper, which is available on my Downloadable Papers page, or directly here: PDF.

AND... don't forget that you can also get awesome t-shirts with both the normal and unconventional upfixes. The shirt designs (which are the in this post images) actually feature our stimuli from the experiments!


Abstract:

“Upfixes” are “visual morphemes” originating in comics where an element floats above a character’s head (ex. lightbulbs or gears). We posited that, similar to constructional lexical schemas in language, upfixes use an abstract schema stored in memory, which constrains upfixes to locations above the head and requires them to “agree” with their accompanying facial expressions. We asked participants to rate and interpret both conventional and unconventional upfixes that either matched or mismatched their facial expression (Experiment 1) and/or were placed either above or beside the head (Experiment 2). Interpretations and ratings of conventionality and face–upfix matching (Experiment 1) along with overall comprehensibility (Experiment 2) suggested that both constraints operated on upfix understanding. Because these constraints modulated both conventional and unconventional upfixes, these findings support that an abstract schema stored in long-term memory allows for generalisations beyond memorised individual items.

Full reference:

Cohn, Neil, Beena Murthy, and Tom Foulsham. (2016). Meaning above the head: combinatorial constraints on the visual vocabulary of comics. Journal of Cognitive Psychology. 28(5): 559-574.

Tuesday, August 30, 2016

Dispelling myths about comics page layout

There are many websites and twitter accounts that give advice about how to draw comics, and perhaps no other piece of "advice" arises more than the repeated advocacy to avoid page layouts like the one in the image to the right. Advice-givers claim that this layout is confusing because a reader may not know whether to follow their usual left-to-right and down "Z-path" from A to C (resulting in a backtrack to B), or whether to go vertically from A to B, then to C. Because of this confusion, this layout is advised to be avoided at all costs, with the fervor of a grammar nazi for the visual language of comics.

This post aims to disentangle what we know and what we don't know about this layout, how people navigate through it, and how it occurs in comics. I here report on what science tells us about this layout, not "gut feelings" or conventional wisdom.

First off, let me give this phenomenon a name. In my papers and my book, The Visual Language of Comics, I labeled this layout as "blockage" because the long vertical panel "blocks" the flow of horizontal navigation. I called the flipped version of this layout (long panel on the left, vertical stack on the right) "Reverse blockage" simply because it was named after.




Is this layout confusing?

I understand why people think that this layout is confusing. That's why it was one of the key elements that I tested in the very first experiment I ever did about comics, conducted at Comic-Con International way back in 2004 (though it took many more years to write it up and get published).

In that first study, I presented people with empty page layouts—with no content in the panels—and asked them to number the panels in the order that they would read them. As in the graph to the side (red bar), blockage was chosen using the Z-path (horizontal reading) at only around 32% of the time. People used the vertical reading roughly 62-68% of the time (see details in the paper). This showed that people actually preferred the vertical reading by a 2 to 1 margin.

Also, one might note in the graph, of all the features we tested in the study, it was the most impactful on pushing people to not use the Z-path of reading. So, of all layout features, this one was the most consistent at pushing people away from the Z-path.

Now, admittedly, my first study was not that carefully controlled as an experiment. It was my first one, after all, and I did it before I even started my graduate training in experimental psychology. I essentially tested lots of different instances of this layout (among other aspects of layout), but I did not explicitly manipulate it to see what variables affect it. In particular, the relationship between vertical staggering and blockage was not clear. So, we did a second study...

In our follow up study (pdf), we more carefully manipulated the layouts to ask the question: what is the relationship between a blockage layout and a "staggered" layout where the gutters are merely discontinuous. We used several page "templates" with basic aspects of layouts that were then filled with our experimental manipulations, which modulated the height of the right-hand border:


I should note that this design meant that it was not obvious that we were testing this phenomenon (we also tested other aspects of layout too). People saw lots of different whole page layouts with lots of variations. This is important, because this attempted to make sure participants were not aware of what was being tested, and thus they could proceed in an unbiased nature (as was true in my first study, though less systematically controlled).

In this study, we found that in "pure blockage" arrangements ("Bottom full"), there was a rate of 91% to go down vertically, and only 9% to go horizontally. This could be modulated by raising the righthand gutter though. The higher the gutter was raised (i.e., the more the stagger), the more likely people were to go horizontally. The data were actually beautiful, and there's another graph below that shows this.

If there is a rate of going vertical at 91%, this is pretty solid evidence that people prefer to read these arrangements vertically. This is not "confused"—there is overwhelmingly consistency. That's why when I see people harping about avoiding this layout, I send around the graphic above and say "no, it's not confusing! Feel free to use it!"

Now, one might say "But these data show that 1 out of 10 people will find this confusing! That's still confusing! Don't use it!" Let me unpack this. First off, almost no scientific study will show 100% of people doing something 100% of the time. Case in point: we didn't even get 100% consistency in reading 2 x 2 panel grids, which should be obvious as to how to navigate (a point I'll return to below).

Second, these data are not counts of people, but are averages across instances in the experiment (each person saw more than one version of the layout—we averaged across them), and then we took an average across participants to analyze. So, it's not 1 out of 10 people, it's that there is a mean rate of 91% for people go down rather than over across individual instances and people.

Third, if you you think that having a rate of 91%/9% is still "confusing" for people's preferences, then bear in mind that's roughly the same rate that people didn't choose the Z-path for arrangements in a 2 x 2 grid that was also found throughout the experiment. The actual graph for our data is to the left. (As I said, it beautifully shows a stepwise pattern for the height of the gutter.) The rates at which people use the Z-path in blockage (red) and for the grid (grey) are essentially the inverse of each other.

In other words, the rate for going horizontal (Z-path) in blockage is the same as going vertical (non-Z-path) in pure grids. So, if you're going to harp on blockage for being confusing, does that mean you're also going to harp on basic grids for being confusing?

Caveats: What these experiments show is that people have intuitions for how to navigate through blockage (and other) layouts in consistent ways that are measurable and quantifiable. These experiments show what people's preferences are; i.e., how they would consciously choose to navigate a layout. And, they do this by using layouts with no content.

It is certainly the case that navigating comics with content is different than those with empty panels. The inclusion of content may push people in different ways, which we can study (color, balloon placement, characters overlapping panels, etc.). But, this is exactly consistent with my theory about page layouts: there are many factors ("preference rules") pushing around how you navigate. For example, if you colored panels A and C blue and colored B yellow, that visual feature might push you towards C instead of B.

However, this isn't your basic preference.  By testing empty panels that don't have these additional influences, we can factor out these additional influences and get at people's basic intuitions. This is how you do science: by isolating structures to understand their influences on a situation.

Finally, since these experiments tested people's preferences, they don't test people's actual behavior. In the one study that has looked at people's behavior with these layouts, a Japanese team found that eye-movements in these layouts caused more looks back and forth ("regressions") than when those same panels were rearranged post hoc. Note though, there were several problems with this experiment (described here). Nevertheless, the results should not be discounted, and they imply that there may be a disconnect between what people's behavior is (like eye-movements), and what their intuited preferences are for navigation. We're currently doing studies to tease this apart.


What about comic reading experience? 

One factor that might could possibly influence how people read comics is their experience. I've shown that the frequency with which people read comics can influence lots of things about how they're comprehended, including people's brain responses. There is a "fluency" for the visual language used in comics. Maybe this could extend to blockage layouts?

In my first experiment, the only thing that modulated whether people used the Z-path in blockage layouts was whether they had any comic reading experience at all. People who said they "never" read comics were significantly more likely to use the Z-path than those who read comics to any degree whatsoever. This is the dotted blue line in the graph to the right.

In our second study, we used a more sophisticated measurement of comic reading expertise called the Visual Language Fluency Index (VLFI) score, which I've used in many studies. We didn't find any significant correlations between VLFI and blockage paths, but we did find an interesting trend. The statistics related to correlations (r-values) increased as the gutter got higher. This suggested that the more the layout used blockage, the more experience in reading comics seemed to matter. But, again, this wasn't statistically significant.


What about different types of comics?

Another factor that might influence this layout is the degree to which it appears in comics of the world. Over the past several years, my students and I have been gathering data about properties from different comics around the world, and this is one of the things we've coded for.

The first study to code for properties of page layout in comics was done by my student, Kaitlin Pederson. She analyzed how page layouts have changed across the last 80 years of American superhero comics. The paper for this study should come out soon, but here is her presentation on this material from Comic-Con of 2015. Essentially, she found that blockage occurs in fairly small proportions in American comics, but it has been increasing in how often it occurs in page layouts over time (that is, it's being used more often more recently), but this trend was only approaching statistical significance.

If it is the case that blockage is increasing in usage over time, that would imply a corollary to cognition. We might expect younger readers (who experience it more) to have less of an issue with it than older readers (who experienced it less frequently). However, in neither study did we show correlations between the age of participants and blockage choices.

In more recent work, we've looked at layouts from across the world. This work isn't published, but it was presented by my students at Comic-Con 2016. We found that blockage is used much more in Asian books (Japanese manga, Chinese manhua) than Western books (US superhero and Indy books, and books from France and Sweden). Paper hopefully being written up soon.

So, might it be the case that the rate at which people read manga (which use more blockage) impacts how they choose to navigate this layout? It doesn't seem to be the case. In my first study, I found no impact of people's reading frequency for manga on blockage layouts. This was actually a surprising finding for me, since my intuition was that blockage occurs more in manga (which we now seem to have data to support), and thus I figured experience reading manga matters. But, the data don't bear this out. I also went back into the data for my second study and looked at whether manga reading had an impact: Nope, no influence.

So, yes, this does vary across comics from different cultures and time periods. However—at least so far (and this could change with more studies)—it seems that the types of comics you read do not impact how you navigate pages. I'll also note, this is different than some other recent findings I have showing that the types of comics you read does impact how your brain comprehends image sequences.

Closing thoughts

In this post, I've discussed what science—not conventional wisdom or hearsay—tells us about "blockage" layouts. I've discussed data from two experiments published in peer-reviewed journals, which show that people are fairly consistent about how they choose to navigate these layouts—at least as consistent as people navigate through grids. This navigation is modulated somewhat by having experience reading comics, but not overwhelmingly. It also seems unaffected by which types of comics people read, even though it appears more in Asian books than Western ones.

At the outset of this post I likened harping on avoiding blockage layouts akin to being a "grammar nazi." I actually think this is an apt analogy. Like blockage, most of the so called "rules" of language upheld by grammar nazis are not actually rules of English grammar. They're gospels hammered into people through rote conditioning, but have little reality in the way English is structured or comprehended. This is the visual language equivalent of one of these "rules."

So, I again say: this is not an overly problematic layout and people are making much ado about nothing. Feel free to use it without worry that people will be confused by it. The visual language of comics is incredibly complex and multifaceted in its structure, and the most complicated and impactful aspects of this structure usually go unnoticed or un-commented on by critics and creators alike. In the scope of that complexity, this layout is fairly minor in terms of people's comprehension of comics. Perhaps it's time to focus on other things?


Sunday, August 28, 2016

August craziness from the east side of the Atlantic

Busy times lately: This marks my first blog post on the east side of the Atlantic! I've now been living in the Netherlands for the better part of August, and things have been a bit crazy moving and getting settled in. We're now entering the new school year here at Tilburg University, where I'm teaching my introductory course on The Visual Language of Comics, and co-teaching Multimodal Communication with Joost Schilperoord. I've even got a keen new departmental profile page!

In between initially moving to the Netherlands and fully moving in, I actually flew back to the States to attend the recent Cognitive Science Society conference in Philadelphia. It was a great conference, especially with the theme of event comprehension.

For my part, I organized a symposium on "Comics and Cognitive Systems" (pdf) which featured an introductory talk by me about how you can use different methods (theory, experimentation, corpus analysis) to converge on greater knowledge of issues than using any one method. This was followed by a great talk by my colleague Joe Magliano about generating inferences in sequential images. My collaborator Emily Coderre then talked about our recent brainwave experiments looking at how autistics process visual narratives compared to sentences. Finally, my collaborator Lia Kendall discussed her behavioral and brainwave studies comparing cartoony and realistic drawing styles. It was an exciting session!

Later on, I gave a second talk about how "Sequential images are not universal" (pdf). This presentation was a caveat to people who use sequential images/comics for experiments and outreach believing that they are universally understood and transparent. I then showed cross-cultural research showing that not everyone understands sequential images, and developmental work showing that understanding of the sequence as a sequence falls along a particular developmental trajectory.

More updates coming soon including a recently published paper, a few more papers about to be published, and hopefully video from recent presentations like Comic-Con.

Tuesday, July 19, 2016

Moves, Comic-Con, and one crazy summer!

Summer has been a wild one so far, and only getting crazier. In addition to several conferences in the US, I'm now preparing to move to The Netherlands for my new job as an assistant professor at Tilburg University, in the Tilburg center for Cognition and Communication! It's a big move, but I'm very excited to be joining an exciting department with colleagues doing very interesting work.

As if having movers come take all my stuff wasn't crazy enough this week, we've got Comic-Con here! Following on the success of last year's panel, I'll again be chairing a panel with my undergraduate researchers. They'll be presenting about their projects related to cross-cultural diversity in different comics. In these studies, researchers go panel by panel coding different dimensions of structure outlined in visual language theory. We then analyze this data to look at the trends displayed by different cultures.

This year's talk is Friday at 10:30am in room 26AB, as part of the Comic Arts Conference:

Data-Driven Comics Research

Recent work analyzing comics has turned to scientific methods. Neil Cohn (Tilburg University) will chair this panel discussing projects that annotate properties of comics from around the world, and discuss growing efforts for analyzing comics within the cognitive sciences. Then, presentations by Jessika Axner (University of California, San Diego) and Michaela Diercks (University of California, San Diego) will explore the differences between the structures found in comics from America, Japan, Hong Kong, and various European countries, such as France and Sweden. Finally, Nimish Pratha (University of California, San Diego) will describe how sound effects differ across genres of American comics and Japanese manga. Together, these presentations show the benefits of a data-driven, scientific approach to studying comics.



I'll also be signing on Thursday and Friday afternoons at the Reading with Pictures booth (#2445) which has been kind enough to again let me join them.

Hope to see you there!

Friday, April 22, 2016

New paper: Reading without words

One of the most frequent questions that people ask about reading comics is "what are people's eyes doing when comprehending comics?" More and more, I see people planning eye-tracking experiments with comics where eyes are recorded for where they move across a page or strip. I've reviewed a number of these studies on this blog, and many use fairly ad hoc methods without systematically manipulating elements within the experiment.

I'm very proud to say that my new paper, "Reading without words: Eye-movements in the comprehending of comic strips," with Tom Foulsham and Dean Wybrow in the journal Applied Cognitive Psychology, addresses these very issues. I was very happy to collaborate with Tom on this project, and it should be the first of several papers related to eye-tracking that we will produce. To our knowledge, this is the first paper on eye-tracking of comics that systematically manipulates aspects of the sequences in controlled experiments outside of free-form reading or without resorting to post-hoc alterations.

We did two experiments where participants read Peanuts comics in either a coherent or scrambled sequence. In Experiment 1, participants were presented with each panel one-at-a-time, while Experiment 2 presented them in a 3x2 grid. Overall, we found that people had more dispersed eye-movements for the scrambled strips, which also created more "regressions" (looks backward) to other panels in the sequence. This study also addressed a few myths of how comics are understood:

1) By and large, reading order in the 3x2 grid resembled that of text—a left-to-right and down motion with regressions to adjacent units. There was no "scanning" of the page prior to reading, as some have claimed.

2) We also found no real difference in eye-movements for the content of panels between layouts. That is, changing the layout did not affect the comprehension of the sequence.

You can download the paper on my Downloadable Papers page or click here for a pdf.

Here's the full abstract:

"The study of attention in pictures is mostly limited to individual images. When we ‘read’ a visual narrative (e.g., a comic strip), the pictures have a coherent sequence, but it is not known how this affects attention. In two experiments, we eyetracked participants in order to investigate how disrupting the visual sequence of a comic strip would affect attention. Both when panels were presented one at a time (Experiment 1) and when a sequence was presented all together (Experiment 2), pictures were understood more quickly and with fewer fixations when in their original order. When order was randomised, the same pictures required more attention and additional ‘regressions’. Fixation distributions also differed when the narrative was intact, showing that context affects where we look. This reveals the role of top-down structures when we attend to pictorial information, as well as providing a springboard for applied research into attention within image sequences."



Foulsham, Tom, Dean Wybrow, and Neil Cohn. (2016) Reading Without Words: Eye Movements in the Comprehension of Comic Strips. Applied Cognitive Psychology. 30: 566-579

Tuesday, March 29, 2016

Dispelling myths of comics understanding

In reading through various works about comics understanding, I keep hearing several statements repeated over and over. But, several of these statements are not reflective of the way people actually understand comics. So, I'd like to go through several of these myths about "understanding comics" and explain why they aren't true:

1. Page layout ≠ sequential image understanding

This is one of the biggest recurring myths that I see, and was leveled against both McCloud's panel transitions (see below) and my own model of narrative grammar. Somehow, because we focus on the relations between panel content, it is seen as denying the "meaning" that arises from a page layout. This myth conflates content and layout

Content is the meaningful connections between the depictions within panels. Page layout is the physical arrangement of panels on a canvas (like a page). While there are some cases where page layout can factor into the relations between panel content, these are not the same thing and are fully independent structures.

First off, it is easy to see that layout and content are different because you can rearrange panels into different layouts and it does not change the meaning. So long as the order of panels remains the same, then it doesn't matter if six panels are in a 2 x 3 grid, 6 panels vertically, or 6 panels horizontally. Now, you may end up manipulating the visual composition of panel relations by rearranging them, but that is still not necessarily the same as the "understanding" that is derived from the relations between the meaningful content in images.

Second, we also know that page layout is different from content because we've done experiments on them. In two separate studies, we gave people comic pages with empty panels and asked them to number the panels in the order they'd read them. We find that people choose consistent orderings of panels, even in unusual patterns, that does not rely at all on panel content (pdf, pdf). That is, knowing how to "read" a page is not contingent on panel content.

Also, in a recent study we actually tested the difference between single-panel viewing and panels arranged in a 3 x 2 grid. We effectively found little difference between what people look at and how they understand the panels in the different "layouts." In this case, alterations of layout makes no difference on comprehension of content.

Finally, when we do an experiment where we present panels one at a time, it is not a confound of how people actually understand sequential images. In fact, it is the opposite. These types of experiments aren't looking at "sequential image understanding" in general, but each experiment focuses on specific aspects of comprehension. In doing science, you want to factor out as many variables as you can so that your experiment focuses on the specific thing you are examining. If we included page layout, that is one additional variable that might confound the actual specific things we're examining. 


2. There is no "closure" that takes place between juxtaposed images

I have great fondness for McCloud's seminal theories on comics, but we've reached a point where it is clear that several of his claims are not accurate. In particular, the belief that we "fill in the gap" between panels is simply not correct. While it is the case that readers definitely make inferences for information that is not depicted, this inference does not occur "in the gutter" and also not in panel-to-panel juxtapositions (pdf).

We've now shown evidence of this for several experiments. First, when we do examine how people make inferences (like in action stars in the image to the left, or when that image would be deleted), the evidence for understanding missing information is not reflected in the gap between images, but in understanding the content of the second image relative to the first (or relative to several previously). We see this with slower reading times to the panel after an inference (pdf), or to particular brain responses associated with "mental updating" at the panel after an inference (pdf).

Also, we've shown that people make predictions about what content will come in subsequent images. In one study, we showed that disrupting a sequence within a grouping of panels is worse than disrupting it between groupings (pdf, video). Crucially, the disruption within the first grouping was understood worse than at the break between groupings. Because both of these disruptions appeared before the break between images, people could not have been using the "panel transition" at the break as a cue for those groupings. Rather, people's brains had to have been predicting upcoming structure. This means that there was no "filling in the gaps" because the subsequent image relationship had not yet even been reached.


3. Not all juxtaposed image relationships are meaningful somehow

There is a pervasive myth that somehow all possible juxtapositions of panels can somehow be meaningful. This implies that, no matter what panels are arranged, some type of meaning will be construed. Because of this, it implies that any image has roughly equal probability of appearing after any other panel. This is simply untrue. Not only do we show that different panel relations are understood by the brain in different ways, but people also will choose to order panels in particular orders, not random choices. This emerges in discriminations that...

1. Scrambling the orders of panels is worse than coherent narrative orders (pdf, pdf)
2. Fully random panels pulled from different comics are worse than narrative orders and those where random panels share meaningful associations (like being random panels, but all about baseball) (pdf)
3. Switching the position of some panels in a sequence is worse than others—the longer the switch, the worse the sequence. Also, switches across groupings of panels is worse than within groupings (pdf)
4. People choose to omit some panel types more than others. Those same types are also recognized when missing more often than those that are chosen to be deleted less. (pdf)
Etc.
Etc.

You can also just ask people about relations between panels: if you give them a "yes/no" rating of whether panel relations are comprehensible, they will consistently say that those expected to be anomalous are indeed incongruous. Or if they rate sequences on a 1 to 7 scale, they will consistently rate the incongruous ones as lower than the congruous ones. While conscious interpretation can be somewhat problematic (see below), people are fairly uniform in their assessment of whether something "makes sense" or does not.


4. Consciously explaining a relation between panels is different than an immediate, unconscious brain response.

This one is particularly important, especially for understanding experimental results like reaction times or brain responses. When you derive meaning from a relationship between panels, your brain will respond in a way that attempts to integrate those meanings together. Sometimes, no relation may be made, and you can measure this relationship by comparing different types of relations to each other. This brain response is also very fast: Brainwaves reveal that people recognize the difference between incongruous or out-of-place panels and congruous panels in a sequence in less than half a second.

This type of "meaning" is different than what might be consciously explained. Sure, you may be able to concoct some far flung way in which two totally unrelated images might have some relationship. However, this post hoc conscious explanation does not mean that is the way you brain is deriving meaning from the relation between images, and is far slower than that brain process.

In fact, such explanations are evidence in and of themselves that the relationship may be incongruous: if you have to do mental gymnastics and consciously explain a relation, you are compensating for the clear lack of a relationship that actually exists between those images.

-----

Want more advice about how to do research on the visual language of comics? Check out this paper and these blog posts.

Wednesday, February 17, 2016

Mayan visual narratives in the BBC!

I'm very happy to say that David Robson over at the BBC has a new article out discussing Jesper Nielsen and Søren Wichmann's chapter in my new book, The Visual Narrative Reader.  Their chapter, and the BBC article, examine the structural properties of Mayan visual narratives found on the sides of pottery.

There are a lot of great things in their chapter that motivated me to invite them to be a part of the collection. Foremost, they nicely show that these Mayan systems share many properties with the "visual languages" used in comics and other cultures, ranging from the way they show sequences to the way they use text-image relationships and graphic signs like lines to show smells or speech.

In my conception of sequential image systems being like language, there is no one visual language, but rather there are many throughout the world. In addition, just as spoken languages change and die off over time, so do visual languages. The system used in the Mayan visual narratives thus reflects a “(Classic) Mayan Visual Language” tied to a particular time period and location. Similarly, we could identify historical visual languages from different time periods all over the world.

I’ll point out also that this is different than saying that Mayans used “comics.” This is not the case. “Comics” are the context in which we use some visual languages in contemporary society, and casting that label back in time is inappropriate. Rather, they have a visual language that is used in its own context tied to its era.

What makes the Mayan examples nicely illustrative is that they are an older, historical version of this that is preserved in the historical record. The visual language used in sand drawings (also discussed in two chapters of The Visual Narrative Reader.) disappears once it is drawn, because of the medium of sand, while the accompanying gesture/signs and speech disappear because they are spoken verbally. This means there is no historical record of them. But, the Mayan examples on pottery and ceramics are drawn and include writing, those artifacts can provide a window into past human behavior as a multimodal animal.

Finally, what I really liked about this article—beyond the the subject matter—was the way in which the subject matter was analyzed using systematic linguistic methods. I think this nicely shows how much of what has previously been discussed in “art history” can really be transported to the linguistic and cognitive sciences given the theory of visual language. If we’re talking about the structure of older drawing systems, then we’re not discussing “art” per se, but rather are discussing ancient visual languages and their structure. Further focus like this can contribute towards building a study of historical visual linguistics that can then analyze such material the same way as we think of any other type of linguistic system.

Monday, February 01, 2016

New Paper: The pieces fit

Magical things happen at conferences sometimes. Back at the Cognitive Neuroscience Society conference in 2014, I ran into my graduate school friend, Carl Hagmann, who mentioned he was doing interesting work on rapid visual processing, where people are asked to detect certain objects within an image sequence that changes at really fast speeds (like 13 milliseconds). He noticed that I was doing things with image sequences too and thought we should try this rapid pace with visual narratives (similar to this old paper I blogged about).

Lo and behold, it actually happened, and now our paper is published in the journal Acta Psychologia!

Our paper examines how quickly people process visual narrative sequences by showing participants the images from comics at either 1 second or half a second. In some sequences, we flipped the order that images appeared. In general, we found that "switches" of greater distances were recognized with better accuracy and those sequences were rated as less comprehensible. Also, switching panels between groupings of panels were recognized better than those within groups, again showing further evidence that visual narratives group information into constituents.

This was quite the fun project to work on, and it marks a milestone: It's the first "visual language" paper I've had published where I'm not the first author! Very happy about that, and there will be several more like it coming soon...

You can find the paper via direct link here (pdf) or on my downloadable papers page.


Abstract:

Recent research has shown that comprehension of visual narrative relies on the ordering and timing of sequential images. Here we tested if rapidly presented 6-image long visual sequences could be understood as coherent narratives. Half of the sequences were correctly ordered and half had two of the four internal panels switched. Participants reported whether the sequence was correctly ordered and rated its coherence. Accuracy in detecting a switch increased when panels were presented for 1 s rather than 0.5 s. Doubling the duration of the first panel did not affect results. When two switched panels were further apart, order was discriminated more accurately and coherence ratings were low, revealing that a strong local adjacency effect influenced order and coherence judgments. Switched panels at constituent boundaries or within constituents were most disruptive to order discrimination, indicating that the preservation of constituent structure is critical to visual narrative grammar.


Hagmann, Carl Erick, and Neil Cohn. 2016. "The pieces fit: Constituent structure and global coherence of visual narrative in RSVP." Acta Psychologica 164:157-164. doi: 10.1016/j.actapsy.2016.01.011.