Wednesday, December 19, 2012

Spider-Man's naughty adventures within or between panels

A friend of mine recently pointed me to an unusual debate that's raging about a recent Spider-Man comic where Peter Parker apparently is within the mind of Dr. Octopus and is forced to see his memories. One questionable scene occurs with Aunt May, where the scene is unclear whether she and Dr. Octopus kiss (and thus Peter experiences it), or whether more illicit behavior happens:



As my friend wrote to me, "Amid fan outrage, the writer, Dan Slott, actually started hitting up message boards to claim that his intent was to imply a pre-wedding kiss... most fans have assumed that much more went on. Who is right?"

The short answer is: it is entirely ambiguous about what happens. This is exactly the same as the example McCloud shows with the guy with the axe (right). The unseen actions need to be filled in by your mind given the information you are provided. In both cases, you are given preparatory actions, and then shown something different, with only a word balloon to connect to the preparatory action.

The key is that your mind tries to figure out what happened because of the associated word balloon and because the second panel doesn't show the action. It's worth emphasizing that, contrary to McCloud's claims, the "filling in of the information" does not happen between panels, but within that underspecified second panel. In both cases, the authors choose not to show the action.

In the Spider-Man example, the first panel is also slightly underspecified. It hints at Aunt May about to kiss, but the action isn't defined well enough. The "No Stop!" balloon reads as anticipatory, suggesting that something is about to happen. The most direct answer then would be a kiss in the second panel, because it's set up directly in the previous panel (just like the axe sets up chopping in the second panel).

However, there is nothing to prevent other interpretations, since the first panel's "preparatory kiss" wasn't drawn that clearly and the second panel is entirely ambiguous because all you see is a door and all cues come from the word balloon. The question then becomes how associated is the word balloon "Ahhh!" with just kissing or with something else.

(It is also worth mentioning the implication of duration that an action takes place. Kisses are can be single pointed actions, or can be drawn out (as can...ahem... other actions). Showing a door doesn't give any further information about the duration of time passing in the hidden event—this is also only suggested by the word balloon.)

So... from a structural perspective, there is no real "right answer." It is a (likely intentionally) suggestive and ambiguous depiction, but it nicely plays with properties of the visual language grammar to elicit varying interpretations.

For more information on these types of concerns, see my paper The limits of time and transitions (pdf here).

Thursday, December 13, 2012

Basic structures of visual language

One of the important basic tasks of doing research on the visual language used in comics is to identify the foundational components that go into our comprehension of sequential images. In Understanding Comics, McCloud implicitly broke down the medium into a few parts:

1. Graphic Style
2. Iconography and symbolism
3. Panel-to-panel relationships
4. Text-image relationships

These parts provided a nice initial foray into how the visual language of comics might be segmented. However, the crux of my research outlines that the structure of sequential images actually breaks down similarly to language, and can thereby be researched using similar tools. This gives us several components of the visual language of comics, many of which tie to McCloud's:

1. Graphic Structure is how we understand the visual pieces of an image. Are certain junctions of lines more appropriate for certain parts of an image? How do we understand lines and shapes? This is the equivalent of studying the sound properties of language, only here in the visual-graphic form.

2. The Lexicon is the vocabulary of systematic pieces used to create images and sequential images. These might range from the morphology of visual conventions (like motion lines) to systematic full panels (like those from Wally Wood's 22 Panels) or patterns of storytelling (like the set up-beat-punchline pattern). Basically, anything that is used as a pattern is a part of the lexicon of visual language.

2.2. Morphology is a particular part of the lexicon that deals with small components of meaning (like McCloud's iconography and symbols). However, morphology also includes the principles for how these parts combine together. Why do stars above heads mean one thing but replaced for eyes mean something else? Why can't lightbulbs also replace eyes to mean inspiration, like they do floating above heads? Why do motion lines always trail behind objects, but not in front?

3. Event Structure is how people understand the nature of events, and in sequential images we may have to rely on knowledge about parts of an event to understand the whole. If an image shows a person punching another, we infer that the puncher reached back their arm first. We also need to be able to make sense of the connections in meaning between and across panels.

4. Spatial Structure has to do with the knowledge that panels convey information about a fictitious spatial location. Each panel only frames a glimpse of this location, and our minds build the overall space. If one panels shows the exterior of a house and the next shows someone sitting at a table, how do we know that they are inside that house without overt cues? If panels in a sequence only depict individual characters, how do we know they belong to the same broader environment?

5. Narrative Structure is how we make sense of the meaning of a sequence of images—its grammar. The event or spatial structures convey the meaning of a sequence, but this meaning is guided through its presentation in a narrative structure. Why delay the climax of a sequence until after several lead-in panels? Why show a scene where each panel shows individual characters instead of all characters in just one panel? These have to do with the presentation of meaning, not just the meaning itself.

6. Navigational Structure is the system used to move through a page layout. Why do people read from left-to-right in America instead of vertically down-then-up? What happens when layouts depart from simple grids? These issues go beyond just the meaningful connections between panels and have to do with a reader's preferences for how to move from panel-to-panel on a page.

7. Multimodality is the phenomenon of getting information from different domains. In this case, we receive information from both text and image, and thus need to explore how these multiple signals cohere to form a single conception (or, in reverse for creation: how a single conception results in multiple signals).

------
These are the broad components at work in comprehending sequential images. Many questions have yet to be answered about their parts and their relationships. And, of course, we can also ask how these components might differ across cultures, how people learn these conventions, and how their understanding changes over development.

Importantly, when we look at these components through a linguistic or cognitive perspective, we can't simply think about it terms of the components of the medium. Rather, we must think about these components in terms of what authors or readers must know in order to create/understand this visual language.

In other words, it shifts the focus to what's going on in people's minds and brains. Because of this type of shift, we can then ask how this knowledge may be similar or different from what we know about other cognitive systems, spoken and signed languages in particular.

Tuesday, December 04, 2012

Blog-iversary and updates

So, today marks 7 years that my blog has now been online, and 10.5 years for the website. How time flies!

I've received a lot of great feedback from people about my profile in Discover Magazine, so it's worth having a post here to just review what I'm working on. My original study described in the Discover article can be found here (pdf), or a short, "comic" version here (pdf). I do have a few other brainwave studies that examine sequential images (from my dissertation), though they are still being written up for publication.

All of these papers are based on a theoretical model of a "narrative grammar" I've been developing for the past 12 years. The seeds of that theory appear in my book, Early Writings on Visual Language, though the approach in there has been far surpassed by my recent work. A concise version of this narrative grammar is set to be published by Cognitive Science soon.

In addition to that, I will have a new book out next year published by Bloomsbury called, The Visual Language of Comics: Introduction to the Structure and Cognition of Sequential Images. It will lay out the basics of my full theory of visual language, as well as summarize the experimental work done in psychology about how sequential images are comprehended. I'm very excited about the book, since it finally lays out the broader picture of my theories, and should provide a solid foothold for people who are interested in this research. I believe we're looking at a Fall release, so stay tuned for more updates as it gets closer...

Finally, I'm working on new research as part of my postdoctoral fellowship here at the Center for Research in Language at UC San Diego. We're currently designing a new brainwave study that looks at the intersection of how people make predictions and "fill in" missing information in a visual sequence.

These all look to be the foundations of a growing field, so I hope you stick around to see how things look once we really pick up steam.

Wednesday, November 28, 2012

Discover Magazine article

I'm proud to say that I'm featured in an article in this month's Discover Magazine! The article was written by the excellent Carl Zimmer, who good-naturedly let me run him through my experiment for the sake of the article. I'd actually read articles by Carl for many years, so it was fun to interact with him  for the interview.

I should note a slight correction to the the reported results of my study. While the difference between the Scrambled and Narrative Structure Only sequences did show a "left anterior negativity" (correlated with syntax), the difference between the amplitudes of those sequences and normal ones showed a different waveform, called the N400 (correlated with semantics). So…

Normal vs. Structural Only = N400
Normal vs. Scrambled = N400
Structural Only vs. Scrambled = Left Anterior Negativity

You can read the original article here (pdf), or a short, "comic" version here (pdf).

Overall though, Carl did a great job describing my study and this type of research. I'm very humbled to receive the attention. Go read!

Tuesday, November 20, 2012

Sequence vs. Singles in "visual language"

In my last post, I addressed the basic idea for a "visual language" as being a sequence of meaningful images guided by a system of constraints (i.e., a grammar). In the comments, I was asked a very good question:

Why is a sequence necessary for the graphic form to be considered "language"?

There are two main reasons for this, both which relate to the analogy with verbal and sign languages. As I said in that post, my notion of "language" in "visual language" is not metaphorical, but rather based on conceptions from the linguistic sciences.

The first reasons is structural. Languages are made up of three primary components:

1. The conceptual structure of meaning in the mind
2. A sensory modality they can be expressed in (i.e., sound, body motions, graphics)
3. A "grammar" that guides and constrains the sequential expressions of meaning

In the verbal form, the main grammar is syntactic structure, which allows us to sequentially order words and other expressions into coherent sentences. However, technically all of these components (meaning, modality, grammar) are built of rule-systems that constrain them. The phonological system that guides our production of sounds also is constrained by rules. This is why English cannot have words that start with the sound combination "tf" or why the "c" in elastic goes from a "k" sound to an "s" sound in elasticity. These are rules guiding the modality itself.

The analogy for the graphic form holds these same functions. The primary "grammar" guiding images is a "narrative grammar" which guides the presentation of meaning in coherent sequences. Both syntax and narrative function in the same general way: to present meaning in a coherent sequence. They also share methods of doing this, such as chunking units into groupings, making connections between distant units, and embedding groupings inside each other.

However, single images also have a constraining system which is analogous to phonology. You could call it "photology" or "graphology" perhaps. This system similarly constrains the modality itself. This is why certain junctions of lines are awkward, like when you want to show occlusion (one thing in front of another), but instead of using a "T" shaped junction of lines, you use a "Y" or "+" shaped junction.

So, structurally, single images are guided by a rule system, but that system is closer to that of phonology than syntax.

The second reason for sequence being important comes from analogies with development.

By and large, when people are not exposed to a language within the right time period of life, they won't learn language. They seem to be able to still acquire a limited set of vocabulary (i.e., words) but the most problematic component is the syntax.

Even when people are able to learn a spoken language, but never learn sign language, they can still all gesture. The manual modality doesn't disappear as a way to create meaning—it just functions using single expressions without a grammatical sequence.

This same trend is true of drawing and sequential images. Most people cannot draw a coherent narrative sequence. However, they can all use the drawing system ("photology") to create meaningful single images (albeit rudimentary ones if they haven't fluently developed the vocabulary of the drawing system either).

So, the analogy then holds that single images are to visual language what gestures are to sign languages. One type uses a modality for single novel expressions (single images/gestures) while the other uses complete grammars in sequences of expressions (sign language/visual language). The evidence comes because even a rudimentary form of the simpler expressions (single images/gestures) is maintained even if the full grammar is not developed.

Incidentally, quite a lot of my discussion about the structure of single images and the development of the drawing system is available in my recently published paper, "Explaining 'I can't draw'" available as a pdf here.

Tuesday, November 13, 2012

Revisiting "visual language"

I've now had this website for over 10 years, and have been blogging for almost 6 years, so it may be worth revisiting the fundamental ideas of my research over the next few posts. Hopefully by this time next year my book, The Visual Language of Comics, will be out and describing these ideas in even more detail. Until then...

Let's start with the obvious: What is "visual language"?

There are several ways that the term "visual language" can be used. Sometimes it is used to talk about general visual information or visual culture. It might be used as a broad term for visual culture, or for any combination of text and images. Some people use it to describe creative ways to use writing in pictorial ways.

None of of these are what I mean by "visual language."

This isn't necessarily a bad thing. These and other applications of the term use "language" in a very metaphorical sense, usually by extension to mean "communication."

My meaning of "language" is actually very literal, based on the scientific definitions of language. By extension, my definition of "visual language" is also very specific.

So, what do I mean by "visual language"?

Human beings as a species can only convey our thoughts in three ways: we can 1) create sounds with our mouths, 2) move our bodies (especially hands and faces), and 3) draw things. That's it.

When any of these channels is put into a sequence, such that some sequences are good and others are bad, then the result is a "language." Thus, sequential sounds (words) become spoken languages, sequential body movements become sign languages (as opposed to gestures) and sequential images literally become visual languages.

Given this, individual images are similar to single expressions (which have their own rich structure), while sequential images form a visual language.

So, what is writing? Writing is the learned importation of the spoken form into the visual form (essentially a learned synesthesia). This is not natural, which is why it's so hard to learn to read and write, and why most of the world's languages use no writing systems.

By contrast, the ability to draw sequential images is a natural ability that is accessible to anyone who receives the proper exposure and practice at it.

Given all that, now what about "comics"? Well, comics are the place that we predominantly find these visual languages used. Just like novels are written in English, comics in America are written in American Visual Language. Or, manga in Japan are written in Japanese Visual Language.

And, of course, comics are not just written in the visual language of sequential images, they also use written language. So, technically, comics use two languages that combine to make a larger whole of communication. This is actually similar to the way we communicate generally. We constantly combine modalities: we gesture when we speak, text and image often come together, etc.

From this basic idea, that sequential images literally create a natural visual modality of language,  innumerable other questions emerge about the nature of graphic communication, it's cognition, and how it can be used in society.

Tuesday, October 23, 2012

New Article: Explaining "I can't draw"

I'm happy to say that I have a new article published in the journal Human Development that argues that learning how to draw is similar to learning how to speak.

I've always thought that this was among my best ideas, and apparently the journal agreed: they thought it was provocative enough that they invited two additional scholars to comment on my paper. From one of the reviews:

"Cohn’s paper can be viewed not just as an account of the development of drawing but also as representing a paradigmatic shift in the way we conceptualize the role of nature and nurture in development."

Here's the abstract:
Both drawing and language are fundamental and unique to humans as a species. Just as language is a representational system that uses systematic sounds (or manual/bodily signs) to express concepts, drawing is a means of graphically expressing concepts. Yet, unlike language, we consider it normal for people not to learn to draw, and consider those who do to be exceptional. Why do we consider drawing to be so different from language? This paper argues that the structure and development of drawing is indeed analogous to that of language. Because drawings express concepts in the visual-graphic modality using patterned schemas stored in a graphic lexicon that combine using ‘syntactic’ rules, development thus requires acquiring a vocabulary of these schemas from the environment. Without sufficient practice and exposure to an external system, a basic system persists despite arguably impoverished developmental conditions. Such a drawing system is parallel to the resilient systems of language that appear when children are not exposed to a linguistic system within a critical developmental period. Overall, this approach draws equivalence between drawing and the cognitive attributes of other domains of human expression.
The article is available directly here.

Thursday, October 11, 2012

The Graphic Canon Vol. 2

On the non-theory front, I'm happy to announce that I have a piece in the recently released second volume of The Graphic Canon edited by Russ Kick. It's a collection of great literature, in this case from the 1800s, and adapted into graphic form by various authors. The book is beautiful inside and out.

My contribution is the second of my two versions of John Keats' "La Belle Dame Sans Merci." I'd first read the poem in high school and decided to draw a version during my first semester in college. Shortly after, I realized a second interpretation could be done of it that perhaps was closer to the way most interpret the poem, so I drew a second version that drew from the same layout and composition as the first (along with many of the same full pages).

I'm very proud to be able to contribute to this collection, and am glad its gotten so much attention. My piece pales in comparison to many of the others, so it's definitely worth checking out!

Monday, October 01, 2012

Review: How fast can you comprehend comic panels?

In this study, the authors wanted to know how much time it would take to comprehend each image of a sequence of images, both for how long each panel stayed on the screen, and for how long the time was between each panel ("interstimulus interval" or "ISI"). They compared normal four-panel long strips with sequences where the third panel was reversed with an adjacent panel (1-2-4-3 or 1-3-2-4). 

In the first experiment, they varied the length of time that each panel stayed on screen, 83 milliseconds (ms) and 150ms, with a constant ISI of 300ms between each exposure. They found that the responses to whether the sequence was in correct or incorrect order varied per speed. Panels at 83ms were only correctly responded to 24% of the time, while those at 150ms were correctly responded to 71% of the time. They conclude that 150ms is the minimum time necessary for exposure.

Experiment 2 varied ISI—the time between each panel—keeping each panel exposed on screen for 150ms. They found that accuracy increased as ISI increased. At 133ms, accuracy reached around 70%, staying constant through 217ms and 300ms. They thus conclude that an ISI of more than 130ms is necessary. 

I actually find these numbers to be blazing fast. In my experiments, we used a consistent ISI of 300ms to avoid the effect of panels seeming like they turned into a flipbook style animation. In self-paced reading, the speed of processing panels depended on both the complexity of the panel and its context in the sequence, but people would often average between 700ms or 1 second for reading each panel. In our measure of brainwaves—which are even more sensitive to the timing of the brain's comprehension—we don't fully get a response for activation of recognizing something is awry in meaningful information until starting around 200ms to 250ms at the very soonest.

Thus, I find it highly surprising that an exposure time of 150ms and an ISI of 130ms would be sufficient to get as accurate responses as they did. I would think that these numbers would be the absolute minimum amount of time necessary, and that these numbers may get larger if the panels were more complex (their stimuli looked even more simple than the Peanuts panels we use in our experiments).


ResearchBlogging.orgInui, Toshio, & Miyamoto, Kensaku (1981). The time needed to judge the order of a meaningful string of pictures Journal of Experimental Psychology: Human Learning and Memory, 7 (5), 393-396 DOI: 10.1037//0278-7393.7.5.393

Monday, September 24, 2012

New article: Framing attention in Japanese and American comics

I'm pleased to say that I now have a new article out about the cross-cultural differences between American comics—both Mainstream and Indy comics—and Japanese manga:

"Framing attention in Japanese and American comics: Cross-cultural differences in attentional structure."

I'm particularly excited about this paper, because my co-authors are two former undergraduates at Tufts University—Amaro Taylor-Weiner and Suzi Grossman—who worked very hard and took this project on.

This paper shows further evidence that the panels in Japanese manga structure space differently than the   panels in American comics, regardless of genre. We argue that these patterns connect to deeper differences in cognition that have been found between Americans and Asians (here, Japanese specifically).

Download the new article (pdf)

Download my previous article on cross-cultural differences (pdf)

Full abstract:
Research on visual attention has shown that Americans tend to focus more on focal objects of a scene while Asians attend to the surrounding environment. The panels of comic books— the narrative frames in sequential images—highlight aspects of a scene comparably to how attention becomes focused on parts of a spatial array. Thus, we compared panels from American and Japanese comics to explore cross-cultural cognition beyond behavioral experimentation by looking at the expressive mediums produced by individuals from these cultures. This study compared the panels of two genres of American comics (Independent and Mainstream comics) with mainstream Japanese “manga” to examine how different cultures and genres direct attention through the framing of figures and scenes in comic panels. Both genres of American comics focused on whole scenes as much as individual characters, while Japanese manga individuated characters and parts of scenes. We argue that this framing of space from American and Japanese comic books simulate a viewer’s integration of a visual scene, and is consistent with the research showing cross-cultural differences in the direction of attention.

ResearchBlogging.orgCohn, Neil, Taylor-Weiner, Amaro, and Grossman, Suzanne (2012). Framing attention in Japanese and American comics: Cross-cultural differences in attentional structure Frontiers in Psychology - Cultural Psychology, 3, 1-12 DOI: 10.3389/fpsyg.2012.00349

Monday, September 17, 2012

Prehistoric animation?

One of my research assistants from Tufts, Patrick Bender, sent along this link to an article that claims prehistoric cave paintings from 30,000 years ago actually featured animated figures. The recently published research discusses two types of "animation" found in cave paintings.

First, they claim that the presence of multiple limbs, heads, etc. in cave paintings gave the sense of animated movement when illuminated by the flickering of torchlight. In Reinventing Comics, Scott McCloud argued that this multiplicity in cave paintings implied motion as well, though he didn't jump all the way to claiming it was animation (as in the video below).




I definitely believe that this multiplicity would have implied motion, but the real question is whether it would be fully "animated" by the flicker of torchlight, as they seem to argue. I could understand how torchlight would give the sense of a strobe light, allowing a person to shift attention to different parts of the image, thereby giving the sense of motion.

Nevertheless, I would like to see a video that at least simulates this to really believe it fully. The movie above that makes the case for this nicely shows how the pieces of the pictures could create this effect. However, the movie deletes portions of the image at every step in the animation. A horse with multiple heads on the wall wouldn't do this. It would simply have all the heads all the time, and a viewer would have to try to shift their focus to the other parts throughout the flickering of light.

That's not to say that the animation effect can't happen under these conditions. I'd just like to see a "torchlight" demonstration before I believe it fully.

Their other example I feel is much more compelling though. They claim to have evidence of early "thaumatropes" made of bone, which are small discs that show figures on either side. When spun with a string, they create the illusion of motion (like this animated gif). Toys like these were popular in the Victorian era, but they claim these prehistoric individuals came up with them thousands of years prior. 

These seem a lot more probable as early animation, especially given their evidence that the bone discs had figures on both sides and holes in the middle where string could have been placed. The fact that these more convincing devices supposedly accompanied the cave walls perhaps gives more validity to the animation on the wall paintings as well.

At the very least, I take these examples to be far more credible than other examples of ancient animation that have been reported over the years.

Monday, September 10, 2012

Website down...

You've probably noticed the frequency of my posts increasing. Now that I've finished grad school, driven across America, and flown back from Japan, I'm settling into my postdoctoral fellowship at UCSD's Center for Research in Language, where I will continue my work looking at the cognition of sequential image comprehension.

For the blog, I'm hoping to keep up a pace of about a post a week. In the weeks to come, I look to be having a couple new articles posted, some possible press, as well as a "return to basics" for a few posts. I'll be recapping some of the basic principles and organization underlying this research.

Thursday, September 06, 2012

Lecture on comics and psychology

Here's a lecture by psychologist Barbara Tversky about the understanding of events, spatial cognition, and comics. It's a bit long (an hour) but worth watching. (The parts explicitly about comics start around 25:20. Here's an alternate link where you can jump direct to certain slides)

I've known Barbara and her student John Bresman for many years so it was fun to see. She covers a lot of ground in terms of general cognitive principles at work in the comprehension of sequential images, particularly the more poetic and creative uses of sequential images. I've talked about several of these principles under different terms on this blog and in my papers, and similar lists have been made in various scattered articles for years.

The part of the talk I found most interesting was towards the end, where she describes a recent experiment looking at the depiction of action in panels from comics around the world. On the whole they found that comics from China and America had more action than those from Japan or Italy. The comparison here is that Chinese and English are both languages that can combine the description of an action and the manner by which it happens into one single verb, while Japanese and Italian do not combine these into one verb (i.e. 'to swagger' vs. 'to walk in a swaggering way').

I've wanted to do a study on this comparison for awhile, so I was glad to see the data. It is part of a broader interest that I have in the relationship between thought, verbal language, and visual language.

Thursday, August 30, 2012

Eye-movements for comic panels vs. photos

In the recent article, "Inferring Artistic Intention in Comic Art through Viewer Gaze," the authors examined whether people's eyes are more directed to parts of comic panels than they are when looking at other types of visual phenomena (particularly photos). The aim of the study was to investigate the claim among comic artists "that the artist is able to purposefully direct the visual attention of readers through the pictorial narrative."

The authors used an eyetracking device to measure where people's eyes look when they are reading  individual comic panels (as opposed to across a whole page), as well as photographs taken by experts, amateur photographers and a robot.

They found that participants had far more directed and consistent eye movements towards specific portions of comic panels than were found for photographs, where gaze was far more general. They suggest that these findings show that comic panels direct the flow of attention of their readers. 

I am not sure that these findings fully support their goal to see if "artists purposefully direct the visual attention of readers through the pictorial narrative." This is a fairly vague hypothesis (direct visual attention to what? To the whole image? What does that mean?). For full evidence of this hypothesis, they would need to see the relationship between eye movements across a larger page layout with those in individual panels, assuming that this is what they mean by directing a reader's attention through the narrative.

What's appealing about these data though is the idea that panels—being created to be in sequence—hone a reader's attention to specific parts of panels over others. This is an important finding, and invites follow up experiments that might better explore just what portions of panels might be important or not for the comprehension of a sequence.

However, there are some limitations of their design that may make their overall conclusions a bit premature. My fear in the comparisons they make is that the compared stimuli are not equivalent or counterbalanced appropriately. For example, they use comic panels from Watchmen and an Iron Man book that likely depict figures and actions. In contrast, the photos taken by amateurs were of a variety of topics both of figures and places (online photo albums). In contrast, the robot's photos were almost entirely of static environmental information about the interior of a building. These comparisons in subject matter are not equivalent in their subject matter.

More equivalent stimuli might be able to ask: Would photo versions of panels (as in a photo novella) elicit the same types of eye movements as those in drawn panels? What if the photos also showed figures engaged in actions instead of places)? How are eye movements of comic panels different from other artwork or film shots (where all are designed, but only comics and film intentionally have a sequence)?

It seems that these would be more equivalent comparisons, otherwise it seems like comparing apples and oranges: the stimuli are totally different from each other in nature to begin with. The more important comparison shouldn't be comic panels vs. photos, it has to bear in mind the content of those images.

Full Abstract:
Comics are a compelling, though complex, visual storytelling medium. Researchers are interested in the process of comic art creation to be able to automatically tell new stories, and also, summarize videos and catalog large collections of photographs for example. A primary organizing principle used by artists to lay out the components of comic art (panels, word bubbles, objects inside each panel) is to lead the viewer's attention along a deliberate visual route that reveals the narrative. If artists are successful in leading viewer attention, then their intended visual route would be accessible through recorded viewer attention, i.e., eyetracking data. In this paper, we conduct an experiment to verify if artists are successful in their goal of leading viewer gaze. We eyetrack viewers on images taken from comic books, as well as photographs taken by experts, amateur photographers and a robot. Our data analyses show that there is increased consistency in viewer gaze for comic pictures versus photographs taken by a robot and by amateur photographers, thus confirming that comic artists do indeed direct the flow of viewer attention.


ResearchBlogging.orgJain, Eakta, Sheikh, Yaser, & Jessica Hodgins (2012). Inferring Artistic Intention in Comic Art through Viewer Gaze SAP '12 Proceedings of the ACM Symposium on Applied Perception, 55-62 : 10.1145/2338676.2338688

Tuesday, August 07, 2012

Manga research meeting

I had a great time yesterday meeting with several manga researchers in Japan at Chiba University. I got to present my work to them, and then discussed several projects by other people.

For example, my host, Jun Nakazawa from Chiba University, presented a new study he did in collaboration with an American researcher. In this study, Nakazawa-sensei showed that Japanese participants have better comprehension of sequential images—both strips, Western comics, and Japanese manga—than American participants. This continues his work on looking at how different populations of people (different ages, levels of expertise, etc.) comprehend sequential images, which I have previously discussed.

Another study was presented by a graduate student, Hiromasa Hayashi from the University of Tokyo, who I met at the Cognitive Science Society conference last week. His study examined how the length and number of motions lines affects the perception of speed an object is moving. They had the image of a ball appear to move across a screen, only to have it disappear and the participant press a button when they thought it reached another location on the screen. Faster reaction times appeared to objects with longer lines than shorter lines, and to those with more lines (5 or 8) than less lines (1 line), implying that participants viewed those balls as moving faster.

After our meeting, we went to dinner with Fusanosuke Natsume (blog), who is a former manga artist and well known "comic theorist" in Japan. Natsume-sensei is a wealth of interesting ideas and stories about manga, and is also a martial artist to boot! On Friday, I'll be going to a manga museum with him and Nakazawa-sensei, which should be great fun.

Wednesday, August 01, 2012

Back to the land of manga

It's been 9 years since I was last here, but I just arrived in Japan for a few weeks. I'm currently in Sapporo, on the island of Hokkaido, where I'm attending the conference of the Cognitive Science Society. I'll be speaking on three separate panels about the cognitive implications of differences between American comics and Japanese manga, about the grammar of sequential images, and about evidence from the brain that we organize sequential images into chunks (as opposed to linear transitions).

Later on in my trip, I'll be heading down to Tokyo, where I'll be meeting several Japanese manga researchers for the first time. I'm looking forward to connecting with these international researchers!

Saturday, June 16, 2012

10 years!

It seems I'm always late with these notices, but as of May 31st, emaki.net has been on the Internet for 10 years! I have greatly enjoyed having this outlet to connect my work with everyone, as well as have the interaction with people over the Internet. I look forward to great things to come over the next 10 years. Thanks!

Friday, June 08, 2012

Review: Emotion in manga through loss of hands

This paper continues Charles Forceville's research paradigm of looking at the metaphorical underpinnings of meaning in comics. Here, he and Michael Abbott look at a particular phenomenon in some manga where a character's hands turn into stumps, as a visual metaphor for "loss of control" with regard to the body or emotions, which occurs repeatedly in the manga Azumanga Daioh.

They explain that this iconic convention is a conceptual metaphor—where one domain of concepts is mapped to another. In particular, metaphors aid in understanding abstract concepts in terms of more concrete ones. In this case, the metaphor is "Loss of Control is Loss of Hands." When characters lose control emotionally, their hands disappear. It is noticeable that this technique is very subtle—it is not the primary focus of the images and occurs frequently throughout their analyzed book.

They also mention an interesting corrolary that other characters undergo additional changes to their bodies. One character's feet turn into stumps ("foot loss") in a case of extreme happpiness, while another shrinks to the size of a child without facial features when humiliated. They speculate that an overarching metaphor of "Emotion is Bodily Change" might cover all of these specific cases.

I find these changes interesting from a structural point of view as well. Many abstract signs in the visual language of comics use "substitution" or "suppletion" as a strategy of conveying signs. For example, hearts substitute for eyes to show love, or dollar signs substitute for eyes to show desire for money. In this case, the substitution is a deletion—getting rid of a body part. This makes me wonder if there are other abstract signs in manga that replace the hands, or to what extent other body parts may be deleted.

Official abstract:
Comics and manga have many ways to convey the expression of emotion, ranging from exaggerated facial expressions and hand/arm positions to the squiggles around body parts that Kennedy (1982) calls ‘pictorial runes’. According to Ekman at least some emotions – happiness, surprise, fear, sadness, anger, disgust – are universal, but this is not necessarily the case for their expression in comics and manga. While many of the iconic markers and pictorial runes that Forceville (2005) charted in an Asterix album to indicate that a character is angry occur also in Japanese manga, Shinohara and Matsunaka also found markers and runes that appear to be typical for manga. In this article we examine an unusual signal conveying that a character is emotionally affected in Volume 4 of Kiyohiko Azuma’s Azumanga Daioh: the ‘loss of hands’. Our findings (1) show how non-facial information helps express emotion in manga; (2) demonstrate how hand loss contributes to the characterization of Azuma’s heroines; (3) support the theorization of emotion in Conceptual Metaphor Theory.

ResearchBlogging.orgAbbott, M.,&; Forceville, C. (2011). Visual representation of emotion in manga: Loss of control is Loss of hands in Azumanga Daioh Volume 4 Language and Literature, 20 (2), 91-112 DOI: 10.1177/0963947011402182

Tuesday, May 29, 2012

New article: Comics, linguistics, and visual language

I recently had a new paper published in an exciting book collection, Linguistics and the Study of Comics, edited by Frank Bramlett. The collection looks at various facets of comics under the lens of linguistics, ranging from the structural and cognitive to the socio-cultural.


My own chapter reviews the diverse previous research that looks at sequential images using linguistic methods. As I demonstrate, various approaches from linguistics have looked at comics, including structuralism, generative linguistics, cognitive linguistics, and others. These studies range from full theories of the structure of layout, meaning, and graphic signs, all the way to specific approaches to metaphor in comics. 


Throughout, I argue that the notion of "comics" is separate from the "visual language" that they are written in. I then outline how the structure of this visual language is analogous to spoken and signed languages, and I describe how it can be studied using the same questions that guide the study of those linguistic systems. 


In many ways, this chapter is a precursor to my book that will be published next Fall, and I recommend it for anyone seriously interested in the underlying theory behind sequential image structure.


Go check it out!


Cohn, Neil. 2012. Comics, linguistics, and visual language: The past and future of a field. In Bramlett, Frank (ed). Linguistics and the Study of Comics. New York: Palgrave Macmillan.

Monday, May 21, 2012

More changes

As of yesterday I fully received my PhD (and the fancy hood!) from Tufts University. It was a long and great journey, but now it's on to the next venture. As I mentioned in my last post, I will be spending next year working on an introductory book on visual language theory that is due for Fall of 2013. I've been working hard on it lately and am very excited about it.

I will also be joining the Center for Research in Language at UC San Diego as a post-doctoral fellow starting in September. I'll be further investigating the neurocognition of sequential image processing (i.e., what happens in the brain when people are reading comics). I also plan to start learning techniques for measuring eye-movements, so we can begin to examine what people are looking at in comic panels (and how that relates to what goes on in the brain). I'm very excited for this opportunity and the potential for enlightening new collaborations and research.

In the meantime, hopefully I can start posting on the blog more often. Hopefully...

Sunday, May 06, 2012

A User's Guide to Thought and Meaning

Lo and behold, I now have a new book out! My mentor, Ray Jackendoff, has a new book, A User's Guide to Thought and Meaning, and it is chalk full of illustrations by me! (... along with some choice Zippy strips by Bill Griffiths)

He has been working on this book throughout my time as his student, and I think the result is truly excellent. If you're looking for a good book about language, meaning, thought, and their relations, this is a good, non-technical read. I can't recommend it enough, and not just because my name is on the cover page. Check it out!

From the publisher's description:
A User's Guide to Thought and Meaning presents a profound and arresting integration of the faculties of the mind - of how we think, speak, and see the world. Ray Jackendoff starts out by looking at languages and what the meanings of words and sentences actually do. He shows that meanings are more adaptive and complicated than they're commonly given credit for, and he is led to some basic questions: How do we perceive and act in the world? How do we talk about it? And how can the collection of neurons in the brain give rise to conscious experience? As it turns out, the organization of language, thought, and perception does not look much like the way we experience things, and only a small part of what the brain does is conscious. Jackendoff concludes that thought and meaning must be almost completely unconscious. What we experience as rational conscious thought - which we prize as setting us apart from the animals - in fact rides on a foundation of unconscious intuition. Rationality amounts to intuition enhanced by language. Written with an informality that belies both the originality of its insights and the radical nature of its conclusions, A User's Guide to Thought and Meaning is the author's most important book since the groundbreaking Foundations of Language in 2002.

Monday, April 16, 2012

Review: "Language of Comics" in Art of Comics

The recent compilation, The Art of Comics edited by Aaron Meskin and Royt Cook contains several new articles on a "Philosophical Approach" to comic theory. This book tackles many interesting and pertinent topics about comics, with varying degrees of effectiveness.

Here I'm going to focus on Darren Hudson Hicks' article, "The Language of Comics" since it is especially pertinent to the topics of my research and this blog. Hicks explores the claim that "comics constitute a language" and analyzes that claim in light of Currie's opinions that film cannot be a language. There are many problems with this article, and, I'll focus on some of the largest affronts. In the philosophical tradition then, I don't feel I'd do it any service to pull my punches, so here goes...

In much of this article, Hicks defends the similarities of comics and natural languages, and tries to point out where Currie's arguments fall flat. I should say upfront that I support this aim, since my own work has to respond to the same questions. To be complimentary upfront, I do agree with many of his points, and think that he is overall on the right track with many of his arguments. The biggest problem is his lack of knowing (or just citing) other appropriate literature.

For example, Currie argues that film cannot be considered a language because the signs (shots) are not arbitrary symbols. Hicks then defends comics for having a great deal of conventional signs, while pointing out that language also is not entirely arbitrary. His conclusion is that symbolic and iconic signs actually lie on a continuum, not a discrete categorization.

I agree with Hicks' position overall here and his analysis of it (neither all words nor all images are solely iconic or symbolic). My problem is that Hicks' conclusion misses the mark because of a lack of knowing previous research. He references C.S. Peirce's well known division between iconic, indexical, and symbolic types of reference in discussing the difference (and overlap) between images and words. Yet, he does not acknowledge that within Peirce's system, conventionality is not solely associated with symbols. Peirce recognizes that all three of those types of reference can be conventional, but only symbols derive their meaning solely from conventionality (i.e. Peirce readily would say that smiley faces are Conventional Icons, just like words are most often Conventional Symbols). Had Hicks known this distinction, the idea of a continuum between iconic and symbolic reference would not be needed. This reveals a lack of actually knowing the literature that is being cited.

Another grating section is where Hicks discusses the relationships between panels on a page. Here, he is trying to argue that individual panels on a page interact greatly with other panels. His point is well made, but in his discussion he resorts to reporting where "the eye" moves while reading the example comic page that is reprinted from Xenozoic Tales. As a cognitive psychologist, this grates on my nerves, because if eye-tracking experiments have taught us anything, it's that we often do not consciously know where our eyes are looking. To me, this renders his whole description a bit vacuous, because it is based on a faulty premise that he actually knows where his eyes are looking (which he doesn't: My own first reading of the page completely missed details he claims are "visually prominent" and that my eye should "gravitate towards" showing outright that his analysis might be wrong).

Overall though, the discussion of the relationship between comics and language is framed in the wrong way. Because he mostly adheres to the McCloudian conception that "comics ARE sequential images (± text)", he then must deal with the issue of whether "comics ARE language."

This is the wrong comparison, though it is frequently made. As I have discussed at length for the past 10 years (herehere, here, in talks, and many comixpedia articles and blog posts), "comics" are not a language. Rather, "comics" are a cultural context in which a visual language of sequential images is used, where it often combines with text. Just as novels are written in English, comics are written in a visual language (plus maybe also a written language). Dylan Horrocks hinted towards a similar breakdown in his essay, "Inventing Comics."

If Hicks recognized this disparity, much of his arguments would be simplified, and he would not have to deal with the issue of "defining comics" in relation to language. Also, he would not have to deal with the sticky issue of text-image relationships, since if "comics ARE language, what does it mean for that to enclose another language?" This whole issue is rendered moot if "comics" aren't argued as a language, but "sequential images" are a visual language which combines with written language in a socio-cultural object of "comics". (Another pet peeve here: he unnecessarily appeals to the brain processing text and images differently. Not only is mentioning the brain superfluous, but his citation for this is from over 40 years ago. Again... lit review?).

Not recognizing this argument for separating "visual language of sequential images" and "comics" again shows a lack of reading previous literature. In this case, it's a little personal, because it relates directly to my own work. Despite my work probably being the most vocal advocacy of the relationship between language and sequential images over the past 10 years, nowhere is my work mentioned or cited (though an actual comic of mine is featured and cited in a different essay in the book).

If this omission was on purpose (which I doubt), it raises the issue of "why"? If it was not on purpose (as I suspect), it betrays a lack of basic research on this topic. This is more my issue with the article. It's not so much that my ego is bruised (Horrocks should be mentioned, as should Mario Saraceni's dissertation, and others), but leaving it out seems an oversight in doing the appropriate background research for a paper topic that could greatly benefit from this point of view. (The books editors should also have given feedback on this, especially those I've corresponded with).

Knowing my work would also be useful for his concluding paragraphs. Here, he dismisses the idea of sequential images (re: "comics") being a type of full natural language because he cannot conceive of a "syntax" for sequential images. In fact, my book Early Writings on Visual Language laid out my first model of generative "syntax" for sequential images all the way back in 2003, and my recent research has actually provided empirical evidence for psychological validity of a "grammar" for sequential images. Granted, Hicks does mention two other approaches to "syntax" by Saraceni and Groensteen. But, these approaches receive little attention outside an endnote, and they are not discussed in depth. One would think such an important topic would receive more than passing mention as being a too "difficult concept to wrap one's head around" in the concluding paragraph.

Essentially then, this leaves me with the impression that Hicks is saying, "This idea is beyond me, so I can't address it well enough, and/or it must not actually exist." This does not see like the lasting impression one wants to have about an essay in a book collection purporting to be a solid foundation for a "philosophical" approach to comics.

Tuesday, April 10, 2012

Looking forward and backward

So, after 12 years of research and 6 years of grad school, I'm proud to say that I successfully defended my dissertation yesterday and now have a PhD in Psychology studying the how people's minds and brains understand comics. It's been a wild ride!

What's next then? Well, I have several papers that are set to come out soon or will soon be submitted to journals. Also, this seems like a good a place as any to announce that I will have a new book coming out in late 2013 from Continuum Books. It will introduce visual language theory, and will outline the basic structure and cognition of visual narratives. I'm very excited about it, and will post updates periodically as it approaches.

Thanks, dear readers, for your continued support of this work, and I look forward to big things to come.

Friday, April 06, 2012

Public Defense

For anyone in the Boston area and interested in hearing about comics and the brain, I'm defending my dissertation on Monday, and it's open to the public. I'll be speaking about my research on the "grammar" of sequential images, as found in comics:

"Structure, Meaning, and Constituency in Visual Narrative Comprehension"
Monday, April 9th at 4:30pm
Kreplick Conference room on the first floor of the Tufts Psychology Building (490 Boston Ave, Medford 02155)

Sunday, March 04, 2012

New Article: Comics and the brain

I have a new paper available online (pdf), and I'm proud to say that this is my first brainwave study on comics. In this paper, now published by Cognitive Psychology, we argue that sequential images use a narrative "grammar" to distinguish coherent narrative sequences from random strings of images. We conducted two experiments measuring reaction times and brainwaves to examine the contributions of narrative structure and meaning to processing sequential images. Our findings provide evidence that sequential image comprehension uses a narrative structure that goes beyond "transitions" between panels.

Below is the abstract, though here's a pdf of a "graphic" version of the abstract...
Just as syntax differentiates coherent sentences from scrambled word strings, the comprehension of sequential images must also use a cognitive system to distinguish coherent narrative sequences from random strings of images. We conducted experiments analogous to two classic studies of language processing to examine the contributions of narrative structure and semantic relatedness to processing sequential images. We compared four types of comic strips: (1) Normal sequences with both structure and meaning, (2) Semantic Only sequences (in which the panels were related to a common semantic theme, but had no narrative structure), (3) Structural Only sequences (narrative structure but no semantic relatedness), and (4) Scrambled sequences of randomly-ordered panels. In Experiment 1, participants monitored for target panels in sequences presented panel-by-panel. Reaction times were slowest to panels in Scrambled sequences, intermediate in both Structural Only and Semantic Only sequences, and fastest in Normal sequences. This suggests that both semantic relatedness and narrative structure offer advantages to processing. Experiment 2 measured ERPs to all panels across the whole sequence. The N300/N400 was largest to panels in both the Scrambled and Structural Only sequences, intermediate in Semantic Only sequences and smallest in the Normal sequences. This implies that a combination of narrative structure and semantic relatedness can facilitate semantic processing of upcoming panels (as reflected by the N300/N400). Also, panels in the Scrambled sequences evoked a larger left-lateralized anterior negativity than panels in the Structural Only sequences. This localized effect was distinct from the N300/N400, and appeared despite the fact that these two sequence types were matched on local semantic relatedness between individual panels. These findings suggest that sequential image comprehension uses a narrative structure that may be independent of semantic relatedness. Altogether, we argue that the comprehension of visual narrative is guided by an interaction between structure and meaning.

ResearchBlogging.orgCohn, N., Paczynski, M., Jackendoff, R., Holcomb, P., & Kuperberg, G. (2012). (Pea)nuts and bolts of visual narrative: Structure and meaning in sequential image comprehension Cognitive Psychology, 65 (1), 1-38 DOI: 10.1016/j.cogpsych.2012.01.003


Thursday, February 09, 2012

Art and books you should check out

Here's a few links I've been meaning to post...

First, head over to my friend Helena's website. She's an amazing artist, so go check out her paintings!

Second, if you haven't already been seeing it in airport shelves and bookstores everywhere, you should check out the new book Situations Matter, by my friend and colleague Sam Sommers, a professor in the psychology department here at Tufts. The book explores how the contexts and people find themselves in often affect the way they behave...
"Every day and in all walks of life, we overlook the enormous power of situations—of context—in our lives. Just like the museum visitor neglects to notice the frames around paintings, so do most people miss the influence of ordinary situations on the way they think and act. But frames do matter: your experience viewing the paintings wouldn't be the same without them.

The same goes for human nature."
Go check it out! There's a reason it's been getting so much attention! And here's a short video promoting it as well:


Saturday, February 04, 2012

Downloadable Un-Defining "Comics" article

It's not exactly a new article, but I realized that my article "Un-Defining 'Comics'" from the International Journal of Comic Art way back in 2005 was not downloadable from my site. That is now fixed! A downloadable pdf of the article is now available.

This article was among my first published works on visual language (and it kind of shows...gulp), and is the first that argues for a separation between the idea of "comics" and a "visual language" made up of images (i.e. "comics ≠ sequential images"). Enjoy!

Monday, January 23, 2012

Little busy for a bit...

Unfortunately, I haven't been able to blog as much as I'd like these days, and the coming months are likely to be a little quiet around here. I'm hoping to have a few big announcements to make soon. However, I'm aiming to defend  my dissertation this semester (!), so blog posts are likely to stay sparse for awhile.