Wednesday, December 19, 2012

Spider-Man's naughty adventures within or between panels

A friend of mine recently pointed me to an unusual debate that's raging about a recent Spider-Man comic where Peter Parker apparently is within the mind of Dr. Octopus and is forced to see his memories. One questionable scene occurs with Aunt May, where the scene is unclear whether she and Dr. Octopus kiss (and thus Peter experiences it), or whether more illicit behavior happens:

As my friend wrote to me, "Amid fan outrage, the writer, Dan Slott, actually started hitting up message boards to claim that his intent was to imply a pre-wedding kiss... most fans have assumed that much more went on. Who is right?"

The short answer is: it is entirely ambiguous about what happens. This is exactly the same as the example McCloud shows with the guy with the axe (right). The unseen actions need to be filled in by your mind given the information you are provided. In both cases, you are given preparatory actions, and then shown something different, with only a word balloon to connect to the preparatory action.

The key is that your mind tries to figure out what happened because of the associated word balloon and because the second panel doesn't show the action. It's worth emphasizing that, contrary to McCloud's claims, the "filling in of the information" does not happen between panels, but within that underspecified second panel. In both cases, the authors choose not to show the action.

In the Spider-Man example, the first panel is also slightly underspecified. It hints at Aunt May about to kiss, but the action isn't defined well enough. The "No Stop!" balloon reads as anticipatory, suggesting that something is about to happen. The most direct answer then would be a kiss in the second panel, because it's set up directly in the previous panel (just like the axe sets up chopping in the second panel).

However, there is nothing to prevent other interpretations, since the first panel's "preparatory kiss" wasn't drawn that clearly and the second panel is entirely ambiguous because all you see is a door and all cues come from the word balloon. The question then becomes how associated is the word balloon "Ahhh!" with just kissing or with something else.

(It is also worth mentioning the implication of duration that an action takes place. Kisses are can be single pointed actions, or can be drawn out (as can...ahem... other actions). Showing a door doesn't give any further information about the duration of time passing in the hidden event—this is also only suggested by the word balloon.)

So... from a structural perspective, there is no real "right answer." It is a (likely intentionally) suggestive and ambiguous depiction, but it nicely plays with properties of the visual language grammar to elicit varying interpretations.

For more information on these types of concerns, see my paper The limits of time and transitions (pdf here).

Thursday, December 13, 2012

Basic structures of visual language

One of the important basic tasks of doing research on the visual language used in comics is to identify the foundational components that go into our comprehension of sequential images. In Understanding Comics, McCloud implicitly broke down the medium into a few parts:

1. Graphic Style
2. Iconography and symbolism
3. Panel-to-panel relationships
4. Text-image relationships

These parts provided a nice initial foray into how the visual language of comics might be segmented. However, the crux of my research outlines that the structure of sequential images actually breaks down similarly to language, and can thereby be researched using similar tools. This gives us several components of the visual language of comics, many of which tie to McCloud's:

1. Graphic Structure is how we understand the visual pieces of an image. Are certain junctions of lines more appropriate for certain parts of an image? How do we understand lines and shapes? This is the equivalent of studying the sound properties of language, only here in the visual-graphic form.

2. The Lexicon is the vocabulary of systematic pieces used to create images and sequential images. These might range from the morphology of visual conventions (like motion lines) to systematic full panels (like those from Wally Wood's 22 Panels) or patterns of storytelling (like the set up-beat-punchline pattern). Basically, anything that is used as a pattern is a part of the lexicon of visual language.

2.2. Morphology is a particular part of the lexicon that deals with small components of meaning (like McCloud's iconography and symbols). However, morphology also includes the principles for how these parts combine together. Why do stars above heads mean one thing but replaced for eyes mean something else? Why can't lightbulbs also replace eyes to mean inspiration, like they do floating above heads? Why do motion lines always trail behind objects, but not in front?

3. Event Structure is how people understand the nature of events, and in sequential images we may have to rely on knowledge about parts of an event to understand the whole. If an image shows a person punching another, we infer that the puncher reached back their arm first. We also need to be able to make sense of the connections in meaning between and across panels.

4. Spatial Structure has to do with the knowledge that panels convey information about a fictitious spatial location. Each panel only frames a glimpse of this location, and our minds build the overall space. If one panels shows the exterior of a house and the next shows someone sitting at a table, how do we know that they are inside that house without overt cues? If panels in a sequence only depict individual characters, how do we know they belong to the same broader environment?

5. Narrative Structure is how we make sense of the meaning of a sequence of images—its grammar. The event or spatial structures convey the meaning of a sequence, but this meaning is guided through its presentation in a narrative structure. Why delay the climax of a sequence until after several lead-in panels? Why show a scene where each panel shows individual characters instead of all characters in just one panel? These have to do with the presentation of meaning, not just the meaning itself.

6. Navigational Structure is the system used to move through a page layout. Why do people read from left-to-right in America instead of vertically down-then-up? What happens when layouts depart from simple grids? These issues go beyond just the meaningful connections between panels and have to do with a reader's preferences for how to move from panel-to-panel on a page.

7. Multimodality is the phenomenon of getting information from different domains. In this case, we receive information from both text and image, and thus need to explore how these multiple signals cohere to form a single conception (or, in reverse for creation: how a single conception results in multiple signals).

These are the broad components at work in comprehending sequential images. Many questions have yet to be answered about their parts and their relationships. And, of course, we can also ask how these components might differ across cultures, how people learn these conventions, and how their understanding changes over development.

Importantly, when we look at these components through a linguistic or cognitive perspective, we can't simply think about it terms of the components of the medium. Rather, we must think about these components in terms of what authors or readers must know in order to create/understand this visual language.

In other words, it shifts the focus to what's going on in people's minds and brains. Because of this type of shift, we can then ask how this knowledge may be similar or different from what we know about other cognitive systems, spoken and signed languages in particular.

Tuesday, December 04, 2012

Blog-iversary and updates

So, today marks 7 years that my blog has now been online, and 10.5 years for the website. How time flies!

I've received a lot of great feedback from people about my profile in Discover Magazine, so it's worth having a post here to just review what I'm working on. My original study described in the Discover article can be found here (pdf), or a short, "comic" version here (pdf). I do have a few other brainwave studies that examine sequential images (from my dissertation), though they are still being written up for publication.

All of these papers are based on a theoretical model of a "narrative grammar" I've been developing for the past 12 years. The seeds of that theory appear in my book, Early Writings on Visual Language, though the approach in there has been far surpassed by my recent work. A concise version of this narrative grammar is set to be published by Cognitive Science soon.

In addition to that, I will have a new book out next year published by Bloomsbury called, The Visual Language of Comics: Introduction to the Structure and Cognition of Sequential Images. It will lay out the basics of my full theory of visual language, as well as summarize the experimental work done in psychology about how sequential images are comprehended. I'm very excited about the book, since it finally lays out the broader picture of my theories, and should provide a solid foothold for people who are interested in this research. I believe we're looking at a Fall release, so stay tuned for more updates as it gets closer...

Finally, I'm working on new research as part of my postdoctoral fellowship here at the Center for Research in Language at UC San Diego. We're currently designing a new brainwave study that looks at the intersection of how people make predictions and "fill in" missing information in a visual sequence.

These all look to be the foundations of a growing field, so I hope you stick around to see how things look once we really pick up steam.