Friday, April 25, 2008

Podcast: "Grammar" in visual language

I've done another podcast with the folks at VizThink, this time debating Yuri Engelhardt and Dave Gray on what constitutes a visual language and the nature of visual language grammar.

This new format allows you to skip around to different chapters to jump straight to parts of interest. (Please note, I object to the insinuation in the chapter title that "comics" can equal "visual language"):

Hint: Use the Full Screen Button to see this video in greater detail.

I think that there is something I strived to point out throughout the discussion that I didn't articulate well enough, but to explain it I'll have to do a mini-linguistics lesson.

In the podcast, Yuri pointed out the view that language has two main parts: a set of units (lexicon) and a set of combinatorial rules (grammar). This view of two components is essentially Chomsky's view of grammar, and organizationally looks something like the diagram to the side.

In this traditional view, syntax/grammar is the component that offshoots meaning, and only syntax has properties for combining elements together. I said that I agreed with this notion, but really I don't. When I mentioned that I subscribe to a view from Chomsky's student, Ray Jackendoff (my teacher), I should perhaps have elaborated on the differences between those perceptions more, because they are extremely important and can resolve some of the conflict of the debate.

Jackendoff's view of grammar is different. This "Parallel Architecture" says that the mind has three main interfacing components: modality (auditory/manual/graphic), syntax, and conceptual structure (meaning). The "lexicon" is distributed across the interfaces between all three of these structures — it doesn't have it's own "place." And, importantly, each of these structures has that capacity for infinite combinations — not just syntax. (Note the similarities to my listing of properties of Language). This would look like this:

Much of our debate focused around whether or not single images (diagrams) have "grammar." My objection is that it does not function like "syntax" does in a verbal grammar, though I acknowledge that there might be a hierarchy or a combinatorial system there. If you subscribe to the Chomksyan view of grammar, you're forced to say that the combinatorial element "is syntax," which is exactly what Yuri is doing:

If you follow the Parallel Architecture (as I do), syntax is not the only element that creates hierarchies. They all do. So, combinations within a single image or diagram is "grammar' insofar as phonology is the "grammar" of sound. Essentially, Yuri's "visual grammar" is the combination system within the graphic structures, which is why I kept prodding about the difference between it and just the system of perception (and why most of its "constraints" are based on iconicity). This instead looks like this:

In contrast, my grammar for visual language needs a combinatorial system for individual images and for combining them together, looking like this:

To the extant that the narrative structure takes concepts and a modality and orders them coherently, it functions the same as syntax in verbal language. This is "visual language grammar" analogous to the way that syntax is verbal language grammar (nouns and verbs). But, all three structures have combinatorial properties. They don't all make reasonable analogies to saying that they are like "grammar" in the syntactic sense, but they may be combinatorial.

(This is also why you can say that "gestures are to sign language what individual images are to visual language" in the context of sequential images, but not for individual images. There is no developmental/fluency gap like this for "... visual objects are to individual images". I.e. People don't learn how to draw simple graphic signs but not be able to put them into a diagrammatic arrangement.)

Making this shift in perception buys you a lot: It makes the distinction why single images may have hierarchy (like perception/phonology), but don't have grammar (like syntax). It addresses why most of that structure is guided by iconic and indexical constraints. And, it also may give you a leg up in describing combinatorial aspects of images beyond diagrams (which occurs within panels).

Finally, it is worth noting that not just aspects of language have consistent patterned units that appear hierarchic in structure within our cognitive system. This also appears in music, event structure, vision, social structure, and a myriad of other domains (discussed well here). But, we don't have to call them "languages" because of this broad similarity.

Suggested reading:
Foundations of Language and
Language, Consciousness, Culture, both by Ray Jackendoff

Tuesday, April 22, 2008

Navigating page layouts = defining "comics"?

One of the topics I debated closing out my new essay on page layout (pdf) with was its relationship to McCloud's definition of "comics."

As most know, McCloud's definition is that "comics" are "juxtaposed sequential images in deliberate sequence." Yet, he never places any constraints on that. He means all sequential images are "comics" — regardless of the characteristics of content.

On the one hand, you can say that his notion of closure demands that there is some content to those panels, and that closure is the underlying force behind his definition. However, there are places where this appeal to content is a bit slim. This is most apparent in his panel transitions, where "non-sequitur" as a catch-all for anything his other transitions don't cover. Or, in claiming that empty panels that represent "time" still maintain the essence of "comics." In other words, it doesn't matter what's in them, as long as they're in sequence.

If really all McCloud means by "comics" is that two graphic units are place next to each other, is he really just talking about the system I've proposed for layouts? This system just tells people how to navigate through a comic page — how to read from one panel to another. And, since my experiment used blank comic pages, it has nothing to do with content — just like McCloud's definition.

So, if navigation between panels is really all he's talking about for "comics," isn't that a little... I don't know... unremarkable?

It's also interesting in this interpretation, since (as I note in the paper) McCloud's notion of the Infinite Canvas essentially desires to simplify this navigational structure. Food for thought...

Sunday, April 20, 2008

Clever Dumbo?

The Language Evolution blog posts this interesting youtube video showing an elephant painting a picture of an elephant:

I've vocalized often that as much as there is a debate about whether animals can or do have language (they don't), we know of no animals that draw. By drawing, I mean that they employ tool use to graphically achieve conceptual expression (that is usually iconic).

While it is pretty amazing to watch, I am highly dubious of it as a reflection of elephant cognition or that they really can "draw" to the definition above.

I imagine that the elephant has it remembered as a specific sequential routine of actions, rather than the execution of a set of patterns stored in the head for representing a concept of "elephant" (or "me"). That is, it doesn't have a mapping of action/graphics to concept. Remembering such a sequence of fine grained actions is still pretty impressive, but very different than being conceptually expressive.

This is likely a trick the elephants have been taught to show tourists like the ones who took the video. Note that in the beginning of the video there are multiple easels, perhaps for multiple elephants? Do they all draw the same patterns? Do they have any capability to recognize what they're drawing? Do they do this spontaneously without an audience?

Before conceding that elephants truly can draw, I'd like to know enough to rule this out as a "Clever Dumbo".

Wednesday, April 16, 2008

Panel Time!

1.5 seconds per panel. That's how long it takes on average for wordless panels to be read.

I recently completed a very exciting study that asked people to read four-panel comic strips one panel at a time. In this "Self-Paced Reading" task, they see four boxes on the screen, and with each button press a subsequent panel appears in the sequence. Only one panel is shown on the screen at a time.

While they do this, the computer records how long it takes them to move from each panel to the next. By manipulating the strip in various ways, I'm able to tell if certain manipulations have a greater impact on the reading from the original "normal" sequence.

I'm not going to go into the intricacies of the experiment (you can wait for the write up for that), but I thought I'd share some tidbits.

For instance, for "normal" wordless four panel Peanuts strips, it takes a person on average 1.5 seconds per panel. The first panel is usually read relatively slow (1.7), the second panel is fastest (1.3), then third is slightly less fast (1.5), then fourth is back to where the first was (1.7).

An interesting side note about this: since all these panels were the same size, yet were read at different speeds, it might imply a rejection of McCloud's claim that panel sizes affect reading time (which thereby somehow affects narrative time). However, this experiment didn't test that specifically, and no alternations in panel sizes nor layouts were given either. So, I can't say anything conclusive about that.

This was a very exciting project for me to complete, because it marks the first time I've (anyone has?) looked at data like this for how people understand sequential images.

Monday, April 14, 2008

Navigating Comics Plus

Since various concerns have popped up here and there about my latest essay on page layouts (pdf), I figured I should take the time to reiterate responses to some of them here...

First off, the types of navigation I talk about here are absolutely intended to be part of a broader network of how people move through layouts. Certainly, panel locations aren't the only influence on people's movement through layouts. Among the other things potentially are color, content, etc.

What I was trying to get at is that people do have idealized preferences for reading directions given various conditions and that those preferences emerge *even in the absence of content* in non-left-to-right ways. My suspicion is that people use these sorts of preferences from panels as their first influence for navigational choice, which can then be further influenced by content and maybe color. However, that's something that would need to be empirically tested.

Another reason for creating this study that I didn't mention in my last post was that the year before I did the study, John Barber came out with his own paper about layout (unfortunately now taken offline). While he had some great ideas and observations, I disagreed with his basic claim that layout and meaning were expressly tied. Layout and content most definitely can be connected in important ways, but I think that this experiment nicely shows that they are governed by separate (yet interfacing) systems.

Finally, several people have been curious about instances where panels do not have borders at all. I do mention borderless panels, but only in a footnote, where I basically say "more testing needed"! My guess is that certain permutations follow the same principles that I outline in the paper, but others lead to greater violations because of the ambiguities they create (ahem, testing needed).

As with many of my papers, this should just serve to lay the groundwork for future work (by me or others). I'm still convinced that the really cool stuff will come far down the road!

Thursday, April 10, 2008

Essay origins

So far I've been very pleased at the response to my latest essay, "Navigating Comics", on how people navigate through page layouts (pdf). As several responses have been rolling in via email and elsewhere, I intend to do a post soon addressing concerns in that feedback. However, I think it'd be informative to first talk about the origins of this paper.

Back in 2003 when I was drawing our political book We the People, every now and then my editors would tell me they had trouble knowing exactly where to go in the sequence. Often this happened in consistent situations (like what I call "blockage" in the paper).

Most of the times I'd either simplify the layouts or make some graphic fix (like a trail) to indicate a clearer path. However, it got me thinking... My editors were quite a bit older than I was, and weren't all that experienced comic readers, so I wondered if this lack of experience mattered in their reading habits? (or if I was just needlessly making things difficult)

So, I designed this study to test that. I had a booth at ComicCon 2004 that year to promote the book and my other works, so I designed a simple pamphlet people could make responses in and tested people throughout the convention.

I could tell immediately that the results would be interesting, I just had to wait another three years to learn the statistics necessary to show them (d'oh!). The theory with the tree structures predated the experiment by at least a year, but it didn't really say much without knowing about people's actual preferences. It's exciting to see that my suspicions for creating the experiment were borne out in data.

Every now and then I get a response to my work along the lines of "Why do theory? Why not do something related to praxis?" While theory can be interesting, enlightening, and much of science is simply about discovery without practical applications in mind (ex: penicillin), another reason is that theory can sometimes wrap back around on praxis. I like to believe that this is one of those cases, especially given that it came from those origins.

Sunday, April 06, 2008

New Essay: Navigating Comics

I'm very happy to announce that I have a new essay online: Navigating Comics: Reading Strategies of Page Layouts (pdf). This paper reports the findings of an experiment I conducted looking at how people navigate through comic pages. The big finding: people don't just mimic text going left-to-right and down.

The full abstract:
The spatial domain is often considered to be non-linear, given the analog nature of visual information. However, the visual language of comics defies this by siphoning images into a deliberate reading sequence. Most often this sequence is assumed to be read in an order that mimics text: left-to-right and down, a “z-path.” However, several scenarios can violate this order, such as Gestalt groupings of panels that deny a z-path of reading. To investigate these concerns, an experiment asked 145 participants to number empty page layouts in the order they would read them, and showed that readers use an alternate strategy extending beyond both the traditional “z-path” and Gestalt groupings to navigate through comic page layouts.

I should also say that this paper took a very long time to complete. The study was run in 2004 and I know I've been talking about releasing the results all the way since last summer. Thanks to all who participated in the study and to all for being patient as I finished it up!


Wednesday, April 02, 2008

Time and The Torch

On this page I found another great example of a page by Jae Lee that defies the "temporal mapping" idea that successive panels are successive moments:

I'm unaware of the full context of the page, but the Human Torch is flying around some big monster of sorts and creates the number "4" (for Fantastic Four no doubt) in his path. Doing so, his path begins by violating a constraint of page layout, entering at the bottom of the page, and then flies over his own path, which crosses a panel he's already been in.

I'm not sure I agree with the analysis given on that blog, mainly because I think appealing to McCloud's transitions and closure only hurts his otherwise fairly good discussion.

Now, I don't want to suggest here that there is not time being shown here, but I think that there are two considerations that need to be reoriented.

First, let's not talk about "time," let's talk about "events." To the human mind time is only an extrapolation of events. Thinking in terms of a clicking-clock type of absolutist Time is not on the same level with the understanding of time constructed in a person's head. From understanding events, we can tell that time passes, not so the other way around.

Second, panels do not necessarily have to equal moments. Rather, panels function as "attention units" grouping important information into meaningful chunks. These chunks don't have to be moments, but they do highlight relevant information in ways that the author intends.

This is exactly the case in this example. The interesting thing is that the flow of events runs counter to the standard reading path of panels in order to create the "4" emblem. If reading left-to-right as if these were independent moments, this would make no sense whatsoever. But, because this display uses image constancy (breaking up a single image into parts... what I'd call a Divisional panel, the understanding of which is what Gestalt psychology would call Closure), the panels only serve to divide up the conceptual space of the image to highlight the Torch at different positions within the space.

Yes, the countering of events vs. panels is a bit funky, but it's also a creative use of playing the two off each other to reveal their functions.

Note: For those more interested in these types of examples about Time, most of these ideas are written about more extensively in my paper Time Frames... Or Not. Attention Units are discussed more in A Visual Lexicon.

Tuesday, April 01, 2008

Some links and whatnots

Steven Seagle has a decent piece up at the First Second blog about visual storytelling. He nicely taps into a simplified version of some of the same things that I've been pushing for my theory of visual grammar. The exercise he uses to rearrange panels is very reminiscent of linguistics methods, and is also a good one that shows how a broader structure exists above and beyond the so-called 'transitions' between panels.

Along these lines, Matt Madden and Jessica Abel will soon have a "how to" book coming out about comics. I've been hearing that its somewhat theory oriented, so the book should be an interesting read. So, keep an eye out in the coming months.

Finally, keep an eye on this very site in the next week. I will finally — finally — be posting the results of my experiment about comic page layouts shooting for next Monday. This one has been a long time coming — I first ran the experiment almost 4 years ago and have been working on the paper since last spring! The project tested whether or not people read comic page layouts using the "left-to-right and down" path like text. A preview: the answer is "not really."

Finally, last month was my most trafficked month ever, so, thanks to everyone that's been reading my site lately!