Legitimation of "comics"

Comics scholar Domingos Isabelinho posts a critique of Thierry Gröensteen's paper on comics' search for legitimation. Domingos keys in on this quote in particular, so I'll do the same:
“Although comics have been in existence for over a century and a half, they suffer from a considerable lack of legitimacy. To those who know and love it, the art that has given us Rodolphe Töpffer and Wilhelm Busch, Hergé and Tardi, Winsor McCay and George Herriman, Barks and Gottfredson, Franquin and Moebius, Segar and Spiegelman, Gotlib and Brétecher, Crumb and Mattotti, Hugo Pratt and Alberto Breccia, not to mention The Spirit, Peanuts or Asterix… in short, comic art, has nothing left to prove.”

His critique of this is that an analogy to literature would be ridiculous: why would literature have anything to prove?

However, we can take this one step further. In his work, Gröensteen conflates the socio-cultural "comics" with "sequential images with/out text". If he means the former part of this dichotomy (comics, the sociocultural phenomena) then Domingos' argument stands. If he means the latter (the expressive system of sequential images), the claim becomes even more bizarre:

Why would any type of expressive form (drawing, writing, etc.) need legitimation? Do we really have to have justification for why drawings (in sequence) are worthy of attention at all? From a cognitive perspective, this seems crazy: all of these things are just ways in which humans express themselves.

To exemplify this, try replacing some of those names above with friends of yours, and the last sentence as: "in short, English, has nothing left to prove."

The justification of English should be self-evident, and so it should be with the visual language of sequential images too.

Status report

I have, unfortunately, once again been struck by so much work that blogging has fell by the wayside. Thankfully, this is a good thing in many ways, because I have a whole lot of really cool stuff going on!

For example, I'm busy planning my new class here at Tufts on The Visual Linguistics of Comics, the first time such a class has ever been offered. I've actually posted the syllabus online. If you're in the Boston area and interested in sitting in or taking it, I encourage you to email me.

I've also got a host of projects going. Inspired by the types of thing I discussed in my last post on Action Star Substitution, I'm now running an experiment looking at this phenomena. I'm hoping to have it done by the holiday break, so perhaps I can post some preliminary data if things look interesting.

I'm also pleased to say that a follow up to my comparison of Japanese and American books (mentioned briefly here) has finally begun, despite wanting to do it for years now. An enterprising helper in my lab volunteered to take on the project, so hopefully this will eventually lead to some very interesting results, especially given that I've expanded the scope of the project to a lot more than just comparisons between countries.

Oh, and I've also got a massive paper on visual language grammar being refined, along with another huge visual grammar experiment in the works. Busy busy busy...

In discussing this post of mine with Derik, I realized that I should post on the technique I used of substituting a whole panel for an "action star," like this:

This usage is somewhat similar to what I talk about in this older article on metonymy, and the same phenomena creeps up overtly in McCloud's famous "Closure" example from Understanding Comics:

In both cases, we never see the action, because it's replaced by a panel that implies action took place, but replaces the image with some neutral information. In the case of the action star though, we associate that sign with events, so it further indicates the presence of an action, whereas in McCloud's example the text does most of this work, since the cityscape is entirely neutral.

I've recently been exploring this phenomena a lot more, especially since I keep seeing it in Peanuts strips, first shown (I believe) in this one:

So, we now have this phenomenon where we know we can substitute certain types of panels for others to get an entailment of the actions. For storytelling, this is pretty cool, since it forces the reader to draw an inference about the actions (a result interesting enough that McCloud extended this out to all interactions between panels).

However, are there also restrictions on which types of panels we can replace? Since the action star essentially just means "events occurring!" but doesn't show them, it can be considered as a kind of "visual pronoun." Because of this, it can also be used as a diagnostic for determining certain categories of panels versus others. This "pro-form" replacement is a common technique in linguistics for determining grammatical categories: we can replace Noun Phrases with the pronoun "it", and Prepositional Phrases with "there":

1. Martin pushed the really huge boulder up a massive hill.
2. Martin pushed [it] up a massive hill.
3. Martin pushed the really huge boulder [there].
4. * Martin pushed the really huge boulder [it].
5. *Martin pushed [there] up a massive hill.

In 2 and 3, we can see that this substitution works fine, but when we reverse which ones we're substituting for, in 4 and 5, it sounds awful (indicated by the asterisks).

So... can we do this using an "action star" as a kind of visual "pronoun"? Check out these Peanuts strips, where the action star replaces either the second or third panels:

1. *
2. *

When the second panel is replaced, it does not seem to make much sense (nor would it make much sense in the first or final positions here either). However, replacing it for the third panels does work — hinting that those panels belong to a certain class of words where a culmination of an action occurs (even when that action isn't an "impact").

Note also that an approach using linear "transitions" between panels would be unable to express this: what would the action star be a transition of — a "Non-sequitur"? That wouldn't be able to capture the understanding of the event occurring in that panel. Rather, this hints that transitions (based on the relations between panels) are not the way sequences are understood, and the need of some sort of global narrative structure (with categories for actual panels) underlying the sequence.

Continuity across panels

Derik posts a quote from this article on Narration in Comics that discusses cognitive schema and comics. I'd read the article awhile ago, but seized on this part of the quote :
"An extrinsic norm crucial to comics is the interpretation of a figure reappearing in several panels as one and the same figure shown at different moments in time (usually in chronological order)… Usually it is assumed that the event represented in the second panel happens after the event represented in the first one…"

This constraint is no doubt what led Saraceni to posit a principle of sequential images that weighs "new" versus "given" information across a sequence. It's also a type of constraint placed by Gestalt organization: Continuity.

However, what struck me on this reading of that quote, is that this schema is exactly the sort of thing that people who lack knowledge of the visual grammar (or who have a competing grammar) have trouble with.

For instance, kids below four years old seem to have no ability to make coherent sense of connecting juxtaposed panels — they can recognize the meaningful content of the things in each panel, but they can't seem to connect them as part of a narrative sequence. (They also seem unable to recognize any representations in the images that are predicated on understanding the causation between panels).

A comparable thing happens with the native Australians who use sand narratives. They draw their narratives unfolding in the same space over time, and when presented with juxtaposed panels, think that each panel is a new scene. For them, their own system inhibits this recognition of continuity across panels.

What's also striking is that there are tons of examples where this constraint is not upheld immediately — many (most?) sequences don't feature the same characters over and over in panels. This is one of the reasons that a linear approach to sequential images (like "panel transitions") just can't work.

For example, let's say panel 1 shows person A, then panels 2 through 4 show other things, then you're back to person A at panel 5. You can't just integrate 4 and 5, because you would have had to lose track of person A through 3 panels. Rather, you have to keep them and their actions in mind somehow. Transitions can't capture this relationship.

There has to be a way of upholding this constraint of continuity across longer distances — which thus requires a bigger system than linear sequence alone provides.

The Development of Language

This related to both sequential images and linguistics, so how could I not post it? Via LanguageLog:

For example...

I've been working very hard lately on a few projects and papers that have been occupying a lot of my time and energy. One of them is a write-up of my model of visual language grammar that I've been developing over the past several years. This one is particularly important to me because it will frame a lot of the issues for future studies, especially for all the psychological experiments I'm now planning.

To drive home the points in this article, I've been trying to use a great deal of example sequences from various comic books, which shows the diverse structures involved in examples that have not just been created (or altered) by me. However, the amusing thing about this is that comics themselves feature a lot of wild and diverse topics, making my examples often dealing with wild themes.

Often linguistics papers have reasonably cut and dry example sentences. Not mine: my papers are filled with guys fighting zombies, samurai chopping each other in half, people psychically blasting each other, names being carved on the moon, and sex toys being cut off people's hands.

Whatever might be said about my theories, my papers at least have crazy examples!

Camera angles and meaning

Kraft, Robert N. 1987. The influence of camera angle on comprehension and retention of pictorial events. Memory and Cognition 15 (4):291-307.

Kraft explores the semantic associations made to different camera angles (high, eye-level, low) in a four frame photo story. Subjects were asked to rate the story along a 7-point scale, use a recall test for remembering the order, and then a recognition task. Each story used two characters, which were contrasted in each frame position with differing angles in different story sequence types.

Overall, results dramatically supported that angle does correlate with semantic meaning when comparing how characters were discerned. Low angles support senses of shortness, weakness, afraid, timid, and passive, while high angles were thought of as tall, strong, unafraid, bold, and agressive. A lesser correlation was found to value judgements like good/bad. Eye level angles did not contrast between characters. These results seemed to be sustained across several experimental tasks.

In recall tasks, analysis did show that camera angles influenced a connotative meaning for how characters were remembered.

Explanations for this correlation claim it comes from of our experience with the visual world, such as how looking upward at taller people gives them a sense of power (like children to adults). An alternative view says that the different angles allow the viewer to see different things in the images, from which they draw the semantic implications.

If these results extend to drawings, it would be interesting to do further study on the semantic correspondences. I find it dubious to fully believe both of the reasonings above, at least in a universal sense. There is no semantics attached to the aerial view in Australian sand narratives, nor do fixed high angles in Japanese children's representations or ancient Asian graphics have any semantic correlation that such a theory would require.

Rather, this may simply be a case of learned conventions. We've built up these meanings by continually viewing them, particularly in movies. Or... the "visual world" explanation could be valid, but only in systems that allow for flexibility in viewpoints, not those that have fixed perspective. To be honest, I had always had doubts about claims that camera angles had semantic meanings, so I'm glad there's actually work that backs it up.

Review: Drawing Words & Writing Pictures

Drawing Words and Writing Pictures by Matt Madden and Jessica Abel

Before I started reading DW&WP, Matt Madden warned me that it was a book for praxis, not theory. As amusing as I find it that such a disclaimer needs to be given to me, the book draws upon aspects of theory throughout in an informed and well-measured way, and I am lead to thinking further about the relationship between theory and praxis.

First off, this book is a great resource. It’s put together well, lays out all the essential issues from hand lettering to stretching to avoid tendonitis (I know from personal experience: important and overlooked!). It even has homework and lesson plans, along with digital resources as well. For praxis, this all is fantastic.

However, no review of mine should escape looking at theoretical issues and reading this book has made me once again consider how praxis can draw from theory.

For instance, an immediate question of mine was: Why is it important to include defining “comics” at all? The book — wonderfully titled — talks entirely about the process, what I would say “writing in visual language.” So, why spend additional space trying to shoehorn various social manifestations of “writing pictures” (i.e. manga, comics, graphic novels, etc.) into the umbrella of “comics”?

First off, why should people who are aiming to create visual stories necessarily care about the arguments for what are or are not "comics"? These issues may be important for scholars, but most who read this book can just go by the "I know it when I see it frame." Truly, if they needed to have their "horizons expanded" by a broader definition, they probably aren't the ones reading this book.

Furthermore, as I see it, inviting readers into the subcuture of “comics” is unimportant to the aims of the book. Especially since they do well to state the applications of the ideas beyond genres and styles, this book is not about “drawing comics” — it’s about learning to be a visual writer. To this end, mentioning “comics” as a cover term at all belies this broad goal. What better way can people expand their applications of “writing in pictures” than by not immediately being co-opted into a subculture that they might not desire being in? Let them learn the visual language, then let them decide what they want to do with it on their own terms.

With the course of instruction, I thought the emphasis on thumbnails and not on scripts was a great choice that isn’t focused on enough. This is the central place that “writing in pictures” happens, and the assembly line style with scripting skirts this step often. It nicely reinforces the theme that this book is teaching people to be authors, not cogs in a manufacturing wheel.

I do have some problems with the instruction of panel transitions as how authors are guided to think about their craftsmanship. Now, heavily influenced by McCloud as a teenager, I did go through a period where I thought in terms of panel transitions (or at least I thought I did), and I certainly benefited from it.

However, when I look back on my thought processes with a broader theory in mind, I realize that I wasn’t just looking just one panel ahead as transitions would have us think. Really, I planned for whole sequences, to the point where (in thumbnails) I might draw an expected panel later in the sequence before filling in the ones in between. Such a process would predict that we are thinking in terms of whole sequences not linear panel-to-panel relationships — as my theories of sequential images imply. I suspect that others have had similar experiences.

On the plus side, what transitions do allow is a directed focus on storytelling methods that convey aspects of the scene beyond just the actions. They provide a cover for people to think about whether to slow “time” down, show other characters or the environment, etc. However, if we were to develop a more robust vocabulary of types of panels to be used and the potential for what panels might contain, we might not need the bootstrap of “transitions” to couch it in.

In fact, while I think that learning theory can be beneficial to practice, perhaps what would be a more direct and simple learning tool would be to see that theory in action. For example, in section 3, they discuss rhythm and pacing — the decision making for what to show and when to show it and use several variations of stories where they demonstrate different pacings. This type of section could be expanded to account for the types of things covered by transitions.

I would also have enjoyed seeing a greater discussion of page layouts and the uses that one can make with them, especially regarding meaning and rhythm. While I disagree with McCloud that the size of panels has an effect on narrative time, I do believe it has an effect on pace — the rhythm and meter of reading. Elaborating on these concepts would be very useful for beginning authors.

Finally, while it has recently become a point of research for me, I found their comments on reading order a bit wanting. Despite their specific advocacy to not use “blockage” scenarios (where two panels are vertically stacked to the left of a long panel) they praise another page by Mike Mignola that uses this layout just three pages prior. The overall message of “using unambiguous layouts” rings true, but this inconsistency (along with personal knowledge that fluent readers don’t have ambiguity in such scenarios) was a little disappointing.

Despite these issues, this book is a fantastic resource and accomplishment given other books on the topic. I wish that I had it when I was a teenager, as I probably would have devoured it voraciously, doing all the exercises and then some. Indeed, while I probably would have gotten more mileage out of it then, I plan on using its resources here on out.

Visual Linguistics of "Comics" Course

I am ecstatic to say that I've just learned my course for teaching a "Visual Linguistics of 'Comics'" course next Spring semester '09 has been approved! This will be the first course of it's kind to cover my own visual language research and related studies in a complete package.

I'm beyond excited about it, so... if you're in the Boston area and might want to head over to Tufts for some spring classes, stay tuned.

Panels as Attention Units

I stumbled across this article recently about how current theories of perception are similar to what magicians have been exploiting for years. Essentially, the idea is that we can only "see" what our attention is focused on at a given time. They liken it to a "spotlight" which roams around and only let's you take in certain things under its view. Though in the case of vision all the things out of the "spotlight" are still within your visual field. You just don't "see" them.

As I discuss in this video, panels in the visual language used in comics serve to facilitate this same sort of focusing of attention. Most of the time though, panels serve to exclude all relevant information except for the elements that need to be focused on, or at least clearly distinguish what is relevant from irrelevant. This lets panels provide a graphic manifestation of this mental "spotlight," allowing the author to control that attention instead of the reader's wandering eyes (which is one of the reason's I formally call panels "Attention Units").

This ties into the argument for why you don't want to overload a panel with too much stuff, because it becomes too hard to disentangle the attentionally important from unimportant elements. (If you still want to pack info in, inset panels help facilitate this honing of attention).

Even more, when you have too much in several panels sequentially, it becomes too difficult to track all the changes and carry-overs from one panel to another. This is what gives way to things like "parallel cutting." By switching back and forth between two (or more) scenes, you can highlight the individual aspects of each in panels without risking it becoming overlooked for other information or overloading the system. Of course, doing so introduces other processing demands on the visual grammar, but at least your attention is focused exactly on what is intended to be conveyed.

Review: The System of Comics by Thierry Groensteen

Groensteen, Thierry. 2007. The System of Comics. Translated by B. Beaty and N. Nguyen: University of Mississippi Press.

Printer-friendly pdf

The best thing about the new translated work The System of Comics by Thierry Groensteen is that it hopefully reflects an increase of English translations of international works on comic theory. There are numerous offerings by European, Japanese, and South American authors that rarely make their way into American scholarship, and more exchange of ideas can only be fruitful to the field. Groensteen is regarded as one of Europe’s leading historians and theorists of comics, and in its original French, the Systeme de la Bande Desinée is heralded by many as a "must-read,” so it would be a logical starting point for such a trend.

Moreover, Groensteen’s book is offered by many as a “serious” alternative Scott McCloud’s Understanding Comics, which has gained massive popularity, especially in America, yet remains dismissed in many scholarly circles since McCloud himself is not an academic. Comparisons between the two works has become commonplace (including the book’s own introduction), and this review will be no exception. For his own part, Groensteen does not engage with McCloud's theories except in a singular passing endnote, which is a shame since their ideas share many similarities.

While my French is not sufficient to assess the overall quality of translation, it certainly did not read comfortably. At times, the English felt unnatural, and word-choice often seemed clumsy if not uninformed. For instance, calling the psychological device of an "eye tracker" an "eye path follower" betrays a lack of competency and/or desire to find the accurate vocabulary (whether on the part of the translators or author is unknown). Additionally, while the endnotes were often helpful, the absence of a reference section is quite noticeable, and the endnotes do not include all the cited bibliographic information. As an “academic” book, such an oversight is a bit peculiar.

Nevertheless, the writing should only be a surface issue to the actual ideas in the book. Perhaps the most appropriate place to start is where he does: with the definition of "comics.”

Problems with definitions

Groensteen begins with what has become a common exercise in the study of comics: defining “comics” itself (no doubt bande desinée originally). He carefully deconstructs the faults of various definitions that have been proposed by various realms of scholarship. He rightly shows it cannot be guided by a single "essence" like text/image interactions, and decries a definition of comics as episodic narratives, among others.

Rather, at the outset, he identifies "comics as a language" a “system” that arises out of the "combination of a… collection of codes,” most strongly motivated by “iconic solidarity” — a fancy term for the contribution of several images. While somewhat more flexible, this emphasis on the visuals sounds a great deal like McCloud’s more rigid “juxtaposed sequential images,” but lacks a reference or engagement with his ideas despite discussing many other scholars works who Groensteen disagrees with.

Truly, in its similarity, Groensteen’s definition falls into the same formalist trap as McCloud's in failing to separate the structural notions of creating images/writing from the socio-cultural role of “comics” as objects/artifacts. This habit may have been inherited by them both from Kunzle’s influential The Early Comic Strip, whose recasting of the term “comics” on pre-1800s sequential images engendered many (like McCloud) to seek out a definition to include all possible historical examples. However, this goal is a red-herring, and as Horrock’s points out, there is no reason for sequential images from diverse historical contexts to be bound to the same socio-cultural context of the contemporary notion of “comics.”

Indeed, "comics" as a social artifact refers to numerous qualities, including 1) physical objects (strips and books), 2) a collection of genres, 3) an industry, 4) a culture/community, and others that are all tied to a context of the modern era. On the other hand, sequential images do create a language: a “visual language” that combines with text to be used within those social objects called "comics." "Comics" are not this visual language. "Comics" are a social object written in a visual language that combines with text. If novels or magazines are written in English, why should “comics” be a language, instead of be written in a language?

While Groensteen strives to strike out an abstract notion for his definition, he frequently reinforces the conflation of "comics as a medium” with "comics as a social artifact” — particularly a physical object. He writes that comics owe their birth to the technological development of lithography (8). Further on, his analysis of both word balloons (69-75), panels as “measurable in square centimeters” (28), and the expressive power of sequential images through “arthrology” rest wholly on the physicality of and across pages. However, physical objects cannot be languages — especially in the abstract sense of them being a “code” — and his reinforcement of it runs contrary to his own definition of “comics” that claims to guide the broader work.

The "System"

As invoked by the title, Groensteen’s work seeks to sketch a “system” by which “comics” operate. To his credit, he carefully moves in analysis through varying levels of interactions in comics’ structure, accounting for most of its forms. He eschews the method of previous structuralist approaches to treating the form as a “language,” which aim to dissect the medium into its minimal units, instead aiming for its broader “articulation” — larger levels of structure. However, Groensteen’s grand system is little more than an extensive taxonomy with terms that essentially mean “the principle by which (pick your taxonomic portion) operates.”

Groensteen begins by discussing the “spatio-topical system,” the various spatial elements at play in comic pages. He first notes the “hyperframe” as the delineated space of the page, in contrast to the “multiframe” — meaning the relation of all frames that constitute a comic piece, including “the sum of the hyperframes.” This chapter also includes discussions of the functions of a panel, usefulness of margins, positioning of balloons, and various facets of page layout, such as inset panels, the “strip” of a tier of panels, and possible functional roles for types of layouts.

Now, my harshest criticism is not that I think Groensteen's theory of comics is invalid or wrong, but that it is uninteresting.

While at times insightful, he labors through discussing details of relatively lackluster observations, such as the various ways in which a balloon visually interacts with a panel border. It is understandable why a taxonomic distinction can be made between balloons that touch the panel border and those that overlap it, but does it really add substantially to knowledge about the function and understanding of balloons as a graphic/narrative element on its own?

If this discussion led to a substantial observation about the constraints that this interaction creates on such a relationship, this breakdown would seem more significant (for instance, that two balloons cannot cross tails such that their speakers are in opposite panels than the balloons). However, no revelation of this sort is reached, and most of his analysis focuses solely on the physical aspects of the relations of balloons to panels. Such is the features of most of his discussions. All of this leaves the question of what this analysis offers a theory of comics understanding except for surface descriptions of a (fairly banal) phenomenon?

The next two sections are devoted to the principle of “arthrology” to describe either the linear relations of panels to each other (restrained arthrology) or the relations that one panel might have to others in a non-linear sense (general arthrology). Unlike McCloud’s panel transitions, Groensteen does well to recognize that panels make connections beyond their immediately juxtaposed neighbors, yet does not give any hint as to how. McCloud’s transitions at least attempted to characterize relationships between panels, which is largely why the approach is so appealing, but Groensteen leaves such detail aside, preferring instead for gross scale abstract principles. He describes “braiding” as the principle guiding arthrology — essentially the function of making connections across the multiframe.

On the surface braiding and arthrology appear to be about the creation of meaning, like McCloud’s “closure,” or perhaps a comic version of Halliday and Hasan’s (1976) classic notions of local and global discourse coherence, which Saraceni (2000) also attempted to map to comics’ analysis. However, Groensteen does not even come close to talking about semantic concerns in this way — despite purporting to (the back cover describes the book as “An authoritative exploration of how the comics achieve meaning, form, and function.”).

Really, braiding and arthrology are a theory of compositional relationships. What Groensteen focuses on is not in any way a system of how the content of panels lends towards making meaning — it is primarily about the relationship of the visual composition within a panel to others on a page and other panels scattered throughout the broader work. In particular, his theory observes the recurrence of compositional or thematic similarities (such as motifs) across varying distances of space, be it a page or a whole work, and whether the position in layout of those panels has significance (such as in the first or last position of a page).

While these observations provide a very different perspective on comics pages, they do not even scratch the surface of the “holy grail” of questions about comic sequences that they appear to address: how do sequences of images create meaning? Indeed, Groensteen dismisses such aspects as merely about "storytelling."

Rather, arthrology, like the other aspects of Groensteen’s system, once again highlights a physical feature of comic pages in what is proposed as an abstract code of understanding. The theory treats the medium as an “art object” to be analyzed like a multi-canvas painting, as opposed to a communicative medium in the McCloudian sense. Indeed, such an approach would be akin in verbal language to recognizing that various patterns of sound appear throughout different words in a story — perhaps telling you something about phonetics, but nothing about meaning. All told, Groensteen’s system limits itself to only describing the surface aspects of the medium, without pushing towards any deeper constraining principles.

Despite the criticism of the workings of his system, Groensteen is nevertheless an astute observer of various components of the comics form, and his numerous insights on comic elements betray a deep commitment to dissecting the medium. The System of Comics does reveal gems of this intuition at times when discussing various components, though the overall theoretical architecture in which they are embedded is not commiserate with the value of their insight.


The most troubling thing I found about The System of Comics is the overall orientation to scholarship that this work represents. Throughout, Groensteen’s writing conveys an attitude as if the theories are entitled to be significant, echoed in the introduction where Beaty and Nguyen hail it for emerging from the rich semiotics tradition, as if that inherently legitimates its ideas. This is a direct knock to McCloud, who they note has “been criticized for…lack of theoretical sophistication” (1). Though, this criticism has only come from academics unfortunately perpetuating the stereotype of ivory tower snobbery, as if scholarship must be done in the academy. It is hypocritical at best for any scholar to deride McCloud for not engaging a broader literature while lauding a book with very similar ideas to McCloud’s without any citation or discussion of them. So much for the value of “engaging the literature.”

Truly, Groensteen is the anti-McCloud, keeping the keys to theory locked away in the ivory tower, reachable by only those willing to slog through the exposition to reach it. While McCloud’s work about comics is presented in its visual language, there is some irony that The System of Comics takes “iconic solidarity” of images as its crux, yet is almost devoid of graphic examples. Perhaps this is why McCloud is ridiculed so much by Groensteen-minded theorists: he willingly gives away the goods to the rabble without a fight.

If McCloud lies on one extreme of being too accessible (as if that’s a bad thing), Groensteen’s work is the inverse, reflecting the worst of academic jargon and inapplicability of “theoretical sophistication” — to the extent that the terminology obfuscates the actual theories (to academics and layfolk alike). Groensteen offers complicated names and lengthy descriptions to what are otherwise fairly facile observations about surface phenomena.

Moreover, numerous questions are left unanswered, for instance how this theory is useful or applicable to 1) describing how this medium of sequential images communicates, 2) contrasting various comics’ structure with each other, or 3) describing the relationship of the visual language in comics to other modes of human expression?

Groensteen’s “system” accomplishes none of these. These are all questions that are important to semiotics as a discipline, so why is Groensteen's approach unable to even broach them? It only states vague principles for nearly obvious observations, with fancy names and a semiotic tradition to validate its claims. It is scholarly hand-waving at its best, and a reflection of why the invocation of “semiotics” garners more eye-rolls than awe in many circles.

Paradigm Shifts

In Kuhn’s renowned discussion of paradigm shifts, he describes that prior to a major shift many similar theories will emerge in competition with each other, yet all pointing towards similar intuitions. In this case, Groensteen taps into the common intuition that the system found in comics is somehow similar to the system of language. Indeed, he begins with a strong statement about how comics are a language, yet his analysis paints a picture of the comic medium that is decidedly un-language-like.

While his stated focus is not on “minimal units” — thereby avoiding topics such as how the system is understood as a graphic domain and meaning-making for signs and symbols — he states nothing resembling a grammar for how the sequence creates meaning (though claiming that is his intent). If his aim is to describe a “code” that shows this is a language, one would think it would involve structures akin to language. All this leaves doubt as to whether Groensteen would actually know what being “a language” would entail in the first place (aside from metaphoric extension).

To this extent, Groensteen’s total work certainly marks a valiant attempt that can take its place next to other approaches that reflect the growing pains of a discipline, sharing the intuition that the comic form should be compared to language (including: Eisner 1985; McCloud 1993; Saraceni 2000; Cohn 2003, etc.). However, this piece and its theories are not the revolutionary new paradigm for comics that will sweep away all others, and will likely follow the path of the branch of semiotics from which it hails — being considered largely passé in the study of language.


Blogging has been slow lately for me what with the oncoming start of school next week. I've mainly been devoting my time to the set-up of my latest experiment, which, after a few behavioral studies, will finally start looking at people's brainwaves reading comic strips.

To give an idea of the scope of experiment preparation: I have to create 200 novel Peanuts strips using panels from existing strips, and then make an additional 600 that are not of the "normal" variety. So, 800 new strips for one study, which will all be rated and tested in three pre-experiment studies to make sure they will be acceptable to use. Oy!

I do, however, plan to post a few reviews in the coming weeks. First, I will finally post my review of Groensteen's System of Comics, though I'm debating how best to parse it up (one LONG post? several shorter ones...?), and I may actually get around to finishing my post of Abel and Madden's latest Drawing Words & Writing Pictures.

Stay tuned.

Note: Especially when things are slow around here, I'm always open to requested topics if people want to pick my brain. Feel free to send me conversation pieces!

Equivalences for "Language"

In claiming that the graphic form (especially in sequence) is structured as a language, it might help to parse out how to make the analogy reasonably. It's not as if no one has ever noted the similarities between the forms — in fact, it's fairly common. However, mistakes are often made in my opinion.

First, I assume that there is "Equivalence" between the different modalities, which can be summarized as "the expectation that the human mind/brain treats all modalities in an equal way, given modality specific constraints."

By this account, it essentially means that we would expect all modalities to feature the same sort of storage in memory for patterned signs of differing sizes and functions (from sound patterns of phonemes to sentence patterns of idioms) and feature ways to combine all those elements at different levels of structure (phonology through discourse). We would also expect development to be similar, with a critical learning period and drop offs after that.

However, this is not the take that most comparisons of the verbals and visual forms take. Rather, they often try to make direct superficial analogies between specific types of structures. For example, "such and such" is the equivalent of a "word" or "sentence." This is often why many want to claim that single images have "grammar" — because a single image has lots of information in it, like a "sentence" and unlike a "word" — even though composition within single images behaves nothing like a grammar. (...nor should we expect it to given the differences between sound and light!)

A similar endeavor has tried to find "minimal units" of the structure of the forms, following the school of Structuralism (most popular in American linguistics from around 1920-1960ish). However, again, just knowing minimal units doesn't tell you about the broader structure, and units larger than minimal units might also be useful and insightful. It also gives no beneficial comparison other than that "minimal units" exist in both domains.

All of this is an argument for looking beyond the superficial understandings of "language" and to look for comparisons in deeper, more fundamental aspects of structuring.

Wednesday, August 13, 2008

Diversity in Visuals

At the VaIL conference a few weeks ago, one of the frequent conversations revolved around the issue of creating a universal graphic system.

The belief that visuals are universal is not new, and largely stems from the fact that most drawings look like what they represent ("iconicity"). This is also a motivating factor behind so-called "universal writing systems" like Blissymbols, Icon Language.

However, when looking at drawings as reflecting patterns in the mind, then there is actually quite a lot of diversity. Patterned styles like those of Japanese manga compared to stereotypical American superhero comics reflect culturally diverse conventions of different populations. Even more relative are the sand drawings of native Australian communities, and the constraints they place on recognizing other graphic depictions. Several other stories exist of people not fully understanding iconic drawings as well, such as the story Mort Walker tells about natives (I can't recall where) who thought that a person's legs were cut off, despite just being "out of frame."

Given this, perhaps it's time that we get over the idea of a universal communication system, and we come to accept that populations of humans will always develop idiosyncratic and in-group tendencies.

While globalization especially has raised a desire for inter-cultural communication, diversity in communication systems may have had evolutionary advantages. Having an identity tied to your language separate from others' means you can identify your group members, and even more helpful that you can keep secrets from other groups. If it wasn't beneficial, we would all be speaking the same languages (or so the argument could be made... diversity could just be a useful "spandrel" for the way other cognitive functions happened to turn out).

Perhaps instead of attempting to create universal systems, we should instead acknowledge the diversity that comes along with the way our brains are wired for social interactions. By accepting it, we can then strive to work with the constraints that our cognition and diversity brings or allows. That way, it will cease the fighting against the tide of inevitable diversity with rose colored glasses of universality, and instead invites appreciation for relative systems and cooperation to meet common goals of communication.

Vocab gaps

This interesting and quite fun essay/post reviews the book Reading Comics and ponders the definition of "comics" and some other terminological issues. Its biggest query is 'what is the word for the act of making a comic... the producing of sequential images bit?' (paraphrased).

I agree there's somewhat of a gap in vocabulary, but think that its partially symptomatic of a larger issue: not recognizing that drawings (especially in sequence) constitute a system of communication. That is, we lack the recognition that drawn works use a set of mental patterns the same way that, say, English does.

We in America speak in English.
We in America draw in ___?___.

Note that we also have many words describing the avenue or subroutine for picture-making: painting, sketching, penciling, inking, coloring, etc. But, no word for one actually does with those media besides the uninformative "drawing," which inadequately covers the dynamic process of creating sequences of images.

In a previous thesis of mine (¡Eye [heart] græfIk Semiosis!), I argued that vocab in a language for graphic creation partially relies on how that culture's graphic systems are structured. In Western representation, we have a dominantly sound-based "writing system" with a preference for highly "realistic" drawings — and our words for these things are very different: "Writing" versus "Drawing." Contrast this with Japan, who use a wide range of meaning/sound values in their "writing", while using much less realistic representations — and they use a single word for both concepts (kaku: かく), while using two separate Chinese characters to distinguish uses (writing: 書く, drawing: 描く).

So, in our case of English, what we're left with is a gap in vocabulary, for those representations that run a middle ground between our prototypical conceptions. Anyone who reads this blog should know my answer: The missing graphic system is what I call "visual language" (when done with images in sequence).

We in America speak in English.
We in America draw/write in American Visual Language.

If this is language, then we might as well just adopt language related vocabulary. I have no problem with the idea that I write in pictures (as well as words), and I doubt many others do either.

And, as I've argued before, recognizing that such a system exists (visual language) is the first step towards defining (or rather, un-defining) "comics." By taking away formal definitions that rely on the "comics=visual language" equation, "comics" can be defined as what we've always known it to be: socio-cultural objects (books, strips) in which a visual language is written, often pertaining to particular genres, and a culture surrounding them.

Finally, many have complained that debating vocabulary is a waste of time. But, something like this has realistic applications. Take for instance pushing the notion of visual language through the idea that sequential images are "written." Where would you learn such a thing?

You learn to write in a writing/language class, and such would be the case with visual language too. You learn to write in pictures in a writing class, not an art class — bringing it directly into the fold for education and development in a very practical and communication-driven way.

Comics, Webdesign, Closure, Cartooning

A friend of mine sent along a link to this lecture by Andy Clarke about how comics can inform webdesign. Most of the talk is a regurgitation of McCloud's theories, but he has some interesting parallels and ideas. (There's an mp3 of the talk that you can flip through the slides with).

Also, while I don't necessarily agree with the ideas, Gary Sullivan ruminates a bit on Closure.

Finally...I was sent a link to the Australian Cartooning School which has a decently formalist bent to analyzing and teaching cartooning. Definitely worth poking around the site.

Alright, heading back from San Diego soon. Perhaps I'll be able to muster up some convention/conference reports...

Talk talk talk...

I've been remiss in my reminders this year, but I have a few talks coming up if anyone is around San Diego.

The first is another appearance at the Visual and Iconic Languages Conference on July 21-22, which I believe is closed to the public (more's the pity), though my talk will be on a general overview of visual language theory. Hopefully, as with last year, they'll put the talk online.

A few days later I'll be over at Comic-Con, where my talk is on Friday at 12:30 in room 30AB. I'll be presenting a talk about manga and Japanese Visual Language. Here's the description of the panel:

12:30-2:00 COMICS ARTS CONFERENCE SESSION #7: VISUAL LANGUAGE - Neil Cohn (Tufts University) explores the visual language underlying the "manga style," how it works and how it differs from the visual languages in comics developed in other cultures. Robert O'Nale, Jr. (Henderson State University) uses David Mack’s Kabuki to illustrate how gestalt can be an important avenue for analyzing design and meaning in comics. Alec Hosterman (Indiana University South Bend) demonstrates the dominance of hyperreality in comics art and explains how it can be utilized for further study of the art form. Room 30AB

Both of the other presentations look promising, so it should be a fun panel. Come on out, enjoy the talks, and say hi!

Fun with Text

While preparing the Peanut strips for my next project, I came across a fantastic integration of text subtly hidden in this panel:

If you look carefully, at the center of the starry smack mark where the ball hit the bat, there is text reading ".315", which I assume is a reference to Pig-Pen's batting average (pretty good). This is particularly interesting to me since it's a descriptive use of text as opposed to a sound effect.

xkcd has a similar usage where instead of sound effects, the text reads the name of the action being done. However, the result in xkcd is that it describes the actions straight-out.

These two types of usage give complementary aspects of the way languages structure actions and manner of motion. Compare:

a. The ball flew into the glove.
b. The ball spiraled into the glove.

In sentence (a) you are given the action that the ball does, but little about the characteristics of that action. Sentence (b) gives you the manner of the motion, from which you derive the action implicitly.

Sound effects often give you information about manner of motion. For instance, a golf ball falling into a hole goes "Klunk" or "zoom" to describe a speeding car. These elaborate on the action itself. The use of text in xkcd eschews this to just focus on the action, without any manner of motion. What is intriguing about the Peanuts example is that ".315" is neither manner nor action — it is purely descriptive in an additive sense.

Playing with this one step further, we can create some panel pairs that replace the action for the sound effect. One characteristic of these types of "action text" is that they can stand in for the actions themselves (discussed a bit in Interfaces and Interactions). Note that both replacing for an action or manner of motion works fine:

However, substituting ".315" is a little weird — even with the expectation of the event — since it doesn't stand in for the action itself. It only gives you additional information about the action:

Looking through all these Peanuts strips, Schulz was more of a formalist than he's thought of I think.

Manga "decompression"

In Understanding Comics, McCloud made the claim that manga supposedly uses more circuitous storytelling because of the formats of their books.

In my paper on Japanese VL, I dismiss this on the grounds that nothing about longer formats gives people the drive to make slower paced narrative. Just because you have ample space doesn't mean you're going to use it to let the story linger more. You could use that space to fill in even more "compressed" storytelling.

Thinking more about this, webcomics are another good example against this theory: from my knowledge, we haven't seen a vast decompression of storytelling on the web due to the completely unrestricted "infinite" space allowing authors to freely use (though feel free to prove me wrong!). On the one hand, you could say that they aren't effectively using the space that they have at their disposal. However, the other side could say that they're using it to achieve just what they want: they have no restrictions, so what they're producing is entirely their preference.

Personally, I think that there are numerous explanations for what might be going on in manga storytelling. Here's a few, some of which were in my paper...

1) It's just an inherent part of the difference between Euro-American VLs and Japanese VLs. We don't expect spoken languages to be the same, why should visual languages? Could "decompression" simply be a result of the development of how JVL evolved?

2) They're using the VL as a language: Manga use less text than American and European books. With more reliance on visual modality over the written requires it to take on more expressive weight. The result is more complex structure in the visual sequences. This is comparable to studies asking people to only gesture with no speaking. The result is something that looks closer to patterns like in sign languages (though still not SL).

3) The cross-cultural differences focusing on environment over action requires more space devoted to "setting a scene." Research seems to suggest that Asian minds are more interested in the broader environment than the specifics and individuating different elements of the environment take up more panel space than simply presenting it as a whole, backgrounded to the actions.

Notice that in all of these cases, formatting is entirely secondary. Indeed, it's somewhat interesting to think that formatting is one of McCloud's explanations, because much of his work is about transcending formatting. Here, the explanations focus on cognitive reasoning — meaning we should see the effects no matter what the format.

Garfield experimentalism

Apparently we're upon the 30th anniversary of Jim Davis' Garfield strip. As a ten year old I was pretty obsessed with the Garfield books, and can probably mark meeting Jim Davis at the ABA as a highlight of my fourth grade life.

Perhaps unsurprisingly, I've gotten about seven emails from people linking to the Garfield Minus Garfield strips, which I first saw a few years ago even. I was always partial to the Is Garfield Dead? premise, though Nothing Garfield strips are interesting too (though Barfield does give me a good chuckle).

More theory related, the Garfield Generator is a great example of a few points of my research. It shows that there is an overarching coherent structure built into the whole strip (at least sometimes in this case), even when the immediate linear relationships don't make much sense. This is somewhat similar to the famous Chomsky sentence "Colorless green ideas sleep furiously", and I'm actually basing my next big experiment out of 6 panel long Peanuts panels of this same nature.

In some cases with the generator though, you can easily tell that the position of the panel is somewhere it doesn't belong. The thematic role of the panel belies it's canonical positioning.

Anyhow, Happy Birthday Garfield, and thanks for the early influences on my comics obsessions!

Comics and the Brain... almost

Nagai, Masayoshi, Nobutaka Endo, and Kumada Takatsune. 2007. "Measuring Brain Activities Related to Understanding Using near-Infrared Spectroscopy (Nirs)." In Human Interface and the Management of Information. Methods, Techniques and Tools in Information Design, 884-93. Heidelberg: Springer Berlin

Looks like I was beat to the punch... I've found a study from last year that analyzes the activity in the brain while reading comics. However, it doesn't say much.

The authors use near-infrared spectroscopy to measure blood flow in the brain while reading comics. This technique uses infrared light to measure where blood flows in the brain, which can thus indicate the brain regions involved in various behaviors. They find that "the left prefrontal lobe region is activated when people actively try to understand the comic stories and to memorize their contents for reporting in the future."

However, there are extensive problems with this study. First, the number of stimuli they use is extremely small (only 6 strips) as is their population (13 people... which does not add up to counterbalancing). Comparatively, the study I'm planning will use 180 stimuli per trial (720 strips total) and use somewhere from 24 to 36 people.

Additionally, the increase in blood flow that they observe only occurs in "reported" conditions — where subjects are actively making a judgement about the stimuli, as opposed to scenarios when they are not. This seems more to reflect the well-reported cognitive activation for making judgements than anything about the structure of the comics themselves.

So... this really doesn't tell us much about comics and the brain, but its nice to see other people are at least taking stabs at it as well.

Friday, June 06, 2008

Collaborative drawing

Last weekend on public television I saw a fantastic biography about Pete Seeger, the influential folk singer and activist. Throughout, Seeger stressed his desire to sing with people, not to people — motivating music as a collaborative endeavor. This sentiment is echoed in the accessible book, This is Your Brain on Music, which points out that music as "performance" by people on a stage to other people seems to be a fairly new thing. Traditionally, music was a group activity that was not reserved for those of express "skill" and training.

Drawing is much the same way. We often make a huge break between those with or without "talent" — resigning people to the misperception that they "can't draw", when really our biological endowment ensures that we all can draw. Really what is at issue is a level of fluency, and most people just don't develop with the proper exposure or practice.

Language, like this sense of music, is entirely collaborative. And, it is learned collaboratively, unlike most learning of drawing. In some cases, drawing might be instructed, often very well, though this is far from simply being interactive in the sense that you learn just by participatory immersion.

On a productive sense, drawing also is highly non-collaborative in our modern life. Belonging to a print-culture, most drawers and readers are separated by huge distances of space and time. This isn't always the case though. Sand narratives by native communities in Australia are highly interactive, drawn in real-time communication.

Humans are an intensely social animal, and my gut tells me that nearly all of our expressive capacities developed and thrive in such collaborative interactions. The question is: how in our modern ecology can we facilitate such usage for visual language? Will we have to rely on technological breakthroughs (ex. digital whiteboards), or can it grow organically without the crutch of engineering?

For those interested in more about this, my article from a few years ago "Interactive Comics" probed a lot of these ideas.

My "Homopholganger"

Huzzah! Today is the sixth year this website has been online! If I remember correctly, I posted everything online while the Lakers were in the playoffs about to go to the championships... and lo and behold the past is repeating! (yes, I'm a Laker fan... which will certainly be interesting living in Boston as they move on to play the Celtics in the Finals)

So, here's a semi-research-related story to commemorate the occasion. The Tufts Psych department (to which I'm a grad student) is hosting a conference this weekend, and one of the featured speakers is a psychologist named...Neal Cohen! (no relation)

Naturally, I thought it would be hilarious and awkward to meet him. The first day of the conference I had turned it into a scavenger hunt, with several faculty and other grad students all on the lookout for him. We came up with a great portmanteau word to describe someone who shares the same name as you: your "homopholganger." By the time I arrived Friday, I was getting asked over and over if I'd met him yet.

I actually did end up talking to him shortly after his own presentation, and hilarity ensued! Even cooler, I started making connections between some of his work on the hippocampus to things I'm finding in visual grammar. Naturally, I proposed a collaboration... He thought the ideas were pretty cool, so, who knows, perhaps in the next few years we'll see the fantastic byline: By Neil Cohn and Neal Cohen.

Random!... panel sequences that is

As long as we're on the topic of comics that people clip out for me, here's another one that my advisor passed along. For some reason, he's rather partial to Zippy the Pinhead (I think because of the philosophy jokes), and this one caught his eye. Particularly this first panel over to the side.

Zippy it seems comes from the Non-sequitur school of panel transitions (if you're into that sort of thing).

What makes this fun for me is that my next experiment is actually going to use various scrambled strips to help illustrate the differences in processing between those and normal strips (plus some other more complex strip types).

Not much is out there about this sort of research, but one study did show that people's comprehension of sequential "picture stories" (Mercer Mayer stories) correlated with their comprehension for text. Skilled readers showed a drop in recollection for scrambled compared to regular sequences. However, unskilled readers showed no comprehension differences at all.

I'm a bit dubious that fluency in visual language is comparable to general comprehension skills (they used no measure for graphic fluency), but this study at least showed some support for a domain general capacity.

Tuesday, May 20, 2008

Rory Root, we'll miss you

I learned to great shock and sorrow this morning that Rory Root, owner and operator of Comic Relief in Berkeley, has passed away. Rory was a phenomenal presence in the comic industry, and I remember fondly being first introduced to him by Beau Smith as having "the best comic store anywhere" when I was still working for TMP at the ComicCon as a teenager. When Beau discovered I was going to go to college at UC Berkeley, he made sure I knew Rory before going.

During my time at Berkeley, Rory was always interesting and encouraging, especially when I began my greater foray into theory. His was the first store that carried my Early Writings on Visual Language book on theory, and always made sure I did booksignings with them at ComicCon.

Rory once took me to lunch on the auspices of giving advice for future bookselling. He was always quick to introduce me to people he thought might give me good exposure, one time unexpectedly taking one of my books out of my hands to give to a blogger saying "Trust me, this will be good publicity." (He was right)

He was a fixture in this industry, and a wonderful friend and benefactor. He will be greatly missed.

Split planes

I haven't gotten a newspaper at home in years, but every now and then my father sends me clippings of comics or articles. The comic that he seems to send the most is Pearls Before Swine which has a periodic flair for formalism.

In the last set that he sent, the characters do a fair amount of walking on the borders of the panels (Here, Here, and Here):

"Awareness" of panel borders by characters within them is nothing new, but doing so it reveals that there are two levels of representations in this visual language of comics. There is a "Representational Plane" (RP) that the content exists in, and a "Framing Plane" (FP) that holds things like panel borders and balloons/bubbles/text boxes. Usually, the Framing Plane just lies "outside" the RP, but instances like these collapse the layers together. (see linked essay below for illustrations of this)

Another hint that these two layers exist comes from the fact that text carriers can become panels, as I discussed in my article on "Loopy Framing":

This commonality between their forms — that they both encapsulate information, both are not part of the image matter but can be interacted with in a "meta" way — go towards their being two aspects of a singular plane of Framing.

Note: For those more interested, I discuss this more extensively in my paper Interactions and Interfaces.

On PowerPoint

I was recently asked by the VizThink people to devote a post to the prompt: PowerPoint: A powerful tool poorly used or a poor tool overused?, so... here goes...

I do think that PowerPoint** can be a powerful tool that is misused, but its functions and autofills can also be overused. In many ways this is not the fault of the program, but the fault of the users for relying on the program to guide their presentations and thinking. (though, perhaps some blame should go to the programmers who design it to serve that purpose)

At its heart, the program is just a slide show. Its key function is to show slides one at a time after each other. Everything else is just bells and whistles. That potential is extremely powerful, but it can be misused. The choice of what to put on those slides makes all the difference. This is a reflection of the user — and their own abilities for storytelling, narrative, etc.

When slides are used as an alternative to substantial speaking they become a hinderance: Cramming too much information on a slide. Presenting information that people will be forced to digest at the same time as trying to follow your speech. Seeing the presentation as a rigid path not allowing the free form creativity of writing on a blackboard.

From my experience, slides should be used like gestures. Co-speech gesture occurs at a rate of roughly one gesture per clause, and usually elaborates upon some aspect of speech often adding a spatial dimension to it. It isn't something separate that happens to go with speech though — it expresses an integral and complementary part of the conceptualization that also goes into verbal expression.

Another analogy more pertinent to this blog (and one I discuss in this podcast), is that slides should serve the equivalent of a "panel." The slide is the image content of a panel, while your speech is like the text. The biggest difference between the two is the temporal quality of presenting them — otherwise they serve largely the same function.

Not all use of slide shows need to be clipped and truncated as the Powerpoint Gettysburg suggests. You can still have beautiful and powerful oration using slides — but it should not depend on the slides. Rather, the slides let speech be more than just sounds. It has to be a multimodal expression, where both slides and speech work in concert with each other to achieve something more. The ability to do this, I would suspect, is cognitively the same whether it's done in print or on a screen.

PowerPoint is not a substitute for lack of narrative skills, and its problems can largely be fingered for forgetting or believing that (whether as a user or a programmer). Excel can't make you a powerful statistician. Word won't make you a good writer. Why should we expect that PowerPoint is to blame for poor presentations?

**My personal preference is actually Keynote. I'm going to treat this as just a discussion of slide show programs in general. If we're targeting PowerPoint specifically, then you can amp up my dislike of the autofills, etc.

CogSci Comics

I am currently in the refractory period of the semester, enjoying the freedom of summer break starting and the ability to work on all those projects I usually don't get around to doing during the school year. I've now plotted out at least three papers I plan to write, plus a visual language class syllabus to refine.

I should have some more substantial blogging to do soon, but in the meantime, here's some goofy random comics related to cognitive science and linguistics:

Signifier vs. the Signified

Gava Guy

Cog Sci Supers!

Please feel free to post more as you find them... Enjoy!

I've been ridiculously too busy to blog lately, largely due to my upcoming exam/project on biopsychology. Once that's over it's summertime! (i.e. time for me to work on projects otherwise not given enough attention while in classes). In the meantime...

Here's an interesting site that attempts to show the relative sizes of things in the universe. I like how its using digital tools to get at visualizing otherwise hard to conceive of things. In some respects it serves as an "Infinite Canvas" in McCloud's sense. (Beware has sound)

Podcast: "Grammar" in visual language

I've done another podcast with the folks at VizThink, this time debating Yuri Engelhardt and Dave Gray on what constitutes a visual language and the nature of visual language grammar.

This new format allows you to skip around to different chapters to jump straight to parts of interest. (Please note, I object to the insinuation in the chapter title that "comics" can equal "visual language"):

Hint: Use the Full Screen Button to see this video in greater detail.

I think that there is something I strived to point out throughout the discussion that I didn't articulate well enough, but to explain it I'll have to do a mini-linguistics lesson.

In the podcast, Yuri pointed out the view that language has two main parts: a set of units (lexicon) and a set of combinatorial rules (grammar). This view of two components is essentially Chomsky's view of grammar, and organizationally looks something like the diagram to the side.

In this traditional view, syntax/grammar is the component that offshoots meaning, and only syntax has properties for combining elements together. I said that I agreed with this notion, but really I don't. When I mentioned that I subscribe to a view from Chomsky's student, Ray Jackendoff (my teacher), I should perhaps have elaborated on the differences between those perceptions more, because they are extremely important and can resolve some of the conflict of the debate.

Jackendoff's view of grammar is different. This "Parallel Architecture" says that the mind has three main interfacing components: modality (auditory/manual/graphic), syntax, and conceptual structure (meaning). The "lexicon" is distributed across the interfaces between all three of these structures — it doesn't have it's own "place." And, importantly, each of these structures has that capacity for infinite combinations — not just syntax. (Note the similarities to my listing of properties of Language). This would look like this:

Much of our debate focused around whether or not single images (diagrams) have "grammar." My objection is that it does not function like "syntax" does in a verbal grammar, though I acknowledge that there might be a hierarchy or a combinatorial system there. If you subscribe to the Chomksyan view of grammar, you're forced to say that the combinatorial element "is syntax," which is exactly what Yuri is doing:

If you follow the Parallel Architecture (as I do), syntax is not the only element that creates hierarchies. They all do. So, combinations within a single image or diagram is "grammar' insofar as phonology is the "grammar" of sound. Essentially, Yuri's "visual grammar" is the combination system within the graphic structures, which is why I kept prodding about the difference between it and just the system of perception (and why most of its "constraints" are based on iconicity). This instead looks like this:

In contrast, my grammar for visual language needs a combinatorial system for individual images and for combining them together, looking like this:

To the extant that the narrative structure takes concepts and a modality and orders them coherently, it functions the same as syntax in verbal language. This is "visual language grammar" analogous to the way that syntax is verbal language grammar (nouns and verbs). But, all three structures have combinatorial properties. They don't all make reasonable analogies to saying that they are like "grammar" in the syntactic sense, but they may be combinatorial.

(This is also why you can say that "gestures are to sign language what individual images are to visual language" in the context of sequential images, but not for individual images. There is no developmental/fluency gap like this for "... visual objects are to individual images". I.e. People don't learn how to draw simple graphic signs but not be able to put them into a diagrammatic arrangement.)

Making this shift in perception buys you a lot: It makes the distinction why single images may have hierarchy (like perception/phonology), but don't have grammar (like syntax). It addresses why most of that structure is guided by iconic and indexical constraints. And, it also may give you a leg up in describing combinatorial aspects of images beyond diagrams (which occurs within panels).

Finally, it is worth noting that not just aspects of language have consistent patterned units that appear hierarchic in structure within our cognitive system. This also appears in music, event structure, vision, social structure, and a myriad of other domains (discussed well here). But, we don't have to call them "languages" because of this broad similarity.

Suggested reading:
Foundations of Language and
Language, Consciousness, Culture, both by Ray Jackendoff

Navigating page layouts = defining "comics"?

One of the topics I debated closing out my new essay on page layout (pdf) with was its relationship to McCloud's definition of "comics."

As most know, McCloud's definition is that "comics" are "juxtaposed sequential images in deliberate sequence." Yet, he never places any constraints on that. He means all sequential images are "comics" — regardless of the characteristics of content.

On the one hand, you can say that his notion of closure demands that there is some content to those panels, and that closure is the underlying force behind his definition. However, there are places where this appeal to content is a bit slim. This is most apparent in his panel transitions, where "non-sequitur" as a catch-all for anything his other transitions don't cover. Or, in claiming that empty panels that represent "time" still maintain the essence of "comics." In other words, it doesn't matter what's in them, as long as they're in sequence.

If really all McCloud means by "comics" is that two graphic units are place next to each other, is he really just talking about the system I've proposed for layouts? This system just tells people how to navigate through a comic page — how to read from one panel to another. And, since my experiment used blank comic pages, it has nothing to do with content — just like McCloud's definition.

So, if navigation between panels is really all he's talking about for "comics," isn't that a little... I don't know... unremarkable?

It's also interesting in this interpretation, since (as I note in the paper) McCloud's notion of the Infinite Canvas essentially desires to simplify this navigational structure. Food for thought...

Sunday, April 20, 2008

Clever Dumbo?

The Language Evolution blog posts this interesting youtube video showing an elephant painting a picture of an elephant:

I've vocalized often that as much as there is a debate about whether animals can or do have language (they don't), we know of no animals that draw. By drawing, I mean that they employ tool use to graphically achieve conceptual expression (that is usually iconic).

While it is pretty amazing to watch, I am highly dubious of it as a reflection of elephant cognition or that they really can "draw" to the definition above.

I imagine that the elephant has it remembered as a specific sequential routine of actions, rather than the execution of a set of patterns stored in the head for representing a concept of "elephant" (or "me"). That is, it doesn't have a mapping of action/graphics to concept. Remembering such a sequence of fine grained actions is still pretty impressive, but very different than being conceptually expressive.

This is likely a trick the elephants have been taught to show tourists like the ones who took the video. Note that in the beginning of the video there are multiple easels, perhaps for multiple elephants? Do they all draw the same patterns? Do they have any capability to recognize what they're drawing? Do they do this spontaneously without an audience?

Before conceding that elephants truly can draw, I'd like to know enough to rule this out as a "Clever Dumbo".

Wednesday, April 16, 2008

Panel Time!

1.5 seconds per panel. That's how long it takes on average for wordless panels to be read.

I recently completed a very exciting study that asked people to read four-panel comic strips one panel at a time. In this "Self-Paced Reading" task, they see four boxes on the screen, and with each button press a subsequent panel appears in the sequence. Only one panel is shown on the screen at a time.

While they do this, the computer records how long it takes them to move from each panel to the next. By manipulating the strip in various ways, I'm able to tell if certain manipulations have a greater impact on the reading from the original "normal" sequence.

I'm not going to go into the intricacies of the experiment (you can wait for the write up for that), but I thought I'd share some tidbits.

For instance, for "normal" wordless four panel Peanuts strips, it takes a person on average 1.5 seconds per panel. The first panel is usually read relatively slow (1.7), the second panel is fastest (1.3), then third is slightly less fast (1.5), then fourth is back to where the first was (1.7).

An interesting side note about this: since all these panels were the same size, yet were read at different speeds, it might imply a rejection of McCloud's claim that panel sizes affect reading time (which thereby somehow affects narrative time). However, this experiment didn't test that specifically, and no alternations in panel sizes nor layouts were given either. So, I can't say anything conclusive about that.

This was a very exciting project for me to complete, because it marks the first time I've (anyone has?) looked at data like this for how people understand sequential images.

Monday, April 14, 2008

Navigating Comics Plus

Since various concerns have popped up here and there about my latest essay on page layouts (pdf), I figured I should take the time to reiterate responses to some of them here...

First off, the types of navigation I talk about here are absolutely intended to be part of a broader network of how people move through layouts. Certainly, panel locations aren't the only influence on people's movement through layouts. Among the other things potentially are color, content, etc.

What I was trying to get at is that people do have idealized preferences for reading directions given various conditions and that those preferences emerge *even in the absence of content* in non-left-to-right ways. My suspicion is that people use these sorts of preferences from panels as their first influence for navigational choice, which can then be further influenced by content and maybe color. However, that's something that would need to be empirically tested.

Another reason for creating this study that I didn't mention in my last post was that the year before I did the study, John Barber came out with his own paper about layout (unfortunately now taken offline). While he had some great ideas and observations, I disagreed with his basic claim that layout and meaning were expressly tied. Layout and content most definitely can be connected in important ways, but I think that this experiment nicely shows that they are governed by separate (yet interfacing) systems.

Finally, several people have been curious about instances where panels do not have borders at all. I do mention borderless panels, but only in a footnote, where I basically say "more testing needed"! My guess is that certain permutations follow the same principles that I outline in the paper, but others lead to greater violations because of the ambiguities they create (ahem, testing needed).

As with many of my papers, this should just serve to lay the groundwork for future work (by me or others). I'm still convinced that the really cool stuff will come far down the road!

Thursday, April 10, 2008

Essay origins

So far I've been very pleased at the response to my latest essay, "Navigating Comics", on how people navigate through page layouts (pdf). As several responses have been rolling in via email and elsewhere, I intend to do a post soon addressing concerns in that feedback. However, I think it'd be informative to first talk about the origins of this paper.

Back in 2003 when I was drawing our political book We the People, every now and then my editors would tell me they had trouble knowing exactly where to go in the sequence. Often this happened in consistent situations (like what I call "blockage" in the paper).

Most of the times I'd either simplify the layouts or make some graphic fix (like a trail) to indicate a clearer path. However, it got me thinking... My editors were quite a bit older than I was, and weren't all that experienced comic readers, so I wondered if this lack of experience mattered in their reading habits? (or if I was just needlessly making things difficult)

So, I designed this study to test that. I had a booth at ComicCon 2004 that year to promote the book and my other works, so I designed a simple pamphlet people could make responses in and tested people throughout the convention.

I could tell immediately that the results would be interesting, I just had to wait another three years to learn the statistics necessary to show them (d'oh!). The theory with the tree structures predated the experiment by at least a year, but it didn't really say much without knowing about people's actual preferences. It's exciting to see that my suspicions for creating the experiment were borne out in data.

Every now and then I get a response to my work along the lines of "Why do theory? Why not do something related to praxis?" While theory can be interesting, enlightening, and much of science is simply about discovery without practical applications in mind (ex: penicillin), another reason is that theory can sometimes wrap back around on praxis. I like to believe that this is one of those cases, especially given that it came from those origins.

Sunday, April 06, 2008

New Essay: Navigating Comics

I'm very happy to announce that I have a new essay online: Navigating Comics: Reading Strategies of Page Layouts (pdf). This paper reports the findings of an experiment I conducted looking at how people navigate through comic pages. The big finding: people don't just mimic text going left-to-right and down.

The full abstract:
The spatial domain is often considered to be non-linear, given the analog nature of visual information. However, the visual language of comics defies this by siphoning images into a deliberate reading sequence. Most often this sequence is assumed to be read in an order that mimics text: left-to-right and down, a “z-path.” However, several scenarios can violate this order, such as Gestalt groupings of panels that deny a z-path of reading. To investigate these concerns, an experiment asked 145 participants to number empty page layouts in the order they would read them, and showed that readers use an alternate strategy extending beyond both the traditional “z-path” and Gestalt groupings to navigate through comic page layouts.

I should also say that this paper took a very long time to complete. The study was run in 2004 and I know I've been talking about releasing the results all the way since last summer. Thanks to all who participated in the study and to all for being patient as I finished it up!


Wednesday, April 02, 2008

Time and The Torch

On this page I found another great example of a page by Jae Lee that defies the "temporal mapping" idea that successive panels are successive moments:

I'm unaware of the full context of the page, but the Human Torch is flying around some big monster of sorts and creates the number "4" (for Fantastic Four no doubt) in his path. Doing so, his path begins by violating a constraint of page layout, entering at the bottom of the page, and then flies over his own path, which crosses a panel he's already been in.

I'm not sure I agree with the analysis given on that blog, mainly because I think appealing to McCloud's transitions and closure only hurts his otherwise fairly good discussion.

Now, I don't want to suggest here that there is not time being shown here, but I think that there are two considerations that need to be reoriented.

First, let's not talk about "time," let's talk about "events." To the human mind time is only an extrapolation of events. Thinking in terms of a clicking-clock type of absolutist Time is not on the same level with the understanding of time constructed in a person's head. From understanding events, we can tell that time passes, not so the other way around.

Second, panels do not necessarily have to equal moments. Rather, panels function as "attention units" grouping important information into meaningful chunks. These chunks don't have to be moments, but they do highlight relevant information in ways that the author intends.

This is exactly the case in this example. The interesting thing is that the flow of events runs counter to the standard reading path of panels in order to create the "4" emblem. If reading left-to-right as if these were independent moments, this would make no sense whatsoever. But, because this display uses image constancy (breaking up a single image into parts... what I'd call a Divisional panel, the understanding of which is what Gestalt psychology would call Closure), the panels only serve to divide up the conceptual space of the image to highlight the Torch at different positions within the space.

Yes, the countering of events vs. panels is a bit funky, but it's also a creative use of playing the two off each other to reveal their functions.

Note: For those more interested in these types of examples about Time, most of these ideas are written about more extensively in my paper Time Frames... Or Not. Attention Units are discussed more in A Visual Lexicon.

Tuesday, April 01, 2008

Some links and whatnots

Steven Seagle has a decent piece up at the First Second blog about visual storytelling. He nicely taps into a simplified version of some of the same things that I've been pushing for my theory of visual grammar. The exercise he uses to rearrange panels is very reminiscent of linguistics methods, and is also a good one that shows how a broader structure exists above and beyond the so-called 'transitions' between panels.

Along these lines, Matt Madden and Jessica Abel will soon have a "how to" book coming out about comics. I've been hearing that its somewhat theory oriented, so the book should be an interesting read. So, keep an eye out in the coming months.

Finally, keep an eye on this very site in the next week. I will finally — finally — be posting the results of my experiment about comic page layouts shooting for next Monday. This one has been a long time coming — I first ran the experiment almost 4 years ago and have been working on the paper since last spring! The project tested whether or not people read comic page layouts using the "left-to-right and down" path like text. A preview: the answer is "not really."

Finally, last month was my most trafficked month ever, so, thanks to everyone that's been reading my site lately!