I recently found this study (pdf) by Kinzer and colleagues that compares the comprehension and eye movements of readers for narratives in comics and video games. Their main goal is to help provide support for the use of comics and video games in educational contexts.
In this study, they presented sixth graders with either a video game version of a story or a comic created from the images of the video game. Overall, they find that participants understood the story in the video game version better than the comic version. They also found people spent more time engaged with the game than the comic.
I would take all of these findings with a grain of salt...
...because the stimuli appear to be extremely confounded because the comic versions of the story appear to be so badly created. Judging by the example in the text, the comic pages clearly were created by someone who had no real fluency in the visual language of comics. This is clear at a glance just by the example that they provide in the paper:
If this is their example (what is probably the best example they have), I shudder to think what other pages in the experiment look like. Seriously, if I wanted to design an experiment that had "incomprehensible comics pages" as one type of stimuli, I'd use pages like these.
It's no wonder they found that their participants had poorer comprehension for the comic version—their stimuli are the equivalent of trying to test English comprehension while using broken English. It tells you next to nothing of interest.
There are two main points I'd like to make about this:
First, good experiments are hard to design, and having something worth saying must follow from having successfully designed an experiment that can give you good information. It pays to be critical as a creator and reader of scientific research (no matter what the topic).
Second, doing experiments using the visual language of comics is not trivial. Stimuli cannot be created by anyone, regardless of their fluency in comic creation. Just because you can throw together some images and words into panels on a page does not mean you've successfully created an example of "native" visual language. Believing otherwise does a disservice to yourself and to others who might read and cite your research.