Brent Wilson and Marjorie Wilson have done a lot of great work on child drawing that has influenced my thinking, especially since they take comics and manga into account a lot. I got to meet Brent last year at a manga exhibit, and he was a really nice guy. Here's another good (old) article that I recently read (it won't be the last):
Wilson, Brent, and Marjorie Wilson. "An Iconoclastic View of the Imagery Sources in the Drawings of Young People." Art Education 30.1 (1977): 4-12.
Wilson and Wilson present findings that contrast the century-old belief that imitation is bad for learning to draw. This view focused on the belief that children have an innate purity to drawing that emerges out of their natural tendencies. Imitation is thought to defile this purity. Additionally, due to their iconicity, drawing has been seen as the correspondence between the world and mental equivalences of those objects.
The Wilsons' data counteracts this with the observations that of the hundreds of drawings gathered from high school students, virtually all of them could be traced back to imitation of some other source of representation (especialy comics and cartoons, but not much fine-art). They note that the learning of drawing might be more similar to learning words (though they don't seem to really know what that means beyond common sense knowledge).
They propose that people are using/creating mental models for drawing, and that minor modifications to generalized structures can aid in creating specific representations. At times, these models would be begun to be employed, yet abandoned part way through if the drawer couldn't produce the desired result with that schema.
People might also have mental models for one type of representation, but be unable to do drawings outside of that model. So, drawing novel objects either results in sub-adaquate ability or deploy existing models from other domains.
In total, people seem to be able to store hundreds of these types of mental models. For instance, one subject who drew comics could create figures in innumerable ways. When drawing a figure he hasn't before, they hypothesize that he "averages" several of the models together to produce the new form. To me, this raises the question whether the aggregation of these models creates a singular more abstract schema or whether it remains a catalogue of numerous "malleable" models.
This means the mind has a place to store such visual schemas, a "photological" component akin to a "phonological" component for spoken language. As I've said before, if perception is the desired stimulus for drawing, models aren't created from existing models — so the system never creates a conventional set of signs.
And just to riff on my previous post, this is also the reason why drawing perspective might be "awkward" for learners — because it sidesteps metal modeling in order to exact a system of depth through measurement (as opposed to schemas... though "loose" drawing of perspective might involve some level of mental modeling created by learning how to do it. I'll have to see if there's research on that...).
I had a 13-ish year old student when I taught "drawing comics" in an afterschool program, who could not draw perspective for a tile floor in a hallway. He drew the receding lines of the hall walls with us as a group, then when drawing the floor on his own he drew it flat — like an aerial view. My interpretation: he couldn't override his existing mental models for spatial representation with new ones for perspective. That's not to say he couldn't if he worked at it, but at that moment he couldn't do it.
Finally, note that perspective and schemas are both learned — the difference is that one is acquired "effortlessly" through imitation (as they'd say in language acquisition) and the other is taught explicitly.