Attentional Intersections Between Psychology and Cinema

In our current technological environment, it can be difficult to determine the difference between what’s natural and what’s constructed. Our brains are natural and our computers are constructed, but is the knowledge they produce unnatural by design? Is it any less efficient or powerful than the knowledge produced by the brains which constructed them? Is a screen a detriment to society for its addictive nature and brainwashing capabilities, or is it a superimposition of life itself, crafted for our brains to consume and reflect on? The cinema is constructed. Does that mean it’s less real or robust than natural human experiences? Is it more powerful, perhaps? Attention research and cognition theories dawn some light onto the internal structure of cinema, especially as it relates to memory formation, entertainment, and empathy.

It seems as though, while the language of cinema isn’t by linguistic terms, a true language, it still holds great power in implanting subconscious ideas which are formed through story clues, set design, lighting, framing, perspective and more. It can also influence emotional impact, attitude formations and occasionally, alterations of moral code (expressed most extremely among fascist regimes and their tailored propaganda). As such, it can be inferred that cinema draws on an attention-based perceptual analysis which reveals the type of information humans are most receptive to, how much people are willing to suspend their disbelief before the picture is deemed unrealistic, and how certain perceptual clues may impact their attitudes in both the short-term and the long-term. Through a deep, four year analysis, I’ve compiled an array of similarities between psychology and film theories, pursued personal research on the topic utilizing an eye tracker, and analyzed the methodical undertones of edited films in an attempt to unearth how movie enthusiasts can take these intersections into account when analyzing the complexities of the screen and what filmmakers decide to put on it. 

In either 1928 or 1929, René Magritte produced the famous painting, This is Not a Pipe, which was meant to signify the difference between physical entities and representational ones. The painting of the pipe is just that, a painting. It is merely a recreation of a pipe that may or may not even exist. Rather, it is an icon of what people generally picture a pipe to be. Some psychologists believe that our ability to distinguish the painting of the pipe as a pipe, while also acknowledging that it’s different from a physical pipe, relies on our cognitive representational abilities. In other words, our minds hold representations of entities in the world in our minds, and we draw on those representations when perceiving information. Other psychologists, however, completely refute the idea of representations, often opting for dynamical models instead. There are a number of alternate theories, however; some are more focused on neuronal relationships and some are more focused on artificial intelligence systems but these theories have mostly been refuted since, in the wake of a more integral objective toward the explanation of the human mind (inspired largely by the philosophies of cognitive science).

Psychologists who advocate for the existence of representations tend to believe that people perceive, categorize, and distinguish things through probabilistic inferences across multidimensional state spaces. For example, if a person were to walk into a zoo, based on their prior experiences, they would likely be able to determine which animals are which and categorize them beyond that. Due to multiple exposures to these animals throughout the person’s lifetime, they would be able to track all the subtle details of each animal and map them onto state spaces where members of the same species cluster. If the animals in question are similar, like let’s say hamsters and mice, they might be mapped closer together. However, over time and through compounding experiences, the lines between these clusters expand, and the person’s ability to distinguish these two rodents from each other expands along with it. The similarities between them become less important/less noticeable. This is also applicable to faces, a person’s ability to distinguish a person’s sex based on their face increases over time due to experience and as a result, the opportunities for ambiguous perceptions decrease. 

Under this model, people would be able to distinguish the painting of the pipe as a pipe because they’ve likely seen many pipes in the past. Due to their statistical inferences regarding pipes and objects resembling pipes, they don’t even need a three-dimensional object to make the connection. Psychologists who believe in this theory generally believe that people have an icon image in their mind when it comes to a familiar entity and when they encounter them, they draw on their probabilistic inferences along with the icon; then, through an extremely quick, millisecond experience, they attribute an identity to the object and act accordingly. This would explain why people might immediately think, “oh that’s a pipe,” upon noticing the painting and once they read the famous quote beneath, may realize that they did, in fact, make a slight error in their mental semantics. The process of categorization and judgment seems to be an instantaneous experience for people, and rightly so. If people had to wonder if a door was a door every time they encountered one, a lot of time would be wasted staring at doors. This could also be applied to survival and adaptation explanations. Quick judgements are sometimes needed in order to navigate the world safely. 

In regard to cinema, this representational ability would supersede the confinement of the screen, offering a relationship to what exists on it even though it is not physically present. In Lacanian terms, and relating to Christian Metz’s paper, “The Imaginary Signifier,” spectators of the screen might compare it to a mirror, but upon noticing that they’re profoundly missing from the reflection, will instead decide to view the screen through the eyes of the characters, suspending their disbelief about the movie’s inherent nonrealism.

 

Dynamical systems theory, however, follows the idea that there are no “states.” Just as time flows with no concern to those who record it, there are no specific “desirable outcomes.” Rather, there are consistent perturbations to stability. Biological systems will mostly lean toward semi-stable states, sometimes referred to as attractor basins. Through conditioning and higher levels of attention, one could reduce the size of their basins toward specific pursuits over time, relative to their effort. This theory argued that living systems, unlike computer systems, operate on multivariable levels far too complicated for simple cause-and-effect intentions. Rather, the system reacts to both the body and the environment, assembling its cognitive resources in an efficient, time sensitive, and calculated manner.

These calculations aren’t always apparent to the individual because they primarily make subconscious calculations based on internal statistics and estimates of themselves relative to their environment. This is all biological, so it’s not considered a cognitive resource. For example, a person catching a ball won’t have to calculate actual physics in order to determine where it’s going to fall. Rather, the person collects the information from the light hitting their foveas, and the information gets processed almost instantaneously. The body reacts based on probabilistic inferences about the person’s physical location relative to the ball. This is calculated through bodily resources that are expended before any higher-level, more “abstract” resources are exercised. This is likely because people have a limited amount of space in their brains but a nearly infinite amount of things to fit in them, so the brain has to make estimations in order to reduce the need for cognitive awareness in (what it perceives to be) common situations.

 

If exact formulas aren’t needed, but rather molded by their dynamical relationships, and the system is widely self-sufficient in terms of adaptation, then what we consider “autonomy” is really just the result of infinite relationships/patterns among a person’s respective body and environment. It can not be calculated the same way a computer calculates database information, or anything else for that matter. There is no input-output relationship occurring in biological bodies. It can better be attributed to something like thermodynamics, the relationship between heat and other forms of energy. This comparison is made by Michael Spivey in The Continuity of Mind. 

Dynamical systems are also widely studied in mathematics. One example is the double pendulum and the chaotic movement it produces; it’s not easily calculated and relies largely on initial movements. Similarly, a person’s initial experience with life may predict certain things about how the rest of their life will go, but not to the degree where it can be assumed or easily calculated. Experience plays an extremely large role in this assessment and even then, it’s still an unstable predictor. Moreover, this theory stresses that mind-oriented processes unfold and change over time. Spivey suggested for researchers to keep this under consideration by measuring dependent variables over a period of time, to ensure plentiful information for their hypotheses. 

In an experiment titled “Joint Action Aesthetics” by Vicary, Sperling, von Zimmerman, Richardson, and Orgs, participants were instructed to watch a live dance show while wearing heart monitors and were also asked to determine their enjoyment of the show as well as their take on how “together” the dancers seemed by utilizing a virtual graph on a screen. The movement of their fingers through the show was used to determine the relationship between audience enjoyment and dance synchrony. Researchers also videotaped the show and measured actual dance synchrony by the pixel. The results determined that audiences who felt an overall like or dislike of the show did show more enjoyment when there was synchrony, both implicitly (heart rate) and explicitly (graph). Those who weren’t sure didn’t respond according to synchrony; the relationship appeared to be random or even reversed. Thus, we can infer that emotions attributed to a visual stimulus help expose audiences to more detailed analyses toward the stimulus. An audience member watching a movie, for example, might be more in tune with the various parts (especially in regard to all the parts working together) if they produce a strong emotional reaction toward it. 

It seems, however, that movies which often elicit strong emotions actually decrease audience awareness toward their various parts. In an experiment titled “How attention is driven by film edits: A multimodal experience,” conducted by Arthur Shimamura, Brendon Cohn-Sheehy , Brianna Pogue, and Thomas Shimamura, analyses found that audience attention was heavily dependent on multimodal factors which assisted in attentional engagement and reduced cognitive awareness toward cuts. Audience members seemed to be more engaged when all the parts melded together in sufficient harmony; the lack of awareness toward edits and other continuity techniques increased investment. This suggests a desire for synchrony, much like the dancers. However, it also shows a desire to “get lost” in the film being watched, essentially striving for an emotional attachment rather than a cognitive assessment. Audience members don’t necessarily perform a detailed analysis drawing on a holistic awareness, rather, they often pigeonhole their attention to the story, characters, or whatever else the filmmakers point their attention to. The concept of manipulated attention seems to sit within the core of the cinema. 

In another experiment titled “Perceiving Event Dynamics and Parsing Hollywood Films,” conducted by James Cutting, Kaitlin Brunick, and Ayse Candan, results did indeed show evidence for parsing in Hollywood films, essentially analyzing them in parts in order to determine their syntactic roles. This is seen in the assembly of grammar. Researchers found that audience members assembled film into units through frames, shots, subscenes, scenes, sequences, and acts. Luminance and color discrepancies across the frames helped predict this parsing and the duration of shots helped with establishing pacing and characters’ emotions. All of these factors were analyzed by the viewers even when the motivations of the characters’ were unclear; perceptual information was analyzed before anything relating to the story was. We can infer then, that perceptual clues are analyzed first and can heavily influence the rest of the movie experience. However, due to the event segmentation present from the parsing of units, the audience members watching don’t need to consciously attend to these various parts; they’re simply analyzed as continuous stimuli, providing a particular atmosphere for the story to unfold under and reflect on.

In an experiment titled “Fractal Structure of Event Segmentation: Lessons From Reel and Real Events” conducted by Julia Blau, Stephanie Petrusz, and Claudia Carello, researchers found some evidence for fractal segmentation in movies and compared it to the fractal segmentation we experience through natural living. Participants were asked to parse both films and a basketball game (real life event condition) by pressing a button during what they perceived to be breakpoints. Researchers found that filmmakers utilized perceptual tricks in order to drape invisibility over their cuts and encourage a subjective reality, often mirroring the perceptual experiences of life. Button presses were not simply motivated by cuts, but rather an interaction between their environment, the movie, and their cognitive processes. This suggests a dynamical relationship between the perceiver and the movie’s various forms of information, much like what we see with brain, body, and environmental dynamics outlined in the continuous cognition theory proposed by Spivey. Researchers suggested that event analysis was based on an attentional ruler determined by the particular film’s editing style. 

In their second experiment (under the same study), researchers found that perceivers of daily life use attentional rulers too, in order to assess events subjectively and process/parse the perceptual information they gather. Moreover, they found that fractal segmentation was present in most modern films but not really present among films made before 1980. They hypothesized that fractal segmentation in films would rise over time, as continuity and perceptual techniques for film improve. Overall, researchers believed that the fractal segmentation element in modern films gets borrowed from nature in order to devise realistic/plausible moving images for consumers. This hierarchical structure allows for a familiarity among viewers so they could, in Lacanian terms, look at themselves in the mirror, detach from their identities upon noticing their absence, and ultimately latch onto the character(s) as they explore their environments. This is done without conscious awareness but its existence seems to be a determining factor in whether or not the film is clear, trustworthy, and coherent. 

Gaze dynamics in films also suggest lower levels of executive functioning when presented with well regarded, coherent films and higher levels of executive functioning when presented with disorganized ones. In an experiment titled “How Narrative Film Captures Attention and Disrupts Goal Pursuit,” by Cohen, Shavalian, and Rube, researchers found common gaze fixations among people watching particular feature films. This is sometimes referred to as attentional synchrony. However, they also found that films of lower caliber had less consistent gaze points throughout (i.e. less synchrony among participants). Moreover, they found less conscious awareness and a decreased ability to complete memory tasks when presented with coherent movies. 

In my own research, “Gaze Dynamics of Film Continuity: Spatial Issues, Argument, and Confusion,” I measured gaze dynamics among participants watching filmed arguments, in normal continuity and 180 degree breaking conditions. The 180 degree rule contests that the camera is not allowed to cross the degree of axis conjoining the actors at any point, otherwise it could cause disorientation among viewers.

I hypothesized that the breakage of spatial continuity would result in a general lag between facial fixations and moreover, that the breakage would alter participants’ perspectives on the arguments being tackled. Overall, the data seemed to support our hypotheses regarding the lag when watching spatially severed videos, however the effect size was quite small. Participants answered questions regarding their perspectives on the argument and whether or not they noticed the difference between videos. Most participants seemed oblivious to what the actual difference was but claimed to enjoy the spatially severed video less. After watching the spatially severed video set, participants’ opinions toward whether or not dinosaurs' existence altered. Sixteen percent more people said that they were “pretty confident that they existed” after the second set of videos. The other fifty percent said “of course they existed.” While minor inconsistencies in edited film may not consciously disturb or catch the attention of the viewer, the spatial relations within those inconsistencies may subconsciously affect their attitudes in the long term. 

The attentional intersections between psychology and cinema stem from the similarities in their perceptive structures. Human minds, under Michael Spivey’s framework in “The Continuity of Mind,” are far from sentient and/or capable of making their own decisions. Under his hypothesis, minds are a sum of brain, body, and environment and the dynamical relationships between them are what ultimately determine the goals/wants of the agent. Movies mirror this framework by intersecting various variables which ultimately determine the thoughts/desires/expectations of the viewer, and much like the perception of daily life by human beings, the experience among moviegoers tends to be relatively similar in essence. 

Filmic worlds abide by their own rules, and through filmmakers’ tool set of editing, lighting, movement, sound, production design, costuming, etc, movies capture their viewers and impose their own perceptive stimuli for their audience to react to. Audiences usually possess collective experiences with the dynamical relationships between the senses. Much like the pipe in René Magritte’s painting, the cinema mirrors the physical entities of daily life while still acknowledging that it’s fundamentally different. It is an immersive reflection of the world it mimics, however, it recognizes the true passivity of the human experience. In life, we perceive free will. In the cinema, we surrender it.


Next
Next

Product Descriptions