The Future Is Electric Blue

… and other insights from the color spectrum of science fiction movies

Science fiction has never been precisely defined.  It can encompass all manner of settings (past, present, and future) and can accommodate any genre (from romance to neo-noir).  It can range from the dorkiest of hard sci-fi space opera, in which every plot turn is dictated by imagined, yet rigorous, physics, to an otherwise classic story with a single, technological plot element.

Despite the diversity, science fiction is a well-defined category in practice: like obscenity and ducks, we know it when we see it.  It stands to reason that science fiction might be defined by a certain, diffuse set of attributes, perhaps none of which is sufficient to identify a work of sci-fi on its own, but which, together, drive an overall perception of sci-fi-ness.  I sought to look for a few of these attributes using my movie dataset.  As before, I was looking for certain aspects of the color spectrum in science fiction movies which distinguish those movies from non-science fiction.  Geared up for what I thought would be an arduous and difficult attempt at data-mining, I was promptly confronted with a rather obvious set of differences between sci-fi and non-sci-fi.

Science fiction can be identified by its color spectrum

Using a similar approach to that employed in my last post, I endeavored to classify movies using a machine learning algorithm according to their sci-fi status.  The dataset itself consists of mean color profiles for each of 79 movies (after filtering out the black & white movies, and those for which I could not determine the genre).  Briefly, I encoded each movie as being science fiction or not, and then computed a set of  summary statistics on the color spectrum in each movie.  Importantly, sci-fi and non-sci-fi movies did not differ in their distributions of genre (p = .12).  After running them through a repeated, randomized, training/testing cycle, I began to get an appreciation for just how accurately science fiction could be identified.

Science fiction can be accurately classified with only two variables.
Science fiction can be accurately classified with only two variables.

Surprisingly, it’s quite easy to achieve greater accuracy than chance (50%; the green line), and using only two variables: average hue, and the variance in hue.  I’ll come back and explain what those variables mean in real, visual terms, but I want to note that it is possible to achieve ~70-75% accuracy in identifying science fiction movies from non-science fiction movies.  This fact suggests that there really are some properties which distinguish the color spectrum of sci-fi films.

Science fiction has lower average hue, more hue variance

The two variables which so accurately classify science fiction are both related to hue, and hue is a weird concept.  It’s basically a compound measurement of color which doesn’t take into account the brightness of a color, only its relative red/green/blue-ness.  This is better pictured than explained, so here’s 1000 random colors, plotted according to their hues (on the y-axis):

Hue is a compound metric of color.
Hue is a compound metric of color.

You can see that blue falls towards the low end of hue (towards -2), red at the top (+2), and green somewhere in the middle.  Various concoctions derived from mixtures of these colors occur in bands between the primary colors. Note that colors of a given hue can be bright or dark or anywhere in between.

Science fiction movies are characterized by two factors: they have lower average hue, so they tend to be more blue/green than red; and greater variance in hue in the same movie (so more different kinds of colors).  Both of these differences are statistically significant in my dataset, if you’re into that sort of thing.  Consider this plot of average hue versus hue variance, where science fiction movies are blue and everything else is red:

Average hue and variance in hue accurately discriminates between science fiction and non-science fiction.
Average hue and variance in hue accurately discriminates between science fiction and non-science fiction.

One can clearly see that sci-fi movies tend towards the left side of the plot, characterized by bluer and greener average hues, and are also higher in the plot, connoting more variance in hues.  This plot provides a powerful insight into how the classifier algorithm works: if one were to simply draw a line at an average hue of -0.5, classifying those to the left of the line as science fiction, and those to the right as non-science fiction, one would probably achieve greater-than-random classification accuracy.

It’s also worth considering some of the science fiction movies which don’t follow the expected pattern.  I have labelled a few in the plot, including Buckaroo Banzai (labeled BB8), the original Star Trek movie, and Soylent Green.  Each of these is noteworthy because they are a bit older than most of the movies in my dataset.  It stands to reason that the signature of science fiction might change over time, leading to a classifier which inaccurately identified them.  The idea of changes in color patterns over decades is definitely worth investigating, and something I plan to do more of as the dataset expands; but for now it will have to stand as only a hypothesis.

The last movie I want to draw your attention to is eXistenZ, which is not very old (1999), and yet still out of character for a science fiction movie.  Why?

The Cronenberg Effect

eXistenZ and another movie, Scanners (1981), are interesting because they are both frequently misclassified by the algorithm.  In fact, these movies are so atypical that they shift the entire machine learning algorithm towards lower accuracy: examine the 4th column in the first graphic–note that simply by removing those 2 movies, constituting only 2.5% of the dataset, I am able to increase the algorithm’s accuracy by ~5%.  In addition to their atypical nature, they share another fascinating commonality.

Both movies were directed by David Cronenberg, a critically acclaimed but commercially unsuccessful director.  (I have seen relatively little of his work–neither eXistenZ nor Scanners–but I thought his movie Eastern Promises was fantastic).  Incidentally, Scanners is not only an atypical science fiction movie, it’s an atypical horror movie as well.  While most horror movies are dark and relatively plodding in pace, Scanners is bright and quick paced, more like an action movie.  Therefore in both cases we see Cronenberg rejecting genre norms of color palette to pursue atypical visual styles in his movies–an idea I shall term the Cronenberg effect.

I speculate that it may be this blatant disregard for visual style morés which make Cronenberg’s films both critical darlings and commercial bombs.  I can only offer a little pop psychology to back this idea up: movie-goers expecting to see a certain kind of movie will be put off when they are confronted with a color palette not befitting that genre, while critics may consider the juxtaposition of classic horror with non-genre standard colors as refreshing.  With only two movies of this nature in my dataset, it is too early to make a rigorous test of this hypothesis, but it’s something I look forward to pursuing as I gather more films.

The future is electric blue

All of this business about average hue and hue variance is fine, but I wanted to pin down something more concrete and visually obvious that separates sci-fi and non-sci-fi.  In order to do this, I set about looking for specific colors that are enriched in the science fiction movies relative to the non-science fiction.  I reasoned that these colors might function as effective markers of science fiction: somewhat subliminal hints to the audience of the kind of movie they are watching.  Without describing the methods in too much detail (see Appendix if interested), consider this set of the top ~150 colors which are enriched in sci-fi:

Representative set of colors enriched in science fiction vs. non-science fiction movies.
Representative set of colors enriched in science fiction vs. non-science fiction movies.

Naming colors is an inherently subjective exercise, but I submit to you that there are generally two kinds of colors which are present in this set: electric blues, and forest greens.

In fact, of the 159 colors which show enrichment in sci-fi movies, fully 92% are some shade of green or blue.  Leaving aside questions of nomenclature (and my [hopefully] pithy title), I think we can draw three general conclusions from this plot:

1) Red is not much favored in science fiction movies, at least relative to non-science fiction movies.

2) Bright blues, regardless of what you call them, are substantially over-represented.

3) Similarly with dark greens.

So there you have it, a testable prediction: insofar as sci-fi movies represent visions of the future, the future will be characterized by beautiful electric blues and deep, lush, greens.

In more seriousness, and after having thought about the dataset a bit more, I wonder if these colors are basically serving as “signs” (in the semiotic sense) of science fiction: essentially distinguishing marks which inform the audience that they are watching a science fiction movie, and prime them for appropriate shenanigans (spaceships and lasers and time travel, oh my!).  Directors, and/or cinematographers, may choose them in order to confer a feel of futurity which enables the suspension of disbelief necessary for sci-fi plot elements to work.  Conversely, creators of non-science fiction movies might avoid these colors in order to establish the authenticity of certain “present-day” or historical settings.  Alternatively, the choice of bright blues and dark greens may not represent any vision of the future so much as the conscious choice to find colors not frequently used in non-science fiction cinema.

Naturally, caveats apply.  First and foremost, it’s worth noting that 79 samples, while a sizable dataset for much scientific inference, represents only a vanishingly small subset of all the movies ever made (and is likely non-random in many respects).  It could therefore be the case that electric blue is only a sci-fi diagnostic character in this limited sample of movies.  This is a difficulty worth noting, but only ameliorated through more and better sampling. Secondly, as already noted, these characteristics of science fiction may only apply for movies which were created in the recent past; it is possible that older movies may have employed other signifiers of science fiction.

Finally, science fiction is a fluid and complex categorization, and some science fiction movies may disregard these colors entirely (as per the Cronenberg effect).  Nevertheless, I find it very interesting that these colors are not artificial.  What I mean by that is that these colors are not difficult to produce on camera without special effects.  If you had asked me before writing this to guess the top two colors which characterized science fiction, I might have guessed deep black (space!) or some kind of bright-orange/red (danger! lasers! suns!).  But these colors are, in fact, somewhat natural: blues the color of the evening sky, or a lake; green the color of trees.  What that says about the collective unconsciousness’ vision of the future, I don’t know (and probably nothing); but for my money, I profoundly hope that the future gets a little more electric blue.




Color enrichment analysis:

First, I quantized all of the colors by putting each continuous color measurement (red, green, and blue), which normally range from 0 to 1, into one of twenty bins (0-.05; .05-.1; and so on).  I then counted the number of colors in each bin for 1) all of the non-science fiction movies; and 2) all of the science fiction movies.  After that, I subtracted the resulting color abundances for the science fiction movies from the non-science fiction.  I considered any set of colors more abundant in the science fiction films as “enriched” in science fiction–there were 159 such colors (plotted above).


One comment

  1. Pingback: Movies into Data |

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>