An analysis of the average colour of feature films
COSC3000 visualisation report
2 Introduction
Colour has a large subconscious impact on the human body. In some cases it seems that evolution has hardwired the brain to respond with certain emotions to colours. Studies tested sample images of modified colour to assess the effect of colour on people’s disgust in response the pictures. Figure 1 highlights the fact that colour perception plays an integral role in people’s perception of disgusting and unhygienic situations (Curtis et al., 2011). In this case colour is tied to the content being viewed but in other cases the colour, independent of content, leads to subconscious changes to behaviour.
Figure 1: Which image is more disgusting? Colour has been shown to be intimately tied with people's disgust response
Images: http://www.bbc.co.uk/science/humanbody/mind/surveys/disgust/index.shtml
Studies have shown that the colour of food, has a large effect on the appetite of the participants (Cornier, 2009). The ability to subconsciously react to subtle differences in colour associated with food could conceivably increase the amount of food gathered and thus the animal’s fitness. The same can be said for disgust in avoiding unhygienic situations.
This report aims to investigate the effect of these inherent emotional responses to colour and their effect on the emotions evoked by visual media, in particular feature films. Because colour can have such large subconscious effects on humans the authors of visual media may utilise these effects to evoke their desired emotional responses in the viewer. Specifically directors of movies may be, consciously or subconsciously, altering the average colours of the frames of their movies in a content-independent manner to evoke their desired emotions.
The purpose of this analysis was to determine whether directors and other film staff were altering the colour of frames to influence the emotions evoked within the viewer. Different scenes within movies are intended to produce different emotional responses within a movie and would be good predictors of the emotional purpose of each frame. Unfortunately there is no data for the directors intended mood for each scene. This analysis works around this by using the genre as a guide for the dominant emotions intended for the entire movie. By averaging the colour for the entire movie, and having a guide of the movie’s dominant emotions through the genre, an analysis can be made as to whether the colours associated with the dominant emotions of the genre are more present in the average colour of movies within that genre.
3 Obtaining the data
3.1 Sampling methods
The top five movies of the 1920’s and 1930’s and the top ten movies from each decade thereafter as rated by http://www.digitaldreamdoor.com/pages/movie-pages/ were gathered. This was added to a random collection of 109 modern movies for a total of 199 movies.
3.2 Frame. averaging
The Open Source Computer Vision API was used to process the movie files and output an average colour. The RGB components of each pixel of each frame were added up and then averaged.
3.3 Metadata gathering
The Internet Movie DataBase (IMDB) was searched for the metadata for each movie by the filename. Initial attempts at searching attempted to decode the sometimes cryptic filenames of the downloaded movie files. Using regular expressions the year and file extension were able to be extracted from the
title. However, the search algorithm of the IMDB proved too restrictive and denied any search that wasn’t perfectly formatted. The final option involved manually changing the filenames to an easily parsable form. For example, “Safety.Last.[Harold.Lloyd].(1923).avi” was changed to “To Safety Last (1923)“ .
In some cases the automatic annotation was unsuccessful and in others it was erroneous so a final sweep of the data needed to be performed, manually checking the filename to the metadata and gathering any data that wasn’t able to be gathered automatically.
4 Data analysis
4.1 Initial impressions of average colour
The first visualisation was a bar graph of the number of movies in each genre with the bars coloured by the average colour of the movies they contained. This immediately showed some interesting effects in the data. Figures 2 and 3 show that the average colour of movies by genre is very close to grey. A
histogram of the saturations shows that the majority of colours have saturations, or distances from grey, between 0 and 0.1. To enable to Figure to better express the subtle differences in colour between the genres the colours were converted to HSL format and the saturation was increased to 1. This showed that the average colour of genres was almost unanimously brown with only small degrees of variation.
Figure 2 A histogram of the saturation of the average colour of each movie
Figure 3 50 Shades of Grey: A bar chart of the number of movies by genre coloured by the average colour of each genre shows that the average colour of each genre is predominantly grey.
Figure 4 48 Shades of Brown: The same bar chart with increased saturation showing when not grey the average colour of
each genre is predominantly brown.
4.2 Comparing Hues
Early comparisons between the colour components showed large differences between the components in both RGB and HSV format. Additionally, the colours seemed to be grouped very tightly within their colour range and very similar by the different components. The grouping could be explained by the fact that the components of the RBG and CMY formats needed to account for the shade of the film. Given that the films have very little saturation and differ very little from grey any colour differences were overshadowed by the colour shades as can be seen in Figure 5.
Figure 5 The magnitude of the cyan (top left) and magenta (top right) components can be almost entirely explained by the
lightness (bottom)
The solution to this problem was to use the hue for all further comparisons in colour. However, this presented its own unique problems. What is an effective way to compare colours from a 360 degree angular value? This was solved by using vector products. The 360 degree angle was converted into vector in the unit circle. The vector component of a colour to the vector of another colour could then be measured by their various hues. This gives a number between -1 and 1 as a measure of how similar the hues are. Because this is a unit circle this simply became the cosine of the difference in angles. This method also allows the testing of any hue, as opposed to only being able to test CMY and RGB components.
Figure 6 The vector component method of comparing hues. Yellow has a redness of 2/√2 and green has a redness of - 2/√2
Figure 7 The corrected cyan (left) and Magenta (right) comparisons are no longer dependant on the shade of the colour
This method isn’t without its own detriments. The first is that 0 is the default hue for all colours. If there is no hue then the default is red. This was fixed by filtering all movies with no hue out of graphs. The
second is more of a general problem with colour theory and it is whether to include saturation in the comparison as well as hue. This could be easily implemented by scaling the vectors according to their saturation.
Figure 8 Which colour is more red? On the left the prediction without saturation in more accurate while on the right the prediction with saturation is more accurate
The conundrum can be visualised in the two cases in Figure 8. In case one the saturation of yellow is one while the saturation of a colour very close to red is lower yet, if saturation is included, both colours have the same amount of red. This case suggests that saturation should not be taken into account. Case two shows a colour that is nearly red at full saturation and a colour that is black with a hint of true red. This case suggests that saturation should be taken into account. The correct answer in how to display the data lies in how the brain processes colours to produce emotions, which is certainly beyond the scope of this report. Further comparisons with hue used the approach that did not include saturation.
4.3 Western Colour perception and genre
4.3.1 Grouping colour and genre
In order to reduce the number of false positives encountered only distinct genre colour pairs were tested. Genres and colours were only paired if they had a related colour. All colours and genres could have been compared but this would have increased the number of false positives and would have added nothing to the analysis. Genres where there would be a distinct lack of certain emotions were also included. The pairing process can be seen in table 1.
Table 1 The process of selecting the genre colour pairs that were tested for interaction
As can be seen in table 1 many of the film genres have no dominant emotions while many of the colours have no emotions that are relevant to film making.
4.3.2 Plotting the results
Figure 9 Boxplots of the distribution of the colour component of the movies by genre. Genres were chosen by table 1 and distance to hue was calculated using the vector components.
Box plots of each genre were mapped against the component that they were being tested against. This gives an idea of the distribution of the component within each genre. The magnitude of the hue for each of the hues was measured with the vector component. A scatter plot of transparent squares was plotted next to the box plot for each genre to give an idea of the absolute number of movies as well as their average colour. Jitter and transparency were added to visibility of density of values despite the fact that they overlapped considerably.
4.3.3 Interpretation
There are no genres whose mean component level falls outside the upper or lower quartiles of any of the other genres. This suggests that there is very little effect of genre on the average colour of movies.
T-tests between each genre-component pair and all films that weren’t in the genre were also performed. The only pairs that showed significant results were the family genre being significantly lighter than average (p=0.01). Because there was a large overlap between family films and animated films a t test was performed on the lightness of animated films compared to non-animated films. It was found that animated films are significantly lighter than normal films (p=0.008) and this probably explains the lightness of the family genre. This lightness is likely to be associated with the filming techniques of animation rather than any genre differences and the associated dominant emotions.
4.4 Periodicity
4.4.1 The Oscar effect
Another effect investigated is related to the “Oscar season.” The Oscar season describes the phenomenon of studios releasing their most award worthy films during September, just before the eligibility cut-offs for the academy awards take place. This effect may introduce a periodicity to the colour data. This report takes a naive approach and attempts to find any yearly patterns which can then be further analysed.
4.4.2 Plotting for periodicity
Figure 10 The degree of each film towards the colour components when grouped by month. The line joins the mean component level for each month
The hue components and saturation and lightness were all graphed against release month. The point of each movie was represented by its average colour and was mapped according to the intensity of the measured hue and its month of release. Because there was a large amount of noise in the data a line graph between the means of each month was added.
4.4.3 Interpretation
It appears that there is very little periodicity to the current colour data. This is to be expected given that genre has shown to have no effect on average colour and that the Oscar effect only affects genre.
5 Conclusion
In conclusion there seems to be very little effect of dominant emotion in movies and their average colour.
One factor that could have been improved was the sampling methods. Taking a random, representative sample would have been much more useful than the sample that was taken. Some analyses, such as the progression of colour over time with different film technologies, were not able to be made due to the lack of a representative sample.
Given time further statistical analyses would have been performed on the data. ANOVA and multiple regression would have been able to deduce with more certainty whether there were any differences between genres and months which would have aided analysis.
Another improvement could have been an increased sample size. The distribution of movies by their colour components show a large amount of variance, which is understandable given the content of movies gains precedence over any of the effects being analysed. For example, if there is a horror set in an arctic environment as a result of the films content it will inevitably be lighter, no matter what the genre or dominant emotions involved. The direction would most likely have no problems in creating a movie that was scary. This is a fringe effect and thus needs large sample sizes to overcome the noise of its lack of importance. With a much larger sample size there might be some differences that can be teased out of average movie colour.
6 References
Cornier, M.A. (2009). The effects of overfeeding and propensity to weight gain on the neuronal responses to visual food cues. Physiology & behavior. 97, 525-530.
Curtis, V., de Barra, M., and Aunger, R. (2011). Disgust as an adaptive system for disease avoidance behaviour. Philosophical transactions of the Royal Society of London Series B, Biological sciences 366, 389-401.