
Automated Saliency Prediction Applied to Cinema Studies
​​
​This article examines how filmmakers use visual saliency—the prominent elements that attract attention—to guide audiences emotionally. By comparing CNN-based saliency maps with real eye-tracking data, the study explores cinema’s historical shift from single-shot to multi-shot storytelling, showing that saliency guidance played a key role in the emergence of film editing.​
​
Dissemination:
-NECS Annual Conference 2022 @ Bucharest, Romania
-CUT/GENERATE Conference 2025 @ Paris, France
-Published in Projections: The Journal for Movies and Mind. Issue 3, Winter 2023. Co-authored with Professor Suren Jayasuriya PhD.
Abstract
In visual cognition research, saliency refers to the prominence
of specific elements in a scene. Moreover, saliency guidance is part of a
filmmaker’s toolset to build narratives and guide the audience into emotive
responses. This article compares two Convolutional Neural Network (CNN)
saliency mapping models with viewers’ eye-position mapping to investigate
the potentiality of automated saliency mapping in moving image studies by
analyzing saliency’s role during cinema’s transition from one-shot to multi-
ple-shot. Although the exact moment when montage and editing methods
appeared cannot be identifi ed with precision, findings suggest one of the
reasons for this transition was saliency guidance, hence its preponderance.
