This script was used to perform an exploratory analysis of visual data and their labels selected from a data-driven tagging system based on the Grammar of Visual Design (Kress and van Leeuwen, 2006). The Grammar was tested on and adapted to digital tourism imagery, in particular Instagram and website photography.
This script was used specifically to perform a preliminary Principal Component (PCA) Analysis which has been published in Frontiers in Communication, in the Special Issue "Drawing Multimodality's Bigger Picture: Metalanguages and Corpora for Multimodal Analyses", to honor and discuss John Bateman's contributions to the field of Multimodality.
The paper is openly available here.
The R script reports the process of cleaning and reading both data (data items, i.e. images, counts, variables, or labels/tags) and metadata (channel, agency, metafunction) to perform a PCA and find dimensions of variance. Specifically, the code reads yes/no responses for data mapping into clusters.
The code was initially used to analyze all annotations, without any distinctions between highly reliable and less reliable annotations. This was done to offer a pluralist view of corpus data and compare it with descriptive statistics results. The paper mentioned does not report in detail the implementation of reliability measures for space constraints.
Reliability checks and subsequent analyses are available at this link. Additional open-access materials and articles will be published soon.
For further inquiries, please feel free to contact me on GitHub or via email: elena.mattei@unive.it
The author thanks Prof. John Bateman for his invaluable expertise and support in the conduction of the statistical analyses in R, including the implementations of appropriate reliability measures. All errors, omissions or misrepresentations are solely the author's responsibility.