Statistics, Department of

The R Journal
Date of this Version
8-2016
Document Type
Article
Citation
The R Journal (August 2016) 8(1); Editor: Michael Lawrence
Abstract
This software paper describes ‘Stylometry with R’ (stylo), a flexible R package for the high level analysis of writing style in stylometry. Stylometry (computational stylistics) is concerned with the quantitative study of writing style, e.g. authorship verification, an application which has considerable potential in forensic contexts, as well as historical research. In this paper we introduce the possibilities of stylo for computational text analysis, via a number of dummy case studies from English and French literature. We demonstrate how the package is particularly useful in the exploratory statistical analysis of texts, e.g. with respect to authorial writing style. Because stylo provides an attractive graphical user interface for high-level exploratory analyses, it is especially suited for an audience of novices, without programming skills (e.g. from the Digital Humanities). More experienced users can benefit from our implementation of a series of standard pipelines for text processing, as well as a number of similarity metrics.
Included in
Numerical Analysis and Scientific Computing Commons, Programming Languages and Compilers Commons
Comments
Copyright 2016, The R Foundation. Open access material. License: CC BY 3.0 Unported