Big data, Algorithms, and the Princely County of Gorizia and Gradisca

A lost world in a former empire in Europe has been brought to life thanks to University of Bristol researchers who used artificial intelligence (AI) techniques to analyse 47,000 multilingual pages from newspapers dating back to 1873.

The study, published in Historical Methods, aimed to discover whether historical changes could be detected from the collective content of local newspapers from the Princely County of Gorizia and Gradisca. The findings reveal a series of political and cultural events which took place in a forgotten corner of the Austrian Empire that is now divided between Italy and Slovenia, some of which were unknown, until now.

A team of computer scientists and a historian digitised microfilms of old multilingual newspapers from the County between 1873 to 1914. The images were then converted to text. The patterns that emerged from the automated analysis of 47,000 pages revealed the individual stories of thousands of people, but also the collective trends of a population in the years leading up to WW1 and the final years of that Empire.

Professor Cristianini, Professor of Artificial Intelligence and lead author of the study, said: “Importantly, we get aglimpse in the last years of a world heading towards a new chapter in its history and during a period that transformed it beyond recognition. We see new technologies, new ideas, new economic opportunities, new cultural challenges and problems.”

The findings highlight how the war transformed the city and its county into something entirely different. The front lines crossed through the city itself and the urban population was largely relocated. The annexation of the city by Italy was quickly followed by twenty years of fascism, another war, and finally the iron curtain that ran


right through the County itself, partly separating the city centre and some of its neighbourhoods.

Professor Cristianini added: “In this paper we have shown that, in the space of a few decades, the town embraced new ways to communicate, such as the cinema and the telephone, along with modes of transportation, like the car, the airplane, the bicycle and the train. Far from being a backwater of a decaying empire, this was a city with an eye on the future and an interest in new ideas – including political ones. It was, however, also a time in which new tensions emerged along ethnic lines and of rapid change, with problems and anxieties that sound very familiar to the modern ear.

“It is also incredibly fortunate that the collection of newspapers in the Biblioteca Isontina library survived so many threats.”

The Bristol group will hold workshop on Digital Humanities in Windsor in June:

This project was partly funded by the ERC Advanced Grant ThinkBIG, which explores applications and implications of big-data technologies.


Large-scale content analysis of historical newspapers in the town of Gorizia 1873–1914 by N Cristianini, T Lansdall-Welfare and G Dato in Historical Methods: A Journal of Quantitative and Interdisciplinary History, DOI: 10.1080/01615440.2018.1443862

Press Release:

Media Coverage: