Study of 800-million tweets finds distinct daily cycles in our thinking patterns

6 am: our peak time for analytical thinking

Our mode of thinking changes at different times of the day and follows a 24-hour pattern, according to new findings published in PLOS ONE. University of Bristol researchers were able to study our thinking behaviour by analysing seven-billion words used in 800-million tweets.

Researchers in Artificial Intelligence and in Medicine  used AI methods to analyse aggregated and anonymised UK twitter content sampled every hour over the course of four years across 54 of the UK’s largest cities to determine if our thinking modes change collectively.

The researchers were able to reveal different emotional and cognitive modalities in our thoughts by identifying variations in language through tracking the use of specific words across the twitter sample which are associated with 73 psychometric indicators, and are used to help interpret information about our thinking style.

At 6 am, analytical thinking was shown to peak, the words and language at this time were shown to correlate with a more logical way of thinking. However, in the evenings and nights this thinking style changed to a more emotional and existential one.

Although 73 different psychometric quantities were tracked, the team found there were just two independent underlying factors that explained most of the temporal variations across the data.

The first factor, with a peak expression time starting at around 5 am to 6 am, linked with measures of analytical thinking through the high use of nouns, articles and prepositions, which has been related, in other studies, to intelligence, improved class performance and education. This early-morning period also shows increased concern with achievement and power. At the opposite end of the spectrum, the researchers find a more impulsive, social, and emotional mode.

The second factor has a peak expression time starting at 3 am to 4 am, the aggregated twitter content found this time to be correlated with the language of existential concerns but anticorrelated with expression of positive emotions.

Overall, the study discovered strong evidence that our language changes dramatically between night and day, reflecting changes in our concerns and underlying cognitive and emotional processes. These shifts also occur at times associated with major changes in neural activity and hormonal levels, suggesting possible relations with our circadian clock.   Furthermore, the study revealed both cognitive and emotional states change in a predictable way during the 24 hours.

Professor Nello Cristianini, Professor of Artificial Intelligence and the project lead, said: “The analysis of media content, when done correctly, can reveal useful information for both social and biological sciences. We are still trying to learn how to make the most of it.”

Professor Stafford Lightman, Professor of Medicine and a neuroendocrinology expert at Bristol Medical School, and one of the study’s authors, added: “Circadian rhythms are a major feature of most systems in the human body, and when these are disrupted they can result in psychiatric, cardiovascular and metabolic disease. The use of media data allows us to analyse neuropsychological parameters in a large unbiased population and gain insights into how mood-related use of language changes as a function of time of day. This will help us understand the basis of disorders in which this process is disrupted.”



‘Diurnal Variations of Psychometric Indicators in Twitter Content’ by Fabon Dzogang, Stafford Lightman, Nello Cristianini in PLOS ONE [PubMed].


Media Coverage

English Language











Circadian Mood Variations in Twitter Content

Twitter can reveal our Collective Emotions


In the largest study of its kind, researchers from the University of Bristol have analysed mood indicators in text from 800 million anonymous messages posted on Twitter. These tweets were found to reflect strong patterns of positive and negative moods over the 24-hour day.

Circadian rhythms, widely referred to as the ‘body clock’, allows people’s bodies to predict their needs over the dark and light periods of the day. Most of this circadian activity is regulated by a small region in the hypothalamus of the brain called the suprachiasmatic nucleus, which is particularly sensitive to light changes at dawn and dusk, and sends signals through nerves and hormones to every tissue in the body.


The research team looked at the use of words relating to positive and negative emotions, sadness, anger, and fatigue in Twitter over the course of four years. The public expressions of affect and fatigue were linked to the time they appeared on the social platform to reveal changes within the 24-hours. Whilst previous studies have shown a circadian variation for positive and negative emotions the current study was able to differentiate specific aspects of anger, sadness, and fatigue.

Lead author and machine learning researcher Dr Fabon Dzogang, in collaboration with neuroscientist and current British Neuroscience Association President, Professor Stafford Lightman from Bristol Medical School: THS, and Nello Cristianini, Professor of Artificial Intelligence from the Department of Engineering Mathematics, have found distinct patterns of positive emotions and sadness between the weekends and the weekdays, and evidence of variation of these patterns across the seasons.


Dr Fabon Dzogang, research associate in the Department of Computer Science, said: “Our research revealed strong circadian patterns for both positive and negative moods. The profiles of anger and fatigue were found remarkably stable across the seasons or between the weekdays/weekend. The patterns that our research revealed for the positive emotions and sadness showed more variability in response to these changing conditions, and higher levels of interaction with the onset of sunlight exposure. These techniques that we demonstrated on the social media provide valuable tools for the study of our emotions, and for the understanding of their interaction within the circadian rhythm.”

Stafford Lightman, Professor of Medicine and co-author, added: “Since many mental health disorders are affected by circadian rhythms, we hope that this study will encourage others to use social media to help in our understanding of the brain and mental health disorders.”



Narrative Network Analysis of British Power Structures

By: Ana Rubio Denniss, cross-posted from

To say that our society is shaped by the actions of individuals and institutions may seem obvious, but how could one quantify the prevalence and importance of such actions on society? This is the question that a University of Bristol research collaboration between the Intelligent System Laboratory (Thomas Lansdall-WelfareSaatviga Sudhahar and Nello Cristianini) and Department of History (James Thompson ) set out to answer in their recent work “The Actors of History: Narrative Network Analysis Reveals the Institutions of Power in British Society Between 1800-1950”.

This work looks specifically at the roles of key players in British society over a 150 year period from 1800-1950 and how these societal roles changed over this time. These players may be individuals such as the reigning King or Queen, or institutions such as the Church. The analysis was undertaken using narrative network analysis, turning digital information from 39.5 million local newspaper articles published in this time into a series of narrative networks. The complex social interactions were transformed into these visual networks by representing the different players of influence as network nodes, the actions of these influencers as links and the closely interacting sections of society built around them as their surrounding communities.

The results of this data-driven analysis of 28.6 billion words taken from 150 years’ worth of newspaper reporting involves 29 networks comprised of 156,738 nodes connected by 230,879 edges. The image below shows one narrative network in a ten year period from 1905-1915, with coloured coded nodes denoting the broad topics to which they belong.



Colour coded network topics

Another key element of the work was the detection of communities, which was performed by computing the frequency with which different key players performed interactions to or on each other. Once communities had been computed around the nodes, macro-communities were formed to show community structures which were persistent over the 150 year period across the majority of the 29 networks. The resulting macro-community for the most central 1000 players in shown below.





The findings of this project are interesting in themselves, giving insight into the power structures of British society and how these link together to influence public life. However the project itself has much wider implications in terms of the application of computational tools, particularly AI algorithms in this case, to Digital Humanities. Due to the numerous library digitalization projects taking place worldwide, this field is rapidly expanding and requires tools with which to deal with the large number of resources becoming digitally available. The meaningful representations produced in this paper demonstrate that many of these tools already exist in the spheres of AI, big data and data visualisation, and that collaboration with these fields has great potential to provide information and understanding of qualitative resources.

“Content Analysis of 150 Years of British Periodicals” published in PNAS



What could be learnt about the world if you could read the news from over 100 local newspapers for a period of 150 years? This is what a team of Artificial Intelligence (AI) researchers from the University of Bristol have done, together with a social scientist and a historian, who had access to 150 years of British regional newspapers.

The patterns that emerged from the automated analysis of 35 million articles ranged from the detection of major events, to the subtle variations in gender bias across the decades. The study has investigated transitions such as the uptake of new technologies and even new political ideas, in a new way that is more like genomic studies than traditional historical investigation.

The team of academics, led by Professor Nello Cristianini, collaborated closely with the company findmypast, who is digitising historical newspapers from the British Library as part of their British Newspaper Archive project.

The main focus of the study was to establish if major historical and cultural changes could be detected from the subtle statistical footprints left in the collective content of local newspapers. How many women were mentioned? In which year did electricity start being mentioned more than steam? Crucially, this work goes well beyond counting words, and deploys AI methods to identify people and their gender, or locations and their position on the map.

The landmark study, part of the University of Bristol’s ThinkBIG project, collected a huge amount of regional newspapers from the UK, including geographical and time-based information that is not available in other textual data such as books. Over 35 million articles and 28.6 billion words, from the British Library’s newspaper collections, representing 14 per cent of all British regional outlets from 1800 to 1950, were used for the study.

Nello Cristianini, Professor of Artificial Intelligence, from the Department of Engineering Mathematics, said: “The key aim of the study was to demonstrate an approach to understanding continuity and change in history, based on the distant reading of a vast body of news, which complements what is traditionally done by historians.

“The research team showed that changes and continuities detected in newspaper content can reflect culture, biases in representation or actual real-world events.”

Simple content analysis allowed the researchers to detect specific key events like wars, epidemics, coronations or gatherings with high accuracy, while the use of more refined techniques from AI enabled the research team to move beyond counting words by detecting references to named entities, such as individuals, companies and locations.

Some of the results were to be expected, and acted as a rational check for the approach, while other outcomes were not so obvious at the start of the analysis.

The researchers found in the areas of values, beliefs and UK politics that in the 19th century Gladstone was much more newsworthy than Disraeli; until the 1930’s Liberals were mentioned more than Conservatives, and that reference to British identity took off in the 20th century.

In the subjects of technology and economy, the research team tracked the steady decline of steam and the rise of electricity, with a crossing point of 1898; trains overtook horses in popularity in 1902; and the four largest peaks for ‘panic’ corresponded with negative market movements linked to banking crises in 1826, 1847, 1857 and 1866.

The researchers have shown in the subjects of social change and popular culture that the Suffragette movement fell within a delimited time interval 1906 to 1918; ‘actors’, ‘singers’ and ‘dancers’ began to increase in the 1890s, rising significantly from then on, while references to ‘politicians’, by contrast, gradually declined from the early 20th century; and that ‘football’ was more prominent than ‘cricket’ from 1909.

Replicating a previous study done on book content, the researchers then moved on to link famous people in the news to their profession, finding that politicians and writers are most likely to achieve notoriety within their lifetimes, while scientists and mathematicians are less likely to achieve fame but decline less sharply.

More importantly, the researchers found that males are systematically more present than females during the entire period studied, but there is a slow increase of the presence of women after 1900, although it is difficult to attribute this to a single factor at the time. Interestingly, the amount of gender bias in the news over the period of investigation is not very different from current levels.

Dr Tom Lansdall-Welfare, Research Associate in Machine Learning in the Department of Computer Science, who led the computational part of the study, said: “We have demonstrated that computational approaches can establish meaningful relationships between a given signal in large-scale textual corpora and verifiable historical moments.

“However, what cannot be automated is the understanding of the implications of these findings for people, and that will always be the realm of the humanities and social sciences, and never that of machines.”

The researchers believe that these data-driven approaches can complement the traditional method of close reading in detecting trends of continuity and change in historical corpora. The contribution that Big Data and AI can do to the field of Digital Humanities is still largely unexplored, and one of the most exciting areas of cross disciplinary research enabled by the new field of Data Science. The ThinkBIG project is aimed at exploring the interplay between Social Science, Humanities, and large-scale data-driven AI.

Paper: Content Analysis of 150 Years of British Periodicals by Thomas Lansdall-Welfare, Saatviga Sudhahar, James Thompson, Justin Lewis, The FindMyPast Newspaper Team and Nello Cristianini in Proceedings of the National Academy of Sciences of the United States of America (PNAS).


Secondary Data from the Study:

Two workshop papers accepted for presentation at ICDM 2016

Two papers from the ThinkBIG project have been accepted for presentation at workshops at the International Conference on Data Mining taking place in Barcelona, Spain from the 12th – 15th December 2016.

The first paper, Seasonal Fluctuations in Collective Mood Revealed by Wikipedia Searches and Twitter Posts will be presented at the Sentiment Elicitation from Natural Text for Information Retrieval and Extraction workshop (SENTIRE) by Fabon Dzogang. In this paper, we investigate seasonal fluctuations in mood and mental health by analyzing the access logs of Wikipedia pages  and the content of Twitter in the UK over a period of four years. By using standard methods of Natural Language Processing, we extract daily indicators of negative affect, anxiety, anger and sadness from Twitter and compare this with the overall daily traffic to Wikipedia pages about mental health disorders.

The second paper, Change-point Analysis of the Public Mood in UK Twitter during the Brexit Referendum will be presented at the Data Mining in Politics workshop (DMiP) by Thomas Lansdall-Welfare. In this paper, we study the changes in public mood within the contents of Twitter in the UK, in the days before and after the Brexit referendum. We measure the levels of anxiety, anger, sadness, negative affect and positive affect in various geographic regions of the UK, at hourly intervals. We analyse these affect time series’ by looking for change-points common to all five components, locating points of simultaneous change in the multivariate series using the fast group LARS algorithm, an algorithm originally developed for bioinformatics applications.


Public lecture on “Living in a Data Obsessed Society”


On 2nd December 2016, Nello Cristianini gave a public lecture with James Ladyman and Andrew Charlesworth which was hosted by Abigail Fraser to discuss with the public on the theme of “living in a data obsessed society”.

 A new unified data infrastructure that mediates a broad spectrum of our daily transactions, communications, and decisions has emerged from the data revolution of the past decade. New AI technologies permit this infrastructure to infer our inclinations and predict our behaviour for an increasing range of activities, whether social, economic or regulatory. As opting-out is no longer a realistic option, we must strive to understand the effects this new reality can have on society.
Presently, we are ‘sleepwalking’ into unquestioning acceptance of a data ideology which presupposes that data-driven decisions are inherently neutral, objective and effective. Growing evidence to the contrary requires that such assumptions must be rigorously and robustly questioned. From privacy to persuasion, this technology will affect all of us.  

Issues that demand wider debate include addressing the risks of unintended discrimination, challenging spurious claims of objectivity,  the need to uphold an ethics of privacy and autonomy, and the importance of understanding the future roles and capabilities of intelligent machines. 

 A data scientist, a philosopher of science, and a legal scholar,  will present their work on the theme of “living in a data obsessed society”. 

Big data shows people’s collective behaviour follows strong periodic patterns

New research has revealed that by using big data to analyse massive data sets of modern and historical news, social media and Wikipedia page views, periodic patterns in the collective behaviour of the population can be observed that could otherwise go unnoticed.

Academics from the University of Bristol’s ThinkBIG project, led by Nello Cristianini, Professor of Artificial Intelligence, have published two papers that have analysed periodic patterns in daily media content and consumption: the first investigated historical newspapers, the second Twitter posts and Wikipedia visits.

The two sets of findings, taken together, show that people’s collective behaviour follows strong periodic patterns and is more predictable than previously thought.  However, these patterns can often only be revealed when analysing the activities of a large number of people for a very long time, and until recently this has been a very difficult task.

By using big data technologies it is now possible to obtain a unified look at newspaper content, for dozens of newspapers at the same time, spanning several decades or to analyse the contents posted on Twitter by large numbers of users, or even the Wikipedia pages visited.

Professor Nello Cristianini, from the Department of Engineering Mathematics, said: “What emerges is a glimpse at the regularities in our behaviour that are hidden behind the day-to-day variations in our lives.

“Our two papers have shown that by analysing massive data sets of modern and historical news, social media and Wikipedia page views, we can obtain an unprecedented look at our collective behaviour, revealing cycles that we certainly suspected, but that have never been observed before.”

The first paper, published in the journal PLOS ONE, analysed 87 years of US and UK newspapers between 1836 and 1922.  The researchers found people’s leisure and work were strongly regulated by the weather and seasons, with words like picnic or excursion consistently peaking every summer in the UK and US.

Much of our diet was influenced by the seasons too, with very predictable peak times for different fruits and foods, and even flowers, in the historical news. The same was found for diseases, such as the peak season for measles in both countries was found to be in late March to early April.  Interestingly, a strong indicator was provided by the very periodic re-appearance of gooseberries every June, which is no longer found in modern news, along with many other lost traditions.

This may seem obvious, but the research team also noticed that certain activities that used to be highly regular, like Christmas lectures, have now all but disappeared, and have been replaced by other periodic activities, like football, Ibiza, Oktoberfest. In some ways, the TV has partly replaced the weather as a major factor of synchronisation of people’s lives.

In the second paper, to be presented next month at a workshop at the 2016 IEEE International Conference on Data Mining (ICDM), the researchers discovered that seasons may also have strong effects on mental health.  The team analysed the aggregate sentiment in Twitter in the UK, plus aggregate Wikipedia access over four years.  They found that negative sentiment is overexpressed in the winter, peaking in November, and anxiety and anger are overexpressed between September and April.

At the same time, an analysis of Wikipedia visits for mental health pages, globally but strongly dominated by northern hemisphere traffic, showed clear seasonality in searches for specific forms of mental issues. For example, visits to the page on seasonal affective disorder peaks in late December and panic disorder visits peak in April, at the same time as visits to the page on acute stress disorder.

Together, these two articles show that the use of multiple sources of big data can enable researchers to look at the collective behaviour, and even the mood and mental health, of large populations, revealing cycles for the first time that have been suspected but were difficult to observe.

New study in PLOS ONE shows women are seen more than heard in online news

It has long been argued that women are under-represented and marginalised in relation to men in the world’s news media. New research, using artificial intelligence (AI), has analysed over two million articles to find out how gender is represented in online news. The study, which is the largest undertaken to date, found men’s views and voices are represented more in online news than women’s.

What is perhaps more interesting is that the research found – while being overall under-represented – women appear proportionally more in images than men, while men are mentioned more in text than women.  A breakdown of topics shows that women feature more in articles about fashion, followed by entertainment and art, while being least present in topics including sport and politics.

A team of AI experts at the University of Bristol’s Intelligent Systems Laboratory (ISL), led by Nello Cristianini, Professor of Artificial Intelligence, teamed up with social scientist, Dr Cynthia Carter from Cardiff University, to ask a very old question on a very new scale.  How many men and how many women are mentioned in the news, or portrayed in newspaper images, over a long period of time and in over hundreds of different newspapers?

Modern AI, which is frequently in the news, is a great tool to support research and can automate tasks that would take humans an impossible amount of person-hours to complete.  It is now possible to automate the task of recognising the gender of a face with a remarkable level of accuracy, and it is also possible to detect references to people in online text, along with their gender.

The paper, published in PLOS ONE, reports the findings from a large-scale, data-driven study of gender representation in online English language news media. The researchers analysed both words and images to give a broader picture of how gender is represented in online news.

The team gathered a body of news consisting of 2,353,652 articles collected over a period of six months from more than 950 different news outlets.  From this dataset, they extracted 2,171,239 references to named persons and 1,376,824 images resolving the gender of names and faces using AI.

The researchers found that males were represented more often than females in both images and text, but in proportions that changed across topics, news outlets and style.

Additionally, the proportion of females was consistently higher in images than in text for virtually all topics and news outlets.  Women were more likely to be represented visually than mentioned as a news actor or source.

Professor Nello Cristianini from the University’s Department of Engineering Mathematics, said: “Just a few years ago, it would not have been possible for a computer to determine the gender of a face, or to process such a large amount of text, with the ecessary accuracy and speed.  The analysis of millions of articles and images is one of the ways in which modern AI can help scientific research. When Big Data meets AI we see benefits in many areas of business and technology, now we can also see benefits in the way we do science.”

Dr Cynthia Carter, Senior Lecturer in the Cardiff School of Journalism, Media and Cultural Studies, added: “Our large-scale, data-driven analysis offers important empirical evidence of macroscopic patterns in news content, supporting feminist researchers’ longstanding claim that the marginalisation of women’s voices in the news media under-values their potential contributions to society, and in the processes, diminishes democracy.”