Motivation
“The world faces two major problems in 2020, the predicted climate change and the unexpected COVID-19 epidemic.”
“The Coronavirus Disease 2019 (COVID-19) pandemic is one of the most impactful crises that shook global economies, restricted social activities, emptied public spaces, infected 182 million people, and has taken more than 3.94 million lives from 213 countries (as of July 2021) since its outbreak in December of 2019.”
COVID-19 has caused huge and important shifts in people’s lives due to the implementation of unprecedented non-pharmaceutical interventions, which in turn created challenges that did not only contain physical health but mental health as well. This fact has fostered an unforeseen effort to better identify how the pandemic extensively affects human needs and health concerns.
Pageviews of Wikipedia, the world’s largest online encyclopedia, could reflect the major developments and shifts in people’s attention during the lifetime of this pandemic. It is critical to track the digital footprints above, which will empower governments to know what the public values thus better responding to potential public health emergencies in the future.
Data
Here are the datasets we got or collected:
- Mobility data from Google
- Covid cases data from WHO and Johns Hopkins University
- Pageviews from the Wiki dataset
You can find the complete data about this story here, and more detailed data for the begin of the journey can be found here. For a more detailed description of the data, you can refer to this paper: Sudden Attention Shifts on Wikipedia During the COVID-19 Crisis.
Mobility data describes the daily percentage change of people’s time in different places, while Covid data tell us the daily new cases. For people’s online behaviour, we have selected a total of 544 different pages in 3 categories to analyse the change in the number of page views per day or per week.
The outbreak of Covid in most countries was in the spring of 2020, and we are interested in the mobility of people in the unusual year of 2020, the association of wiki page views with the outbreak, especially from January to October 2020, when Covid was more severe. As a comparison, we also collected data for the same time period in 2019 as a reference for a typical year. What’s more, we also have the data discribing the public events of each area, such as the time of first case etc..
In total, we collected wiki pages in 12 different languages, corresponding to 24 different countries,which is shown in the figure below. Thanks to the fact that each entry in the wiki dataset has the same QID for different languages, we only need to identify the entries under the English category to collect all the data. Of course, not all entries will be present under all languages, so there will be a little difference between languages in our dataset. To aggregate data in the mobility dataset and Covid cases, we assign different weight according to the population of that country. The story we are telling is not about the U.S. and U.K., but rather about English speaking countries, etc.