Launch Announcement: The Brand-New Data Knowledge Hub for Monitoring Online Discourse
Cathleen Berger, Charlotte Freihse
What is the Data Knowledge Hub?
The Data Knowledge Hub aims to break the barriers for monitoring social media and online discourse by reducing the obstacles and lowering the threshold to conducting it. Hosted open source and under Creative Commons license on GitHub, it continuously welcomes contributions of new data, code, and written content, fostering a collaborative environment for all. Cooperation and collaboration on development, design, content, and scope among established actors is key to turning this Data Knowledge Hub into a useful tool and an enabler for future research.
Why are we launching it?
Online discourse has changed how we inform ourselves, what and who to trust, as well as how information is quite simply accessed. Notably on online platforms and social media, recommender systems and other design features can be gamed to fuel disinformation, hate speech, and outrage. In addition, messaging services and alternative platforms are increasingly falling risk to exploitation and providing agitators with vast audiences to spread falsehoods. But how and why exactly this is happening remains under-researched and merely anecdotally illustrated. If we want to strengthen our information ecosystem and increase each other’s ability to decide what’s trustworthy and what’s not, we need to move from anecdotes to broad, continuous, and ideally real-time data-driven insight.
There are renowned, well-established organisations out there that do incredible work on monitoring online discourse, including CeMAS, Democracy Reporting International, the SPARTA Project of the Bundeswehr University Munich, or the Institute for Strategic Dialogue. Yet even these established players face several challenges, among others: the multitude of digital platforms; the sheer amount of data and necessary server capacities; fast-developing and constantly changing narratives; new and changing actors and agitators.
And this is indeed the challenge: due to the increasing number of social media and other digital platforms as well as the huge amounts of data to analyse, it is critical to enable and empower more researchers, social as well as data scientists: On the one hand, we need to be able to monitor platforms on a technical level, for example to support risk assessments (DSA) or to design large-scale and real-time analysis of online public discourse. On the other, we also must empower researchers to assess this data from a socio-political context.
Journalists and other practitioners, too, will find renown experts, insights, methods as well as code to conduct their own data-driven analyses to boost their reporting.
Currently, and still growing, you’ll find chapters on legal basis and ethical standards, good practices for data collection, exemplary research on X (formerly Twitter) and TikTok, as well as code samples to monitor various platforms.
We welcome contributions on a rolling basis and would be happy to discuss your content ideas. Simply reach out via email.