Platform accountability through independent research: What is, what is missing, what is next

Cathleen Berger

Artikel

The key to successfully mitigating disinformation could lie in moving from anecdotally analysing attacks on digital discourse to do so in a continuous, data-driven manner. Sounds simple, but faces numerous hurdles in. Going forward, we need a hub for knowledge and data management to fill serious gaps.

Role and limits of monitoring social media platforms

In January 2024, the Federal Foreign Office investigated a large-scale disinformation campaign by Russia. The strategic communication team identified over 50,000 bot accounts that were automatically spreading false information about the war in Ukraine and mutually reinforcing each other. Since 2022, the EU DisinfoLab, Correctiv and others have been uncovering new cases of the so-called ‘doppelganger’ campaign, in which content from established, far-reaching newspapers such as Der Spiegel, Zeit Online, Le Monde and others is copied and then replaced with individual, manipulated content and disinformation. Even if the URLs of the pages vary, the deceptions are often not recognisable at first glance, so that the supposedly ‘serious’ but false articles are sometimes distributed widely on platforms and in messengers.

Such attacks are detected by monitoring digital discourse, which involves checking posts, interactions and trends for anomalies. Without this monitoring, numerous attacks on our discourse would remain undetected – a major threat to the quality of and trust in our information ecosystem. In recent years, there has been a visible increase in the number of civil society organisations that use or want to use monitoring to propose more concrete changes to platforms and support the mitigation of threats. The demands for better, more reliable and easier access to platform data for independent research are as diverse as the respective contexts: Be it CeMAS, the Institute for Strategic Dialogue, the Mozilla Foundation, Democracy Reporting International, Aos Fatos, Soch Fact Pakistan, Media Monitoring Africa or the Coalition for Independent Tech Research. There are now dozens, if not hundreds, of organisations worldwide that integrate monitoring into their work and help to better understand and defuse disinformation campaigns.

Successes such as those mentioned above are heartening – and yet they are anecdotes. Anti-disinformation successes identify individual attacks, can reveal bot armies, and networks or sensitise people to patterns in disinformation narratives. However, they remain selective and are structurally limited in their reach and speed. There are several reasons for this.

Platform Governance: Private actors with public responsibility

One reason for this lies in the tension between the public sphere and private sector platforms. In the digital public sphere, more and more of our societal discourse is taking place on social media platforms, which, as private actors, have a disproportionate influence on our lived realities. Of course, this also entails a high level of responsibility – which has been repeatedly and emphatically emphasised and demanded by numerous experts in recent years. Where platforms initially based their rules for content moderation on internal business guidelines, today many such decisions are either legally pre-structured, supervised, or critically assessed from the outside.

The pressure to act responsibly has increased noticeably. At the same time, the imbalance of power persists: Despite new requirements and increasing regulation in many countries around the world, our insight into the functional logic and available data of platforms remains limited. This is another reason why the importance and necessity of independent research and data analysis is a highly relevant topic that has been the subject of intense debate, including political debate, at least since the negotiations surrounding the Digital Services Act (DSA).

According to the DSA, disinformation is considered a ‘systemic risk’ to democracy. The DSA obliges the dominant platforms, such as TikTok, YouTube, Facebook, and LinkedIn, to take decisive action to counter the risks. Platforms are required to prevent the spread of disinformation and to design and curate their services in such a way that risks can be minimised, violations can be tracked, and countermeasures can be evaluated. Civil society organisations and academics play a central role in the implementation of these obligations. They act as a corrective, an early warning system, and a source of inspiration for how to foster healthy digital discourse. To fulfil these functions, civil society organisations need access to platform data. This is regulated in Article 40 of the DSA – the clearest framework to date for data access for research purposes on platforms.

Data access for research purposes: Data protection, applicability, coordination

Despite the supposedly clear legal framework, inconsistencies and uncertainties emerge in practice. Monitoring is based on data that the platforms already collect about their users, information flows, interactions, or the effect of their design choices. Such data can be used, for example, to visualise networks, track the virality of individual posts, and trace patterns of ‘comment bots’ and pseudo-accounts.

However, a lot of platform data is also sensitive as it contains personal information, individual preferences, or direct messages between individuals. Data protection experts express justified concerns if too much data is collected, stored, or analysed. Strict guidelines for independent research are therefore necessary to prevent intrusions into the privacy of individuals and disproportionate surveillance of our digital discourse. Awareness of data protection and the sensitivity to the impact of monitoring varies, especially among civil society organisations, and often depends a great deal on the context, national legal traditions, and the urgency or pressure that civil society faces in different countries around the world. Due to a lack of resources and/or limited capacity, data protection and ethics advice is not always institutionalised in civil society organisations and individual analysts must often assess and decide sensitive issues on their own.

There are also legal gaps when it comes to mandatory access to platform data. For example, journalists, non-affiliated researchers, and research consortia with non-European partners cannot refer directly to the DSA. Researchers outside the EU therefore often resort to commercial marketing tools or web scraping to conduct their analyses. This is legally vague and analytically limited, both with regard to the comparability of results and the possibilities of filtering data and preparing it in a methodologically sound manner.

In addition, our research and exchanges with experts show that every organisation and every research network develop its questions, the research design, and, in case of doubt, the respective code for the collection, evaluation, and analysis of platform data from scratch. This means that existing knowledge is rarely built upon, and each organisation sets up their respective monitoring effort independently. This presents a chicken-and-egg problem: On the one hand, the capacities and competencies of civil society organisations are too limited to conduct long-term, legally compliant, and interlinked research based on platform data, so that they concentrate on smaller, anecdotal monitoring projects. On the other hand, their resources and leverage are limited not least by the fact that they do not build on existing knowledge and complement each other in their work because they lack the time and capacity to network internationally and continuously and keep up to date.

Platforms interpret the obligation to allow access data differently

A further hurdle arises from the lack of comparability between the platforms and their way of implementing the various legal provisions. Each platform provider implements existing legislation in their own interests and context.

For example, Meta’s platforms, Facebook and Instagram, were accessible to researchers for years via a tool called ‘CrowdTangle’. Organisations from all over the world could apply to Meta for access to the tool. For around two years, applications were no longer possible, and Meta recently announced that the tool would be shut down. A new access point set up to comply with the implementation of the DSA limits access to data from the last three months and has been criticised by first-time users for its limited options. Although current developments and major campaigns can be monitored, historical or regional comparisons are not possible. The clear criticism from civil society regarding the shutdown of CrowdTangle underlines the high relevance of independent, reliable monitoring for their work, but also shows that many of them are overwhelmed by learning and setting up new access points with the necessary speed and urgency due to a lack of resources and skills.

TikTok and YouTube, which for a long time were only accessible to researchers through workarounds, e.g. web scraping or data donation methods, are also gradually setting up access points for research purposes on the basis of the DSA. However, both are limited in their respective ways. For example, TikTok only allows access for researchers from the U.S. and the EU, a maximum of 10 researchers are allowed to join together in a network, and initial experience suggests that access is not reliable and is faulty in places. Although YouTube makes its data research programme available worldwide, the sheer volume of data and the daily growth are so large that long-term, comparative studies usually do not have sufficient server capacity for evaluation, which is why anecdotal research is particularly present here. The costs for server and storage capacities for long-term research are often prohibitive for civil society organisations anyway, not only, but especially for YouTube.

As each platform defines its own access, research across multiple platforms is extremely difficult and has not yet been standardised or harmonised. Analyses of the spread of narratives and manipulation tactics across platforms and between different networks are therefore virtually impossible. Entertainment on TikTok, shopping on Instagram, news on X or Threads – user behaviour is diverse and we can only understand the long-term effects of manipulation attempts if we research them.

Gaps that need to be filled

On the plus side, we know much more about the emergence of disinformation on platforms today than we did five years ago. And yet, existing gaps urgently need to be filled if we are to move beyond anecdotal knowledge to truly measurable, evidence-based successes in dealing with disinformation.

These challenges can be summarised as follows:

  • There is a lack of standardised research access to platform data. As platforms are privately organised, each one sets its own rules. This hinders research across multiple platforms. In addition, real-time analyses across different regions are currently limited or not possible independently.
  • Journalists, independent researchers, and especially non-European research are currently neglected, which limits the categorisation and assessment of systemic risks on social media platforms.
  • The capacities and competences of civil society are currently not yet sufficient to produce long-term analyses and thus make evidence-based design proposals for platforms. A lack of data protection and ethics supervision, too few training programmes, a lack of coordination and further development of existing research, limited access to comparative data, and the cost of sufficient server and storage capacity significantly limit the impact of independent monitoring.

The hurdles are great, but not insurmountable. In view of the high relevance of independent research and data analysis for our digital public, things can and must change quickly.

Outlook: A central hub for knowledge and data management

Many wheels need to mesh here: Legal improvements, pressure on platforms to assume their public responsibility through harmonised research access, and a strengthening and increase in civil society organisations in order to fulfil their role as a corrective and initiator. Political decision-makers, technology companies, and philanthropic organisations are called upon.

However, these solutions also require greater coordination between researchers and opportunities to share significantly more knowledge in order to shape digital discourse in a sustainable and trustworthy manner. Awareness of the need and the will alone will not be enough here. Instead, we need a central hub for knowledge and data management for independent research on digital discourses. Such a (service) organisation would have to act as a hub on three levels: (1) as a source of knowledge that provides, among other things, templates for legal and ethical issues relating to research questions, data collection, evaluation, and storage; (2) as a data manager that prepares methods and approaches for monitoring and, as a scientific fiduciary, provides cleaned, pre-coded data on shared server capacities for research purposes; and (3) as a spokesperson that collects the experiences of monitoring organisations from all over the world and represents them in a coordinated form to platform providers and political decision-makers to call for future improvements.

With the Data Knowledge Hub for researching digital discourse, we have launched an initial pilot for a knowledge database. The further development and networking of this concept must and will occupy us in the future.


Cathleen Berger

Cathleen Berger

Co-Lead

Cathleen Berger’s professional experience spans across sectors: academia, government, non-profit, corporate, and start-up. Her work and research focus on the intersection of digital technologies, sustainability, and social impact. She currently works with the Bertelsmann Stiftung as Co-Lead for Upgrade Democracy as well as the Reinhard Mohn Prize 2024 and Senior Expert on future technologies and sustainability. In addition, she occasionally advises and works with social purpose companies and organisations on their climate and social impact strategies.

Previously, she directed the B Corporation certification process of a pre-seed climate tech start-up, launched and headed up Mozilla’s environmental sustainability programme, worked within the International Cyber Policy Coordination Staff at the German Foreign Office, as a consultant with Global Partners Digital, a research assistant at the German Institute for International and Security Affairs (SWP), and a visiting lecturer at the Friedrich Schiller University Jena.

Follow on

Co-Lead

Share

Similar articles