Bold ideas and critical thoughts on science.

Lennart Stoy on the growing problems for the efforts for a science with a rational of open data in the context of upcoming european legislation

Lennart Stoy

Data is the new oil. This mantra has long made it past tech entrepreneurs and policy wonks into the public debate. But the perception of data as a competitive advantage in a geopolitical competition does not come without new challenges, in particular in a world where science and technology are increasingly in the cross hair of great power competition.

Many policy initiatives in Europe in the area of the digital single market – take the new European Strategy for Data [1] and the Open Data Directive [2] as prominent examples – aim to increase the supply of data to European researchers and companies. GDPR aimed to replace the fragmented landscape of national rules with a single framework across Europe that makes movement of data across European borders simpler.

The premise of much of this is that companies and researchers will need more and better data for research and development of new tools and services, including artificial intelligence. These policies seek to mitigate perceived disadvantages of Europe compared to international competitors, in particular China and the US with their massive tech companies collecting data from large populations and user bases.

Open access to data, in particular data from public sources or publicly funded research projects, has been embraced as a major component in this strategy. By making it available for all, with no or limited access restrictions, innovation is to be spurred and competitiveness enhanced. A study for the European Commission [3] has calculated that making research data findable, accessible, interoperable and re-usable (FAIR) would yield massive efficiency gains for European coffers. And a recent OECD report studying data access policies [4] finds policy rationales for data sharing including “promoting innovation and economic growth” and “enhancing social welfare for individuals and society at large”. Besides political motivations based on competitiveness, the Open Science movement advocates for better access to data from and for research to inspire scientific re-use, replication, and to enhance reproducibility.

However, increasing geopolitical competition vis-a-vis China and the US also demonstrates that not everything is as simple as it seems. When data is framed as a competitive advantage for a specific region or country, openness risks being tainted as giving away “the new oil” to geopolitical competitors. A good example illustrating this conundrum is an interview published in ScienceBusiness [5] in December 2019, where an interviewee called open access to data “naïve without a concrete plan on how the EU should protect the general macroeconomic interest”, further asking “do [competitors] also have an open science cloud to which EU would have unlimited access?”.

This issue hasn’t gone fully unnoticed in policy circles. EARTO, an umbrella association of research organisations recommends that “the EU should also seek to ensure reciprocity between the EU RD&I policy rules and measures and those of the third countries with which the EU has RD&I partnership agreements.” [6] and even warns that “Imposing immediate open access to publications in Horizon Europe could create unbalances and a lack of reciprocity with our global competitors” – though this specific recommendation begs the question how exactly paywalling publications bars global competitors from being able to access them. A draft of a study for European Commission as part of the “2030 Vision on the European Universities of the Future” urged to “Use access to RTD Framework Programme funding and Trade Agreements to leverage global reciprocity in Open Science”. And the Open Science Policy Platform, the now defunct Open Science advisory body for the European Commission, has also adopted some of that language. Their 2020 final report [7] suggests that “dissemination of research knowledge should also take place on a reciprocal basis, especially at an international level. […] Open Science policies can boost the performance of both the European economy and global economy, while IPR ensures the added value falls within European boundaries when appropriate”.

Taking a step back before continuing with the theme of international competition, one doesn’t even have to think about foreign competitors to observe a perceived tension between openness requirements and IPR. Different stakeholders from the research sector, such as research organisations or universities are openly concerned. A recent blog [8] from a university-based tech transfer professional starts with the leading question ”is there a real friction between Openness and IPRs?” and concludes that “there is a need to clarify the principles of EU policy for research and innovation, especially now, on the eve of the new Horizon Europe” in order to make the lives of scientists and research managers easier.

Knowing this, it is clear why complete openness has already been replaced with the more timid credo that data should be “as open as possible, as closed as necessary” in the European policy sphere. The accessibility part of the FAIR principles [9] explicitly permits restricted, even paid, access to data. None of this is a problem in principle. The JRC itself recognises [10] that “there are no incompatibilities between IPR and Open Science” and instead finds that the “IPR framework, if correctly defined from the onset, becomes an essential tool to regulate open science and ensure that the efforts from different contributors are correctly rewarded”.

And indeed there is probably no lack of options for licensing research data [11]. Funding requirements allow exceptions for open data sharing: One draft legal text of Horizon Europe [12] – yet to be confirmed – mentions “legitimate interests of the beneficiaries and any other constraints, such as data protection rules, security rules or intellectual property rights” as reasons to opt out of fully open data sharing. Most of these legal exceptions should be sufficient to safeguard sensitive data and intellectual property.

This recognition of the limitations for fully open data in legal works has been accompanied by “more nuanced and technologically-focused efforts, which can accommodate for more sensitive and private data types” [13]. There are also international, private-sector initiatives such as the International Data Space Association [14] building safe data sharing infrastructures. At the local level, promising projects such as the Amsterdam Data Exchange [15] promise bottom-up solutions for data sharing across the public and private sector. From the top, the idea of European data spaces with very similar objectives has been enshrined in the EU Data Strategy [16].

However, simply referring to the sufficiency of current legal frameworks within Europe and some ideas for technical implementation doesn’t fully do the problem of geopolitical competition justice. Think for instance about international research collaboration. Most guidelines will tell you that projects with international partners should be mutually beneficial, balanced and reciprocal, and that may include accessing data of the respective partner. With an EU-wide data sharing mandate that still asks to be “as open as possible”, Europe may unintendedly discharge its own researchers of this element of partnerships and give away some leverage to foster reciprocal access at project level. In other words, programme-level requirements could inadvertently harm project-level reciprocity. This is on top of the basic conundrum described above that opening data beyond Europe may be detrimental to the ‘general macroeconomic interest’, however that will be defined.

Now, some of these issues may seem more like thought experiments. But in times in which science and technology are increasingly caught in a global power shift of historical proportions, they have real consequences. The recent communication on the future of the European Research Area (ERA) [17] introduces the idea of “purposeful openness” and emphasises that a revamped ERA “will protect EU vital interests and sovereignty in strategic technology areas and critical infrastructures based on common values and promoting a global level playing field.” In practical terms, the European Commission is already considering “limiting international research in strategic areas including cybersecurity, sixth generation wireless and quantum technologies” [18] and may resort to “exclude some companies, or countries from participation, notably for security reasons” [19]. To this end, the Council has introduced a clause in the legal proposal [20] permitting to exclude “the participation of legal entities established in the Union or in associated countries directly or indirectly controlled by non-associated third countries or by legal entities of non-associated third countries from individual calls” – and a similar line on “European Union global economic competitiveness” as a justification to limit open access to research data.

It is not clear at this stage what this would mean in practice for the access requirements to research outputs. Is there a way to reconcile open access with the call for reciprocity? Will “global economic competitiveness” actually be applied to justify access restrictions to otherwise non-sensitive research outputs? Could there be “geo blocked” access regimes that depend on the geographic location of those who want to use certain research outputs? Are there (or can there be) truly “open” licenses that only work within Europe or select partner countries? Will the EU push back on openness in science to safeguard economic competitiveness – and thereby re-introduce the very barriers for EU-based researchers and innovators that the Open Data Directive and Horizon Europe originally sought to tear down? Will we see international negotiations on trade or research collaboration start asking for data sharing or using access to European data as negotiation matter in trade deals?

Most of these questions cannot be answered today. But it stands that openness is intimately tied to irrevocable and world wide open licences [21] and they don’t allow select exclusion of countries. Clarity over what should and what shouldn’t be open is urgent, which is why the European Commission and member states ought to explain how they intend to balance seemingly conflicting objectives. Global competitiveness should not turn into a default justification to restrict access. Its use should at least be monitored across the Horizon Europe areas that will be flagged as sensitive. And to minimise impact on the “free circulation of knowledge”, this should be accompanied by building infrastructure which makes controlled access to data as smooth as possible. It remains to be seen whether or not heightened concerns prove to be warranted. But for anyone interested in open science, this is a space to watch closely.