ChatGPT was presented to the public in November 2022 and, since then, has spurred discussions about the future of work. The advent of Open AI’s ChatGPT and its siblings Google’s Gemini or Anthropic’s Claude has sparked a wide range of positive visions of enhanced velocity and efficiency as well as dystopian scenarios about job losses. From a sociological point of view, such debates primarily address the overall relationship between human expertise and technological capabilities. They are not new and have been taking place ever since the introduction of computers or even since the first machines replaced previously human tasks. However, generative AI applications no longer substitute only standardized work routines of manual labor. Due to a rise in performance and capabilities, they have furthermore reached the field of knowledge production—which until recently seemed to be a strictly human domain—raising the question of what specifically constitutes human performance and to what extent can it truly be replaced by machines.
This is also the central question which is currently being discussed in the science system. The focus here is less on the extent to which generative AI can actually replace scientists themselves, but rather on how generative AI influences the subject of work—the process of scientific knowledge production—both in its outcomes and in its legitimacy. Generative AI is considered a ‘game changer’, enabling the creation of text—from literature reviews to grant applications—in significantly less time and allowing the exploitation of increasingly large datasets. This is not to mention the support it provides for programming tasks and coding. At the same time, however, there are warnings that the increased use of these technologies could lead to scientific ‘monocultures,’ where certain forms of knowledge production dominate all others, potentially leading to bias in content and a loss of innovation (Messeri & Crockett, 2024; Parisi & Sutton, 2024; Watermeyer et al., 2024). As it builds on statistical probabilities, it is feared that the use of technology like ChatGPT could inadvertently narrow perspectives, even as the number of publications increases.
The gospel spreads
In addition to such expectations and concerns, there are first quantitative studies examining how scientists are actually using generative AI. In a quantitative survey conducted by the journal Nature in October 2023, 31% of responding postdocs indicated that they already use ChatGPT or similar programs in their work—primarily for improving texts, but also for generating and checking code, as well as summarizing literature (see also Fecher et al., 2023). Thus, less than a year after the introduction of ChatGPT, a third of the responding scientists (at least in the context of Nature, i.e., in the natural sciences) are using generative AI, which is an incredibly rapid adoption for a new technology.
Not only is the usage steadily increasing in the scientific arena, but more and more companies are developing new applications for supporting research using generative AI. In addition to numerous developments by small startups aimed at, for example, helping users access text content more quickly, large companies in the scientific publishing and data analytics sector have also recognized the potential of developing such large language models. Major corporations like Elsevier, Clarivate, and Digital Science are leveraging their databases—Scopus, Web of Science, and Dimensions—to develop tools tailored to specific needs in different disciplines. These tools are, for instance, designed to address the problem of hallucinations by ensuring the data used by the tools is from a controlled set of publications which is made transparent.
Managing generative AI
One consequence of the high expectations and actual usage of generative AI is the question of how to manage these new possibilities. Specifically, since its presentation to the public, generative AI has placed significant pressure on academic institutions to develop policies and practices to address potential misuse and undesirable developments. Universities have primarily focused on potential fraud in the context of teaching and their own responses to it. However, scientific journals and funding agencies are also confronted with the issue that entire publications or grant applications could be generated by AI. As early as January 2023, two preprints and two articles published in Nature listed ChatGPT as an additional author with its own affiliation (in one case, even with its own email address). Nature immediately characterized the acceptance of ChatGPT as an author as an error to be corrected. However, some bibliographic databases acted more swiftly, leading to ChatGPT being indexed as an ‘author’ in PubMed and Google Scholar. In response, Nature developed its own policy on the use of ChatGPT, explicitly excluding it from authorship. The policy asserts that only human authors can take responsibility for a publication and its content. This approach has been followed by many other journals as well as by the Committee on Publication Ethics (COPE).
In addition to journals, funding agencies also grapple with the use of generative AI in grant applications and funded research. For instance, the European Research Council (ERC) has developed recommendations for scientists, scientific organizations, and funding agencies. These recommendations are explicitly termed ‘living guidelines,’ reflecting the intention to adapt them as AI applications continuously develop. Similar to the journal requirements, these recommendations hold scientists accountable for a transparent and responsible use of generative AI. Scientists are encouraged to continually educate themselves about generative AI applications to utilize them effectively. However, sensitive areas such as peer review and evaluations are explicitly excluded from these recommendations. Scientific institutions are urged to offer training programs, monitor changes in the variety of applications and develop new support mechanisms as needed. Simultaneously, the ERC states that clear rules for AI usage should be integrated within the organization’s own standards of good scientific practice. Additionally, it is emphasized that AI tools that do not collect data and use them for further training should be used. The ERC recommendations also speak to funding agencies and urge them to enforce the correct use of generative AI and provide concrete methods for documenting its application. These agencies are encouraged to promote training and educational opportunities as well. Additionally, they should maintain control over AI usage within their own processes, specifically in review and evaluation procedures.
The future of generative AI in science?
These recommendations demonstrate that whether generative AI should be used is no longer in question, but are rather concerned with how it should be implemented. Several key points stand out, though they also raise further questions: Firstly, the extensive promotion of AI applications raises the question of how to develop training programs for a constantly evolving subject and who should provide these programs. More importantly, it brings up the issue of licensing: What tools are freely accessible to scientists? Which scientific institutions can afford which licenses or see the acquisition of such licenses as necessary and beneficial? What inequalities might arise from unequal access? And what kind of new market is emerging here and who are the major players within it? Given that public alternatives are still scarce, the current trend may resemble what we can already observe in the scientific publishing industry. It could lead to an expansion of the commercial market and strengthen the power of large corporations that already own numerous journals or possess relevant publication data that can be used to develop tools tailored specifically to science.
More pressing questions arise from the issue of the lack of transparency in what kind of data is used. They are not only concerned with the generation of scientific results. Another central point of discussion are the data privacy concerns related to using generative AI for peer review purposes. There is concern that the input data, such as the texts to be reviewed, are used by the tools themselves. At the same time, there is growing apprehension that the use of generative AI could drastically increase the number of submitted publications and grant applications. Amid rising numbers of submissions, the question arises of how to ensure adequate peer review. Should AI be allowed for peer review? How can fairness and transparency then be guaranteed? How can biases against certain topics and approaches over- or underrepresented in the underlying data be prevented? And ultimately, is it possible to prevent a ‘feedback loop’ where AI-generated text is evaluated again by AI? These questions are crucial for the future development and regulation of the use of generative AI in scientific peer review.
These questions regarding the use of generative AI in assessing scholarly work also lead to a third question concerning the evaluation of individuals and evaluation criteria, particularly in case we end up facing a scenario of ‘publication inflation’. There are existing approaches, such as the San Francisco Declaration on Research Assessment (DORA), which advocates for responsible use of quantitative metrics. Currently, there is also another initiative, the Coalition for Advancing Research Assessment (COARA), which aims to move beyond quantifying assessments altogether and return to qualitative evaluations. In this regard, it remains to be seen how the increase in assessment materials, such as publications and grant applications, coupled with their potential devaluation due to uncertainties regarding authorship, will impact assessment practices and standards.
Making sense of generative AI
Scientific institutions, publishing bodies, and funding agencies are thus confronted with both regulatory challenges and the task of promoting the use of generative AI in line with good scientific practice. It is clear that generative AI does not hold authorship and that its usage must be transparently documented. However, there is still debate over the specific requirements for documenting its use and where to draw boundaries, which might vary according to the discipline and its level of standardization in text production. Moreover, it is becoming evident that there are also questions concerning the procurement of tools and corresponding licenses. Similar to current problems in academic publishing, these include considerations of potentially costly license fees, commercial providers, and the search for open alternatives. Finally, the use of generative AI is not only discussed as a tool for research purposes but also scrutinized for its application in peer review, raising broader questions about the future of evaluation systems in science. Altogether, these points highlight the complex landscape where scientists, scientific institutions, publishing bodies, and funding agencies must navigate between regulation, ethical considerations, financial implications, and the evolving dynamics of technological adoption in scientific practice, which are already challenging the scientific system.
Yet, amidst debates about new possibilities and needs for regulation to uphold the originality of scientific practice and the legitimacy of its outcomes, we can also witness a process of self-reflection addressing the core of scientific activity itself. What kind of skills are still needed, and which need to be adapted and changed? Can generative AI complement or even replace human creativity and original thinking? And how is authenticity in knowledge production related to the validity and legitimacy of scientific results? In this sense, Immanuel Kant’s motto “sapere aude,” which he coined to characterize the Enlightenment movement in 1784, remains as relevant as ever, albeit from a different angle: have the courage to use your own reason—especially in determining where, when, and how to apply generative AI!
0 Comments