Since 30 November 2022, they have regularly flooded the screens of social media users: screenshots of texts generated by the computer programme ChatGPT that read as if they had been written by a human being. Just days after the tool was released by its developer, OpenAI, a San Francisco-based company, more than a million users were already enthusiastically trying it out.

How should AI systems be used and regulated to realise their economic potential while minimising their potential dangers?

This success has made generative artificial intelligence (AI) a dominant theme in the technology industry. At the beginning of the year, OpenAI was valued at $29 billion; by 23 January, Microsoft was announcing that it would participate with a multi-year „multi-billion dollar“ investment. While the company is still busy reflecting on the impact of ChatGPT, its successor, GPT-4, is apparently almost ready for public release. How should such systems be used and regulated to realise their economic potential while minimising their potential dangers?

ChatGPT openAI copyright shutterstock/Ascianno
The San Francisco-based company "OpenAI" was founded in 2015 as an artificial intelligence research laboratory that, according to their website, aims to ensure that artificial general intelligence benefits all of humanity. In November 2022, they launched their model "ChatGPT", which was trained to interact conversationally.

Probable yes, but true?

The recent technology breakthrough is mainly due to the increasing size of language models, as measured by the number of parameters and the amount of the training data. On this basis, Deep Learning methods allow the probability of a word sequence to be accurately predicted. Crucially, then, large language models do not aim to give the best or truest answer but always restrict themselves to grammatically and semantically correct sentences with a high probability value.

This probabilistic nature means that applications such as ChatGPT do not simply copy existing texts but create something completely new, which gives rise to enormous potential for democratising knowledge and generating economic wealth. But as with any new technology, it is important to consider how it can be misused. Algorithmic summaries may contain errors or outdated information, or mask nuances and uncertainties without users noticing.

AI language models risk exacerbating environmental damage, social inequality and political fragmentation

These errors are not only irritating but can have negative social consequences. AI language models risk exacerbating environmental damage, social inequality and political fragmentation. Training and developing these models is financially and environmentally costly; a situation which favours established digital giants while also worsening the carbon footprint. The underlying training data does not appear to adequately represent marginalised communities; sexual and racial bias has been widely demonstrated. Finally, the ability of ChatGPT to generate misinformation in language that is self-confident and context-sensitive will enable manipulators to run cost-effective cross-platform disinformation campaigns.

Alarmingly, AI can spread disinformation in a context-sensitive globally and within seconds. So far, sexual and racial biases have been demonstrated multiple times.

Should ChatGPT be regulated?

These problems raise the question of possible regulation. Instead of a rigorous ban or detailed regulation, which could slow down future innovation, users must acquire specific digital literacy or, in other words, there must be informed and technically astute application of the new language models on a broad societal level. Think, for example, educational campaigns in schools and universities.

There must be informed and technically astute application of the new language models on a broad societal level

Due to uncertainty about the future path of innovation, European policy should also limit itself to the creation of sensitive regulatory frameworks, but these are still lacking. The Digital Markets Act targets so-called „gatekeepers“ for which it specifies high thresholds that will not be met by OpenAI for a long time to come. The Digital Services Act contains obligations for providers of digital intermediary services, which includes search engines. This could become relevant if Microsoft implements its plans to integrate ChatGPT into Bing. Again, however, the legal obligations are graded according to the type and size of the service. The most relevant regulatory tool might be the proposed AI Act, which is currently being updated to include generative language models as „general purpose AI“ systems. Here too, however, quick standards are unlikely as the current compromise initially only provides for an impact assessment and a consultation, and postpones specifics until later.

own depiction (pillar pciture copyright: shutterstock/
Own depiction of a 5-pillar model to achieve and ensure safe and productive usage of language models.

Framework for safe and productive application

To ensure safe and productive use, certain regulatory requirements, in addition to digital literacy, are essential. To this end, the Centres for European Policy Network proposes a 5-pillar model that calls for transparency of models, competition among providers, fair access, standards to prevent abuse, and protection of intellectual property and personal data.

Large language models are a kind of black box. Even when programming interfaces are available, there is typically no access to model weights or training data. The lack of transparency of popular models makes it difficult to benchmark them holistically or adapt them for specific applications. Developers should therefore explain which training corpora have been used and what logic is followed by the algorithms employed. It is also crucial that individual language models do not gain a dominant market position which would turn their programming interface into a bottleneck for downstream industries. Otherwise, there is a threat of monopolisation tendencies similar to those in the platform economy.

Passivity ultimately means becoming dependent in the long term on non-transparent language models whose implicit value judgements are not comprehensible from the outside.

To counteract social inequalities, fair and reliable access to this new knowledge resource is necessary. OpenAI, for example, is now planning paid access. In a multipolar world economy, characterised by geo-political rivalry in the digital sphere, the situation is aggravated by the fact that the major language models come almost exclusively from the USA or China. In order to avoid strategic dependencies, the development of an alternative European model could be promoted, with training data that corresponds to democratic principles. Even if this seems radical at first glance, passivity ultimately means becoming dependent in the long term on non-transparent language models whose implicit value judgements are not comprehensible from the outside.

Systematic regulations and standards for language models may prevent abuse. Developers should be required to build in certain restrictions to counteract disinformation, monitor their results to filter out toxic content, and block abusive users. European standards for training large-scale language models, requiring transparent algorithms and CO2 neutrality, would also be conceivable. Finally, there are unanswered questions about intellectual property, privacy and data protection. This is because training data is often scraped from the internet without permission.

Implementing these regulatory frameworks could increase public trust in the new technology and thus realise innovations more quickly. At the same time, a better understanding of generative AI models, and embedding them in a regulatory framework would dispel diffuse fears about human creativity being superseded. On that basis, generative AI technology could be a significant opportunity to distribute market power, open up echo chambers and democratise knowledge.

This contribution is an abridged and revised version of the cepAdhoc No. 1 (2023) and based on a guest commentary in Tagesspiegel Background.

Anselm Küsters is Head of Digitalisation/New Technologies at the Centrum für Europäische Politik (cep), Berlin. As a post-doctoral researcher at the Humboldt University in Berlin and as an associate researcher at the Max Planck Institute for Legal History and Theory in Frankfurt am Main, he conducts research in the field of Digital Humanities. Küsters gained his Master's degree in Economic History at the University of Oxford (M. Phil) and his PhD at the Johann Wolfgang Goethe University in Frankfurt am Main.


Copyright Header Picture: shutterstock/1stfootage; copyright picture of OpenAI/ChatGPT: shutterstock/Ascianno; copyright picture of puppeteer and world: shutterstock/Marko Aliaksandr; Depiction of the 5 pillars: own graph with pillar picture copyright: shutterstock/rawf8; picture of Author: copyright Anselm Küsters.