AI voices -
A possible addition

Artificial voices to support and augment human voice artists for customized voice solutions.

“Create AI voices online for free” or "Free text-to-speech and AI voice generator". These are the enticing advertising messages from leading providers of AI voices. Whether it is really that easy and, above all, free to generate AI voices, how easy they are to use and how good the quality really is in the end, we will try to illustrate this as practically as possible in selected case studies.
You find a provider for speech synthesis, use a short audio sample of Bruce Willis' voice actor and voilà, Germany's most famous voice speaks all your commercials and image films as a text-to-speech model for free. Doesn't that sound too good to be true? Unfortunately, it is. The huge hype surrounding AI-generated voices, which experienced a major boost in 2024 in particular and caused a stir in the entire voiceover industry, unfortunately ignored a few important “rules of the game” and to this day suggests that there are no legal or moral limits to the use of AI voices. Of course, this is not the case. It's just that technological progress was faster than the political decision-makers. As a result, there is still no clear set of rules for the use of artificial intelligence.

"The great AI revolution: Create your dream voice artist
as an AI voice with online voice generators"


It should therefore be said at this point that synthesizing other people's voices is not a trivial offence, but violates copyright and personal rights and is therefore a punishable offence. Therefore: No AI voice without the personal consent of the person concerned or the voice actor. Again, German AI voices are currently available in the databases of known or unknown providers. Even if reputable companies claim that these voices may not be trained without the consent of the rights holder, the protection mechanisms are often inadequate and data protection is hardly guaranteed. So who are the people behind these German voices, how do they compare to our professional voice actors and does the use of AI actually make voice recordings cheaper? We will enlighten you and show you when AI offers real added value.

AI voice vs. professional voice artist: How good does artificial intelligence sound?

We encounter artificial voices more and more frequently in everyday life. Be it on a smartphone, in a movie or as a chatbot from a large company. But who is behind these voices? In the rarest of cases, they are German professional voice talents or well-known dubbing voices. Large AI companies usually build their artificial systems in English first. The voices either have an exact human template (voice cloning) or are a product of several human voices (so-called “blended voices” through morphing). In this case, it is almost impossible to draw conclusions about the actual synthesized persons. This process also offers great potential for misuse with organic voices.

Providers who actively make AI voices available in databases (usually as a paid subscription model) for a wide variety of purposes for 24/7 retrieval work, at least in German-speaking countries, predominantly with unknown voice talents. There is a very simple reason for this. The professional sector justifiably sees many risks in using your voice as an artificial playback product. Many AI companies have their servers abroad and do not work in compliance with the GDPR. Control over how recordings are stored and processed is not guaranteed in most cases. In addition, it is difficult to narrow down the intended use and price models do not reflect reality. So if you want to continue working with real professional voices and well-known voice actors, we recommend that you continue to seek contact with real people in the future. This is not only noticeable in the personal exchange, but also in terms of quality. Because in the end, the difference is clearly audible.

"AI voices continue to lose out when expressing
emotions and certain intonations."


In a practical test, we therefore pitted our human voice talent Ulrike Kapfer against her AI voice in different genres. This text-to-speech self-experiment is based on “voice clones” that have been specially trained for this comparison using genre-like material from a leading provider of AI voice generation. It won't be too difficult for you to find out which recording is based on AI and which is the original:


Audiobook


Image film


Advertising 1


Advertising 2



Creating a “perfect” synthetic voice for all purposes is currently not possible without further ado. “Out of the box”, the files generated with text to speech are partially usable for applications without high standards, but they are still a long way from convincing a serious and critical audience. Even in the first example (radio play), the AI fails to reproduce a whispering voice. In the other examples, a certain monotony of emphasis and exaggerated intonations become clear. Emotional variance is not really audible within a genre - rather, the AI voice always seems somewhat impassive.
A large amount of source material and/or several voice models are required to be able to simulate different emotions, intonations, pronunciations or speaking postures in a reasonably realistic way. In addition, the generation of the output, at least in the text-to-speech area, is always a random product to a certain extent, so that numerous generations of a word or sentence module (prompt) may be required to produce the desired voice recording. This results in additional time and costs during post-production in the recording studios and cancels out possible cost savings in the voiceover fee. In this respect, a voice recording with a human professional voice talent is still much more effective.

AI voice recordings: No studio = less cost?

Once speech synthesis is complete, the AI voice can be called up at the touch of a button: enter text and generate a voice (text to speech). And this can be done 24/7 from anywhere in the world. Not only does this sound simple, but future productions no longer require voice recordings in a sound or home studio. This saves the voice artist a possible commute to the recording studio and, above all, time. In terms of price, the use of AI voices should therefore result in significant savings. At least that is the justified hope of many customers. However, voiceover fees were and are not charged according to time spent. Rather, by paying the voiceover fee, you as a customer acquire a license for a specific purpose. This is an established billing model in the professional voice over industry, which will continue to apply to the use of artificial voices. Of course, some dubious providers (mainly from abroad) are trying to undermine this pricing model, but this is also linked to major qualitative losses. Be it the lack of access to experienced and well-known professional voice talents, the exclusivity, the increased effort involved in AI generation or the technical limitations. The following therefore still applies: quality rightly has its price and will always be more popular with consumers.

Another scenario, which at first glance seems to offer the prospect of cost savings, is the dubbing of a film with an AI voice in several languages, for example a German (human) voice as a template and a subsequent AI-based localization of other languages with the same voice. Due to the extended range of voices (keyword: volume discount) with simultaneously reduced studio costs (as only one language is physically dubbed in front of the microphone), you should be able to realize the project much more cheaply? The supposedly lower studio costs are quickly offset by the generation of the AI versions and a more time-intensive post-production process. On the other hand, a certain discount on the voiceover fee may actually be conceivable after consultation with the booked voiceover artist. But there is a completely different aspect that destroys this appealing thought experiment: the lack of quality control. Who can guarantee that the AI-generated versions have been produced correctly? Without native-language control, you run the risk of making yourself untrustworthy when publishing in the respective country with a faulty localization or even getting into legal trouble with incorrect translations. This is why you need to factor in the control hearing when creating the AI voice and include it in your budget planning. The fees saved are therefore compensated for by the costs of the quality check if they are professionally implemented. In the worst case, you even pay more.

A small moral objection at this point: what was previously dubbed with ten different voice talents in ten languages is now only realized with a single (AI) voice. The passionate work of nine native speakers is simply rationalized away. Despite all the cost efficiency, such savings seriously endanger human livelihoods.

The added value of AI voices and possible case studies

In most cases, artificial voices are neither completely free nor cheaper than their human counterparts and certainly not on a par in terms of quality when it comes to emotional variance. But what advantages do AI voices actually offer you?


Case study 1: The voice actor is on vacation


Let's assume a voiceover artist has spoken an advertisement and is on vacation two weeks later. Now your advertising client has asked for a necessary and urgent correction. What to do? Unfortunately, the broadcast cannot be postponed any further. With the consent of the voice artist, the desired change can be made by the recording studio at any time using their AI voice. Of course, the studio costs and voiceover fee remain unchanged, as with any other change in the past. The big advantage? The gain in flexibility at the same cost and a voice artist who is always available when you need them. 24 hours a day, 7 days a week.

Case study 2: Modular systems in the company


Or imagine a modular system. You want to address several hundred employees in your company individually in an e-learning course. What previously had to be recorded prompt by prompt with the voice talent or was not realized at all due to disproportionate effort can now be generated via text-to-speech and subsequently expanded or changed at any time with the same output quality. Subsequent recordings no longer lead to audible differences. However, the increased effort involved in generating the prompts and in quality control should be noted. In addition, the usage licenses must still be paid to the voice talent.

Case study 3: Modular systems in advertising


The situation is very similar with modular advertising. For example, if it is to be played out in a podcast in a targeted manner. In coordination with the targeting, the artificial voice can be used to easily generate and play out individual local offers or claims. Variants that were previously unthinkable can now be realized with the support of AI.


Of course, there are many other scenarios in which artificial intelligence can be of help. However, we see it more as a supplement or extension of the human voice. Not as a replacement. Real emotions and human intelligence are not interchangeable. That's why we provide you with the best and most professional voices as the human basis for AI projects. In close cooperation with our voice talents and, of course, with their personal consent, we realize innovative projects in the field of artificial intelligence in a responsible and legally compliant manner. Be it speech synthesis, text to speech or a modular voice system.

We are happy to provide you with help and advice on all questions relating to AI at any time. Just get in touch with us!