An audio deepfake (also known as voice cloning) is a type of artificial intelligence used to create convincing speech sentences that sound like specific people saying things they did not say. This technology was initially developed for various applications to improve human life. For example, it can be used to produce audiobooks, and also to help people who have lost their voices (due to throat disease or other medical problems) to get them back. Commercially, it has opened the door to several opportunities. This technology can also create more personalized digital assistants and natural-sounding text-to-speech as well as speech translation services.
Audio deepfakes, recently called audio manipulations, are becoming widely accessible using simple mobile devices or personal computers. These tools have also been used to spread misinformation using audio. This has led to cybersecurity concerns among the global public about the side effects of using audio deepfakes, including its possible role in disseminating misinformation and disinformation in audio-based social media platforms. People can use them as a logical access voice spoofing technique, where they can be used to manipulate public opinion for propaganda, defamation, or terrorism. Vast amounts of voice recordings are daily transmitted over the Internet, and spoofing detection is challenging. Audio deepfake attackers have targeted individuals and organizations, including politicians and governments. In early 2020, some scammers used artificial intelligence-based software to impersonate the voice of a CEO to authorize a money transfer of about $35 million through a phone call. According to a 2023 global McAfee survey, one person in ten reported having been targeted by an AI voice cloning scam; 77% of these targets reported losing money to the scam. Audio deepfakes could also pose a danger to voice ID systems currently deployed to financial consumers.
Audio deepfakes can be divided into three different categories:
Replay-based deepfakes are malicious works that aim to reproduce a recording of the interlocutor's voice.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Hands-on introduction to data science and machine learning. We explore recommender systems, generative AI, chatbots, graphs, as well as regression, classification, clustering, dimensionality reduction
Machine learning methods are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analyzed and pr
Deepfakes (portmanteau of "deep learning" and "fake") are synthetic media that have been digitally manipulated to replace one person's likeness convincingly with that of another. Deepfakes are the manipulation of facial appearance through deep generative methods. While the act of creating fake content is not new, deepfakes leverage powerful techniques from machine learning and artificial intelligence to manipulate or generate visual and audio content that can more easily deceive.
15.ai is a non-commercial freeware artificial intelligence web application that generates natural emotive high-fidelity text-to-speech voices from an assortment of fictional characters from a variety of media sources. Developed by a pseudonymous MIT researcher under the name 15, the project uses a combination of audio synthesis algorithms, speech synthesis deep neural networks, and sentiment analysis models to generate and serve emotive character voices faster than real-time, particularly those with a very small amount of trainable data.
Artificial intelligence (AI) has been used in applications to alleviate certain problems throughout industry and academia. AI, like electricity or computers, is a general purpose technology that has a multitude of applications. It has been used in fields of language translation, image recognition, credit scoring, e-commerce and other domains. Recommendation system A recommendation system predicts the "rating" or "preference" a user would give to an item.
Explores emotion theories, applications, and predictive models in affective computing, analyzing NSF funding trends, emotion impact on education and medicine, and emotion detection through physiological signals and visual data.
Detecting digital face manipulation in images and video has attracted extensive attention due to the potential risk to public trust. To counteract the malicious usage of such techniques, deep learning-based deepfake detection methods have been employed and ...
2024
, , , ,
Video DeepFakes are fake media created with Deep Learning (DL) that manipulate a person’s expression or identity. Most current DeepFake detection methods analyze each frame independently, ignoring inconsistencies and unnatural movements between frames. Som ...
In recent years, the remarkable progress in facial manipulation techniques has raised social concerns due to their potential malicious usage and has received considerable attention from both industry and academia. While current deep learning-based face for ...