Eliezer S. Yudkowsky (ˌɛliˈɛzər_ˌjʌdˈkaʊski ; born September 11, 1979) is an American artificial intelligence researcher and writer on decision theory and ethics, best known for popularizing ideas related to friendly artificial intelligence, including the idea of a "fire alarm" for AI. He is a co-founder and research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California. His work on the prospect of a runaway intelligence explosion influenced philosopher Nick Bostrom's 2014 book Superintelligence: Paths, Dangers, Strategies.
Machine Intelligence Research Institute
Yudkowsky's views on the safety challenges posed by future generations of AI systems are discussed in the undergraduate textbook in AI, Stuart Russell and Peter Norvig's Artificial Intelligence: A Modern Approach. Noting the difficulty of formally specifying general-purpose goals by hand, Russell and Norvig cite Yudkowsky's proposal that autonomous and adaptive systems be designed to learn correct behavior over time:
Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to design a mechanism for evolving AI under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.
In response to the instrumental convergence concern, where autonomous decision-making systems with poorly designed goals would have default incentives to mistreat humans, Yudkowsky and other MIRI researchers have recommended that work be done to specify software agents that converge on safe default behaviors even when their goals are misspecified.
In the intelligence explosion scenario hypothesized by I. J.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Machine ethics (or machine morality, computational morality, or computational ethics) is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherwise known as artificial intelligent agents. Machine ethics differs from other ethical fields related to engineering and technology. Machine ethics should not be confused with computer ethics, which focuses on human use of computers.
Friendly artificial intelligence (also friendly AI or FAI) is hypothetical artificial general intelligence (AGI) that would have a positive (benign) effect on humanity or at least align with human interests or contribute to fostering the improvement of the human species. It is a part of the ethics of artificial intelligence and is closely related to machine ethics. While machine ethics is concerned with how an artificially intelligent agent should behave, friendly artificial intelligence research is focused on how to practically bring about this behavior and ensuring it is adequately constrained.
Instrumental convergence is the hypothetical tendency for most sufficiently intelligent beings (human and non-human) to pursue similar sub-goals, even if their ultimate goals are pretty different. More precisely, agents (beings with agency) may pursue instrumental goals—goals which are made in pursuit of some particular end, but are not the end goals themselves—without ceasing, provided that their ultimate (intrinsic) goals may never be fully satisfied.
L'objectif de ce séminaire est d'amener les étudiants à réfléchir aux enjeux éthiques que les nouvelles technologies peuvent soulever, parmi lesquels leur incompatibilité avec l'autonomie, la liberté