Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of humans or animals. It is also the
field of study in
computer science that develops and studies intelligent machines. "AI" may also refer to the machines themselves.
Artificial intelligence was founded as an academic discipline in 1956.[2] The field went through multiple cycles of optimism[3][4] followed by disappointment and loss of funding,[5][6] but after 2012, when
deep learning surpassed all previous AI techniques,[7] there was a vast increase in funding and interest.
The general problem of simulating (or creating) intelligence has been broken down into sub-problems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention and cover the scope of AI research.[a]
Reasoning, problem-solving
Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions.[10] By the late 1980s and 1990s, methods were developed for dealing with
uncertain or incomplete information, employing concepts from
probability and
economics.[11]
Many of these algorithms are insufficient for solving large reasoning problems because they experience a "combinatorial explosion": they became exponentially slower as the problems grew larger.[12]
Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments.[13]
Accurate and efficient reasoning is an unsolved problem.
Knowledge representation
An ontology represents knowledge as a set of concepts within a domain and the relationships between those concepts.
Knowledge representation and
knowledge engineering[14] allow AI programs to answer questions intelligently and make deductions about real-world facts. Formal knowledge representations are used in content-based indexing and retrieval,[15] scene interpretation,[16] clinical decision support,[17] knowledge discovery (mining "interesting" and actionable inferences from large databases),[18] and other areas.[19]
A
knowledge base is a body of knowledge represented in a form that can be used by a program. An
ontology is the set of objects, relations, concepts, and properties used by domain of knowledge.[20] Knowledge bases need to represent things such as:
objects, properties, categories and relations between objects;
[21]
situations, events, states and time;[22]
causes and effects;[23]
knowledge about knowledge (what we know about what other people know);[24]default reasoning (things that humans assume are true until they are told differently and will remain true even when other facts are changing);[25] and many other aspects and domains of knowledge.
Among the most difficult problems in KR are: the breadth of commonsense knowledge (the set of atomic facts that the average person knows is enormous);[26] and the sub-symbolic form of most commonsense knowledge (much of what people know is not represented as "facts" or "statements" that they could express verbally).[13]
Knowledge acquisition is the difficult problem of obtaining knowledge for AI applications.[c] Modern AI gathers knowledge by "
scraping" the internet (including
Wikipedia). The knowledge itself was collected by the volunteers and professionals who published the information (who may or may not have agreed to provide their work to AI companies).[29] This "
crowd sourced" technique does not guarantee that the knowledge is correct or reliable. The knowledge of
Large Language Models (such as
ChatGPT) is highly unreliable -- it generates misinformation and falsehoods (known as "
hallucinations"). Providing accurate knowledge for these modern AI applications is an unsolved problem.
Planning and decision making
An "agent" is anything that perceives and takes actions in the world. A
rational agent has goals or preferences and takes actions to make them happen.[d][30]
In
automated planning, the agent has a specific goal.[31] In
automated decision making, the agent has preferences – there are some situations it would prefer to be in, and some situations it is trying to avoid. The decision making agent assigns a number to each situation (called the "
utility") that measures how much the agent prefers it. For each possible action, it can calculate the "
expected utility": the
utility of all possible outcomes of the action, weighted by the probability that the outcome will occur. It can then choose the action with the maximum expected utility.[32]
In
classical planning, the agent knows exactly what the effect of any action will be.[33]
In most real-world problems, however, the agent may not be certain about the situation they are in (it is "unknown" or "unobservable") and it may not know for certain what will happen after each possible action (it is not "deterministic"). It must choose an action by making a probabilistic guess and then reassess the situation to see if the action worked.[34]
In some problems, the agent's preferences may be uncertain, especially if there are other agents or humans involved. These can be learned (e.g., with
inverse reinforcement learning) or the agent can seek information to improve its preferences.[35]Information value theory can be used to weigh the value of exploratory or experimental actions.[36]
The space of possible future actions and situations is typically
intractably large, so the agents must take actions and evaluate situations while being uncertain what the outcome will be.
A
Markov decision process has a
transition model that describes the probability that a particular action will change the state in a particular way, and a
reward function that supplies the utility of each state and the cost of each action. A
policy associates a decision with each possible state. The policy could be calculated (e.g. by
iteration), be
heuristic, or it can be learned.[37]
Game theory describes rational behavior of multiple interacting agents, and is used in AI programs that make decisions that involve other agents.[38]
Learning
Machine learning is the study of programs that can improve their performance on a given task automatically.[39]
It has been a part of AI from the beginning.[e]
There are several kinds of machine learning.
Unsupervised learning analyzes a stream of data and finds patterns and makes predictions without any other guidance.[42]Supervised learning requires a human to label the input data first, and comes in two main varieties:
classification (where the program must learn to predict what category the input belongs in) and
regression (where the program must deduce a numeric function based on numeric input).[43]
In
reinforcement learning the agent is rewarded for good responses and punished for bad ones. The agent learns to choose responses that are classified as "good".[44]Transfer learning is when the knowledge gained from one problem is applied to a new problem.[45]Deep learning uses
artificial neural networks for all of these types of learning.
Kismet, a robot with rudimentary social skills[61]
Affective computing is an interdisciplinary umbrella that comprises systems that recognize, interpret, process or simulate human
feeling, emotion and mood.[62]
For example, some
virtual assistants are programmed to speak conversationally or even to banter humorously; it makes them appear more sensitive to the emotional dynamics of human interaction, or to otherwise facilitate
human–computer interaction.
However, this tends to give naïve users an unrealistic conception of how intelligent existing computer agents actually are.[63] Moderate successes related to affective computing include textual
sentiment analysis and, more recently,
multimodal sentiment analysis, wherein AI classifies the affects displayed by a videotaped subject.[64]
General intelligence
A machine with
artificial general intelligence should be able to solve a wide variety of problems with breadth and versatility similar to human intelligence.[8]
Tools
AI research uses a wide variety of tools to accomplish the goals above.[b]
Search and optimization
AI can solve many problems by intelligently searching through many possible solutions.[65] There are two very different kinds of search used in AI:
state space search and
local search.
State space search
State space search searches through a tree of possible states to try to find a goal state.[66]
For example,
Planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called
means-ends analysis.[67]
Adversarial search is used for
game-playing programs, such as chess or go. It searches through a tree of possible moves and counter-moves, looking for a winning position.[70]
Local search uses
mathematical optimization to find a numeric solution to a problem. It begins with some form of a guess and then refines the guess incrementally until no more refinements can be made. These algorithms can be visualized as blind
hill climbing: we begin the search at a random point on the landscape, and then, by jumps or steps, we keep moving our guess uphill, until we reach the top. This process is called stochastic
gradient descent.[71]
Evolutionary computation uses a form of optimization search. For example, they may begin with a population of organisms (the guesses) and then allow them to mutate and recombine,
selecting only the fittest to survive each generation (refining the guesses).[72]
Neural networks and
statistical classifiers (discussed below), also use a form of local search, where the "landscape" to be searched is formed by learning.
Logical
inference (or
deduction) is the process of
proving a new statement (
conclusion) from other statements that are already known to be true (the
premises).[77]
A logical
knowledge base also handles queries and assertions as a special case of inference.[78]
An
inference rule describes what is a
valid step in a proof. The most general inference rule is
resolution.[79]
Inference can be reduced to performing a search to find a path that leads from premises to conclusions, where each step is the application of an
inference rule.[80]
Inference performed this way is
intractable except for short proofs in restricted domains. No efficient, powerful and general method has been discovered.[81]
Expectation-maximization clustering of
Old Faithful eruption data starts from a random guess but then successfully converges on an accurate clustering of the two physically distinct modes of eruption.
Many problems in AI (including in reasoning, planning, learning, perception, and robotics) require the agent to operate with incomplete or uncertain information. AI researchers have devised a number of tools to solve these problems using methods from
probability theory and economics.[83]
Probabilistic algorithms can also be used for filtering, prediction, smoothing and finding explanations for streams of data, helping
perception systems to analyze processes that occur over time (e.g.,
hidden Markov models or
Kalman filters).[90]
The simplest AI applications can be divided into two types: classifiers (e.g. "if shiny then diamond"), on one hand, and controllers (e.g. "if diamond then pick up"), on the other hand.
Classifiers[95]
are functions that use
pattern matching to determine the closest match. They can be fine-tuned based on chosen examples using
supervised learning. Each pattern (also called an "
observation") is labeled with a certain predefined class. All the observations combined with their class labels are known as a
data set. When a new observation is received, that observation is classified based on previous experience.[43]
There are many kinds of classifiers in use. The
decision tree is the simplest and most widely used symbolic machine learning algorithm.[96]K-nearest neighbor algorithm was the most widely used analogical AI until the mid-1990s, and
Kernel methods such as the
support vector machine (SVM) displaced k-nearest neighbor in the 1990s.[97]
The
naive Bayes classifier is reportedly the "most widely used learner"[98] at Google, due in part to its scalability.[99]Neural networks are also used as classifiers.[100]
Artificial neural networks
A neural network is an interconnected group of nodes, akin to the vast network of
neurons in the
human brain.
Artificial neural networks[100] were inspired by the design of the human brain: a simple "neuron" N accepts input from other neurons, each of which, when activated (or "fired"), casts a weighted "vote" for or against whether neuron N should itself activate. In practice, the input "neurons" are a
list of numbers, the "weights" are a
matrix, the next layer is the
dot product (i.e., several
weighted sums) scaled by an
increasing function, such as the
logistic function. "The resemblance to real neural cells and structures is superficial", according to
Russell and
Norvig. [101][i]
Learning algorithms for neural networks use
local search to choose the weights that will get the right output for each input during training. The most common training technique is the
backpropagation algorithm.[102]
Neural networks learn to model complex relationships between inputs and outputs and
find patterns in data. In theory, a neural network can learn any function.[103]
Representing images on multiple layers of abstraction in deep learning[109]
Deep learning[107]
uses several layers of neurons between the network's inputs and outputs. The multiple layers can progressively extract higher-level features from the raw input. For example, in
image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits or letters or faces.[110]
Deep learning has drastically improved the performance of programs in many important subfields of artificial intelligence, including
computer vision,
speech recognition,
image classification[111]
and others. The reason that deep learning performs so well in so many applications is not known as of 2023.[112]
The sudden success of deep learning in 2012–2015 did not occur because of some new discovery or theoretical breakthrough (deep neural networks and backpropagation had been described by many people, as far back as the 1950s)[j]
but because of two factors: the incredible increase in computer power (including the hundred-fold increase in speed by switching to
GPUs) and the availability of vast amounts of training data, especially the giant
curated datasets used for benchmark testing, such as
ImageNet.[k]
In the late 2010s,
graphics processing units (GPUs) that were increasingly designed with AI-specific enhancements and used with specialized
TensorFlow software, had replaced previously used
central processing unit (CPUs) as the dominant means for large-scale (commercial and academic)
machine learning models' training.[121]
Historically, specialized languages, such as
Lisp,
Prolog, and others, had been used.
For this 2018 project of the artist Joseph Ayerle the AI had to learn the typical patterns in the colors and brushstrokes of Renaissance painter
Raphael. The portrait shows the face of the actress
Ornella Muti, "painted" by AI in the style of Raphael.
There are also thousands of successful AI applications used to solve specific problems for specific industries or institutions. In a 2017 survey, one in five companies reported they had incorporated "AI" in some offerings or processes.[126]
A few examples are
energy storage,[127]
medical diagnosis,
military logistics,
applications that predict the result of judicial decisions,[128]foreign policy,[129]
or supply chain management.
AlphaFold 2 (2020) demonstrated the ability to approximate, in hours rather than months, the 3D structure of a protein.[147]
Ethics
AI, like any powerful technology, has potential benefits and potential risks. AI may be able to advance science and find solutions for serious problems:
Demis Hassabis of
Deep Mind hopes to "solve intelligence, and then use that to solve everything else".[148] However, as the use of AI has become widespread, several unintended consequences and risks have been identified.[149]
Machine learning applications will be biased if they learn from biased data.[150]
The developers may not be aware that the bias exists.[151]
Bias can be introduced by the way
training data is selected and by the way a model is deployed.[152][150] If a biased algorithm is used to make decisions that can seriously
harm people (as it can in
medicine,
finance,
recruitment,
housing or
policing) then the algorithm may cause
discrimination.[153]Fairness in machine learning is the study of how to prevent the harm caused by algorithmic bias. It has become serious area of academic study within AI. Researchers have discovered it is not always possible to define "fairness" in a way that satisfies all stakeholders.[154]
On June 28, 2015,
Google Photos's new image labeling feature mistakenly identified Jacky Alcine and a friend as "gorillas" because they were black. The system was trained on a dataset that contained very few images of black people,[155] a problem called "sample size disparity".[156] Google "fixed" this problem by preventing the system from labelling anything as a "gorilla". Eight years later, in 2023, Google Photos still could not identify a gorilla, and neither could similar products from Apple, Facebook, Microsoft and Amazon.[157]
COMPAS is a commercial program widely used by
U.S. courts to assess the likelihood of a
defendant becoming a
recidivist.
In 2016,
Julia Angwin at
ProPublica discovered that COMPAS exhibited racial bias, despite the fact that the program was not told the races of the defendants. Although the error rate for both whites and blacks was calibrated equal at exactly 61%, the errors for each race were different -- the system consistently overestimated the chance that a black person would re-offend and would underestimate the chance that a white person would not re-offend.[158] In 2017, several researchers[m] showed that it was mathematically impossible for COMPAS to accommodate all possible measures of fairness when the base rates of re-offense were different for whites and blacks in the data.[160]
A program can make biased decisions even if the data does not explicitly mention a problematic feature (such as "race" or "gender"). The feature will correlate with other features (like "address", "shopping history" or "first name"), and the program will make the same decisions based on these features as it would on "race" or "gender".[161]
Moritz Hardt said “the most robust fact in this research area is that fairness through blindness doesn't work.”[162]
Criticism of COMPAS highlighted a deeper problem with the misuse of AI. Machine learning models are designed to make "predictions" that are only valid if we assume that the future will resemble the past. If they are trained on data that includes the results of racist decisions in the past, machine learning models must predict that racist decisions will be made in the future. Unfortunately, if an applications then uses these predictions as recommendations, some of these "recommendations" will likely be racist.[163] Thus, machine learning is not well suited to help make decisions in areas where there is hope that the future will be better than the past. It is necessarily descriptive and not proscriptive.[n]
Bias and unfairness may go undetected because the developers are overwhelmingly white and male: among AI engineers, about 4% are black and 20% are women.[156]
At its 2022 Conference on Fairness, Accountability, and Transparency (ACM FAccT 2022) the
Association for Computing Machinery, in Seoul, South Korea, presented and published findings recommending that until AI and robotics systems are demonstrated to be free of bias mistakes, they are unsafe and the use of self-learning neural networks trained on vast, unregulated sources of flawed internet data should be curtailed.[165]
Most modern AI applications can not explain how they have reached a decision.[166] The large amount of relationships between inputs and outputs in
deep neural networks and resulting complexity makes it difficult for even an expert to explain how they produced their outputs, making them a
black box.[167]
There have been many cases where a machine learning program passed rigorous tests, but nevertheless learned something different than what the programmers intended. For example, Justin Ko and Roberto Novoa developed a system that could identify skin diseases better than medical professionals, however it classified any image with a
ruler as "cancerous", because pictures of malignancies typically include a ruler to show the scale.[168] A more dangerous example was discovered by Rich Caruana in 2015: a machine learning system that accurately predicted risk of death classified a patient that was over 65, asthma and difficulty breathing as "low risk". Further research showed that in high-risk cases like this, the hospital would allocate more resources and save the patient's life, decreasing the risk measured by the program.[169] Mistakes like these become obvious when we know how the program has reached a decision. Without an explanation, these problems may not not be discovered until after they have caused harm.
A second issue is that people who have been harmed by an algorithm's decision have a
right to an explanation. Doctors, for example, are required to clearly and completely explain the reasoning behind any decision they make.[170] Early drafts of the European Union's
General Data Protection Regulation in 2016 included an explicit statement that this right exists.[o] Industry experts noted that this is an unsolved problem with no solution in sight. Regulators argued that nevertheless the harm is real: if the problem has no solution, the tools should not be used.[171]
DARPA established the
XAI ("Explainable Artificial Intelligence") program in 2014 to try and solve these problems.[172]
There are several potential solutions to the transparency problem.
Multitask learning provides a large number of outputs in addition to the target classification. These other outputs can help developers deduce what the network has learned.[173]Deconvolution,
DeepDream and other
generative methods can allow developers to see what different layers of a deep network have learned and produce output that can suggest what the network is learning.[174]Supersparse linear integer models use learning to identify the most important features, rather than the classification. Simple addition of these features can then make the classification (i.e. learning is used to create a
scoring system classifier, which is transparent).[175]
From the early days of the development of artificial intelligence there have been arguments, for example those put forward by
Weizenbaum, about whether tasks that can be done by computers actually should be done by them, given the difference between computers and humans, and between quantitative calculation and qualitative, value-based judgement. [182]
Economists have frequently highlighted the risks of redundancies from AI, and speculated about unemployment if there is no adequate social policy for full employment.[183]
In the past, technology has tended to increase rather than reduce total employment, but economists acknowledge that "we're in uncharted territory" with AI.[184] A survey of economists showed disagreement about whether the increasing use of robots and AI will cause a substantial increase in long-term
unemployment, but they generally agree that it could be a net benefit if
productivity gains are
redistributed.[185] Risk estimates vary; for example, in the 2010s Michael Osborne and
Carl Benedikt Frey estimated 47% of U.S. jobs are at "high risk" of potential automation, while an OECD report classified only 9% of U.S. jobs as "high risk".[q][187] The methodology of speculating about future employment levels has been criticised as lacking evidential foundation, and for implying that technology (rather than social policy) creates unemployment (as opposed to redundancies).[183]
Unlike previous waves of automation, many middle-class jobs may be eliminated by artificial intelligence; The Economist stated in 2015 that "the worry that AI could do to white-collar jobs what steam power did to blue-collar ones during the Industrial Revolution" is "worth taking seriously".[188] Jobs at extreme risk range from paralegals to fast food cooks, while job demand is likely to increase for care-related professions ranging from personal healthcare to the clergy.[189]
In April 2023, it was reported that 70% of the jobs for Chinese video game illlustrators had been eliminated by generative artificial intelligence.[190][191]
In order to leverage as large a dataset as is feasible, generative AI is often trained on unlicensed copyrighted works, including in domains such as images or computer code; the output is then used under a rationale of "
fair use". Experts disagree about how well, and under what circumstances, this rationale will hold up in courts of law; relevant factors may include "the purpose and character of the use of the copyrighted work" and "the effect upon the potential market for the copyrighted work".[192]
Friendly AI are machines that have been designed from the beginning to minimize risks and to make choices that benefit humans.
Eliezer Yudkowsky, who coined the term, argues that developing friendly AI should be a higher research priority: it may require a large investment and it must be completed before AI becomes an existential risk.[193]
Machines with intelligence have the potential to use their intelligence to make ethical decisions. The field of machine ethics provides machines with ethical principles and procedures for resolving ethical dilemmas.[194]
The field of machine ethics is also called computational morality,[194]
and was founded at an
AAAI symposium in 2005.[195]
The regulation of artificial intelligence is the development of public sector policies and laws for promoting and regulating artificial intelligence (AI); it is therefore related to the broader regulation of algorithms.[198]
The regulatory and policy landscape for AI is an emerging issue in jurisdictions globally.[199] According to AI Index at
Stanford, the annual number of AI-related laws passed in the 127 survey countries jumped from one passed in 2016 to 37 passed in 2022 alone.[200][201]
Between 2016 and 2020, more than 30 countries adopted dedicated strategies for AI.[202]
Most EU member states had released national AI strategies, as had Canada, China, India, Japan, Mauritius, the Russian Federation, Saudi Arabia, United Arab Emirates, US and Vietnam. Others were in the process of elaborating their own AI strategy, including Bangladesh, Malaysia and Tunisia.[202]
The
Global Partnership on Artificial Intelligence was launched in June 2020, stating a need for AI to be developed in accordance with human rights and democratic values, to ensure public confidence and trust in the technology.[202]Henry Kissinger,
Eric Schmidt, and
Daniel Huttenlocher published a joint statement in November 2021 calling for a government commission to regulate AI.[203] In 2023, OpenAI leaders published recommendations for the governance of superintelligence, which they believe may happen in less than 10 years.[204]
In a 2022
Ipsos survey, attitudes towards AI varied greatly by country; 78% of Chinese citizens, but only 35% of Americans, agreed that "products and services using AI have more benefits than drawbacks".[200] A 2023
Reuters/Ipsos poll found that 61% of Americans agree, and 22% disagree, that AI poses risks to humanity.[205] In a 2023
Fox News poll, 35% of Americans thought it "very important", and an additional 41% thought it "somewhat important", for the federal government to regulate AI, versus 13% responding "not very important" and 8% responding "not at all important".[206][207]
The study of mechanical or "formal" reasoning began with philosophers and mathematicians in antiquity. The study of logic led directly to
Alan Turing's
theory of computation, which suggested that a machine, by shuffling symbols as simple as "0" and "1", could simulate both mathematical deduction and formal reasoning, which is known as the
Church–Turing thesis.[208] This, along with concurrent discoveries in
cybernetics and
information theory, led researchers to consider the possibility of building an "electronic brain".[r][210] The first paper later recognized as "AI" was
McCullouch and
Pitts design for
Turing-complete "artificial neurons" in 1943.[211]
The field of AI research was founded at
a workshop at
Dartmouth College in 1956.[s][2] The attendees became the leaders of AI research in the 1960s.[t] They and their students produced programs that the press described as "astonishing":[u] computers were learning
checkers strategies, solving word problems in algebra, proving
logical theorems and speaking English.[v][3]
By the middle of the 1960s, research in the U.S. was heavily funded by the
Department of Defense[215] and laboratories had been established around the world.[216]Herbert Simon predicted, "machines will be capable, within twenty years, of doing any work a man can do".[217]Marvin Minsky agreed, writing, "within a generation ... the problem of creating 'artificial intelligence' will substantially be solved".[218]
They had, however, underestimated the difficulty of the problem.[w] Both the U.S. and British governments cut off exploratory research in response to the
criticism of
Sir James Lighthill[220] and ongoing pressure from the US Congress to
fund more productive projects.
Minsky's and
Papert's book Perceptrons was understood as proving that
artificial neural networks approach would never be useful for solving real-world tasks, thus discrediting the approach altogether.[221] The "
AI winter", a period when obtaining funding for AI projects was difficult, followed.[5]
In the early 1980s, AI research was revived by the commercial success of
expert systems,[222] a form of AI program that simulated the knowledge and analytical skills of human experts. By 1985, the market for AI had reached over a billion dollars. At the same time, Japan's
fifth generation computer project inspired the U.S. and British governments to restore funding for
academic research.[4] However, beginning with the collapse of the
Lisp Machine market in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began.[6]
Many researchers began to doubt that the current practices would be able to imitate all the processes of human cognition, especially
perception, robotics,
learning and
pattern recognition.[223] A number of researchers began to look into "sub-symbolic" approaches.[224]Robotics researchers, such as
Rodney Brooks, rejected "representation" in general and focussed directly on engineering machines that move and survive.[x].
Judea Pearl,
Lofti Zadeh and others developed methods that handled incomplete and uncertain information by making reasonable guesses rather than precise logic.[83][229] But the most important development was the revival of "
connectionism", including
neural network research, by
Geoffrey Hinton and others.[230] In 1990,
Yann LeCun successfully showed that
convolutional neural networks can recognize handwritten digits, the first of many successful applications of neural networks.[231]
AI gradually restored its reputation in the late 1990s and early 21st century by exploiting formal mathematical methods and by finding specific solutions to specific problems. This "
narrow" and "formal" focus allowed researchers to produce verifiable results and collaborate with other fields (such as
statistics,
economics and
mathematics).[232]
By 2000, solutions developed by AI researchers were being widely used, although in the 1990s they were rarely described as "artificial intelligence".[233]
Several academic researchers became concerned that AI was no longer pursuing the original goal of creating versatile, fully intelligent machines. Beginning around 2002, they founded the subfield of
artificial general intelligence (or "AGI"), which had several well-funded institutions by the 2010s.[8]
Deep learning's success led to an enormous increase in interest and funding in AI.[z]
The amount of machine learning research (measured by total publications) increased by 50% in the years 2015–2019,[202]
and
WIPO reported that AI was the most prolific
emerging technology in terms of the number of
patent applications and granted patents[238]
According to 'AI Impacts', about $50 billion annually was invested in "AI" around 2022 in the US alone and about 20% of new US Computer Science PhD graduates have specialized in "AI";[239]
about 800,000 "AI"-related US job openings existed in 2022.[240]
In 2016, issues of fairness and the misuse of technology were catapulted into center stage at machine learning conferences, publications vastly increased, funding became available, and many researchers re-focussed their careers on these issues. The
alignment problem became a serious field of academic study.[241]
Alan Turing wrote in 1950 "I propose to consider the question 'can machines think'?"[242]
He advised changing the question from whether a machine "thinks", to "whether or not it is possible for machinery to show intelligent behaviour".[242]
He devised the Turing test, which measures the ability of a machine to simulate human conversation.[243] Since we can only observe the behavior of the machine, it does not matter if it is "actually" thinking or literally has a "mind". Turing notes that we can not determine these things about other people[aa] but "it is usual to have a polite convention that everyone thinks"[244]
Russell and
Norvig agree with Turing that AI must be defined in terms of "acting" and not "thinking".[245] However, they are critical that the test compares machines to people. "
Aeronautical engineering texts," they wrote, "do not define the goal of their field as making 'machines that fly so exactly like
pigeons that they can fool other pigeons.'"[246] AI founder
John McCarthy agreed, writing that "Artificial intelligence is not, by definition, simulation of human intelligence".[247]
McCarthy defines intelligence as "the computational part of the ability to achieve goals in the world."[248] Another AI founder,
Marvin Minsky similarly defines it as "the ability to solve hard problems".[249] These definitions view intelligence in terms of well-defined problems with well-defined solutions, where both the difficulty of the problem and the performance of the program are direct measures of the "intelligence" of the machine—and no other philosophical discussion is required, or may not even be possible.
Another definition has been adopted by Google[250][better source needed], a major practitioner in the field of AI.
This definition stipulates the ability of systems to synthesize information as the manifestation of intelligence, similar to the way it is defined in biological intelligence.
Evaluating approaches to AI
No established unifying theory or
paradigm has guided AI research for most of its history.[ab] The unprecedented success of statistical machine learning in the 2010s eclipsed all other approaches (so much so that some sources, especially in the business world, use the term "artificial intelligence" to mean "machine learning with neural networks"). This approach is mostly
sub-symbolic,
soft and
narrow (see below). Critics argue that these questions may have to be revisited by future generations of AI researchers.
Symbolic AI and its limits
Symbolic AI (or "
GOFAI")[252] simulated the high-level conscious reasoning that people use when they solve puzzles, express legal reasoning and do mathematics. They were highly successful at "intelligent" tasks such as algebra or IQ tests. In the 1960s, Newell and Simon proposed the
physical symbol systems hypothesis: "A physical symbol system has the necessary and sufficient means of general intelligent action."[253]
However, the symbolic approach failed on many tasks that humans solve easily, such as learning, recognizing an object or commonsense reasoning.
Moravec's paradox is the discovery that high-level "intelligent" tasks were easy for AI, but low level "instinctive" tasks were extremely difficult.[254]
Philosopher
Hubert Dreyfus had
argued since the 1960s that human expertise depends on unconscious instinct rather than conscious symbol manipulation, and on having a "feel" for the situation, rather than explicit symbolic knowledge.[255]
Although his arguments had been ridiculed and ignored when they were first presented, eventually, AI research came to agree.[ac][13]
The issue is not resolved:
sub-symbolic reasoning can make many of the same inscrutable mistakes that human intuition does, such as
algorithmic bias. Critics such as
Noam Chomsky argue continuing research into symbolic AI will still be necessary to attain general intelligence,[257][258] in part because sub-symbolic AI is a move away from
explainable AI: it can be difficult or impossible to understand why a modern statistical AI program made a particular decision. The emerging field of
neuro-symbolic artificial intelligence attempts to bridge the two approaches.
"Neats" hope that intelligent behavior is described using simple, elegant principles (such as
logic,
optimization, or
neural networks). "Scruffies" expect that it necessarily requires solving a large number of unrelated problems. Neats defend their programs with theoretical rigor, scruffies rely mainly on incremental testing to see if they work. This issue was actively discussed in the 70s and 80s,[259]
but eventually was seen as irrelevant. Modern AI has elements of both.
Finding a provably correct or optimal solution is
intractable for many important problems.[12] Soft computing is a set of techniques, including
genetic algorithms,
fuzzy logic and
neural networks, that are tolerant of imprecision, uncertainty, partial truth and approximation. Soft computing was introduced in the late 80s and most successful AI programs in the 21st century are examples of soft computing with neural networks.
AI researchers are divided as to whether to pursue the goals of artificial general intelligence and
superintelligence directly or to solve as many specific problems as possible (
narrow AI) in hopes these solutions will lead indirectly to the field's long-term goals.[260][261]
General intelligence is difficult to define and difficult to measure, and modern AI has had more verifiable successes by focusing on specific problems with specific solutions. The experimental sub-field of artificial general intelligence studies this area exclusively.
The
philosophy of mind does not know whether a machine can have a
mind,
consciousness and
mental states, in the same sense that human beings do. This issue considers the internal experiences of the machine, rather than its external behavior. Mainstream AI research considers this issue irrelevant because it does not affect the goals of the field: to build machines that can solve problems using intelligence.
Russell and
Norvig add that "[t]he additional project of making a machine conscious in exactly the way humans are is not one that we are equipped to take on."[262] However, the question has become central to the philosophy of mind. It is also typically the central question at issue in
artificial intelligence in fiction.
David Chalmers identified two problems in understanding the mind, which he named the "hard" and "easy" problems of consciousness.[263] The easy problem is understanding how the brain processes signals, makes plans and controls behavior. The hard problem is explaining how this feels or why it should feel like anything at all, assuming we are right in thinking that it truly does feel like something (Dennett's consciousness illusionism says this is an illusion). Human
information processing is easy to explain, however, human
subjective experience is difficult to explain. For example, it is easy to imagine a color-blind person who has learned to identify which objects in their field of view are red, but it is not clear what would be required for the person to know what red looks like.[264]
Computationalism is the position in the
philosophy of mind that the human mind is an information processing system and that thinking is a form of computing. Computationalism argues that the relationship between mind and body is similar or identical to the relationship between software and hardware and thus may be a solution to the
mind–body problem. This philosophical position was inspired by the work of AI researchers and cognitive scientists in the 1960s and was originally proposed by philosophers
Jerry Fodor and
Hilary Putnam.[265]
Philosopher
John Searle characterized this position as "
strong AI": "The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds."[ad]
Searle counters this assertion with his Chinese room argument, which attempts to show that, even if a machine perfectly simulates human behavior, there is still no reason to suppose it also has a mind.[269]
If a machine has a mind and subjective experience, then it may also have
sentience (the ability to feel), and if so it could also suffer; it has been argued that this could entitle it to certain rights.[270]
Any hypothetical robot rights would lie on a spectrum with
animal rights and human rights.[271]
This issue has been considered in
fiction for centuries,[272]
and is now being considered by, for example, California's
Institute for the Future; however, critics argue that the discussion is premature.[273]
Future
Superintelligence and the singularity
A
superintelligence is a hypothetical agent that would possess intelligence far surpassing that of the brightest and most gifted human mind.[261]
If research into
artificial general intelligence produced sufficiently intelligent software, it might be able to reprogram and improve itself. The improved software would be even better at improving itself, leading to what
I. J. Good called an "
intelligence explosion" and
Vernor Vinge called a "
singularity".[274]
However, most technologies do not improve exponentially indefinitely, but rather follow an S-curve, slowing when they reach the physical limits of what the technology can do.[275] Consider, for example,
transportation: it experienced exponential improvement from 1830 to 1970, but the trend abruptly stopped when it reached physical limits.
It has been argued AI will become so powerful that humanity may irreversibly lose control of it. This could, as the physicist
Stephen Hawking puts it, "
spell the end of the human race".[276] This scenario has been common in science fiction, when a computer or robot suddenly develops a human-like "self-awareness" (or "sentience" or "consciousness") and becomes a malevolent character.[ae] These sci-fi scenarios are misleading in several ways.
First, AI does not require human-like "sentience" to be an existential risk. Modern AI programs are given specific goals and use learning and intelligence to achieve them. Philosopher
Nick Bostrom argued that if one gives almost any goal to a sufficiently powerful AI, it may choose to destroy humanity to achieve it (he used the example of a
paperclip factory manager).[278]Stuart Russell gives the example of household robot that tries to find a way to kill its owner to prevent it from being unplugged, reasoning that "you can't fetch the coffee if you're dead."[279] In order to be safe for humanity, a
superintelligence would have to be genuinely
aligned with humanity's morality and values so that it is "fundamentally on our side".[280]
Second,
Yuval Noah Harari argues that AI does not require a robot body or physical control to pose an existential risk. The essential parts of civilization are not physical. Things like
ideologies,
law,
government,
money and the
economy are made of
language; they exist because there are stories that billions of people believe. The current prevalence of
misinformation suggests that an AI could use language to convince people to believe anything, even to take actions that are destructive.[281]
The opinions amongst experts and industry insiders are mixed, with sizable fractions both concerned and unconcerned by risk from eventual superintelligent AI.[282] Personalities such as
Stephen Hawking,
Bill Gates,
Elon Musk have expressed concern about existential risk from AI.[283]
In the early 2010's, experts argued that the risks are too distant in the future to warrant research, or that humans will be valuable from the perspective of a superintelligent machine.[284]
However, after 2016, the study of current and future risks and possible solutions became a serious area of research.[241]
In 2023, AI pioneers including
Geoffrey Hinton,
Yoshua Bengio,
Demis Hassabis, and
Sam Altman issued the joint statement that "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war"; some others such as
Yann LeCun consider this to be unfounded.[285]
The word "robot" itself was coined by
Karel Čapek in his 1921 play R.U.R., the title standing for "Rossum's Universal Robots".
Thought-capable artificial beings have appeared as storytelling devices since antiquity,[288]
and have been a persistent theme in
science fiction.[289]
Isaac Asimov introduced the
Three Laws of Robotics in many books and stories, most notably the "Multivac" series about a super-intelligent computer of the same name. Asimov's laws are often brought up during lay discussions of machine ethics;[291]
while almost all artificial intelligence researchers are familiar with Asimov's laws through popular culture, they generally consider the laws useless for many reasons, one of which is their ambiguity.[292]
AI safety – Research area on making AI safe and beneficial
AI alignment – Conformance to the intended objective
Artificial intelligence in healthcare – Machine-learning algorithms and software in the analysis, presentation, and comprehension of complex medical and health care data
^It is among the reasons that
expert systems proved to be inefficient for capturing knowledge.[27][28]
^
"Rational agent" is general term used in
economics,
philosophy and theoretical artificial intelligence. It can refer to anything that directs its behavior to accomplish goals, such as a person, an animal, a corporation, a nation, or, in the case of AI, a computer program.
^Alan Turing discussed the centrality of learning as early as 1950, in his classic paper "
Computing Machinery and Intelligence".[40] In 1956, at the original Dartmouth AI summer conference,
Ray Solomonoff wrote a report on unsupervised probabilistic machine learning: "An Inductive Inference Machine".[41]
^
Compared with symbolic logic, formal Bayesian inference is computationally expensive. For inference to be tractable, most observations must be
conditionally independent of one another.
AdSense uses a Bayesian network with over 300 million edges to learn which ads to serve.[85]
^Expectation-maximization, one of the most popular algorithms in machine learning, allows clustering in the presence of unknown
latent variables.[87]
^Russell and
Norvig suggest the alternative term "computational graphs" – that is, an abstract network (or "
graph") where the edges and nodes are assigned numeric values.
^Geoffrey Hinton said, of his work on neural networks in the 1990s, "our labeled datasets were thousands of times too small. [And] our computers were millions of times too slow"[120]
^The
Smithsonian reports: "Pluribus has bested poker pros in a series of six-player no-limit Texas Hold'em games, reaching a milestone in artificial intelligence research. It is the first bot to beat humans in a complex multiplayer competition."[136]
^Moritz Hardt (a director at the
Max Planck Institute for Intelligent Systems) argues that machine learning "is fundamentally the wrong tool for a lot of domains, where you're trying to design interventions and mechanisms that change the world."[164]
^When the law was passed in 2018, it still contained a form of this provision.
^See table 4; 9% is both the OECD average and the US average.[186]
^"Electronic brain" was the term used by the press around this time.[209]
^
Daniel Crevier wrote, "the conference is generally recognized as the official birthdate of the new science."[212]Russell and
Norvig called the conference "the inception of artificial intelligence."[211]
^Russell and
Norvig wrote "for the next 20 years the field would be dominated by these people and their students."[213]
^Russell and
Norvig wrote "it was astonishing whenever a computer did anything kind of smartish".[214]
^Matteo Wong wrote in
The Atlantic: "Whereas for decades, computer-science fields such as natural-language processing, computer vision, and robotics used extremely different methods, now they all use a programming method called "deep learning." As a result, their code and approaches have become more similar, and their models are easier to integrate into one another."[234]
^Jack Clark wrote in
Bloomberg: "After a half-decade of quiet breakthroughs in artificial intelligence, 2015 has been a landmark year. Computers are smarter and learning faster than ever," and noted that the number of software projects that use machine learning at
Google increased from a "sporadic usage" in 2012 to more than 2,700 projects in 2015.[236]
^Nils Nilsson wrote in 1983: "Simply put, there is wide disagreement in the field about what AI is all about."[251]
^
Daniel Crevier wrote that "time has proven the accuracy and perceptiveness of some of Dreyfus's comments. Had he formulated them less aggressively, constructive actions they suggested might have been taken much earlier."[256]
^
Searle presented this definition of "Strong AI" in 1999.[266] Searle's original formulation was "The appropriately programmed computer really is a mind, in the sense that computers given the right programs can be literally said to understand and have other cognitive states."[267] Strong AI is defined similarly by
Russell and
Norvig: "Stong AI – the assertion that machines that do so are actually thinking (as opposed to simulating thinking)."[268]
Newquist, HP (1994). The Brain Makers: Genius, Ego, And Greed In The Quest For Machines That Think. New York: Macmillan/SAMS.
ISBN978-0-672-30412-5.
Nilsson, Nils (2009). The Quest for Artificial Intelligence: A History of Ideas and Achievements. New York: Cambridge University Press.
ISBN978-0-521-12293-1.
Cybenko, G. (1988). Continuous valued neural networks with two hidden layers are sufficient (Report). Department of Computer Science, Tufts University.
Gers, Felix A.; Schraudolph, Nicol N.; Schraudolph, Jürgen (2002).
"Learning Precise Timing with LSTM Recurrent Networks"(PDF). Journal of Machine Learning Research. 3: 115–143.
Archived(PDF) from the original on 9 October 2022. Retrieved 13 June 2017.
Goertzel, Ben; Lian, Ruiting; Arel, Itamar; de Garis, Hugo; Chen, Shuo (December 2010). "A world survey of artificial brain projects, Part II: Biologically inspired cognitive architectures". Neurocomputing. 74 (1–3): 30–49.
doi:
10.1016/j.neucom.2010.08.012.
Robinson, A. J.; Fallside, F. (1987), "The utility driven dynamic error propagation network.", Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department
Williams, R. J.; Zipser, D. (1994), "Gradient-based learning algorithms for recurrent networks and their computational complexity", Back-propagation: Theory, Architectures and Applications, Hillsdale, NJ: Erlbaum
Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron (2016),
Deep Learning, MIT Press., archived from
the original on 16 April 2016, retrieved 12 November 2017
Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.;
Sainath, T.; Kingsbury, B. (2012). "Deep Neural Networks for Acoustic Modeling in Speech Recognition – The shared views of four research groups". IEEE Signal Processing Magazine. 29 (6): 82–97.
Bibcode:
2012ISPM...29...82H.
doi:
10.1109/msp.2012.2205597.
S2CID206485943.
Linnainmaa, Seppo (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors (Thesis) (in Finnish). Univ. Helsinki, 6–7.|
Griewank, Andreas (2012). "Who Invented the Reverse Mode of Differentiation? Optimization Stories". Documenta Matematica, Extra Volume ISMP: 389–400.
Werbos, Paul (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences (Ph.D. thesis).
Harvard University.
Merkle, Daniel; Middendorf, Martin (2013). "Swarm Intelligence". In Burke, Edmund K.; Kendall, Graham (eds.). Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques. Springer Science & Business Media.
ISBN978-1-4614-6940-7.
Galvan, Jill (1 January 1997). "Entering the Posthuman Collective in Philip K. Dick's "Do Androids Dream of Electric Sheep?"". Science Fiction Studies. 24 (3): 413–429.
JSTOR4240644.
Arntz, Melanie; Gregory, Terry; Zierahn, Ulrich (2016), "The risk of automation for jobs in OECD countries: A comparative analysis", OECD Social, Employment, and Migration Working Papers 189
Morgenstern, Michael (9 May 2015).
"Automation and anxiety". The Economist.
Archived from the original on 12 January 2018. Retrieved 13 January 2018.
Law Library of Congress (U.S.). Global Legal Research Directorate, issuing body. (2019). Regulation of artificial intelligence in selected jurisdictions.
LCCN2019668143.
OCLC1110727808.
Barfield, Woodrow; Pagallo, Ugo (2018). Research handbook on the law of artificial intelligence. Cheltenham, UK.
ISBN978-1-78643-904-8.
OCLC1039480085.{{
cite book}}: CS1 maint: location missing publisher (
link)
Iphofen, Ron; Kritikos, Mihalis (3 January 2019). "Regulating artificial intelligence and robotics: ethics by design in a digital society". Contemporary Social Science. 16 (2): 170–184.
doi:
10.1080/21582041.2018.1563803.
ISSN2158-2041.
S2CID59298502.
Omohundro, Steve (2008). The Nature of Self-Improving Artificial Intelligence. presented and distributed at the 2007 Singularity Summit, San Francisco, CA.
"Kismet". MIT Artificial Intelligence Laboratory, Humanoid Robotics Group.
Archived from the original on 17 October 2014. Retrieved 25 October 2014.
Smoliar, Stephen W.; Zhang, HongJiang (1994). "Content based video indexing and retrieval". IEEE MultiMedia. 1 (2): 62–72.
doi:
10.1109/93.311653.
S2CID32710913.
Neumann, Bernd; Möller, Ralf (January 2008). "On scene interpretation with description logics". Image and Vision Computing. 26 (1): 82–101.
doi:
10.1016/j.imavis.2007.08.013.
S2CID10767011.
McGarry, Ken (1 December 2005). "A survey of interestingness measures for knowledge discovery". The Knowledge Engineering Review. 20 (1): 39–61.
doi:
10.1017/S0269888905000408.
S2CID14987656.
Bertini, M; Del Bimbo, A; Torniai, C (2006). "Automatic annotation and semantic retrieval of video sequences using multimedia ontologies". MM '06 Proceedings of the 14th ACM international conference on Multimedia. 14th ACM international conference on Multimedia. Santa Barbara: ACM. pp. 679–682.
Turing, Alan (1948), "Machine Intelligence", in Copeland, B. Jack (ed.), The Essential Turing: The ideas that gave birth to the computer age, Oxford: Oxford University Press, p. 412,
ISBN978-0-19-825080-7
Pennachin, C.; Goertzel, B. (2007). "Contemporary Approaches to Artificial General Intelligence". Artificial General Intelligence. Cognitive Technologies. Berlin, Heidelberg: Springer. pp. 1–30.
doi:
10.1007/978-3-540-68677-4_1.
ISBN978-3-540-23733-4.
Ransbotham, Sam; Kiron, David; Gerbert, Philipp; Reeves, Martin (6 September 2017).
"Reshaping Business With Artificial Intelligence". MIT Sloan Management Review.
Archived from the original on 19 May 2018. Retrieved 2 May 2018.
Lorica, Ben (18 December 2017).
"The state of AI adoption". O'Reilly Media.
Archived from the original on 2 May 2018. Retrieved 2 May 2018.
Butler, Samuel (13 June 1863).
"Darwin among the Machines". Letters to the Editor. The Press. Christchurch, New Zealand.
Archived from the original on 19 September 2008. Retrieved 16 October 2014 – via Victoria University of Wellington.
Fearn, Nicholas (2007). The Latest Answers to the Oldest Questions: A Philosophical Adventure with the World's Greatest Thinkers. New York: Grove Press.
ISBN978-0-8021-1839-4.
NRC (United States National Research Council) (1999). "Developments in Artificial Intelligence". Funding a Revolution: Government Support for Computing Research. National Academy Press.
Solomonoff, Ray (1956).
An Inductive Inference Machine(PDF). Dartmouth Summer Research Conference on Artificial Intelligence.
Archived(PDF) from the original on 26 April 2011. Retrieved 22 March 2011 – via std.com, pdf scanned copy of the original. Later published as Solomonoff, Ray (1957). "An Inductive Inference Machine". IRE Convention Record. Vol. Section on Information Theory, part 2. pp. 56–62.
Wason, P. C.; Shapiro, D. (1966). "Reasoning". In Foss, B. M. (ed.). New horizons in psychology. Harmondsworth: Penguin.
Archived from the original on 26 July 2020. Retrieved 18 November 2019.
Cukier, Kenneth, "Ready for Robots? How to Think about the Future of AI", Foreign Affairs, vol. 98, no. 4 (July/August 2019), pp. 192–98.
George Dyson, historian of computing, writes (in what might be called "Dyson's Law") that "Any system simple enough to be understandable will not be complicated enough to behave intelligently, while any system complicated enough to behave intelligently will be too complicated to understand." (p. 197.) Computer scientist
Alex Pentland writes: "Current
AI machine-learningalgorithms are, at their core, dead simple stupid. They work, but they work by brute force." (p. 198.)
Domingos, Pedro, "Our Digital Doubles: AI will serve our species, not control it", Scientific American, vol. 319, no. 3 (September 2018), pp. 88–93.
Gertner, Jon. (2023) "Wikipedia's Moment of Truth: Can the online encyclopedia help teach A.I. chatbots to get their facts right — without destroying itself in the process?" New York Times Magazine (July 18, 2023)
online
Johnston, John (2008) The Allure of Machinic Life: Cybernetics, Artificial Life, and the New AI, MIT Press.
Gary Marcus, "Artificial Confidence: Even the newest, buzziest systems of artificial general intelligence are stymmied by the same old problems", Scientific American, vol. 327, no. 4 (October 2022), pp. 42–45.
Mitchell, Melanie (2019). Artificial intelligence: a guide for thinking humans. New York: Farrar, Straus and Giroux.
ISBN9780374257835.
Eka Roivainen, "AI's IQ:
ChatGPT aced a [standard intelligence] test but showed that
intelligence cannot be measured by
IQ alone", Scientific American, vol. 329, no. 1 (July/August 2023), p. 7. "Despite its high IQ,
ChatGPT fails at tasks that require real humanlike reasoning or an understanding of the physical and social world.... ChatGPT seemed unable to reason logically and tried to rely on its vast database of... facts derived from online texts."
Ashish Vaswani, Noam Shazeer, Niki Parmar et al. "Attention is all you need." Advances in neural information processing systems 30 (2017). Seminal paper on
transformers.