large language model

back to index

description: language model built with large amounts of texts

14 results

pages: 321 words: 113,564

AI in Museums: Reflections, Perspectives and Applications
by Sonja Thiel and Johannes C. Bernhardt
Published 31 Dec 2023

This issue highlights the importance of considering ethical implications in the use of AI-generated images and the need for greater transparency and communication regarding the use of artists’ works in AI development. The use and application of generative AI with multimodal models falls within a broader ongoing debate surrounding large language models. Several AI researchers have issued an open call for a moratorium on the development of large language models such as ChatGPT or GPT for at least six months until further research on the technology has been conducted (Open Letter n.d.). In addition to ethical concerns regarding the data used, there are overarching debates surrounding issues such as the potential loss of jobs, particularly for illustrators, who may feel threatened by the technology of prompt engineering and text-to-image generators.

Bernhardt and Sonja Thiel The present volume stems from the conference Cultures of Artificial Intelligence: New Perspectives for Museums, which took place at the Badisches Landesmuseum in Karlsruhe on 1 and 2 December 2022 and was simultaneously streamed on the web. Artificial intelligence is not yet a mainstream topic in the cultural world, but does feature in general debates about digitization and digitality. The use of machine learning, neural networks, and large language models has, however—and contrary to common assumptions—been growing for years. Beyond prominent lighthouses, initial surveys of the international museum landscape list many hundreds of projects addressing issues of traditional museum work and the digitality debate by means of new approaches. The number is continually increasing, and it is not always easy to obtain an overview of all the developments.

If one speaks less far-reaching of systems that follow algorithmic rules, recognize patterns in data, and solve specific tasks, the challenges to human intelligence and related categories such as thinking, consciousness, reason, creativity, or intentionality pose themselves less sharply. The only thing that has changed dramatically in recent years is that such systems—from simple machine learning to the development of neural networks and large language models—have achieved a level of complexity and efficiency that often produces astonishing results. But to view this correctly, it is necessary to think the other way round than Turing did: The results may look intelligent, but they are not. And this leads to the core of the problem for the cultural sector and the still missing piece in Brown’s enigmatic museum scene: the approaches of AI are based on mathematical principles, logic, and probabilities, while culture is about the negotiation of meaning and ambivalence.

pages: 444 words: 117,770

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma
by Mustafa Suleyman
Published 4 Sep 2023

Q., 45 Khrushchev, Nikita, 126 Kilobots, 95 Klausner, Richard, 85 Krizhevsky, Alex, 59 Kurzweil, Ray, 57 L lab leaks, 173–75, 176 labor markets, 177–81, 261, 262, 282 LaMDA, 71–72, 75 Lander, Eric, 265 language, 27, 157 See also large language models LanzaTech, 87 large language models (LLMs), 62–65 bias in, 69–70, 239–40 capabilities of, 64–65 deepfakes and, 170 efficiency of, 68 open source and, 69 scale of, 65–66 synthetic biology and, 91 laser weapons, 263 law enforcement, 97–98 Lebanon, 196–97 LeCun, Yann, 130 Lee Sedol, 53–54, 117 Legg, Shane, 8 legislation, 260 See also regulation as method for containment Lemoine, Blake, 71–72 Lenoir, Jean Joseph Étienne, 23 Li, Fei-Fei, 59 libertarianism, 201 Library of Alexandria, 41 licensing, 261 lithium production, 109 LLaMA system, 69 London DNA Foundry, 83 longevity technologies, 85–86 Luddites, 39, 40, 281–83 M machine learning autonomy and, 113 bias in, 69–70, 239–40 computer vision and, 58–60 cyberattacks and, 162–63, 166–67 limitations of, 73 medical applications, 110 military applications and, 103–5 potential of, 61–62 protein structure and, 89–90 robotics and, 95 supervised deep learning, 65 synthetic biology and, 90–91 See also deep learning Macron, Emmanuel, 125 Malthus, Thomas, 136 Manhattan Project, 41, 124, 126, 141, 270 Maoism, 192 Mao Zedong, 194 Marcus, Gary, 73 Maybach, Wilhelm, 24 McCarthy, John, 73 McCormick, Cyrus, 133 medical applications, 85, 95, 110 megamachine, 217 Megvii, 194 Meta, 69, 128, 167 Micius, 122 Microsoft, 69, 98, 128, 160–61 military applications AI and, 104, 165 asymmetry and, 106 machine learning and, 103–5 nation-state fragility amplifiers and, 167–69 omni-use technology and, 110–11 robotics and, 165–66 Minsky, Marvin, 58, 130 misinformation.

They feature in shops, schools, hospitals, offices, courts, and homes. You already interact many times a day with AI; soon it will be many more, and almost everywhere it will make experiences more efficient, faster, more useful, and frictionless. AI is already here. But it’s far from done. AUTOCOMPLETE EVERYTHING: THE RISE OF LARGE LANGUAGE MODELS It wasn’t long ago that processing natural language seemed too complex, too varied, too nuanced for modern AI. Then, in November 2022, the AI research company OpenAI released ChatGPT. Within a week it had more than a million users and was being talked about in rapturous terms, a technology so seamlessly useful it might eclipse Google Search in short order.

Back in 2017 a small group of researchers at Google was focused on a narrower version of this problem: how to get an AI system to focus only on the most important parts of a data series in order to make accurate and efficient predictions about what comes next. Their work laid the foundation for what has been nothing short of a revolution in the field of large language models (LLMs)—including ChatGPT. LLMs take advantage of the fact that language data comes in a sequential order. Each unit of information is in some way related to data earlier in a series. The model reads very large numbers of sentences, learns an abstract representation of the information contained within them, and then, based on this, generates a prediction about what should come next.

pages: 848 words: 227,015

On the Edge: The Art of Risking Everything
by Nate Silver
Published 12 Aug 2024

It was an expensive and audacious bet—the funders originally pledged to commit $1 billion to it on a completely unproven technology after many “AI winters.” It inherently did seem ridiculous—until the very moment it didn’t. “Large language models seem completely magic right now,” said Stephen Wolfram, a pioneering computer scientist who founded Wolfram Research in 1987. (Wolfram more recently designed a plug-in that works with GPT-4 to essentially translate words into mathematical equations.) “Even last year, what large language models were doing was kind of babbling and not very interesting,” he said when we spoke in 2023. “And then suddenly this threshold was passed, where, gosh, it seems like human-level text generation.

And we can usually tell when someone lacks this experience—say, an academic who’s built a model but never had to put it to the test. *24 The technical term for this quality is “interpretability”; the interpretability of LLMs is poor. *25 Timothy Lee makes the same comparison in his outstanding AI explainer, “Large language models, explained with a minimum of math and jargon,” understandingai.org/p/large-language-models-explained-with [inactive]. That’s the first place I’d recommend if you want to go beyond my symphony analogy to a LLMs 101 class. I’d recommend Stephen Wolfram’s “What Is ChatGPT Doing…and Why Does It Work?” for a more math-intensive, LLMs 201 approach; stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work [inactive]

I return to SBF as he meets his fate in a New York courtroom and makes another bad bet. No spoilers, but the chapter ends with a bang. Chapter ∞, Termination, is the first of a two-part conclusion. I’ll introduce you to another Sam, OpenAI CEO Sam Altman, and others behind the development of ChatGPT and other large language models. Unlike the government-run Manhattan Project, the charge into the frontiers of AI is being led by Silicon Valley “techno-optimists” with their Riverian attitude toward risk and reward. Even as by some indications the world has entered an era of stagnation, both AI optimists like Altman and AI “doomers” think that civilization is on the brink of a hinge point not seen since the atomic bomb, and that AI is a technological bet made for existential capital.

The Singularity Is Nearer: When We Merge with AI
by Ray Kurzweil
Published 25 Jun 2024

BACK TO NOTE REFERENCE 129 Stephen Nellis, “Nvidia Shows New Research on Using AI to Improve Chip Designs,” Reuters, March 27, 2023, https://www.reuters.com/technology/nvidia-shows-new-research-using-ai-improve-chip-designs-2023-03-28. BACK TO NOTE REFERENCE 130 Blaise Aguera y Arcas, “Do Large Language Models Understand Us?,” Medium, December 16, 2021, https://medium.com/@blaisea/do-large-language-models-understand-us-6f881d6d8e75. BACK TO NOTE REFERENCE 131 With better algorithms, the amount of training compute needed to achieve a given level of performance decreases. A growing body of research suggests that for many applications, algorithmic progress is roughly as important as hardware progress.

Algorithmic innovations and the emergence of big data have allowed AI to achieve startling breakthroughs sooner than even experts expected—from mastering games like Jeopardy! and Go to driving automobiles, writing essays, passing bar exams, and diagnosing cancer. Now, powerful and flexible large language models like GPT-4 and Gemini can translate natural-language instructions into computer code—dramatically reducing the barrier between humans and machines. By the time you read this, tens of millions of people likely will have experienced these capabilities firsthand. Meanwhile, the cost to sequence a human’s genome has fallen by about 99.997 percent, and neural networks have begun unlocking major medical discoveries by simulating biology digitally.

My prediction that we’ll achieve this by 2029 has been consistent since my 1999 book The Age of Spiritual Machines, published at a time when many observers thought this milestone would never be reached.[10] Until recently this projection was considered extremely optimistic in the field. For example, a 2018 survey found an aggregate prediction among AI experts that human-level machine intelligence would not arrive until around 2060.[11] But the latest advances in large language models have rapidly shifted expectations. As I was writing early drafts of this book, the consensus on Metaculus, the world’s top forecasting website, hovered between the 2040s and the 2050s. But surprising AI progress over the past two years upended expectations, and by May 2022 the Metaculus consensus exactly agreed with me on the 2029 date.[12] Since then it has even fluctuated to as soon as 2026, putting me technically in the slow-timelines camp!

pages: 336 words: 91,806

Code Dependent: Living in the Shadow of AI
by Madhumita Murgia
Published 20 Mar 2024

ChatGPT and all other conversational AI chatbots have a disclaimer that warns users about the hallucination problem, pointing out that large language models sometimes make up facts. ChatGPT, for instance, has a warning on its webpage: ‘ChatGPT may produce inaccurate information about people, places, or facts.’ Judge Castel: Do you have something new to say? Schwartz’s lawyer: Yes. The public needs a stronger warning. * Making up facts wasn’t people’s greatest worry about large language models. These powerful language engines could be trained to comb through all sorts of information – financial, biological and chemical – and generate predictions based on it.

Amid all the early hype and frenzy were some of the same limitations of several other AI systems I’d already observed, reproducing for example the prejudices of those who created it, like facial-recognition cameras that misidentify darker faces, or the predictive policing systems in Amsterdam that targeted families of single mothers in immigrant neighbourhoods. But this new form of AI also brought entirely new challenges. The technology behind ChatGPT, known as a large language model or LLM, was not a search engine looking up facts; it was a pattern-spotting engine guessing the next best option in a sequence.5 Because of this inherent predictive nature, LLMs can also fabricate or ‘hallucinate’ information in unpredictable and flagrant ways. They can generate made-up numbers, names, dates, quotes – even web links or entire articles – confabulations of bits of existing content into illusory hybrids.6 Users of LLMs have shared examples of links to non-existent news articles in the Financial Times and Bloomberg, made-up references to research papers, the wrong authors for published books and biographies riddled with factual mistakes.

.: ‘The Machine Stops’ ref1 Fortnite ref1 Foxglove ref1 Framestore ref1 Francis, Pope ref1, ref2 fraudulent activity benefits ref1 gig workers and ref1, ref2, ref3 free will ref1, ref2 Freedom of Information requests ref1, ref2, ref3 ‘Fuck the algorithm’ ref1 Fussey, Pete ref1 Galeano, Eduardo ref1 gang rape ref1, ref2 gang violence ref1, ref2, ref3, ref4 Gebru, Timnit ref1, ref2, ref3 Generative Adversarial Networks (GANs) ref1 generative AI ref1, ref2, ref3, ref4, ref5, ref6, ref7, ref8, ref9, ref10 AI alignment and ref1, ref2, ref3 ChatGPT see ChatGPT creativity and ref1, ref2, ref3, ref4 deepfakes and ref1, ref2, ref3 GPT (Generative Pre-trained Transformer) ref1, ref2, ref3, ref4 job losses and ref1 ‘The Machine Stops’ and ref1 Georgetown University ref1 gig work ref1, ref2, ref3, ref4, ref5 Amsterdam court Uber ruling ref1 autonomy and ref1 collective bargaining and ref1 colonialism and ref1, ref2, ref3 #DeclineNow’ hashtag ref1 driver profiles ref1 facial recognition technologies ref1, ref2, ref3, ref4 fraudulent activity and ref1, ref2, ref3, ref4 ‘going Karura’ ref1 ‘hiddenness’ of algorithmic management and ref1 job allocation algorithm ref1, ref2, ref3, ref4, ref5, ref6 location-checking ref1 migrants and ref1 ‘no-fly’ zones ref1 race and ref1 resistance movement ref1 ‘slaveroo’ ref1 ‘therapy services’ ref1 UberCheats ref1, ref2, ref3 UberEats ref1, ref2 UK Supreme Court ruling ref1 unions and ref1, ref2, ref3 vocabulary to describe AI-driven work ref1 wages ref1, ref2, ref3, ref4, ref5, ref6, ref7, ref8, ref9, ref10, ref11 work systems built to keep drivers apart or turn workers’ lives into games ref1, ref2 Gil, Dario ref1 GitHub ref1 ‘give work, not aid’ ref1 Glastonbury Festival ref1 Glovo ref1 Gojek ref1 ‘going Karura’ ref1 Goldberg, Carrie ref1 golem (inanimate humanoid) ref1 Gonzalez, Wendy ref1 Google ref1 advertising and ref1 AI alignment and ref1 AI diagnostics and ref1, ref2, ref3 Chrome ref1 deepfakes and ref1, ref2, ref3, ref4 DeepMind ref1, ref2, ref3, ref4 driverless cars and ref1 Imagen AI models ref1 Maps ref1, ref2, ref3 Reverse Image ref1 Sama ref1 Search ref1, ref2, ref3, ref4, ref5 Transformer model and ref1 Translate ref1, ref2, ref3, ref4 Gordon’s Wine Bar London ref1 GPT (Generative Pre-trained Transformer) ref1, ref2, ref3, ref4 GPT-4 ref1 Graeber, David ref1 Granary Square, London ref1, ref2 ‘graveyard of pilots’ ref1 Greater Manchester Coalition of Disabled People ref1 Groenendaal, Eline ref1 Guantanamo Bay, political prisoners in ref1 Guardian ref1 Gucci ref1 guiding questions checklist ref1 Gulu ref1 Gumnishka, Iva ref1, ref2, ref3, ref4 Gutiarraz, Norma ref1, ref2, ref3, ref4, ref5 hallucination problem ref1, ref2, ref3 Halsema, Femke ref1, ref2 Hanks, Tom ref1, ref2 Hart, Anna ref1 Hassabis, Demis ref1 Harvey, Adam ref1 Have I Been Trained ref1 healthcare/diagnostics Accredited Social Health Activists (ASHAs) ref1, ref2, ref3 bias in ref1 Covid-19 and ref1, ref2 digital colonialism and ref1 ‘graveyard of pilots’ ref1 heart attacks and ref1, ref2 India and ref1 malaria and ref1 Optum ref1 pain, African Americans and ref1 qTrack ref1, ref2, ref3 Qure.ai ref1, ref2, ref3, ref4 qXR ref1 radiologists ref1, ref2, ref3, ref4, ref5, ref6 Tezpur ref1 tuberculosis ref1, ref2, ref3 without trained doctors ref1 X-ray screening and ref1, ref2, ref3, ref4, ref5, ref6, ref7, ref8, ref9, ref10 heart attacks ref1, ref2 Herndon, Holly ref1 Het Parool ref1, ref2 ‘hiddenness’ of algorithmic management ref1 Hikvision ref1, ref2 Hinton, Geoffrey ref1 Hive Micro ref1 Home Office ref1, ref2, ref3 Hong Kong ref1, ref2, ref3, ref4, ref5 Horizon Worlds ref1 Hornig, Jess ref1 Horus Foundation ref1 Huawei ref1, ref2, ref3 Hui Muslims ref1 Human Rights Watch ref1, ref2, ref3, ref4 ‘humanist’ AI ethics ref1 Humans in the Loop ref1, ref2, ref3, ref4 Hyderabad, India ref1 IBM ref1, ref2, ref3, ref4 Iftimie, Alexandru ref1, ref2, ref3, ref4, ref5 IJburg, Amsterdam ref1 Imagen AI models ref1 iMerit ref1 India ref1, ref2, ref3, ref4, ref5, ref6, ref7, ref8, ref9 facial recognition in ref1, ref2, ref3 healthcare in ref1, ref2, ref3 Industrial Light and Magic ref1 Information Commissioner’s Office ref1 Instacart ref1, ref2 Instagram ref1, ref2 Clearview AI and ref1 content moderators ref1, ref2, ref3, ref4 deepfakes and ref1, ref2, ref3 Integrated Joint Operations Platform (IJOP) ref1, ref2 iPhone ref1 IRA ref1 Iradi, Carina ref1 Iranian coup (1953) ref1 Islam ref1, ref2, ref3, ref4, ref5 Israel ref1, ref2, ref3 Italian government ref1 Jaber, Faisal bin Ali ref1 Jainabai ref1 Janah, Leila ref1, ref2, ref3 Jay Gould, Stephen ref1 Jewish faith ref1, ref2, ref3, ref4 Jiang, Mr ref1 Jim Crow era ref1 jobs application ref1, ref2, ref3 ‘bullshit jobs’ ref1 data annotation and data-labelling ref1 gig work allocation ref1, ref2, ref3, ref4, ref5, ref6 losses ref1, ref2, ref3 Johannesburg ref1, ref2 Johnny Depp–Amber Heard trial (2022) ref1 Jones, Llion ref1 Joske, Alex ref1 Julian-Borchak Williams, Robert ref1 Juncosa, Maripi ref1 Kafka, Franz ref1, ref2, ref3, ref4 Kaiser, Lukasz ref1 Kampala, Uganda ref1, ref2, ref3 Kellgren & Lawrence classification system. ref1 Kelly, John ref1 Kibera, Nairobi ref1 Kinzer, Stephen: All the Shah’s Men ref1 Knights League ref1 Koli, Ian ref1, ref2, ref3, ref4, ref5, ref6, ref7, ref8, ref9, ref10 Kolkata, India ref1 Koning, Anouk de ref1 Laan, Eberhard van der ref1 labour unions ref1, ref2, ref3, ref4, ref5, ref6 La Fors, Karolina ref1 LAION-5B ref1 Lanata, Jorge ref1 Lapetus Solutions ref1 large language model (LLM) ref1, ref2, ref3 Lawrence, John ref1 Leigh, Manchester ref1 Lensa ref1 Leon ref1 life expectancy ref1 Limited Liability Corporations ref1 LinkedIn ref1 liver transplant ref1 Loew, Rabbi ref1 London delivery apps in ref1, ref2 facial recognition in ref1, ref2, ref3, ref4 riots (2011) ref1 Underground terrorist attacks (2001) and (2005) ref1 Louis Vuitton ref1 Lyft ref1, ref2 McGlynn, Clare ref1, ref2 machine learning advertising and ref1 data annotation and ref1 data colonialism and ref1 gig workers and ref1, ref2, ref3 healthcare and ref1, ref2, ref3 predictive policing and. ref1, ref2, ref3, ref4 rise of ref1 teenage pregnancy and ref1, ref2, ref3 Mahmoud, Ala Shaker ref1 Majeed, Amara ref1, ref2 malaria ref1 Manchester Metropolitan University ref1 marginalized people ref1, ref2, ref3, ref4, ref5, ref6, ref7, ref8, ref9 Martin, Noelle ref1, ref2, ref3, ref4, ref5, ref6, ref7 Masood, S.

Four Battlegrounds
by Paul Scharre
Published 18 Jan 2023

See also Or Sharir et al., The Cost of Training NLP Models: A Concise Overview (AI21 Labs, April 19, 2020), https://arxiv.org/pdf/2004.08900.pdf, 2. Some research suggests that the optimal balance with a fixed amount of compute would be to scale training data size and model size equally and that many recent large language models would perform better if a smaller model were trained on a larger dataset. Jordan Hoffman et al., Training Compute-Optimal Large Language Models (arXiv.org, March 29, 2022), https://arxiv.org/pdf/2203.15556.pdf. For an analysis of overall trends in dataset size in machine learning research, see Pablo Villalobos, “Trends in Training Dataset Sizes,” Epoch, September 20, 2022, https://epochai.org/blog/trends-in-training-dataset-sizes.

In supervised learning, an algorithm is trained on labeled data. For example, an image classification algorithm may be trained on labeled pictures. Over many iterations, the algorithm learns to associate the image with the label. Unsupervised learning is when an algorithm is trained on unlabeled data and the algorithm learns patterns in the data. Large language models such as GPT-2 and GPT-3 use unsupervised learning. Once trained, they can output sentences and whole paragraphs based on patterns they’ve learned from the text on which they’ve been trained. Reinforcement learning is when an algorithm learns by interacting with its environment and gets rewards for certain behaviors.

Shifts in the significance of these inputs could advantage some actors and disadvantage others, further altering the global balance of AI power. One of the most striking trends in AI basic research today is the tendency toward ever-larger models with increasingly massive datasets and compute resources for training. The rapid growth in size for large language models, for example, is remarkable. In October 2018, researchers at Google announced BERTLARGE, a 340 million parameter language model. It was trained on a database of 3.3 billion words using 64 TPU chips running for four days. A few months later, in February 2019, OpenAI announced GPT-2, a 1.5 billion parameter model trained on 40 GB of text.

pages: 194 words: 57,434

The Age of AI: And Our Human Future
by Henry A Kissinger , Eric Schmidt and Daniel Huttenlocher
Published 2 Nov 2021

We postulated that by 2040, artificial intelligence would be perhaps a million times more powerful than it was in 2021, following Moore’s law, which predicts a doubling in computer processing power every two years. While increases in the power of AI are harder to quantify than increases in computing power, it appears that their growth is even more rapid. For example, the power of large language models, neural networks that underlie much of today’s natural language processing, is growing even more rapidly, tripling in fewer than two years. Microsoft’s Megatron-Turing model,2 released in late 2021, and Google’s PaLM,3 released in early 2022, each has more than 525 billion parameters compared to 175 billion for OpenAI’s GPT-3, which we wrote about in previous chapters and which was released in June of 2020.

Patent law in the United States and most nations recognizes only human inventors, potentially leading to situations in which there are inventions with no inventors (or perhaps in which people play a de minimis co-inventor role). National governments have recognized AI’s threat to language: Hungary has commissioned its own large language model so that Hungarian does not automatically become obsolete in the digital realm.9 Governments have also begun to grapple with digital networks’ dilution of communal and national identity. Some states, including China and Russia, have been more aggressive in this effort than others. The balance between risks of anarchy and risks of oppression is being struck differently in different places.

pages: 169 words: 41,887

Literary Theory for Robots: How Computers Learned to Write
by Dennis Yi Tenen
Published 6 Feb 2024

As someone who studies the development of artifice and intellect historically, I know the current moment of excitement over new technology will subside, diminished by the frenzy of its grifters and soothsayers. What remains will be more modest and more significant. Viewed in the light of collective human intellectual achievement, large language models are built on the foundation of public archives, libraries, and encyclopedias containing the composite work of numerous authors. Their synthesized voice fascinates me not so much for what it says, on average. Rather, I hear it as a rebuke to my own, dearly held beliefs about authorship, and therefore agency and creativity.

pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future
by Martin Ford
Published 4 May 2015

Google’s development team began by focusing on official documents prepared by the United Nations and then extended their effort to the Web, where the company’s search engine was able to locate a multitude of examples that became fodder for their voracious self-learning algorithms. The sheer number of documents used to train the system dwarfed anything that had come before. Franz Och, the computer scientist who led the effort, noted that the team had built “very, very large language models, much larger than anyone has ever built in the history of mankind.”8 In 2005, Google entered its system in the annual machine translation competition held by the National Bureau of Standards and Technology, an agency within the US Commerce department that publishes measurement standards. Google’s machine learning algorithms were able to easily outperform the competition—which typically employed language and linguistic experts who attempted to actively program their translation systems to wade through the mire of conflicting and inconsistent grammatical rules that characterize languages.

pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
by Pedro Domingos
Published 21 Sep 2015

“Relevance weighting of search terms,”* by Stephen Robertson and Karen Sparck Jones (Journal of the American Society for Information Science, 1976), explains the use of Naïve Bayes–like methods in information retrieval. “First links in the Markov chain,” by Brian Hayes (American Scientist, 2013), recounts Markov’s invention of the eponymous chains. “Large language models in machine translation,”* by Thorsten Brants et al. (Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007), explains how Google Translate works. “The PageRank citation ranking: Bringing order to the Web,”* by Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd (Stanford University technical report, 1998), describes the PageRank algorithm and its interpretation as a random walk over the web.

pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by Orly Lobel
Published 17 Oct 2022

Timnit Gebru—a rising star in AI research with an extraordinary path from Ethiopia to Eritrea to political asylum in the United States, to three degrees at Stanford, to Apple, to Microsoft, and then to Google—was ousted from Google over a dispute with company executives about publishing an article on the potential risks and harms of large language models. The article warns that ever larger natural language processing models may be too large to monitor, that the sheer mass of the data becomes inscrutable. The paper calls for curating and documenting data sets “rather than ingesting everything on the web.”3 It also warns against spreading claims about models understanding language and concepts, as opposed to simply identifying patterns in human-made texts, numbers, and images.

pages: 666 words: 181,495

In the Plex: How Google Thinks, Works, and Shapes Our Lives
by Steven Levy
Published 12 Apr 2011

Och’s official role was as a scientist in Google’s research group, but it is indicative of Google’s view of research that no step was required to move beyond study into actual product implementation. Because Och and his colleagues knew they would have access to an unprecedented amount of data, they worked from the ground up to create a new translation system. “One of the things we did was to build very, very, very large language models, much larger than anyone has ever built in the history of mankind.” Then they began to train the system. To measure progress, they used a statistical model that, given a series of words, would predict the word that came next. Each time they doubled the amount of training data, they got a .5 percent boost in the metrics that measured success in the results.

pages: 562 words: 201,502

Elon Musk
by Walter Isaacson
Published 11 Sep 2023

This meant that his engineers were actually ahead of OpenAI in creating full-fledged artificial general intelligence, which requires both abilities. “Tesla’s real-world AI is underrated,” he said. “Imagine if Tesla and OpenAI had to swap tasks. They would have to make Self-Driving, and we would have to make large language-model chatbots. Who wins? We do.” In April, Musk assigned Babuschkin and his team three major goals. The first was to make an AI bot that could write computer code. A programmer could begin typing in any coding language, and the X.AI bot would auto-complete the task for the most likely action they were trying to take.

pages: 2,466 words: 668,761

Artificial Intelligence: A Modern Approach
by Stuart Russell and Peter Norvig
Published 14 Jul 2019

Brandt, F., Conitzer, V., Endriss, U., Lang, J., and Procaccia, A. D. (Eds.). (2016). Handbook of Computational Social Choice. Cambridge University Press. Brants, T. (2000). TnT: A statistical part-of-speech tagger. In Proc. Sixth Conference on Applied Natural Language Processing. Brants, T., Popat, A. C., Xu, P., Och, F. J., and Dean, J. (2007). Large language models in machine translation. In EMNLP-CoNLL-07. Bratko, I. (2009). Prolog Programming for Artificial Intelligence (4th edition). Addison-Wesley. Bratman, M. E. (1987). Intention, Plans, and Practical Reason. Harvard University Press. Breck, E., Cai, S., Nielsen, E., Salib, M., and Sculley, D. (2016).