labeled data

back to index

description: group of samples that have been tagged with one or more labels

generative artificial intelligence

82 results

The Infinity Machine: Demis Hassabis, DeepMind, and the Quest for Superintelligence

by Sebastian Mallaby;  · 30 Mar 2026  · 607pp  · 161,998 words

-shot prompting and supervised fine-tuning, RLHF elevated post-training to a third level. Whereas fine-tuning was a classic deep-learning exercise—taking in labeled data and mapping questions to answers—RLHF drew on the tradition of reinforcement learning personified by David Silver. The model would learn by choosing an action

Prophecy: Prediction, Power, and the Fight for the Future, from Ancient Oracles to AI

by Carissa Véliz  · 21 Apr 2026  · 503pp  · 129,255 words

’s also why it makes so many mistakes. A machine learning algorithm was asked to learn to distinguish between huskies and wolves. After processing some labeled data as input, it got quite good at the task. But when researchers investigated further, they realized that the AI was focusing on the background, and

Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps

by Valliappa Lakshmanan, Sara Robinson and Michael Munn  · 31 Oct 2020

advance. For example, this could include labeling an image as “cat” or labeling a baby as being 2.3 kg at birth. You feed this labeled data to your model in hopes that it can learn enough to label new examples. With unsupervised learning, you do not know the labels for your

tuning, it might be worth searching within this range. Autoencoders Training embeddings in a supervised way can be hard because it requires a lot of labeled data. For an image classification model like Inception to be able to produce useful image embeddings, it is trained on ImageNet, which has 14 million labeled

lower cardinality by using autoencoders as an auxiliary learning task. Then, we solve the actual image classification problem for which we typically have much less labeled data using the embedding produced by the auxiliary autoencoder task. This is likely to boost model performance, because now the model only has to learn the

to illustrate that, provided there is random assignment at work, the Neutral Class design pattern can help us avoid losing model accuracy because of arbitrarily labeled data. In the real world In real-world situations, things may not be precisely random as in the synthetic dataset, but the arbitrary assignment paradigm still

Tcl/Tk in a Nutshell

by Paul Raines and Jeff Tranter  · 25 Mar 1999  · 1,064pp  · 114,771 words

} set y {0.0 0.1 2.3 4.5 1.2 5.4 9.6} graph .g -title "Example Graph" .g element create x -label "Data Points" -xdata $x -ydata $y pack .g * * * [3] The format of this command may change for the final Version 2.4 to require a specific

Getting Things Done: The Art of Stress-Free Productivity

by David Allen  · 31 Dec 2002  · 300pp  · 79,315 words

kept on lists, but I still maintain two categories of paper-based reminders. I travel with a “Read/Review” plastic file folder and another one labeled “Data Entry.” In the latter I put anything for which the next action is simply to input data into my computer (business cards that need to

Ajax: The Definitive Guide

by Anthony T. Holdener  · 25 Jan 2008  · 982pp  · 221,145 words

in the first place. Example 11-9 shows the JavaScript necessary to perform such an action. Example 11-9. Switching out the label data /* Example 11-9. Switching out the label data. */ /** * This function, reloadForm, takes the XMLHttpRequest JSON server response * /xhrResponse/ from the server and sets it equal to the global <label> element

Python Data Analytics: With Pandas, NumPy, and Matplotlib

by Fabio Nelli  · 27 Sep 2018  · 688pp  · 107,867 words

built into Python or provided by other libraries, two new data structures were developed. These data structures are designed to work with relational data or labeled data, thus allowing you to manage data with features similar to those designed for SQL relational databases and Excel spreadsheets. Throughout the book in fact, you

The AI-First Company

by Ash Fontana  · 4 May 2021  · 296pp  · 66,815 words

market improved accuracy and reliability to potential patients. The company building the AI can work with the medical facility to get a critical mass of labeled data, get their models to the PUT, figure out how best to deliver the prediction through existing hardware, work through regulatory issues, and receive feedback from

highly specific datasets, whether through outsourcing, hiring people, or having existing employees use products that generate data. HUMAN GENERATED Data Labeling Many ML models require labeled data for training recognition algorithms. There are some promising transfer and semisupervised learning techniques that may provide alternatives to gathering a great deal of

labeled data, especially for generic domains such as image, video, and language understanding. However, the state of the art doesn’t seem to offer enough just yet,

. Accessing and owning processed data to feed models can be the single hardest problem in starting a vertical, AI-First business. Supervised ML models need labeled data. Getting lots of labeled examples for specific domains is hard. For example, where would you find a hundred thousand images of 2001 Chevy Silverado fenders

manufacturer, a chain of body shops, or an insurance company. In the absence of existing labeled datasets, build one. This entails building a team to label data, which may include both experts and nonexperts, and requires tools to efficiently label large volumes of data. There is a burgeoning area of management practices

saved through automation/Cost of each label) * # labels. Perhaps it’s helpful to think of this operation as a factory. The “good” it produces is labeled data. The factory manager’s job is to find efficiencies along the production line. Tools Labeling often requires engineers to clean data before applying the labels

procure from users. Uncertainty sampling. Labeling those points for which the current model is least certain. Query by committee. Train many models on the same labeled data. Then have people manually label the data points that caused the most disagreement in output between the models. Expected model change. Have people label the

the accuracy of a classifier even if any one of those labels isn’t necessarily correct. A large volume of labeled data can also be an asset itself. Thus, tracking the total labeled data points can be informative of the value produced by the labeling operation. Labels aren’t free, and business models need

generators take a single object and offer unlimited perspectives by, for example, modeling the object in 3-D and then moving around it, generating a labeled data point at each step. Accessibility Labeling objects is often feasible because pictures of them are readily available, as with cars on a street. However, some

of such an object and drop it into various environments. Building such a generator can be expensive, but the cost can be amortized over all labeled data points because the one generator is used to produce many examples of the same object. These generators are typically built using the same tools that

agents click to label an email as “sensitive” if they think the customer who wrote it is particularly angry and needs attention in short order: labeled data to train the ML models to prioritize responses. Vertically integrating domain experts by hiring them to implement systems yields better ideas and better data to

Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data

by Dipanjan Sarkar  · 1 Dec 2016

involves several steps which we will be discussing in detail later in this chapter. Briefly, for a supervised classification problem, we need to have some labelled data that we could use for training a text classification model. This data would essentially be curated documents that are already assigned to some specific class

Artificial Intelligence: A Modern Approach

by Stuart Russell and Peter Norvig  · 14 Jul 2019  · 2,466pp  · 668,761 words

such systems can reach a high level of test-set accuracy—as shown by the ImageNet competition results, for example—they often require far more labeled data than a human would for the same task. For example, a child needs to see only one picture of a giraffe, rather than thousands, in

learning story; indeed, it may be the case that our current approach to supervised deep learning renders some tasks completely unattainable because the requirements for labeled data would exceed what the human race (or the universe) can supply. Moreover, even in cases where the task is feasible, labeling large data sets usually

requires scarce and expensive human labor. For these reasons, there is intense interest in several learning paradigms that reduce the dependence on labeled data. As we saw in Chapter 19, these paradigms include unsupervised learning, transfer learning, and semisupervised learning. Unsupervised learning algorithms learn solely from unlabeled inputs x

data to train an initial version of an NLP model. From there, we can use a smaller amount of domain-specific data (perhaps including some labeled data) to refine the model. The refined model can learn the vocabulary, idioms, syntactic structures, and other linguistic phenomena that are specific to the new domain

. During training a single sentence can be used multiple times with different words masked out. The beauty of this approach is that it requires no labeled data; the sentence provides its own label for the masked word. If this model is trained on a large corpus of text, it generates pretrained representations

that refer to very precisely delineated activities on simple backgrounds, is quite easy to deal with. Good results can be obtained with a lot of labeled data and an appropriate convolutional neural network. However, it can be difficult to prove that the methods actually work, because they rely so strongly on context

Data Mining: Concepts, Models, Methods, and Algorithms

by Mehmed Kantardzić  · 2 Jan 2003  · 721pp  · 197,134 words

Natural Language Annotation for Machine Learning

by James Pustejovsky and Amber Stubbs  · 14 Oct 2012  · 502pp  · 107,510 words

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking

by Foster Provost and Tom Fawcett  · 30 Jun 2013  · 660pp  · 141,595 words

Data Mining: Concepts and Techniques: Concepts and Techniques

by Jiawei Han, Micheline Kamber and Jian Pei  · 21 Jun 2011

Radical Markets: Uprooting Capitalism and Democracy for a Just Society

by Eric Posner and E. Weyl  · 14 May 2018  · 463pp  · 105,197 words

Scikit-Learn Cookbook

by Trent Hauck  · 3 Nov 2014

Mastering Machine Learning With Scikit-Learn

by Gavin Hackeling  · 31 Oct 2014

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

by Aurélien Géron  · 13 Mar 2017  · 1,331pp  · 163,200 words

Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage

by Zdravko Markov and Daniel T. Larose  · 5 Apr 2007

The Elements of Statistical Learning (Springer Series in Statistics)

by Trevor Hastie, Robert Tibshirani and Jerome Friedman  · 25 Aug 2009  · 764pp  · 261,694 words

Data Science from Scratch: First Principles with Python

by Joel Grus  · 13 Apr 2015  · 579pp  · 76,657 words

Architects of Intelligence

by Martin Ford  · 16 Nov 2018  · 586pp  · 186,548 words

Code: The Hidden Language of Computer Hardware and Software

by Charles Petzold  · 28 Sep 1999  · 566pp  · 122,184 words

Doing Data Science: Straight Talk From the Frontline

by Cathy O'Neil and Rachel Schutt  · 8 Oct 2013  · 523pp  · 112,185 words

Why Machines Learn: The Elegant Math Behind Modern AI

by Anil Ananthaswamy  · 15 Jul 2024  · 416pp  · 118,522 words

Democracy's Data: The Hidden Stories in the U.S. Census and How to Read Them

by Dan Bouk  · 22 Aug 2022  · 424pp  · 123,180 words

Code Dependent: Living in the Shadow of AI

by Madhumita Murgia  · 20 Mar 2024  · 336pp  · 91,806 words

Artificial Intelligence: A Guide for Thinking Humans

by Melanie Mitchell  · 14 Oct 2019  · 350pp  · 98,077 words

Python for Algorithmic Trading: From Idea to Cloud Deployment

by Yves Hilpisch  · 8 Dec 2020  · 1,082pp  · 87,792 words

Learning Scikit-Learn: Machine Learning in Python

by Raúl Garreta and Guillermo Moncecchi  · 14 Sep 2013  · 122pp  · 29,286 words

Mastering Pandas

by Femi Anthony  · 21 Jun 2015  · 589pp  · 69,193 words

Numpy Beginner's Guide - Third Edition

by Ivan Idris  · 23 Jun 2015  · 681pp  · 64,159 words

Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

by Aurelien Geron  · 14 Aug 2019

Mining of Massive Datasets

by Jure Leskovec, Anand Rajaraman and Jeffrey David Ullman  · 13 Nov 2014

The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do

by Erik J. Larson  · 5 Apr 2021

Four Battlegrounds

by Paul Scharre  · 18 Jan 2023

The Geek Way: The Radical Mindset That Drives Extraordinary Results

by Andrew McAfee  · 14 Nov 2023  · 381pp  · 113,173 words

AI 2041: Ten Visions for Our Future

by Kai-Fu Lee and Qiufan Chen  · 13 Sep 2021

The Means of Prediction: How AI Really Works (And Who Benefits)

by Maximilian Kasy  · 15 Jan 2025  · 209pp  · 63,332 words

The Future of the Brain: Essays by the World's Leading Neuroscientists

by Gary Marcus and Jeremy Freeman  · 1 Nov 2014  · 336pp  · 93,672 words

Producing Open Source Software: How to Run a Successful Free Software Project

by Karl Fogel  · 13 Oct 2005

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again

by Eric Topol  · 1 Jan 2019  · 424pp  · 114,905 words

Your Face Belongs to Us: A Secretive Startup's Quest to End Privacy as We Know It

by Kashmir Hill  · 19 Sep 2023  · 487pp  · 124,008 words

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers

by John MacCormick and Chris Bishop  · 27 Dec 2011  · 250pp  · 73,574 words

This Is Service Design Doing: Applying Service Design Thinking in the Real World: A Practitioners' Handbook

by Marc Stickdorn, Markus Edgar Hormess, Adam Lawrence and Jakob Schneider  · 12 Jan 2018  · 704pp  · 182,312 words

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

by Pedro Domingos  · 21 Sep 2015  · 396pp  · 117,149 words

Data Wrangling With Python: Tips and Tools to Make Your Life Easier

by Jacqueline Kazil  · 4 Feb 2016

Rule of the Robots: How Artificial Intelligence Will Transform Everything

by Martin Ford  · 13 Sep 2021  · 288pp  · 86,995 words

Supremacy: AI, ChatGPT, and the Race That Will Change the World

by Parmy Olson  · 284pp  · 96,087 words

What to Think About Machines That Think: Today's Leading Thinkers on the Age of Machine Intelligence

by John Brockman  · 5 Oct 2015  · 481pp  · 125,946 words

System Error: Where Big Tech Went Wrong and How We Can Reboot

by Rob Reich, Mehran Sahami and Jeremy M. Weinstein  · 6 Sep 2021

AI Superpowers: China, Silicon Valley, and the New World Order

by Kai-Fu Lee  · 14 Sep 2018  · 307pp  · 88,180 words

Applied Artificial Intelligence: A Handbook for Business Leaders

by Mariya Yao, Adelyn Zhou and Marlene Jia  · 1 Jun 2018  · 161pp  · 39,526 words

The Deep Learning Revolution (The MIT Press)

by Terrence J. Sejnowski  · 27 Sep 2018

Driverless: Intelligent Cars and the Road Ahead

by Hod Lipson and Melba Kurman  · 22 Sep 2016

Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media

by Tarleton Gillespie  · 25 Jun 2018  · 390pp  · 109,519 words

Blockchain Basics: A Non-Technical Introduction in 25 Steps

by Daniel Drescher  · 16 Mar 2017  · 430pp  · 68,225 words

Everything Is Predictable: How Bayesian Statistics Explain Our World

by Tom Chivers  · 6 May 2024  · 283pp  · 102,484 words

Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI

by Karen Hao  · 19 May 2025  · 660pp  · 179,531 words

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future

by Orly Lobel  · 17 Oct 2022  · 370pp  · 112,809 words

Futureproof: 9 Rules for Humans in the Age of Automation

by Kevin Roose  · 9 Mar 2021  · 208pp  · 57,602 words

I, Warbot: The Dawn of Artificially Intelligent Conflict

by Kenneth Payne  · 16 Jun 2021  · 339pp  · 92,785 words

The Soul of a New Machine

by Tracy Kidder  · 1 Jan 1981  · 299pp  · 99,080 words

NumPy Cookbook

by Ivan Idris  · 30 Sep 2012  · 197pp  · 35,256 words

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity

by Amy Webb  · 5 Mar 2019  · 340pp  · 97,723 words

The Loop: How Technology Is Creating a World Without Choices and How to Fight Back

by Jacob Ward  · 25 Jan 2022  · 292pp  · 94,660 words

The Creativity Code: How AI Is Learning to Write, Paint and Think

by Marcus Du Sautoy  · 7 Mar 2019  · 337pp  · 103,522 words

Succeeding With AI: How to Make AI Work for Your Business

by Veljko Krunic  · 29 Mar 2020

Co-Intelligence: Living and Working With AI

by Ethan Mollick  · 2 Apr 2024  · 189pp  · 58,076 words

The Data Journalism Handbook

by Jonathan Gray, Lucy Chambers and Liliana Bounegru  · 9 May 2012

Human + Machine: Reimagining Work in the Age of AI

by Paul R. Daugherty and H. James Wilson  · 15 Jan 2018  · 523pp  · 61,179 words

Talk to Me: How Voice Computing Will Transform the Way We Live, Work, and Think

by James Vlahos  · 1 Mar 2019  · 392pp  · 108,745 words

AIQ: How People and Machines Are Smarter Together

by Nick Polson and James Scott  · 14 May 2018  · 301pp  · 85,126 words

Cloudmoney: Cash, Cards, Crypto, and the War for Our Wallets

by Brett Scott  · 4 Jul 2022  · 308pp  · 85,850 words

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma

by Mustafa Suleyman  · 4 Sep 2023  · 444pp  · 117,770 words

Reality Is Broken: Why Games Make Us Better and How They Can Change the World

by Jane McGonigal  · 20 Jan 2011  · 470pp  · 128,328 words

If Then: How Simulmatics Corporation Invented the Future

by Jill Lepore  · 14 Sep 2020  · 467pp  · 149,632 words

Power and Progress: Our Thousand-Year Struggle Over Technology and Prosperity

by Daron Acemoglu and Simon Johnson  · 15 May 2023  · 619pp  · 177,548 words

Evidence-Based Technical Analysis: Applying the Scientific Method and Statistical Inference to Trading Signals

by David Aronson  · 1 Nov 2006

Work in the Future The Automation Revolution-Palgrave MacMillan (2019)

by Robert Skidelsky Nan Craig  · 15 Mar 2020

The Corruption of Capitalism: Why Rentiers Thrive and Work Does Not Pay

by Guy Standing  · 13 Jul 2016  · 443pp  · 98,113 words

Exponential: How Accelerating Technology Is Leaving Us Behind and What to Do About It

by Azeem Azhar  · 6 Sep 2021  · 447pp  · 111,991 words