description: group of samples that have been tagged with one or more labels
generative artificial intelligence
82 results
by Sebastian Mallaby; · 30 Mar 2026 · 607pp · 161,998 words
-shot prompting and supervised fine-tuning, RLHF elevated post-training to a third level. Whereas fine-tuning was a classic deep-learning exercise—taking in labeled data and mapping questions to answers—RLHF drew on the tradition of reinforcement learning personified by David Silver. The model would learn by choosing an action
by Carissa Véliz · 21 Apr 2026 · 503pp · 129,255 words
’s also why it makes so many mistakes. A machine learning algorithm was asked to learn to distinguish between huskies and wolves. After processing some labeled data as input, it got quite good at the task. But when researchers investigated further, they realized that the AI was focusing on the background, and
by Valliappa Lakshmanan, Sara Robinson and Michael Munn · 31 Oct 2020
advance. For example, this could include labeling an image as “cat” or labeling a baby as being 2.3 kg at birth. You feed this labeled data to your model in hopes that it can learn enough to label new examples. With unsupervised learning, you do not know the labels for your
…
tuning, it might be worth searching within this range. Autoencoders Training embeddings in a supervised way can be hard because it requires a lot of labeled data. For an image classification model like Inception to be able to produce useful image embeddings, it is trained on ImageNet, which has 14 million labeled
…
lower cardinality by using autoencoders as an auxiliary learning task. Then, we solve the actual image classification problem for which we typically have much less labeled data using the embedding produced by the auxiliary autoencoder task. This is likely to boost model performance, because now the model only has to learn the
…
to illustrate that, provided there is random assignment at work, the Neutral Class design pattern can help us avoid losing model accuracy because of arbitrarily labeled data. In the real world In real-world situations, things may not be precisely random as in the synthetic dataset, but the arbitrary assignment paradigm still
by Paul Raines and Jeff Tranter · 25 Mar 1999 · 1,064pp · 114,771 words
} set y {0.0 0.1 2.3 4.5 1.2 5.4 9.6} graph .g -title "Example Graph" .g element create x -label "Data Points" -xdata $x -ydata $y pack .g * * * [3] The format of this command may change for the final Version 2.4 to require a specific
by David Allen · 31 Dec 2002 · 300pp · 79,315 words
kept on lists, but I still maintain two categories of paper-based reminders. I travel with a “Read/Review” plastic file folder and another one labeled “Data Entry.” In the latter I put anything for which the next action is simply to input data into my computer (business cards that need to
by Anthony T. Holdener · 25 Jan 2008 · 982pp · 221,145 words
in the first place. Example 11-9 shows the JavaScript necessary to perform such an action. Example 11-9. Switching out the label data /* Example 11-9. Switching out the label data. */ /** * This function, reloadForm, takes the XMLHttpRequest JSON server response * /xhrResponse/ from the server and sets it equal to the global <label> element
by Fabio Nelli · 27 Sep 2018 · 688pp · 107,867 words
built into Python or provided by other libraries, two new data structures were developed. These data structures are designed to work with relational data or labeled data, thus allowing you to manage data with features similar to those designed for SQL relational databases and Excel spreadsheets. Throughout the book in fact, you
by Ash Fontana · 4 May 2021 · 296pp · 66,815 words
market improved accuracy and reliability to potential patients. The company building the AI can work with the medical facility to get a critical mass of labeled data, get their models to the PUT, figure out how best to deliver the prediction through existing hardware, work through regulatory issues, and receive feedback from
…
highly specific datasets, whether through outsourcing, hiring people, or having existing employees use products that generate data. HUMAN GENERATED Data Labeling Many ML models require labeled data for training recognition algorithms. There are some promising transfer and semisupervised learning techniques that may provide alternatives to gathering a great deal of
…
labeled data, especially for generic domains such as image, video, and language understanding. However, the state of the art doesn’t seem to offer enough just yet,
…
. Accessing and owning processed data to feed models can be the single hardest problem in starting a vertical, AI-First business. Supervised ML models need labeled data. Getting lots of labeled examples for specific domains is hard. For example, where would you find a hundred thousand images of 2001 Chevy Silverado fenders
…
manufacturer, a chain of body shops, or an insurance company. In the absence of existing labeled datasets, build one. This entails building a team to label data, which may include both experts and nonexperts, and requires tools to efficiently label large volumes of data. There is a burgeoning area of management practices
…
saved through automation/Cost of each label) * # labels. Perhaps it’s helpful to think of this operation as a factory. The “good” it produces is labeled data. The factory manager’s job is to find efficiencies along the production line. Tools Labeling often requires engineers to clean data before applying the labels
…
procure from users. Uncertainty sampling. Labeling those points for which the current model is least certain. Query by committee. Train many models on the same labeled data. Then have people manually label the data points that caused the most disagreement in output between the models. Expected model change. Have people label the
…
the accuracy of a classifier even if any one of those labels isn’t necessarily correct. A large volume of labeled data can also be an asset itself. Thus, tracking the total labeled data points can be informative of the value produced by the labeling operation. Labels aren’t free, and business models need
…
generators take a single object and offer unlimited perspectives by, for example, modeling the object in 3-D and then moving around it, generating a labeled data point at each step. Accessibility Labeling objects is often feasible because pictures of them are readily available, as with cars on a street. However, some
…
of such an object and drop it into various environments. Building such a generator can be expensive, but the cost can be amortized over all labeled data points because the one generator is used to produce many examples of the same object. These generators are typically built using the same tools that
…
agents click to label an email as “sensitive” if they think the customer who wrote it is particularly angry and needs attention in short order: labeled data to train the ML models to prioritize responses. Vertically integrating domain experts by hiring them to implement systems yields better ideas and better data to
by Dipanjan Sarkar · 1 Dec 2016
involves several steps which we will be discussing in detail later in this chapter. Briefly, for a supervised classification problem, we need to have some labelled data that we could use for training a text classification model. This data would essentially be curated documents that are already assigned to some specific class
by Stuart Russell and Peter Norvig · 14 Jul 2019 · 2,466pp · 668,761 words
such systems can reach a high level of test-set accuracy—as shown by the ImageNet competition results, for example—they often require far more labeled data than a human would for the same task. For example, a child needs to see only one picture of a giraffe, rather than thousands, in
…
learning story; indeed, it may be the case that our current approach to supervised deep learning renders some tasks completely unattainable because the requirements for labeled data would exceed what the human race (or the universe) can supply. Moreover, even in cases where the task is feasible, labeling large data sets usually
…
requires scarce and expensive human labor. For these reasons, there is intense interest in several learning paradigms that reduce the dependence on labeled data. As we saw in Chapter 19, these paradigms include unsupervised learning, transfer learning, and semisupervised learning. Unsupervised learning algorithms learn solely from unlabeled inputs x
…
data to train an initial version of an NLP model. From there, we can use a smaller amount of domain-specific data (perhaps including some labeled data) to refine the model. The refined model can learn the vocabulary, idioms, syntactic structures, and other linguistic phenomena that are specific to the new domain
…
. During training a single sentence can be used multiple times with different words masked out. The beauty of this approach is that it requires no labeled data; the sentence provides its own label for the masked word. If this model is trained on a large corpus of text, it generates pretrained representations
…
that refer to very precisely delineated activities on simple backgrounds, is quite easy to deal with. Good results can be obtained with a lot of labeled data and an appropriate convolutional neural network. However, it can be difficult to prove that the methods actually work, because they rely so strongly on context
by Mehmed Kantardzić · 2 Jan 2003 · 721pp · 197,134 words
by James Pustejovsky and Amber Stubbs · 14 Oct 2012 · 502pp · 107,510 words
by Foster Provost and Tom Fawcett · 30 Jun 2013 · 660pp · 141,595 words
by Jiawei Han, Micheline Kamber and Jian Pei · 21 Jun 2011
by Eric Posner and E. Weyl · 14 May 2018 · 463pp · 105,197 words
by Trent Hauck · 3 Nov 2014
by Gavin Hackeling · 31 Oct 2014
by Aurélien Géron · 13 Mar 2017 · 1,331pp · 163,200 words
by Zdravko Markov and Daniel T. Larose · 5 Apr 2007
by Trevor Hastie, Robert Tibshirani and Jerome Friedman · 25 Aug 2009 · 764pp · 261,694 words
by Joel Grus · 13 Apr 2015 · 579pp · 76,657 words
by Martin Ford · 16 Nov 2018 · 586pp · 186,548 words
by Charles Petzold · 28 Sep 1999 · 566pp · 122,184 words
by Cathy O'Neil and Rachel Schutt · 8 Oct 2013 · 523pp · 112,185 words
by Anil Ananthaswamy · 15 Jul 2024 · 416pp · 118,522 words
by Dan Bouk · 22 Aug 2022 · 424pp · 123,180 words
by Madhumita Murgia · 20 Mar 2024 · 336pp · 91,806 words
by Melanie Mitchell · 14 Oct 2019 · 350pp · 98,077 words
by Yves Hilpisch · 8 Dec 2020 · 1,082pp · 87,792 words
by Raúl Garreta and Guillermo Moncecchi · 14 Sep 2013 · 122pp · 29,286 words
by Femi Anthony · 21 Jun 2015 · 589pp · 69,193 words
by Ivan Idris · 23 Jun 2015 · 681pp · 64,159 words
by Aurelien Geron · 14 Aug 2019
by Jure Leskovec, Anand Rajaraman and Jeffrey David Ullman · 13 Nov 2014
by Erik J. Larson · 5 Apr 2021
by Paul Scharre · 18 Jan 2023
by Andrew McAfee · 14 Nov 2023 · 381pp · 113,173 words
by Kai-Fu Lee and Qiufan Chen · 13 Sep 2021
by Maximilian Kasy · 15 Jan 2025 · 209pp · 63,332 words
by Gary Marcus and Jeremy Freeman · 1 Nov 2014 · 336pp · 93,672 words
by Karl Fogel · 13 Oct 2005
by Eric Topol · 1 Jan 2019 · 424pp · 114,905 words
by Kashmir Hill · 19 Sep 2023 · 487pp · 124,008 words
by John MacCormick and Chris Bishop · 27 Dec 2011 · 250pp · 73,574 words
by Marc Stickdorn, Markus Edgar Hormess, Adam Lawrence and Jakob Schneider · 12 Jan 2018 · 704pp · 182,312 words
by Pedro Domingos · 21 Sep 2015 · 396pp · 117,149 words
by Jacqueline Kazil · 4 Feb 2016
by Martin Ford · 13 Sep 2021 · 288pp · 86,995 words
by Parmy Olson · 284pp · 96,087 words
by John Brockman · 5 Oct 2015 · 481pp · 125,946 words
by Rob Reich, Mehran Sahami and Jeremy M. Weinstein · 6 Sep 2021
by Kai-Fu Lee · 14 Sep 2018 · 307pp · 88,180 words
by Mariya Yao, Adelyn Zhou and Marlene Jia · 1 Jun 2018 · 161pp · 39,526 words
by Terrence J. Sejnowski · 27 Sep 2018
by Hod Lipson and Melba Kurman · 22 Sep 2016
by Tarleton Gillespie · 25 Jun 2018 · 390pp · 109,519 words
by Daniel Drescher · 16 Mar 2017 · 430pp · 68,225 words
by Tom Chivers · 6 May 2024 · 283pp · 102,484 words
by Karen Hao · 19 May 2025 · 660pp · 179,531 words
by Orly Lobel · 17 Oct 2022 · 370pp · 112,809 words
by Kevin Roose · 9 Mar 2021 · 208pp · 57,602 words
by Kenneth Payne · 16 Jun 2021 · 339pp · 92,785 words
by Tracy Kidder · 1 Jan 1981 · 299pp · 99,080 words
by Ivan Idris · 30 Sep 2012 · 197pp · 35,256 words
by Amy Webb · 5 Mar 2019 · 340pp · 97,723 words
by Jacob Ward · 25 Jan 2022 · 292pp · 94,660 words
by Marcus Du Sautoy · 7 Mar 2019 · 337pp · 103,522 words
by Veljko Krunic · 29 Mar 2020
by Ethan Mollick · 2 Apr 2024 · 189pp · 58,076 words
by Jonathan Gray, Lucy Chambers and Liliana Bounegru · 9 May 2012
by Paul R. Daugherty and H. James Wilson · 15 Jan 2018 · 523pp · 61,179 words
by James Vlahos · 1 Mar 2019 · 392pp · 108,745 words
by Nick Polson and James Scott · 14 May 2018 · 301pp · 85,126 words
by Brett Scott · 4 Jul 2022 · 308pp · 85,850 words
by Mustafa Suleyman · 4 Sep 2023 · 444pp · 117,770 words
by Jane McGonigal · 20 Jan 2011 · 470pp · 128,328 words
by Jill Lepore · 14 Sep 2020 · 467pp · 149,632 words
by Daron Acemoglu and Simon Johnson · 15 May 2023 · 619pp · 177,548 words
by David Aronson · 1 Nov 2006
by Robert Skidelsky Nan Craig · 15 Mar 2020
by Guy Standing · 13 Jul 2016 · 443pp · 98,113 words
by Azeem Azhar · 6 Sep 2021 · 447pp · 111,991 words