computer vision

back to index

202 results

Practical Python and OpenCV
by Adrian Rosebrock

Now that we can analyze breast histology images for cancer risk factors much faster. Of course, computer vision can also be applied to other areas of the medical field. Analyzing X-Rays, MRI scans, and cellular structures all can be performed using computer vision algorithms. Perhaps the biggest success computer vision success story you may have heard of is the X-Box 360 Kinect. The Kinect can use a stereo camera to understand the depth of an image, allowing it to classify and recognize human poses, with the help of some machine learning, of course. The list doesn’t stop there. Computer vision is now prevalent in many areas of your life, whether you realize it or not.

Facial recognition is an application of computer vision in the real-world. 1 introduction What other types of useful applications of computer vision are there? Well, we could build representations of our 3D world using public image repositories like Flickr. We could download thousands and thousands of pictures of Manhattan, taken by citizens with their smartphones and cameras, and then analyze them and organize them to construct a 3D representation of the city. We would then virtually navigate this city through our computers. Sound cool? Another popular application of computer vision is surveillance. While surveillance tends to have a negative connotation of sorts, there are many different types of surveillance.

Did you then pickup a pair and head to the dressing room? These are all types of questions that computer vision surveillance systems can answer. 2 introduction Computer vision can also be applied to the medical field. A year ago, I consulted with the National Cancer Institute to develop methods to automatically analyze breast histology images for cancer risk factors. Normally, a task like this would require a trained pathologist with years of experience – and it would be extremely time consuming! Our research demonstrated that computer vision algorithms could be applied to these images and automatically analyze and quantify cellular structures – without human intervention!

Programming Computer Vision with Python
by Jan Erik Solem
Published 26 Jun 2012

Chapter 10 Shows how to use the Python interface for the commonly used OpenCV computer vision library and how to work with video and camera input. There is also a bibliography at the back of the book. Citations of bibliographic entries are made by number in square brackets, as in [20]. Introduction to Computer Vision Computer vision is the automated extraction of information from images. Information can mean anything from 3D models, camera position, object detection and recognition to grouping and searching image content. In this book, we take a wide definition of computer vision and include things like image warping, de-noising, and augmented reality.[1] Sometimes computer vision tries to mimic human vision, sometimes it uses a data and statistical approach, and sometimes geometry is the key to solving problems.

Symbols 3D plotting, A Sample Data Set 3D reconstruction, 3D Reconstruction Example 4-neighborhood, 9.1 Graph Cuts A affine transformation, 3.1 Homographies affine warping, Affine Transformations affinity matrix, Clustering Images agglomerative clustering, 6.2 Hierarchical Clustering alpha map, Image in Image AR, 4.3 Pose Estimation from Planes and Markers array, Interactive Annotation array slicing, Array Image Representation aspect ratio, 4.1 The Pin-Hole Camera Model association, 9.2 Segmentation Using Clustering augmented reality, 4.3 Pose Estimation from Planes and Markers B bag-of-visual-words, Inspiration from Text Mining—The Vector Space Model bag-of-word representation, Searching Images baseline, Bundle adjustment Bayes classifier, Classifying Images—Hand Gesture Recognition binary image, Morphology—Counting Objects blurring, Using the Pickle Module bundle adustment, Bundle adjustment C calibration matrix, 4.1 The Pin-Hole Camera Model camera calibration, Computing the Camera Center camera center, Camera Models and Augmented Reality camera matrix, Camera Models and Augmented Reality camera model, Camera Models and Augmented Reality camera pose estimation, 4.3 Pose Estimation from Planes and Markers camera resectioning, Triangulation CBIR, Searching Images Chan-Vese segmentation, 9.3 Variational Methods characteristic functions, 9.3 Variational Methods CherryPy, 7.6 Building Demos and Web Applications, Image Search Demo class centroids, Clustering Images classifying images, Classifying Image Content clustering images, Clustering Images, Clustering Images complete linking, 6.2 Hierarchical Clustering confusion matrix, Classifying Images—Hand Gesture Recognition content-based image retrieval, Searching Images convex combination, Image in Image corner detection, Local Image Descriptors correlation, 2.1 Harris Corner Detector corresponding points, 2.1 Harris Corner Detector cpickle, PCA of Images cross-correlation, Finding Corresponding Points Between Images cumulative distribution function, Graylevel Transforms cv, OpenCV, 10.4 Tracking cv2, OpenCV D de-noising, Reading and writing .mat files Delaunay triangulation, Piecewise Affine Warping dendrogram, Clustering Images dense depth reconstruction, Bundle adjustment dense image features, A Simple 2D Example dense SIFT, A Simple 2D Example descriptor, 2.1 Harris Corner Detector difference-of-Gaussian, Finding Corresponding Points Between Images digit classification, Hand Gesture Recognition Again direct linear transformation, 3.1 Homographies directed graph, Image Segmentation distance matrix, Clustering Images E Edmonds-Karp algorithm, 9.1 Graph Cuts eight point algorithm, Plotting 3D Data with Matplotlib epipolar constraint, 5.1 Epipolar Geometry epipolar geometry, Multiple View Geometry epipolar line, 5.1 Epipolar Geometry epipole, 5.1 Epipolar Geometry essential matrix, The calibrated case—metric reconstruction F factorization, Factoring the Camera Matrix feature matches, Finding Corresponding Points Between Images feature matching, Matching Descriptors flood fill, Displaying Images and Results focal length, 4.1 The Pin-Hole Camera Model fundamental matrix, 5.1 Epipolar Geometry fundamental matrix estimation, 5.3 Multiple View Reconstruction G Gaussian blurring, Using the Pickle Module Gaussian derivative filters, Image Derivatives Gaussian distributions, 8.2 Bayes Classifier gesture recognition, Dense SIFT as Image Feature GL_MODELVIEW, PyGame and PyOpenGL GL_PROJECTION, PyGame and PyOpenGL Grab Cut dataset, Segmentation with User Input gradient angle, Blurring Images gradient magnitude, Blurring Images graph, Image Segmentation graph cut, Image Segmentation GraphViz, Matching Using Local Descriptors graylevel transforms, Array Image Representation H Harris corner detection, Local Image Descriptors Harris matrix, Local Image Descriptors hierarchical clustering, 6.2 Hierarchical Clustering hierarchical k-means, 6.3 Spectral Clustering histogram equalization, Graylevel Transforms Histogram of Oriented Gradients, A Simple 2D Example HOG, A Simple 2D Example homogeneous coordinates, Image to Image Mappings homography, Image to Image Mappings homography estimation, 3.1 Homographies Hough transform, Inpainting I Image, Basic Image Handling and Processing image contours, Plotting Images, Points, and Lines image gradient, Blurring Images image graph, 9.1 Graph Cuts image histograms, Plotting Images, Points, and Lines image patch, 2.1 Harris Corner Detector image plane, Camera Models and Augmented Reality image registration, Piecewise Affine Warping image retrieval, Searching Images image search demo, 7.6 Building Demos and Web Applications image segmentation, Visualizing the Images on Principal Components, Image Segmentation image thumbnails, Convert Images to Another Format ImageDraw, Clustering Images inliers, 3.3 Creating Panoramas inpainting, Using generators integral image, Color Spaces interest point descriptor, 2.1 Harris Corner Detector interest points, Local Image Descriptors inverse depth, 4.1 The Pin-Hole Camera Model inverse document frequency, Inspiration from Text Mining—The Vector Space Model io, Useful SciPy Modules iso-contours, Plotting Images, Points, and Lines J JSON, Downloading Geotagged Images from Panoramio K k-means, Clustering Images k-nearest neighbor classifier, Classifying Image Content kernel functions, 8.3 Support Vector Machines kNN, Classifying Image Content L Laplacian matrix, 6.3 Spectral Clustering least squares triangulation, Triangulation LibSVM, 8.3 Support Vector Machines local descriptors, Local Image Descriptors Lucas-Kanade tracking algorithm, Optical Flow M marking points, Interactive Annotation mathematical morphology, Morphology—Counting Objects Matplotlib, Create Thumbnails maximum flow (max flow), 9.1 Graph Cuts measurements, Morphology—Counting Objects, Extracting Cells and Recognizing Characters metric reconstruction, 5.1 Epipolar Geometry, Computing the Camera Matrix from a Fundamental Matrix minidom, Registering Images minimum cut (min cut), 9.1 Graph Cuts misc, Useful SciPy Modules morphology, Morphology—Counting Objects, Morphology—Counting Objects, Exercises mplot3d, A Sample Data Set, 3D Reconstruction Example multi-class SVM, Selecting Features multi-dimensional arrays, Interactive Annotation multi-dimensional histograms, Clustering Images multiple view geometry, Multiple View Geometry N naive Bayes classifier, Classifying Images—Hand Gesture Recognition ndimage, Affine Transformations ndimage.filters, Computing Disparity Maps normalized cross-correlation, Finding Corresponding Points Between Images normalized cut, 9.2 Segmentation Using Clustering NumPy, Interactive Annotation O objloader, Tying It All Together OCR, Hand Gesture Recognition Again OpenCV, Chapter Overview, OpenCV OpenGL, PyGame and PyOpenGL OpenGL projection matrix, From Camera Matrix to OpenGL Format optic flow, 10.4 Tracking optical axis, Camera Models and Augmented Reality optical center, The Camera Matrix optical character recognition, Hand Gesture Recognition Again optical flow, 10.4 Tracking optical flow equation, 10.4 Tracking outliers, 3.3 Creating Panoramas overfitting, Exercises P panograph, Exercises panorama, 3.3 Creating Panoramas PCA, PCA of Images pickle, PCA of Images, The SciPy Clustering Package, Creating a Vocabulary pickling, PCA of Images piecewise affine warping, Image in Image piecewise constant image model, 9.3 Variational Methods PIL, Basic Image Handling and Processing pin-hole camera, Camera Models and Augmented Reality plane sweeping, 5.4 Stereo Images plot formatting, Plotting Images, Points, and Lines plotting, Create Thumbnails point correspondence, 2.1 Harris Corner Detector pose estimation, 4.3 Pose Estimation from Planes and Markers Prewitt filters, Blurring Images Principal Component Analysis, PCA of Images, 8.2 Bayes Classifier principal point, The Camera Matrix projection, Camera Models and Augmented Reality projection matrix, Camera Models and Augmented Reality projective camera, Camera Models and Augmented Reality projective transformation, Image to Image Mappings pydot, Matching Using Local Descriptors pygame, PyGame and PyOpenGL pygame.image, PyGame and PyOpenGL pygame.locals, PyGame and PyOpenGL Pylab, Create Thumbnails PyOpenGL, PyGame and PyOpenGL pyplot, Exercises pysqlite, Setting Up the Database pysqlite2, Setting Up the Database Python Imaging Library, Basic Image Handling and Processing python-graph, 9.1 Graph Cuts Q quad, From Camera Matrix to OpenGL Format query with image, Querying with an Image quotient image, Exercises R radial basis functions, 8.3 Support Vector Machines ranking using homographies, 7.5 Ranking Results Using Geometry RANSAC, 3.3 Creating Panoramas, 5.3 Multiple View Reconstruction rectified image pair, Bundle adjustment rectifying images, Extracting Cells and Recognizing Characters registration, Piecewise Affine Warping rigid transformation, 3.1 Homographies robust homography estimation, RANSAC ROF, Reading and writing .mat files, 9.3 Variational Methods RQ-factorization, Factoring the Camera Matrix Rudin-Osher-Fatemi de-noising model, Reading and writing .mat files S Scale-Invariant Feature Transform, Finding Corresponding Points Between Images scikit.learn, Exercises Scipy, Using the Pickle Module scipy.cluster.vq, The SciPy Clustering Package, Clustering Images scipy.io, Useful SciPy Modules, Reading and writing .mat files scipy.misc, Reading and writing .mat files scipy.ndimage, Blurring Images, Morphology—Counting Objects, Extracting Cells and Recognizing Characters, Rectifying Images, Exercises scipy.ndimage.filters, Blurring Images, Blurring Images, 2.1 Harris Corner Detector scipy.sparse, Exercises searching images, Searching Images, Adding Images segmentation, Image Segmentation self-calibration, Bundle adjustment separating hyperplane, Using PCA to Reduce Dimensions SfM, The calibrated case—metric reconstruction SIFT, Finding Corresponding Points Between Images similarity matrix, Clustering Images similarity transformation, 3.1 Homographies similarity tree, 6.2 Hierarchical Clustering simplejson, Downloading Geotagged Images from Panoramio, Downloading Geotagged Images from Panoramio single linking, 6.2 Hierarchical Clustering slicing, Array Image Representation Sobel filters, Blurring Images spectral clustering, Clustering Images, 9.2 Segmentation Using Clustering SQLite, Setting Up the Database SSD, Finding Corresponding Points Between Images stereo imaging, Bundle adjustment stereo reconstruction, Bundle adjustment stereo rig, Bundle adjustment stereo vision, Bundle adjustment stitching images, Robust Homography Estimation stop words, Inspiration from Text Mining—The Vector Space Model structure from motion, The calibrated case—metric reconstruction structuring element, Morphology—Counting Objects Sudoku reader, Hand Gesture Recognition Again sum of squared differences, Finding Corresponding Points Between Images Support Vector Machines, Using PCA to Reduce Dimensions support vectors, 8.3 Support Vector Machines SVM, Using PCA to Reduce Dimensions T term frequency, Inspiration from Text Mining—The Vector Space Model term frequency–inverse document frequency, Inspiration from Text Mining—The Vector Space Model text mining, Searching Images tf-idf weighting, Inspiration from Text Mining—The Vector Space Model total variation, Reading and writing .mat files total within-class variance, Clustering Images tracking, 10.4 Tracking triangulation, 5.2 Computing with Cameras and 3D Structure U unpickling, PCA of Images unsharp masking, 1.5 Advanced Example: Image De-Noising urllib, Downloading Geotagged Images from Panoramio V variational methods, 9.3 Variational Methods variational problems, 9.3 Variational Methods vector quantization, The SciPy Clustering Package vector space model, Searching Images vertical field of view, From Camera Matrix to OpenGL Format video, Displaying Images and Results visual codebook, Inspiration from Text Mining—The Vector Space Model visual vocabulary, Inspiration from Text Mining—The Vector Space Model visual words, Inspiration from Text Mining—The Vector Space Model visualizing image distribution, Visualizing the Images on Principal Components VLFeat, Interest Points W warping, Affine Transformations watershed, Inpainting web applications, 7.6 Building Demos and Web Applications webcam, Optical Flow word index, Setting Up the Database X XML, Registering Images xml.dom, Registering Images About the Author Jan Erik Solem is a Python enthusiast and a computer vision researcher and entrepreneur. He is an applied mathematician and has worked as associate professor, startup CTO, and now also book author. He sometimes writes about computer vision and Python on his blog www.janeriksolem.net. He has used Python for computer vision in teaching, research and industrial applications for many years. He currently lives in San Francisco. Colophon The animal on the cover of Programming Computer Vision with Python is a bullhead. Often referred to as “bullhead catfish,” members of the genus Ameiurus come in three common types: the black bullhead (Ameiurus melas), the yellow bullhead (Ameiurus natalis), and the brown bullhead (Ameiurus nebulosus).

Programming a computer and designing algorithms for understanding what is in these images is the field of computer vision. Computer vision powers applications like image search, robot navigation, medical image analysis, photo management, and many more. The idea behind this book is to give an easily accessible entry point to hands-on computer vision with enough understanding of the underlying theory and algorithms to be a foundation for students, researchers, and enthusiasts. The Python programming language, the language choice of this book, comes with many freely available, powerful modules for handling images, mathematical computing, and data mining. When writing this book, I have used the following principles as a guideline.

pages: 350 words: 98,077

Artificial Intelligence: A Guide for Thinking Humans
by Melanie Mitchell
Published 14 Oct 2019

FIGURE 48: Four straightforward instances of “walking a dog” While thinking about this topic, I was particularly taken by a delightful and insightful blog post written by Andrej Karpathy, the deep-learning and computer-vision expert who now directs AI efforts at Tesla. In his post, titled “The State of Computer Vision and AI: We Are Really, Really Far Away,”24 Karpathy describes his reactions, as a computer-vision researcher, to one specific photo, shown in figure 50. Karpathy notes that we humans find this image quite humorous, and asks, “What would it take for a computer to understand this image as you or I do?”

v=v1dW7ViahEc. 14.  K. He et al., “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” in Proceedings of the IEEE International Conference on Computer Vision (2015), 1026–34. 15.  A. Linn, “Microsoft Researchers Win ImageNet Computer Vision Challenge,” AI Blog, Microsoft, Dec. 10, 2015, blogs.microsoft.com/ai/2015/12/10/microsoft-researchers-win-imagenet-computer-vision-challenge. 16.  A. Hern, “Computers Now Better than Humans at Recognising and Sorting Images,” Guardian, May 13, 2015, www.theguardian.com/global/2015/may/13/baidu-minwa-supercomputer-better-than-humans-recognising-images; T.

What’s more, “dog pixels” might look a lot like “cat pixels” or other animals. Under some lighting conditions, a cloud in the sky might even look very much like a dog. FIGURE 7: Object recognition: easy for humans, hard for computers Since the 1950s, the field of computer vision has struggled with these and other issues. Until recently, a major job of computer-vision researchers was to develop specialized image-processing algorithms that would identify “invariant features” of objects that could be used to recognize these objects in spite of the difficulties I sketched above. But even with sophisticated image processing, the abilities of object-recognition programs remained far below those of humans.

pages: 138 words: 27,404

OpenCV Computer Vision With Python
by Joseph Howse
Published 22 Apr 2013

In 2005, he finished his studies in IT from the Universitat Politécnica de Valencia with honors in human-computer interaction supported by computer vision with OpenCV (v0.96). He had a final project based on this subject and published it on HCI Spanish congress. He participated in Blender source code, an open source and 3D-software project, and worked in his first commercial movie Plumiferos—Aventuras voladoras as a Computer Graphics Software Developer. David now has more than 10 years of experience in IT, with more than seven years experience in computer vision, computer graphics, and pattern recognition working on different projects and startups, applying his knowledge of computer vision, optical character recognition, and augmented reality.

We have also practiced wrapping this functionality in a high-level, reusable, and object-oriented design. Congratulations! You now have the skill to develop computer vision applications in Python using OpenCV. Still, there is always more to learn and do! If you liked working with NumPy and OpenCV, please check out these other titles from Packt Publishing: NumPy Cookbook, Ivan Idris OpenCV 2 Computer Vision Application Programming Cookbook, Robert Laganière, which uses OpenCV's C++ API for desktops Mastering OpenCV with Practical Computer Vision Projects, (by multiple authors), which uses OpenCV's C++ API for multiple platforms The upcoming book, OpenCV for iOS How-to, which uses OpenCV's C++ API for iPhone and iPad OpenCV Android Application Programming, my upcoming book, which uses OpenCV's Java API for Android Here ends of our tour of OpenCV's Python bindings.

Generating Haar Cascades for Custom Targets Gathering positive and negative training images Finding the training executables On Windows On Mac, Ubuntu, and other Unix-like systems Creating the training sets and cascade Creating <negative_description> Creating <positive_description> Creating <binary_description> by running <opencv_createsamples> Creating <cascade> by running <opencv_traincascade> Testing and improving <cascade> Summary Index OpenCV Computer Vision with Python * * * OpenCV Computer Vision with Python Copyright © 2013 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

pages: 321 words: 113,564

AI in Museums: Reflections, Perspectives and Applications
by Sonja Thiel and Johannes C. Bernhardt
Published 31 Dec 2023

Paullada, Amandalynne/Raji, Inioluwa Deborah/Bender, Emily M. et al. (2021). Data and its (Dis)Contents: A Survey of Dataset Development and Use in Machine Learning Research. Patterns 2 (11). https://doi.org/10.1016/j.patter.2021.1 00336. Prabhu, Vinay Uday/Birhane, Abeba (2020). Large Image Datasets: A Pyrrhic Win for Computer Vision? 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 1536–46. https://doi.org/10.1109/WACV48630.2021.00158. Raffel, Colin/Shazeer, Noam/Roberts, Adam et al. (2020). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. The Journal of Machine Learning Research 21 (1), 5485–51. Available online at https://jmlr2020.csa il.mit.edu/papers/volume21/20-074/20-074.pdf.

In this case, a neural network algorithm has been trained to classify meaning-forming symbols and motifs, attempt to record them, select the works in which they appear, and perform frequency analysis in order to capture their popularity across the centuries. 1 The Digital Curator project was presented at the conference at the Landesmuseum Karlsruhe in December 2022. The text is based on a dissertation that the author defended at the University of Arts, Architecture and Design in Prague in September 2022. (Digital Curator: Algorithms and Computer Vision in the World of Big Cultural Historical Data). 226 Part 3: Applications Database Despite the fact that the principles of computer vision are rooted in linear algebra, the data itself is more than merely mathematics. Arguably, it is the responsibility of the curator or art historian to continuously use their critical eye with respect to the specific data, take an active role in its selection, and also suggest changes to its composition, form, and processing techniques.

In his fundamental study of 1950, Alan Turing argued that the thinking of intelligent humans could not be precisely defined and therefore any output of a machine that cannot be recognized as such by humans should also be regarded as intelligent (Turing 1950; Vater 2023); a little later, the research field of artificial intelligence was established at the famous Dartmouth Workshop of 1956 (McCorduck 2004; Moor 2006). Since then, the concept of AI has changed again and again, been differentiated into subfields such as expert systems, speech recognition, or computer vision, and experienced booms and busts (Nilsson 2004; Seising 2021). AI functions as an umbrella term for a multitude of technical approaches that are often taken as a provocation of human intelligence and regularly trigger both fantasies and fears. If one speaks less far-reaching of systems that follow algorithmic rules, recognize patterns in data, and solve specific tasks, the challenges to human intelligence and related categories such as thinking, consciousness, reason, creativity, or intentionality pose themselves less sharply.

Robot Futures
by Illah Reza Nourbakhsh
Published 1 Mar 2013

In 2008, a new company, Willow Garage, threw its hat into the ring with the Robot Operating System (ROS). This product takes some inspiration from the success that Intel Corporation had achieved in the prior decade with the Intel OpenCV library, an open-source collection of computer vision routines that greatly influenced hobby, educational, and research computer vision work (Bradski and Kaehler 2008). By selecting important computer vision capabilities, then optimizing software for performing those skills on Intel computer chips, the company delivered highly competent vision behaviors into the hands of those who wished to create end user applications (Quigley et al. 2009).

But what about the creation and manipulation of customer desire—what are the real-world equivalents of A/B split testing and customized online pricing? Companies are already prototyping digital walls that will replace fixed advertising posters throughout physical stores (Müller et al. 2009). These digital walls will contain embedded computer vision systems that track face and eye movement, giving them direct access to knowledge about who is looking at the wall. Computer vision will not only track fine-grained human behavior, but will also be able to estimate age, sex, even fashion sense. Spoken language accents will yield clues about each customer’s socioeconomic class, ethnicity, and educational level.

The store achieves faster throughput because you do not wait, and the store throws away far less expired food, because it did not have to make a steady stream of fries no one purchases to keep customer lines short. It made fries when it knew, almost for certain, that you would buy them. Everyone is happier in this model, and the store improves efficiency and profit margins. New Mediocracy 11 Now the kicker: this is not science fiction; it was demonstrated five years ago. Hyperactive Bob, a computer vision system tied to cameras around a store perimeter, watched the incoming cars (Shropshire 2006). After months of data mining on makes and models of cars and which orders correlate to each type of vehicle, the system reliably estimated what the short-order cooks should deliver as customers drove up.

pages: 414 words: 109,622

Genius Makers: The Mavericks Who Brought A. I. To Google, Facebook, and the World
by Cade Metz
Published 15 Mar 2021

Krizhevsky, Sutskever, and Hinton went on to publish a paper describing their system (later christened AlexNet), which Krizhevsky unveiled at a computer vision conference in Florence, Italy, near the end of October. Addressing an audience of more than a hundred researchers, he described the project in his typically soft, almost apologetic tones. Then, as he finished, the room erupted with argument. Rising from his seat near the front of the room, a Berkeley professor named Alexei Efros told the room that ImageNet was not a reliable test of computer vision. “It is not like the real world,” he said. It might include hundreds of photos of T-shirts and AlexNet might have learned to identify these T-shirts, he told the room, but the T-shirts were neatly laid out on tables without a wrinkle, not worn by real people.

The career path of one of Hinton’s key collaborators, a young researcher named Sara Sabour, exemplified the international nature of AI and how susceptible it was to political interference. In 2013, after completing a computer science degree at the Sharif University of Technology in Iran, Sabour had applied to the University of Washington, hoping to study computer vision and other forms of artificial intelligence, and she had been accepted. But then the U.S. government denied her a visa, apparently because she had grown up and studied in Iran and aimed to specialize in an area, computer vision, that potentially played into military and security technologies. The following year, she enrolled at the University of Toronto, before finding her way to Hinton and Google. Meanwhile, the Trump administration continued to focus on keeping people out of the country.

This was his way of showing that vision was more complex than it might seem, that people understood what was in front of them in ways machines still could not. “It is a fact that is ignored by researchers in computer vision,” he said. “And that is a huge mistake.” He was pointing to the limitations of the technology he helped build over the last four decades. Researchers in computer vision now relied on deep learning, he said, and deep learning solved only part of the problem. If a neural network analyzed thousands of coffee cup photos, it could learn to recognize a coffee cup. But if those photos pictured coffee cups only from the side, it couldn’t recognize a coffee cup turned upside down.

Four Battlegrounds
by Paul Scharre
Published 18 Jan 2023

It was slow and painstaking work for humans to do manually. The sheer volume of data was unmanageable. The goal was to have computer vision help, not to replace the job of human intel analysts but to assist them. Computer vision algorithms that could recognize objects could sift through reams of video to find objects of interest. Colonel Brown explained, “Where we expected a computer vision solution to help us was, you could go back into the data, ‘Tell me every time a car left this location.’” Then, “the computer vision algorithm would timestamp when certain activities would happen in certain places.” Humans would still need to be intimately involved in intelligence analysis, but automated tools could speed up the process.

Traditional defense contractors like Lockheed Martin or Northrop Grumman could make a stealth airplane, but the leaders in computer vision were tech companies like Google and Microsoft. DoD was largely flying in the dark with these relationships. That’s where Brendan came in. In July 2017, McCord traveled with Maven leads Lieutenant General Jack Shanahan, Colonel Drew Cukor, and Air Force Colonel Jason Brown to Honolulu to the Computer Vision and Pattern Recognition conference to meet with some of the top minds on computer vision. They also pitched Google, who would go on later that year to join Maven as one of its biggest partners.

DoD’s traditional processes were too sluggish to adopt a rapidly maturing technology like computer vision. An acquisition timeline that took seven to ten years to achieve first fielding would be several generations behind the state of the art and would be too slow for most commercial companies which operate on faster timelines. In his tasking memo, Bob Work directed the Maven team to “integrate algorithmic-based technology” in “90-day sprints,” which was effectively light speed for the DoD bureaucracy. The second problem was that, to harness computer vision technology, DoD needed to reach beyond the traditional defense contractors and tap into technology companies.

The Deep Learning Revolution (The MIT Press)
by Terrence J. Sejnowski
Published 27 Sep 2018

From Peterson, Mountfort, and Hollom, Field Guide to the Birds of Britain and Europe, 5th ed., p.16. Little did anyone suspect in the 1960s that it would take fifty years and a millionfold increase in computer power before computer vision would reach human levels of performance. The misleading intuition that it would be easy to write a computer vision program is based on activities that we find easy to do, such as seeing, hearing, and moving around—but that took evolution millions of years to get right. Much to their chagrin, early AI pioneers found the computer vision problem to be extremely hard to solve. In contrast, they found it much easier to program computers to prove mathematical theorems—a process thought to require the highest levels of intelligence—because computers turn out to be much better at logic than we are.

A document from the archives at MIT confirms his version of the story.2 What looked like it would be an easy problem to solve proved to be quicksand that swallowed a generation of researchers in computer vision. Why Vision Is a Hard Problem We rarely have difficulty identifying what an object is despite differences in the location, size, orientation, and lighting of the object. One of the earliest ideas in computer vision was to match a template of the object with the pixels in the image, but that approach failed because the pixels of the two images of the same object in different orientations don’t match.

It was generally thought that wider neural networks with a greater number of hidden units were more effective than deeper networks with a greater number of layers, but this was shown not to be the case for networks trained layer by layer,4 and the vanishing error gradient problem was identified, which slowed down learning near the input layer.5 When this problem was eventually overcome, however, it became possible to train deep backprop networks that performed favorably on benchmarks.6 And, as deep backprop networks began to challenge traditional approaches in computer vision, the word at the 2012 NIPS Conference was that the “Neural” was back in “Neural Information Processing Systems.” In computer vision, steady progress in recognizing objects in images over the last decade of the previous century and the first decade of the current one had improved performance on benchmarks (used to compare different methods) by a fraction of a percent per year.

pages: 586 words: 186,548

Architects of Intelligence
by Martin Ford
Published 16 Nov 2018

We thought that just from the inputs and outputs, you should be able to learn all these weights; and that was just unrealistic. You were going to have to wire in lots of knowledge to make anything work. That was the view of people in computer vision until 2012. Most people in computer vision thought this stuff was crazy, even though Yann LeCun sometimes got systems working better than the best computer vision systems, they still thought this stuff was crazy, it wasn’t the right way to do vision. They even rejected papers by Yann, even though they worked better than the best computer vision systems on particular problems, because the referees thought it was the wrong way to do things. That’s a lovely example of scientists saying, “We’ve already decided what the answer has to look like, and anything that doesn’t look like the answer we believe in is of no interest.”

There is a very large dataset—the ImageNet dataset—which is used in computer vision, and people in that field would only believe in our deep learning methods if we could show good results on that dataset. Geoffrey Hinton’s group actually did it, following up on earlier work by Yann LeCun on convolutional networks—that is, neural networks which were specialized for images. In 2012, these new deep learning architectures with extra twists were used with huge success and showed a big improvement on existing methods. Within a couple of years, the whole computer vision community switched to these kinds of networks. MARTIN FORD: So that’s the point at which deep learning really took off?

It was revolutionary for the fact that the same technology of deep learning could be used for both computer vision and speech recognition. It drove a lot of attention toward the field. MARTIN FORD: Thinking back to when you first started in neural networks, are you surprised at the distance things have come and the fact that they’ve become so central to what large companies, like Google and Facebook, are doing now? YOSHUA BENGIO: Of course, we didn’t expect that. We’ve had a series of important and surprising breakthroughs with deep learning. I mentioned earlier that speech recognition came around 2010, and then computer vision around 2012. A couple of years later, in 2014 and 2015, we had breakthroughs in machine translation that ended up being used in Google Translate in 2016. 2016 was also the year we saw the breakthroughs with AlphaGo.

pages: 215 words: 56,215

The Second Intelligent Species: How Humans Will Become as Irrelevant as Cockroaches
by Marshall Brain
Published 6 Apr 2015

LIDAR and radar are the two most essential sensor packages on a self-driving car. There is a simple reason for this difference: the computer vision systems that exist in production today (2015) are still fairly primitive. Computer scientists still have a ways to go when it comes to perfecting general vision systems. Yes, there are simple things that computer vision systems can do (for example, this video [14] shows a simple camera system to detect pancakes on a conveyor belt). But at this moment in history, there is not a computer vision system that can look at a common scene of a farm and say, “that is a barn, that is a horse, that is a man, that is the man's hat, that is grass, that is a tree, etc.”

The truckers who become unemployed will really have nowhere in the modern economy to go for new jobs. It is going to be a horrible situation for them. But truckers are just the tip of the iceberg. Many, many other people will become unemployed by the second intelligent species in the near future... Chapter 5 - How Computer Vision Systems will Destroy Jobs If you look back at the description of self-driving cars in the previous chapter, notice that computer vision does not really play a role. Current self-driving cars do not have two eyes on the roof or the hood looking out at the road and deciding what to do based on visual input. Self-driving cars do have an optical camera, but it plays a small role.

Computer scientists simply have not created the algorithms yet for computer vision at that level. But research in this area is occurring on many different fronts, both for the general case and specific situations. In the same way that Chess-playing computers eventually beat human players after several decades of research, and a Jeopardy-playing computer beat the best human players, there will eventually be computers running algorithms that are better than human beings at seeing the world. We simply haven't arrived there yet. The thing to understand is that we will arrive there eventually. As this computer vision research bears fruit, a surprising thing will happen.

AI 2041: Ten Visions for Our Future
by Kai-Fu Lee and Qiufan Chen
Published 13 Sep 2021

When we “see,” we are actually applying our accumulated knowledge of the world—everything we’ve learned in our lives about perspective, geometry, common sense, and what we have seen previously. These come naturally to us but are very difficult to teach a computer. Computer vision is the field of study that tries to overcome these difficulties to get computers to see and understand. COMPUTER VISION APPLICATIONS We are already using computer vision technologies every day. Computer vision can be used in real time, in areas ranging from transportation to security. Existing examples include: driver assistants installed in some cars that can detect a driver who nods off autonomous stores like Amazon Go, where cameras recognize when you’ve put a product in your shopping cart airport security (counting people, recognizing terrorists) gesture recognition (scoring your moves in an Xbox dancing game) facial recognition (using your face to unlock your mobile phone) smart cameras (your iPhone’s portrait mode recognizes and extracts people in the foreground, and then “beautifully” blurs the background to create a DSLR-like effect) military applications (separating enemy soldiers from civilians) autonomous navigation of drones and automobiles In the opening of “Gods Behind the Masks,” we saw the use of real-time facial recognition to automatically deduct payment by recognizing commuters as they pass through a turnstile.

Existing examples include: driver assistants installed in some cars that can detect a driver who nods off autonomous stores like Amazon Go, where cameras recognize when you’ve put a product in your shopping cart airport security (counting people, recognizing terrorists) gesture recognition (scoring your moves in an Xbox dancing game) facial recognition (using your face to unlock your mobile phone) smart cameras (your iPhone’s portrait mode recognizes and extracts people in the foreground, and then “beautifully” blurs the background to create a DSLR-like effect) military applications (separating enemy soldiers from civilians) autonomous navigation of drones and automobiles In the opening of “Gods Behind the Masks,” we saw the use of real-time facial recognition to automatically deduct payment by recognizing commuters as they pass through a turnstile. We also saw pedestrians interact with cartoon animals in ads, using hand gestures. And Amaka’s smartstream used computer vision to recognize the street ahead of him and gave him directions to get to his destination. Computer vision can also be applied to images and videos—in less immediate but no less important ways. Some examples: smart editing of photos and videos (tools like Photoshop use computer vision extensively to find facial borders, remove red eyes, and beautify selfies) medical image analysis (to determine if there are malignant tumors in a lung CT) content moderation (detection of pornographic and violent content in social media) related advertising selection based on the content of a given video smart image search (that can find images from keywords or other images) and, of course, making deepfakes (replacing occurrences of one face with another in a video) In “Gods Behind the Masks,” we saw a deepfake-making tool that is essentially an automatic video-editing tool that replaces one person with another, from face, fingers, hand, and voice to body language, gait, and facial expression.

—AFRICAN PROVERB NOTE FROM KAI-FU: This story revolves around a Nigerian video producer who is recruited to make an undetectable deepfake with dangerous consequences. A major branch of AI, computer vision teaches computers to “see,” and recent breakthroughs allow AI to do so like never before. The story imagines a future world marked by unprecedented high-tech cat-and-mouse games between the fakers and detectors, and between defenders and perpetrators. Is there any way to avoid a world in which all visual lines are blurred? I’ll explore that question in my commentary, as I describe recent and impending breakthroughs in computer vision, biometrics, and AI security, three AI technology areas enabling deepfakes and many other applications.

pages: 688 words: 107,867

Python Data Analytics: With Pandas, NumPy, and Matplotlib
by Fabio Nelli
Published 27 Sep 2018

Population in 2014 Conclusions Chapter 12:​ Recognizing Handwritten Digits Handwriting Recognition Recognizing Handwritten Digits with scikit-learn The Digits Dataset Learning and Predicting Recognizing Handwritten Digits with TensorFlow Learning and Predicting Conclusions Chapter 13:​ Textual Data Analysis with NLTK Text Analysis Techniques The Natural Language Toolkit (NLTK) Import the NLTK Library and the NLTK Downloader Tool Search for a Word with NLTK Analyze the Frequency of Words Selection of Words from Text Bigrams and Collocations Use Text on the Network Extract the Text from the HTML Pages Sentimental Analysis Conclusions Chapter 14:​ Image Analysis and Computer Vision with OpenCV Image Analysis and Computer Vision OpenCV and Python OpenCV and Deep Learning Installing OpenCV First Approaches to Image Processing and Analysis Before Starting Load and Display an Image Working with Images Save the New Image Elementary Operations on Images Image Blending Image Analysis Edge Detection and Image Gradient Analysis Edge Detection The Image Gradient Theory A Practical Example of Edge Detection with the Image Gradient Analysis A Deep Learning Example:​ The Face Detection Conclusions Appendix A:​ Writing Mathematical Expressions with LaTeX With matplotlib With IPython Notebook in a Markdown Cell With IPython Notebook in a Python 2 Cell Subscripts and Superscripts Fractions, Binomials, and Stacked Numbers Radicals Fonts Accents Appendix B:​ Open Data Sources Political and Government Data Health Data Social Data Miscellaneous and Public Data Sets Financial Data Climatic Data Sports Data Publications, Newspapers, and Books Musical Data Index About the Author and About the Technical Reviewer About the Author Fabio Nelliis a data scientist and Python consultant, designing and developing Python applications for data analysis and visualization.

© Fabio Nelli 2018 Fabio NelliPython Data Analyticshttps://doi.org/10.1007/978-1-4842-3913-1_14 14. Image Analysis and Computer Vision with OpenCV Fabio Nelli1 (1)Rome, Italy In the previous chapters, the analysis of data was centered entirely on numerical and tabulated data, while in the previous one we saw how to process and analyze data in textual form. This book rightfully closes by introducing the last aspect of data analysis: image analysis. During the chapter, topics such as computer vision and face recognition will be introduced. You will see how the techniques of deep learning are at the base of this king of analysis.

In recent years, especially because of the development of deep learning, image analysis has experienced huge development in solving problems that were previously impossible, giving rise to a new discipline called computer vision . In Chapter 9, you learned about artificial intelligence, which is the branch of calculation that deals with solving problems of pure “human relevance”. Computer vision is part of this, since its purpose is to reproduce the way the human brain perceives images. In fact, seeing is not just the acquisition of a two-dimensional image, but above all it is the interpretation of the content of that area.

pages: 2,466 words: 668,761

Artificial Intelligence: A Modern Approach
by Stuart Russell and Peter Norvig
Published 14 Jul 2019

Geometrical problems in computer vision are treated thoroughly in Multiple View Geometry in Computer Vision (Hartley and Zisserman, 2000). These books were written before the deep learning revolution, so for the latest results, consult the primary literature. Two of the main journals for computer vision are the IEEE Transactions on Pattern Analysis and Machine Intelligence and the International Journal of Computer Vision. Computer vision conferences include ICCV (International Conference on Computer Vision), CVPR (Computer Vision and Pattern Recognition), and ECCV (European Conference on Computer Vision). Research with a significant machine learning component is also published at NeurIPS (Neural Information Processing Systems), and work on the interface with computer graphics often appears at the ACM SIGGRAPH (Special Interest Group in Graphics) conference.

David Marr’s book Vision (Marr, 1982) played a historical role in connecting computer vision to the traditional areas of biological vision—psychophysics and neurobiology. While many of his specific models for tasks such as edge detection and object recognition haven’t stood the test of time, the theoretical perspective where each task is analyzed at an informational, computational, and implementation level is still illuminating. For the field of computer vision, the most comprehensive textbooks available today are Computer Vision: A Modern Approach (Forsyth and Ponce, 2002) and Computer Vision: Algorithms and Applications (Szeliski, 2011).

In turn, a reasonably reliable pedestrian detector is capable of producing estimates of the horizon, if there are several pedestrians in the scene at different distances from the camera. This is because the relative scaling of the pedestrians is a cue to where the horizon is. So we can extract a horizon estimate from the detector, then use this estimate to prune the pedestrian detector’s mistakes. 27.7Using Computer Vision Here we survey a range of computer vision applications. There are now many reliable computer vision tools and toolkits, so the range of applications that are successful and useful is extraordinary. Many are developed at home by enthusiasts for special purposes, which is testimony to how usable the methods are and how much impact they have. (For example, an enthusiast created a great object-detection-based pet door that refuses entry to a cat if it is bringing in a dead mouse–a Web search will find it for you). 27.7.1Understanding what people are doing If we could build systems that understood what people are doing by analyzing video, we could build human-computer interfaces that watch people and react to their behavior.

pages: 413 words: 119,587

Machines of Loving Grace: The Quest for Common Ground Between Humans and Robots
by John Markoff
Published 24 Aug 2015

At the time, the open-source software movement was incredibly popular. His background was in computer vision, and so he put two and two together and decided to create a project to build a library of open-source machine vision software tools. Taking the Linux operating system as a reference, it was obvious that when programmers worldwide have access to an extraordinary common set of tools, it makes everybody’s research a lot easier. “I should give everyone that tool in vision research,” he decided. While his boss was on sabbatical he launched OpenCV, or Open Source Computer Vision, a software library that made it easier for researchers to develop vision applications using Intel hardware.

“Better to seek forgiveness than to ask permission” was his motto. Eventually OpenCV contained a library of more than 2,500 algorithms including both computer vision and machine-learning software. OpenCV also hosted programs that could recognize faces, identify objects, classify human motion, and so on. From his initial team of just a handful of Intel researchers, a user community grew to more than 47,000 people, and more than ten million copies of the toolset have been downloaded to date. Gary Bradski created a popular computer vision software library and helped design robots. He would later leave robotics to work with a company seeking to build augmented reality glasses.

Indeed, during the mid-sixties there was virtually boundless optimism among the small community of artificial intelligence researchers on both coasts. In 1966, when SRI and SAIL were beginning to build robots and AI programs in California, another artificial intelligence pioneer, Marvin Minsky, assigned an undergraduate to work on the problem of computer vision on the other side of the country, at MIT. He envisioned it as a summer project. The reality was disappointing. Although AI might be destined to transform the world, Duvall, who worked on several SRI projects before transferring to the Shakey project to work in the trenches as a young programmer, immediately saw that the robot was barely taking baby steps.

pages: 487 words: 124,008

Your Face Belongs to Us: A Secretive Startup's Quest to End Privacy as We Know It
by Kashmir Hill
Published 19 Sep 2023

Ton-That summed up the findings: “People could predict a man’s IQ from a face but not a woman’s.” “Makes sense from evolution,” he cryptically remarked, suggesting that only a man’s intelligence matters for mating purposes.[*2] An automated tool able to judge people at a glance based on computer vision could be lucrative, the team seemed to think. Ton-That came across an article about a computer vision company called Clarifai that allowed people to take a photo of a product they liked, such as a pair of sneakers, and be shown similar versions for sale. The company had just raised $30 million from investors. “People putting crazy money into this shit,” he wrote in an email sharing the article with his partners.

It was able to match two different photos of the same person in fifteen of twenty cases, prompting Kanade to optimistically declare in a paper that a computer could extract facial features for identification “almost as reliably as a person.” That was an overstatement. But it was the first fully automated form of facial recognition, with no teens involved, and researchers around the world took notice. For that and other contributions, Kanade became widely recognized as a pioneer in the field of computer vision and was recruited by Carnegie Mellon University in the United States. Other computer scientists followed Kanade’s example, but their experiments succeeded only in matching the faces of a small sampling of individuals. That wasn’t going to help spot a wanted criminal or dangerous person in a large crowd.

DARPA’s goal was grander and more subtle than building a war machine the army could use right away. The agency hoped to seed the research—and the researchers—necessary for remarkable, and ongoing, technological breakthroughs. The point of the billion-dollar investment wasn’t just to make an Alvin; it was to groom engineers like Matthew Turk. Drawn to the challenge of improving computer vision, Turk applied to the MIT Media Lab, a new technology department whose well-branded mission was to “invent the future” and to make real the stuff of science fiction. When Turk arrived at the Massachusetts Institute of Technology in 1987, with shaggy brown hair and a handlebar mustache, Marvin Minsky was still there, even more gnomish than he had been in his youth, the fringe of hair remaining around his head grayed.

pages: 337 words: 103,522

The Creativity Code: How AI Is Learning to Write, Paint and Think
by Marcus Du Sautoy
Published 7 Mar 2019

AARON 117–18, 119, 121, 122 Adams, Douglas: The Hitchhiker’s Guide to the Galaxy 66–7, 268 adversarial algorithms 132–42, 298, 300 AIVA 229–30; Genesis 230 Alberti bass pattern 197, 197 algebra 44, 47, 65, 158–60, 158, 171, 182, 237 Algorithmic Justice League 94 algorithms 2, 5, 11, 13, 17, 21, 24; adversarial 132–42, 298, 300; art and see art; biases and blind spots 91–5; characteristics, key 46; computer vision and see computer vision; consciousness and see consciousness; dating/matching and 57–61, 58, 59, 60; first 44–7, 45, 158–9; free will and 112–13, 300, 301; games and see individual game name; Google search 47–56, 50, 51, 52, 57; language and see language; Lovelace Test and 7–8, 102–3, 219–20; mathematics and see mathematics; music and see musical composition; neural networks and see neural networks; Nobel Prize and 57; recommender 79–80, 81–91, 85, 86; reinforcement learning and 27, 96–7; spam filters and 90–1; sports and 55–6; supervised learning and 95–6, 97, 137; tabula rasa learning and 97, 98; term 46; training 89–91; unexpected consequences of 62–5 see also individual algorithm name Al-Khwarizmi, Muhammad 46, 47, 159 AlphaGo 22, 29–43, 95–6, 97–8, 131, 145, 168, 209, 219–20, 233 AlphaZero 97–8 Al Qadiri, Fatima 224 Altamira, Cave of, Spain 104, 105 Amazon (online retailer) 62, 67, 286 Amiga Power 23 Analytical Engine 1–2, 44 Android Lloyd Webber 290 Annals of Mathematics 152, 170–1, 177, 243 Appel, Kenneth 170, 174 Apple 117 Archer, Jodie 283 Argand, Jean-Robert 237 Aristophanes 165 Aristotle: The Art of Rhetoric 166 Arnold, Malcolm 231 art: AARON and 117–18, 119, 121, 122; adversarial networks and generating new 132–42, 135, 136, 137, 140; animals and 107–9; BOB (artificial life form) and 146–8; bone carvings, ancient 104–5; cave art, ancient 103–4, 156; coding the visual world 110–12; commercial considerations and 131–2; copyright ownership and 108–9; creativity and see creativity; definition of 103–7; emotional response, AI and 106–7; fractals and 113–16, 124–5; future of AI 148–9; identifying artists and waves of creativity with AI 134–9, 135, 136; mathematics and 99–103, 106, 146, 155; origins of human 103; ‘The Painting Fool’ 119–22, 200, 291; Pollock, attempts to fake a 123–6; Rembrandt, recreating 127–32; rules and 1; sale of computer generated work, first 141; visual recognition algorithms, understanding 142–5; Wundt Curve and 139–40, 140 Art Basel 141, 142, 143, 145, 151 artificial intelligence (AI): algorithms and see algorithms; art and see art; birth of 1–2, 67; computer vision and see computer vision; consciousness and see consciousness; creativity and see creativity; data, importance of 67–8; games and see individual game name; language and see language; Lovelace Test and 7–8, 102–3, 219–20; mathematics and see mathematics; music and see musical composition; neural networks and see neural networks; systems see individual system name; term 24; transformational impact of 66–7 Ascent of Man, The (TV series) 104 Ashwood, Mary 48 Associated Press 293, 294 Atari 25–8, 92, 97, 115–16, 132 Atiyah, Michael 179, 248–9 Augustus, Ron 127 Automated Insights 293 Babbage, Charles 1, 7, 65 Babylonians, Ancient 157–60, 161, 165 Bach, Carl Philipp Emanuel 189–90, 193–4; ‘Inventions by Which Six Measures of Double Counterpoint Can be Written without Knowledge of the Rules’ 193–4 Bach, Johann Sebastian 10, 185, 186–7, 189–93, 204, 205, 207, 230, 231, 299; AIVA and 230; algorithms and method of composing music 189–94, 191; The Art of Fugue 186, 198; DeepBach and 207–12, 232; Emmy and 195–6, 197, 198, 200, 201; The Musical Offering 189–94; ‘Ricercar’ 192; St John Passion 207–8 Baroque 10, 13, 138 Barreau, Pierre 230 Barreau, Vincent 230 Barry, Robert 106 Barthes, Roland 251–2 Bartók, Béla 186–7, 197, 205 Batten, Dan 234 Beatles, the 224; ‘Yesterday’ 223 Beckett, Samuel 17 Beethoven, Ludwig van 10, 41, 127, 200, 230, 244 Belamy, Edmond 141 BellKor’s Pragmatic Chaos 87–8 Berlyne, D.

AARON 117–18, 119, 121, 122 Adams, Douglas: The Hitchhiker’s Guide to the Galaxy 66–7, 268 adversarial algorithms 132–42, 298, 300 AIVA 229–30; Genesis 230 Alberti bass pattern 197, 197 algebra 44, 47, 65, 158–60, 158, 171, 182, 237 Algorithmic Justice League 94 algorithms 2, 5, 11, 13, 17, 21, 24; adversarial 132–42, 298, 300; art and see art; biases and blind spots 91–5; characteristics, key 46; computer vision and see computer vision; consciousness and see consciousness; dating/matching and 57–61, 58, 59, 60; first 44–7, 45, 158–9; free will and 112–13, 300, 301; games and see individual game name; Google search 47–56, 50, 51, 52, 57; language and see language; Lovelace Test and 7–8, 102–3, 219–20; mathematics and see mathematics; music and see musical composition; neural networks and see neural networks; Nobel Prize and 57; recommender 79–80, 81–91, 85, 86; reinforcement learning and 27, 96–7; spam filters and 90–1; sports and 55–6; supervised learning and 95–6, 97, 137; tabula rasa learning and 97, 98; term 46; training 89–91; unexpected consequences of 62–5 see also individual algorithm name Al-Khwarizmi, Muhammad 46, 47, 159 AlphaGo 22, 29–43, 95–6, 97–8, 131, 145, 168, 209, 219–20, 233 AlphaZero 97–8 Al Qadiri, Fatima 224 Altamira, Cave of, Spain 104, 105 Amazon (online retailer) 62, 67, 286 Amiga Power 23 Analytical Engine 1–2, 44 Android Lloyd Webber 290 Annals of Mathematics 152, 170–1, 177, 243 Appel, Kenneth 170, 174 Apple 117 Archer, Jodie 283 Argand, Jean-Robert 237 Aristophanes 165 Aristotle: The Art of Rhetoric 166 Arnold, Malcolm 231 art: AARON and 117–18, 119, 121, 122; adversarial networks and generating new 132–42, 135, 136, 137, 140; animals and 107–9; BOB (artificial life form) and 146–8; bone carvings, ancient 104–5; cave art, ancient 103–4, 156; coding the visual world 110–12; commercial considerations and 131–2; copyright ownership and 108–9; creativity and see creativity; definition of 103–7; emotional response, AI and 106–7; fractals and 113–16, 124–5; future of AI 148–9; identifying artists and waves of creativity with AI 134–9, 135, 136; mathematics and 99–103, 106, 146, 155; origins of human 103; ‘The Painting Fool’ 119–22, 200, 291; Pollock, attempts to fake a 123–6; Rembrandt, recreating 127–32; rules and 1; sale of computer generated work, first 141; visual recognition algorithms, understanding 142–5; Wundt Curve and 139–40, 140 Art Basel 141, 142, 143, 145, 151 artificial intelligence (AI): algorithms and see algorithms; art and see art; birth of 1–2, 67; computer vision and see computer vision; consciousness and see consciousness; creativity and see creativity; data, importance of 67–8; games and see individual game name; language and see language; Lovelace Test and 7–8, 102–3, 219–20; mathematics and see mathematics; music and see musical composition; neural networks and see neural networks; systems see individual system name; term 24; transformational impact of 66–7 Ascent of Man, The (TV series) 104 Ashwood, Mary 48 Associated Press 293, 294 Atari 25–8, 92, 97, 115–16, 132 Atiyah, Michael 179, 248–9 Augustus, Ron 127 Automated Insights 293 Babbage, Charles 1, 7, 65 Babylonians, Ancient 157–60, 161, 165 Bach, Carl Philipp Emanuel 189–90, 193–4; ‘Inventions by Which Six Measures of Double Counterpoint Can be Written without Knowledge of the Rules’ 193–4 Bach, Johann Sebastian 10, 185, 186–7, 189–93, 204, 205, 207, 230, 231, 299; AIVA and 230; algorithms and method of composing music 189–94, 191; The Art of Fugue 186, 198; DeepBach and 207–12, 232; Emmy and 195–6, 197, 198, 200, 201; The Musical Offering 189–94; ‘Ricercar’ 192; St John Passion 207–8 Baroque 10, 13, 138 Barreau, Pierre 230 Barreau, Vincent 230 Barry, Robert 106 Barthes, Roland 251–2 Bartók, Béla 186–7, 197, 205 Batten, Dan 234 Beatles, the 224; ‘Yesterday’ 223 Beckett, Samuel 17 Beethoven, Ludwig van 10, 41, 127, 200, 230, 244 Belamy, Edmond 141 BellKor’s Pragmatic Chaos 87–8 Berlyne, D.

Sure, the machine was programmed by humans, but that doesn’t really seem to make it feel better. AlphaGo has since retired from competitive play. The Go team at DeepMind has been disbanded. Hassabis proved his Cambridge lecturer wrong. DeepMind has now set its sights on other goals: health care, climate change, energy efficiency, speech recognition and generation, computer vision. It’s all getting very serious. Given that Go was always my shield against computers doing mathematics, was my own subject next in DeepMind’s cross hairs? To truly judge the potential of this new AI we are going to need to look more closely at how it works and dig around inside. The crazy thing is that the tools DeepMind is using to create the programs that might put me out of a job are precisely the ones that mathematicians have created over the centuries.

pages: 523 words: 61,179

Human + Machine: Reimagining Work in the Age of AI
by Paul R. Daugherty and H. James Wilson
Published 15 Jan 2018

From the Mechanistic to the Organic The potential power of AI to transform businesses is unprecedented, and yet there is an urgent and growing challenge. Companies are now reaching a crossroad in their use of AI, which we define as systems that extend human capability by sensing, comprehending, acting, and learning. As businesses deploy such systems—spanning from machine learning to computer vision to deep learning—some firms will continue to see modest productivity gains over the short run, but those results will eventually stall out. Other companies will be able to attain breakthrough improvements in performance, often by developing game-changing innovations. What accounts for the difference?

For instance, Minsky, with Seymour Papert, wrote what was considered the foundational book on scope and limitations of neural networks, a kind of AI that uses biological neurons as its model. Other ideas like expert systems—wherein a computer contained deep stores of “knowledge” for specific domains like architecture or medical diagnosis—and natural language processing, computer vision, and mobile robotics can also be traced back to the event. One conference participant was Arthur Samuel, an engineer at IBM who was building a computer program to play checkers. His program would assess the current state of a checkers board and calculate the probability that a given position could lead to a win.

In other words, it automates existing processes. But in order to reimagine processes, firms must utilize more advanced technologies—namely, AI. (See the sidebar “AI Technologies and Applications: How Does This All Fit Together?” at the end of this chapter.) Now we’re talking about systems that deploy AI techniques such as computer vision, or machine-learning tools to analyze unstructured or complex information. It might be able to read various styles of invoices, contracts, or purchase orders, for instance. It can process these documents—no matter the format—and put the correct values into forms and databases for further action.

pages: 307 words: 88,180

AI Superpowers: China, Silicon Valley, and the New World Order
by Kai-Fu Lee
Published 14 Sep 2018

By 2017, almost every team had driven error rates below 5 percent—approximately the accuracy of humans performing the same task—with the average algorithm of that year making only one-third of the mistakes of the top algorithm of 2012. In the years since the Oxford experts made their predictions, computer vision has now surpassed human capabilities and dramatically expanded real-world use-cases for the technology. Those amped-up capabilities extend far beyond computer vision. New algorithms constantly set and surpass records in fields like speech recognition, machine reading, and machine translation. While these strengthened capabilities don’t constitute fundamental breakthroughs in AI, they do open the eyes and spark the imaginations of entrepreneurs.

Soon, these juiced-up neural networks—now rebranded as “deep learning”—could outperform older models at a variety of tasks. But years of ingrained prejudice against the neural networks approach led many AI researchers to overlook this “fringe” group that claimed outstanding results. The turning point came in 2012, when a neural network built by Hinton’s team demolished the competition in an international computer vision contest. After decades spent on the margins of AI research, neural networks hit the mainstream overnight, this time in the form of deep learning. That breakthrough promised to thaw the ice from the latest AI winter, and for the first time truly bring AI’s power to bear on a range of real-world problems.

After that, a smaller number of Chinese entrepreneurs and venture-capital funds like my own began to invest in this area. But the great majority of China’s technology community didn’t properly wake up to the deep-learning revolution until its Sputnik Moment in 2016, a full decade behind the field’s breakthrough academic paper and four years after it proved itself in the computer vision competition. American universities and technology companies have for decades reaped the rewards of the country’s ability to attract and absorb talent from around the globe. Progress in AI appeared to be no different. The United States looked to be out to a commanding lead, one that would only grow as these elite researchers leveraged Silicon Valley’s generous funding environment, unique culture, and powerhouse companies.

pages: 205 words: 20,452

Data Mining in Time Series Databases
by Mark Last , Abraham Kandel and Horst Bunke
Published 24 Jun 2004

Shen, P. S. P. Wang and T. Zhang) Vol. 45: Hidden Markov Models: Applications in Computer Vision (Eds. H. Bunke and T. Caelli) Vol. 46: Syntactic Pattern Recognition for Seismic Oil Exploration (K. Y. Huang) Vol. 47: Hybrid Methods in Pattern Recognition (Eds. H. Bunke and A. Kandel ) Vol. 48: Multimodal Interface for Human-Machine Communications (Eds. P. C. Yuen, Y. Y. Tang and P. S. P. Wang) Vol. 49: Neural Networks and Systolic Array Design (Eds. D. Zhang and S. K. Pal ) Vol. 50: Empirical Evaluation Methods in Computer Vision (Eds. H. I. Christensen and P. J. Phillips) Vol. 51: Automatic Diatom Identification (Eds.

Fagin, R. and Stockmeyer, L. (1998). Relaxing the Triangle Inequality in Pattern Matching. Int. Journal on Computer Vision, 28(3), 219–231. 7. Frances, M. and Litman, A. (1997). On Covering Problems of Codes. Theory of Computing Systems, 30(2), 113–119. 8. Fred, A.L.N. and Leitão, J.M.N. (1998). A Comparative Study of String Dissimilarity Measures in Structural Clustering. Proc. of Int. Conf. on Document Analysis and Recognition, pp. 385–394. 9. Gramkow, C. (2001). On Averaging Rotations. Int. Journal on Computer Vision, 42(1/2), 7–16. 10. Gregor, J. and Thomason, M.G. (1993). Dynamic Programming Alignment of Sequences Representing Cyclic Patterns.

Qu, Y., Wang, C., and Wang, X.S. (1998). Supporting Fast Search in Time Series for Movement Patterns in Multiple Scales. Proceedings of the Seventh International Conference on Information and Knowledge Management, pp. 251–258. 57. Sahoo, P.K., Soltani, S., Wong, A.K.C., and Chen, Y.C. (1988). A Survey of Thresholding Techniques. Computer Vision, Graphics and Image Processing, 41, 233–260. 58. Samet, H. (1990). The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading, MA. 59. Sheikholeslami, G., Chatterjee, S., and Zhang, A. (1998). WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases.

pages: 625 words: 167,349

The Alignment Problem: Machine Learning and Human Values
by Brian Christian
Published 5 Oct 2020

Zeiler, Matthew D., and Rob Fergus. “Visualizing and Understanding Convolutional Networks.” In European Conference on Computer Vision, 818–33. Springer, 2014. Zeiler, Matthew D., Dilip Krishnan, Graham W. Taylor, and Rob Fergus. “Deconvolutional Networks.” In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2528–35. IEEE, 2010. Zeiler, Matthew D., Graham W. Taylor, and Rob Fergus. “Adaptive Deconvolutional Networks for Mid and High Level Feature Learning.” In 2011 International Conference on Computer Vision, 2018–25. IEEE, 2011. Zeng, Jiaming, Berk Ustun, and Cynthia Rudin. “Interpretable Classification Models for Recidivism Prediction.”

It was understood that, in principle, a big-enough neural network, with enough training examples and time, can learn almost anything.16 But no one had fast-enough computers, enough data to train on, or enough patience to make good on that theoretical potential. Many lost interest, and the field of computer vision, along with computational linguistics, largely moved on to other things. As Hinton would later summarize, “Our labeled datasets were thousands of times too small. [And] our computers were millions of times too slow.”17 Both of these things, however, would change. With the growth of the web, if you wanted not fifty but five hundred thousand “flash cards” for your network, suddenly you had a seemingly bottomless repository of images.

In 2007, Princeton professor Fei-Fei Li used Amazon Mechanical Turk to recruit human labor, at a scale previously unimaginable, to build a dataset that was previously impossible. It took more than two years to build, and had three million images, each labeled, by human hands, into more than five thousand categories. Li called it ImageNet, and released it in 2009. The field of computer vision suddenly had a mountain of new data to learn from, and a new grand challenge. Beginning in 2010, teams from around the world began competing to build a system that can reliably look at an image—dust mite, container ship, motor scooter, leopard—and say what it is. Meanwhile, the relatively steady progress of Moore’s law throughout the 2000s meant that computers could do in minutes what the computers of the 1980s took days to do.

pages: 340 words: 97,723

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity
by Amy Webb
Published 5 Mar 2019

Tribes typically observe rules and rituals, so let’s explore the rights of initiation for AI’s tribes. It begins with a rigorous university education. In North America, the emphasis within universities has centered on hard skills—like mastery of the R and Python programming languages, competency in natural language processing and applied statistics, and exposure to computer vision, computational biology, and game theory. It’s frowned upon to take classes outside the tribe, such as a course on the philosophy of mind, Muslim women in literature, or colonialism. If we’re trying to build thinking machines capable of thinking like humans do, it would seem counterintuitive to exclude learning about the human condition.

In my own meetings at the Pentagon with Department of Defense officials, an alternative view on the future of warfare (code vs. combat) has taken a long time to find widespread alignment. For example, in 2017, the DoD established an Algorithmic Warfare Cross-Functional Team to work on something called Project Maven—a computer vision and deep-learning system that autonomously recognizes objects from still images and videos. The team didn’t have the necessary AI capabilities, so the DoD contracted with Google for help training AI systems to analyze drone footage. But no one told the Google employees assigned to the project that they’d actually been working on a military project, and that resulted in high-profile backlash.

Since they favor optimization over precision and are basically made up of dense linear algebra operations, it makes sense that a new neural network architecture would lead to greater efficiencies and, more importantly, speed in the design and deployment process. The faster research teams can build and test real-world models, the closer they can get to practical-use cases for AI. For example, training a complicated computer vision model currently takes weeks or months—and the end result might only prove that further adjustments need to be made, which means starting over again. Better hardware means training models in a matter of hours, or even minutes, which could lead to weekly—or even daily—breakthroughs. That’s why Google created its own custom silicon, called Tensor Processing Units (TPUs).

Driverless: Intelligent Cars and the Road Ahead
by Hod Lipson and Melba Kurman
Published 22 Sep 2016

Source: Jakob Engel, Jorg Stuckler, and Daniel Cremers, “Large-Scale Direct Slam with Stereo Cameras,” in 2015 IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 1935–1942. IEEE, 2015; Andreas Geiger, Philip Lenz, and Raquel Urtasun, “Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361, IEEE, 2012. Figure 12.1 GM’s Electric Networked-Vehicle (EN-V) Concept pod, an autonomous two-seater codeveloped with Segway for short trips in cities. Source: General Motors Figure 12.2 The most common job in most U.S. states in 2014 was truck driving.

George Hotz’s home-built Acura is evidence that today a skilled developer can build a pretty good driverless car in a fairly short period of time. However, as Musk points out, when it comes to software you will trust with your life, the leap from 99 percent to 99.9999 percent accuracy is a big one. In the past few years, mobile robots have gotten better at finding their way around their environment. It helps that the performance of computer vision software has improved dramatically, aided by the advent of big data, high resolution digital cameras, and faster processors. Another catalyst has been the successful application of machine-learning software to solve thorny problems of machine vision, sparking a mini-renaissance in the study of artificial perception.

Both of these approaches worked some of the time but were too slow and still didn’t provide the software with a crucial skill, the ability to consistently recognize objects in unfamiliar settings. To automate the process of object recognition, it’s necessary to have software that can extract visual information from raw data in order to identify the objects depicted. Over the years, researchers have attempted to do this in several different ways. One of the earliest forms of computer vision software developed in the 1960s worked by distilling digital images into simple line drawings. A famous example of this approach was a robot named Shakey, described somewhat optimistically by his creator, Stanford researcher Charles Rosen, as “the first electronic person.” Shakey’s “body” consisted of a stack of heavy boxes containing electronic equipment stacked onto a cart.

pages: 416 words: 118,522

Why Machines Learn: The Elegant Math Behind Modern AI
by Anil Ananthaswamy
Published 15 Jul 2024

That year, Stanford University professor Fei-Fei Li and her students presented a paper at the first Computer Vision and Pattern Recognition (CVPR) conference. Titled “ImageNet: A Large-Scale Hierarchical Image Database,” the paper included an immense dataset of millions of hand-labeled images consisting of thousands of categories (immense by the standards of 2009). In 2010, the team put out the ImageNet challenge: Use 1.2 million ImageNet images, binned into 1,000 categories, to train your computer vision system to correctly categorize those images, and then test it on 100,000 unseen images to see how well the system recognizes them.

However, one of the most significant developments over the past five years—one that has led to the enormous explosion of interest in AIs such as ChatGPT—is something called self-supervised learning, a clever method that takes unlabeled data and creates implicit labels without human involvement and then supervises itself. A BET IN BERKELEY In 2014, a group of researchers at the University of California, Berkeley, among them Jitendra Malik, a formidable expert in computer vision, developed a deep neural network solution that performed admirably on a computer vision task called pattern analysis, statistical modeling, and computational learning (PASCAL) for visual object classes (VOC). The task entailed learning, given a small dataset of images, how to draw boxes around, or to segment, different categories of objects in those images, such as bicycles, cars, a horse, a person, and sheep, and then to name them.

Marcello Pelillo, a computer scientist at the University of Venice, Italy, has been doing his best to draw attention to Alhazen’s ideas. THE MAKINGS OF AN ALGORITHM One day, when he wandered into a bookstore in New Haven, Connecticut, Pelillo stumbled upon a slim book called Theories of Vision from Al-Kindi to Kepler. It was the late 1990s, and Pelillo was then a visiting professor at Yale. Besides doing research in computer vision, pattern recognition, and machine learning, he had a penchant for the history and philosophy of science and a love of math. The slim book, at just over two hundred pages, was alluring. It argued that Alhazen was “the most significant figure in the history of optics between antiquity and the seventeenth century.”

pages: 348 words: 119,358

The Long History of the Future: Why Tomorrow's Technology Still Isn't Here
by Nicole Kobie
Published 3 Jul 2024

For example, natural language processing (NLP) has long been a core focus of AI, as it can be used for machine translation – useful for governments and militaries keeping watch on rival nations – but also because it enables other technologies, be it voice assistants or analysing large tranches of unstructured text. Another way AI can be used is computer vision, letting machines understand objects that they ‘see’. Both examples are specialist forms of AI, and complex tasks can be achieved by combining them. Robotics might pair a decision-making tool and analytics for movement with NLP and computer vision to allow an android to walk around, talk and see. A driverless car’s system may have separate but linked AIs for a multitude of tasks: computer vision to analyse what the sensors pick up; automated decision-making to decide when to stop or speed up; and a system that can translate that into the car’s controls.

Dickmanns was a professor at Bundeswehr University Munich, teaching computer vision and machine guidance courses after a career in aerospace research. It may seem an odd route, to go from space vehicles to driverless cars, but a robot sent to another planet to explore must be able to look around and navigate independently. If you can make it work on the moon, you can do it in Munich. In the late 1970s, Dickmanns started work on Versuchsfahrzeug für autonome Mobilität und Rechnersehen (VaMoRs) or, roughly translated to English, ‘Test vehicle for autonomous mobility and computer vision’ – and it did just what it said on the tin.

The carmaker then known as Daimler-Benz (now Mercedes-Benz Group) began supplying vehicles. Dickmanns’ research focused on computer vision; he managed these feats of driverless prowess using only cameras and a bit of modelling, believing that to be entirely sufficient. Daimler-Benz had its own version of the automation technology with added lidar and radar for additional safety, as it sought to eventually commercialise the idea rather than just to write a research paper about computer vision. Because of the extra sensors, the team needed an even bigger van to fit all the computing required. The VaMoRs tests won Dickmanns and his team funding from a key driverless project.

pages: 362 words: 97,288

Ghost Road: Beyond the Driverless Car
by Anthony M. Townsend
Published 15 Jun 2020

When in-house projects failed to produce convincing results, many companies simply acquired promising startups to get hold of the needed technology instead. In a two-year period during 2016 and 2017 alone, some $80 billion surged into self-driving vehicle technologies. The biggest deal, Intel’s panicked 2017 acquisition of computer-vision pioneer Mobileye, an Israel-based maker of computer-vision systems, was valued at an eye-watering $15 billion. As this flurry of mergers and acquisitions unfolded, the web of partnerships and cross holdings linking automakers and the tech sector grew ever more tangled. Two of the world’s biggest consumer industries—computers and cars—had seen their future in each other.

This sea of human teachers—including, for instance, the team Google kept in India to train its first AVs and the 300,000 online gig workers of Seattle-based Mighty AI—performs endless hours of mind-numbing human intelligence tasks (HITs in AI jargon), the most underappreciated role in the creation story of this technology. Some of this work is done once, early in the development of AV software, to provide a baseline for algorithmic training. But many human handlers must be kept on to decipher imagery that computer vision can’t interpret. Despite autonomists’ overreaches, progress toward full autonomy continues, as measured by the number of disengagements—incidents where human safety engineers are forced to intervene and take back control from stymied self-driving computers during test drives. For instance, in 2017, GM’s fleet of Chevy Bolts disengaged on average once every 1,254 miles during testing on the challenging terrain of San Francisco streets, a huge leap over the previous year, when the computers balked every 235 miles.

When the space shuttle Endeavor was moved across Los Angeles in 2012, the 12-mile journey required a custom-built 160-wheel carrier and hundreds of human escorts, at a cost of more than $10 million. But could such a heavy lift become an inexpensive, routine, automated operation in cities of the future? Civic caravans wouldn’t simply be self-driving versions of prefab government trailers. Computer vision, at such low speeds, wouldn’t be used just to look for obstructions ahead. Its gaze could be turned down as well, collecting precise imagery of potholes and road conditions. Pairing such superhuman sensing with active, computer-controlled suspensions could create a mobile platform as stable as the ground below—allowing delicate, light, and airy structures of metal and glass to rise above.

pages: 296 words: 66,815

The AI-First Company
by Ash Fontana
Published 4 May 2021

Integrators collate data from their customers through a set of integrations with other products, but only if that fulfills a specific customer need and goes through existing pipes between the customer’s data storage and the application to be integrated. This also means that vendors don’t have to negotiate deals with each of those external data sources. For instance, computer vision offers a promising path to helping physicians identify diseased tissues such as cancers of the skin, breast, lung, and so on. However, computer vision models tend to need many images on which to train, and there are strict controls on handling patient data such as X-rays. Companies building computer vision–based systems need that data but can’t get it without either running their own medical facilities—which would take years to build and approve—or obtaining it from an existing medical facility.

Having a greater volume of data enables customers to effectively average across many data points when training their systems. Members can see different variations of the same category of data from other members. For example, images of a product taken in different lighting conditions used to train computer vision models to recognize the product in different environments. Members can validate data points for each other. For example, if they have a user with the same (not personally identifiable) data in some dimension—email, let’s say—but not others, such as a preference for liquid or powder detergent, they can correct those other data points by unifying on the email address and filling out the “detergent preference” column for the user with that email address.

Sometimes the priority is avoiding missed diagnoses, in the case of particularly deadly cancers such as lung cancer, while in others it’s about cost-effectively screening lots of patients on a regular basis, in the case of particularly prevalent cancers like basal cell carcinoma of the skin. Medical facilities achieve this by using AI to augment the physicians on staff but don’t typically have the ML computer vision expertise to build AI. They have a valuable asset to leverage as well as protect, so they typically strike partnerships that give them exclusive access to the AI product for a period of time, include strict controls on data, and allow for integration into existing hardware such as X-ray machines.

pages: 272 words: 103,638

Unit X: How the Pentagon and Silicon Valley Are Transforming the Future of War
by Raj M. Shah and Christopher Kirchhoff
Published 8 Jul 2024

If a new problem flared up on the Korean Peninsula, analysts were called away from whatever they’d been working on. McCord and the others all knew that computer vision could solve that problem. And that’s why McCord went back to Boston and convinced his fiancée to move with him to San Francisco so he could join our team. McCord, like so many others we recruited from the private sector, gave up a seven-figure compensation package to do this. Most members of our team were making a fraction of what they could make in the private sector. They were there because they believed in the mission. McCord began by convening top computer vision and AI talent in the Valley—people who were in academia, big tech companies like Google, or tiny startups—and learning about the state of the art in computer vision research.

In artificial intelligence we were working with a company called C3.ai, whose software predicted maintenance for aircraft, saving the air force millions of dollars and keeping more planes airborne. Our AI team also helped orchestrate an initiative called Project Maven, in which Amazon, Microsoft, and Google were developing computer vision algorithms—computer code that can see—which enhanced the military’s ability to track ISIS fighters. In human systems, we worked with a company that used wearables to monitor soldiers in reconnaissance platoons for dehydration—a major cause of mission failure. Another startup developed earbuds that used bone conduction to communicate, enabling warfighters to talk in high-noise environments.

McCord, who was director of software and intelligence there, had an engineering degree from MIT and an MBA from Harvard. He’d heard about DIU and knew we were opening an office in Cambridge, home to his alma maters. Raj invited McCord to come to the ribbon-cutting for the new office, where Carter gave a speech that McCord recalls was “a clarion call for progress in AI and especially for computer vision,” which was exactly what McCord was working on. He was further inspired by a meeting in which Eric Schmidt and a handful of eminent Silicon Valley technologists—including John Giannandrea, a top AI scientist at Google who’d soon become the head of AI research at Apple—discussed the initiative that would come to be known as Maven.

pages: 569 words: 156,139

Amazon Unbound: Jeff Bezos and the Invention of a Global Empire
by Brad Stone
Published 10 May 2021

“Just as he got really enthusiastic about computer voice recognition, he was also really excited about computer vision.” The allure of computer vision, along with his interest in pressing Amazon’s advantage in the cloud to push the frontiers of artificial intelligence, again sparked the fertile imagination of Amazon’s founder. More than 90 percent of retail transactions were conducted in physical stores, according to the U.S. Census Bureau. Perhaps there was a way to tap this vast reservoir of sales with a completely self-service physical store that harnessed emerging technologies like computer vision and robotics. In 2012, Bezos pitched this broad idea at an off-site meeting to the S-team.

Bezos liked to say Amazon was “stubborn on vision, flexible on details,” and here was an illustration: groups working on parallel tracks would essentially compete to fulfill the “Just Walk Out” ideal and solve the problem of the cashierless store. Kumar’s group continued to develop a store with futuristic computer vision technology embedded in the ceilings and shelves. Meanwhile, Kessel asked Jeremy De Bonet, an Amazon technology director based in Boston, to form his own internal startup of engineers and computer vision scientists. They would end up flipping the problem around and integrating computer vision technology and sensors into a shopping cart, instead of blanketing them around the store. In some ways, this was a harder problem. While the Go store could partially deduce the identity of an item based on where in the store it was located, a so-called “smart cart” would have to account for the possibility of a shopper selecting, say, a bag of oranges from the produce aisle but scanning them somewhere else in the store.

But Bezos didn’t want them to take an easy path; he wanted them to innovate in the field of computer vision, which he saw as important to Amazon’s future. So they settled on the idea of cameras in the ceiling and algorithms behind the scenes that would try to spot when customers selected products and charge them for it. Scales hidden inside the shelves would provide another reliable sensor to determine when products were being removed and corroborate who was buying what. Over the next few years, Dilip Kumar recruited experts from outside Amazon, such as the University of Southern California’s renowned computer vision scientist, Gérard Medioni, as well as engineers from inside who worked on complex technologies like Amazon’s pricing algorithms.

Demystifying Smart Cities
by Anders Lisdorf

We have no idea how the neural net splits up the information into discrete patterns and really no way of knowing. We can only stand by and watch the output and decide whether it is accurate. It is great for situations where a lot of information has to be condensed into categorical knowledge like computer vision, where we are interested in parsing an image and extracting data about objects in the image. Use cases for cities would be in computer vision where images could be converted into counts of pedestrians, cars, and bicycles or for allowing speech interaction with city services. The resident could speak, and the input converted into text that could then be processed as questions and answers.

Consider traffic counts: most cities need to manually count traffic to understand the flow of traffic in central corridors. This is a tedious and resource-intensive task, since you need a human to manually count each vehicle. IoT offers alternatives such as counting vehicles by pneumatic tubes on the ground, infrared light, and radar or using computer vision built in to cameras. This can be done at a much bigger scale, since humans need to go and sleep every once in a while, whereas devices never sleep and will keep counting when they are set up, and they will be doing it at a lower cost. This makes devices an attractive alternative for cities. However, devices are also liabilities in terms of security.

This is a good case to show how simple tracking of vehicles can give insight and transparency to a concern for city residents, but also why privacy concerns can impact a solution. Exteros This New York startup has developed a device that can count and categorize people, for example, in a shopping mall. It uses computer vision and artificial intelligence to categorize and count people as they move through the field of vision. The device is based on a Raspberry Pi, a camera, and a 3D printed case. This is a great example of the flexibility and availability of components to make innovative IoT solutions today. Earlier a vendor would have had to find an adequate camera and microcontroller.

From Airline Reservations to Sonic the Hedgehog: A History of the Software Industry
by Martin Campbell-Kelly
Published 15 Jan 2003

There were also two major cross-industry markets: office automation and computer-aided design. The market leaders were Wang Laboratories and Computer Vision, respectively. Both went beyond writing software and integrating it with hardware; they also manufactured their own hardware. The Shaping of the Software Products Industry 129 The turnkey supplier epitomizes the problem of measuring the size of the software industry. For example, Wang, the leading producer of word processors, was usually classified as a hardware supplier, whereas Computer Vision was regarded as a software vendor, yet these firms provided similar packages of hardware and software.

Wang suffered huge losses in the late 1980s and filed for Chapter X bankruptcy in 1992. Computer Vision and CAD Systems What the word processor did for the ordinary office, the computeraided design system did for the engineering design office. But whereas the turnkey word processor displaced an earlier generation of noncomputerized word processors, the CAD system came out of nowhere. CAD was made possible by the minicomputer, the graphical display tube, and the graph-plotter, all which became affordable in the late 1960s. The first entrant into the CAD turnkey market was Computer Vision, founded in Bedford, Massachusetts, in 1969.60 Rather than develop a software package to run on a commercially available minicomputer, Computer Vision developed its own processor, which was specially designed for graphics work and was therefore more effective than a general-purpose minicomputer.61 This initial advantage was short-lived.

The first entrant into the CAD turnkey market was Computer Vision, founded in Bedford, Massachusetts, in 1969.60 Rather than develop a software package to run on a commercially available minicomputer, Computer Vision developed its own processor, which was specially designed for graphics work and was therefore more effective than a general-purpose minicomputer.61 This initial advantage was short-lived. In the early 1970s, Computer Vision was joined in the CAD market by Intergraph, Calman, Applicon, Auto-trol, and Gerber, all of whose systems used standard minicomputers. IBM, DEC, Prime, and Data General also provided systems for the booming turnkey market. CAD systems were expensive, costing perhaps 10 times as much per user as word processing systems.

Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps
by Valliappa Lakshmanan , Sara Robinson and Michael Munn
Published 31 Oct 2020

Embeddings Hashed Feature Neutral Class Multimodal Input Transfer Learning Two-Phase Predictions Cascade Windowed Inference Computer Vision Computer vision is the broad parent name for AI that trains machines to understand visual input, such as images, videos, icons, and anything where pixels might be involved. Computer vision models aim to automate any task that might rely on human vision, from using an MRI to detect lung cancer to self-driving cars. Some classical applications of computer vision are image classification, video motion analysis, image segmentation, and image denoising. Reframing Neutral Class Multimodal Input Transfer Learning Embeddings Multilabel Cascade Two-Phase Predictions Predictive Analytics Predictive modeling uses historical data to find patterns and determine the likelihood of a certain event occurring in the future.

MLPerf name and logo are trademarks. See www.mlperf.org for more information. 3 Jia Deng et al.,“ImageNet: A Large-Scale Hierarchical Image Database,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) (2009): 248–255. 4 For more information, see “CS231n Convolutional Neural Networks for Visual Recognition.” 5 Victor Campos et al., “Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster,” International Conference on Computational Science, ICCS 2017, June 12–14, 2017. 6 Ibid. 7 Jeffrey Dean et al. “Large Scale Distributed Deep Networks,” NIPS Proceedings (2012). 8 Priya Goyal et al., “Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour” (2017), arXiv:1706.02677v2 [cs.CV].

, Data and Model Tooling, Concept, Saving predictions Cloud AI Platform Pipelines, Solution, Running the pipeline on Cloud AI Platform Cloud AI Platform Predictions, Lambda architecture Cloud AI Platform Training, Solution, Running the pipeline on Cloud AI Platform Cloud Build, Integrating CI/CD with pipelines Cloud Composer/Apache Airflow, Scheduled retraining Cloud Dataflow, Lambda architecture Cloud Functions, Triggers for retraining, Integrating CI/CD with pipelines Cloud Run, Create web endpoint, Other serverless versioning tools Cloud Spanner, Cached results of batch serving clustering, Models and Frameworks clustering models, Models and Frameworks CNN, Images as tiled structures, Why It Works-Why It Works cold start, Problem, Cold start combinatorial explosion, Grid search and combinatorial explosion completeness, Data Quality components, definition of, Solution computer vision, Computer Vision concept drift, Problem, Estimating retraining interval confidence, Inputs with overlapping labels, When human experts disagree, Saving predictions confusion matrix, Problem, Evaluating model performance consistency, Data Quality-Data Quality containers, Design Pattern 25: Workflow Pipeline, Solution, Why It Works context language models, Context language models-Context language models(see also BERT, Word2Vec) Continued Model Evaluation design pattern, Design Patterns for Resilient Serving, Design Pattern 18: Continued Model Evaluation-Estimating retraining interval, Model versioning with a managed service, Responsible AI, Automating data evaluation, Pattern Interactions Continuous Bag of Words (see CBOW) continuous evaluation, Continuous evaluation-Continuous evaluation continuous integration and continuous delivery (see CI/CD) convolutional neural network (see CNN) Coral Edge TPU, Phase 1: Building the offline model counterfactual analysis, Counterfactual analysis and example-based explanations-Counterfactual analysis and example-based explanations counterfactual reasoning, Capturing ground truth cryptographic algorithms, Cryptographic hash custom serving function, Custom serving function D DAG, Why It Works, Apache Airflow and Kubeflow Pipelines Darwin, Charles, Genetic algorithms data accuracy, Data Quality data analysts, Roles data augmentation, Data augmentation data collection bias, Before training, Before training data distribution bias, Problem data drift, Data Drift-Data Drift, Problem, Estimating retraining interval, Continuous evaluation for offline models, Problem data engineers, Roles, Scale, Solution data parallelism, Solution-Solution, Synchronous training, Why It Works, Model parallelism data preprocessing, Data and Feature Engineering(see also data transformation, feature engineering) data representation, Data Representation Design Patterns-Data Representation Design Patterns data representation bias, Before training data scientistsrole of, Roles, Multiple Objectives-Multiple Objectives, Why It Works, Problem tasks of, Problem, Problem, Solution data transformation, Data and Feature Engineering data validation, Data and Feature Engineering, Data validation with TFX data warehouses, Embeddings in a data warehouse-Embeddings in a data warehouse dataset-level transformations, Efficient transformations with tf.transform datasets, definition of, Data and Feature Engineering Datastore, Cached results of batch serving decision trees, Models and Frameworks, Data Representation Design Patterns-Data Representation Design Patterns, Decreased model interpretability, Choosing a model architecture, Typical Training Loop, Solution Deep Galerkin Method, Data-driven discretizations-Unbounded domains deep learning, Models and Frameworks-Models and Frameworks, Multimodal feature representations and model interpretability deep neural network (see DNN model) default, definition of, Model versioning with a managed service Dense layers, Solution, Using images with metadata design patterns, definition of, What Are Design Patterns?

pages: 326 words: 88,968

The Science and Technology of Growing Young: An Insider's Guide to the Breakthroughs That Will Dramatically Extend Our Lifespan . . . And What You Can Do Right Now
by Sergey Young
Published 23 Aug 2021

Your shower runs a full-body scan, before your ultrasound bathroom scale checks your organs, soft tissues, and arteries for any signs of tumors, disease, and obstructed blood flow. Your diagnostic toothbrush and microbiome-monitoring commode watch for dangerous changes in your cells and gut, while your computer-vision-equipped bedroom mirror checks your skin for potentially dangerous mole growth. As you sit down for breakfast, a tiny chip embedded at the tip of a blood vessel just beneath the surface of your skin tracks nutrients, immune cells, vitamins, minerals, foreign substances, and disease indicators.

There are bathroom scales that measure your body fat percentage and hydration level, home blood tests that monitor your cholesterol24 and blood glucose, and even home tests that help diagnose STDs, allergies, and food intolerances. Smartphone apps like UM SkinCheck, Miiskin, and MoleMapper leverage your smartphone’s camera and computer-vision AI to offer early guidance and detection of skin cancer. DIY health diagnostic devices like these are becoming increasingly portable, wearable, implantable, ingestible, and affordable. They are also becoming vastly more sophisticated. In 2018, the FDA granted approval for some of the Apple smart watch functionality, which now includes blood oxygen level readings and an electrocardiogram (ECG) monitoring function, to help detect atrial fibrillation, the most common heart rhythm disorder.25 (In 2019, a doctor friend of mine who travels frequently was called upon five times to assist in an in-flight medical emergency, and he used his Apple Watch to take an ECG every time.)

For these data to be used effectively, they must be cross-referenced with scores of pharmaceutical options, surgical treatments, lifestyle adjustments, and other interventions. To quote a viral YouTube video, “Ain’t nobody got time for that.” This is where artificial intelligence enters the picture. If you are familiar with terms like computer vision, deep neural networks, and machine learning, you probably already have a good sense of what happens next. I won’t clutter up the chapter with an AI primer. But AI is rapidly advancing to make precision medicine truly possible. Here are some examples of AI in action: 1.AI Case Study #1: Continuous Monitoring in the UK In the UK, more than one million people live with chronic obstructive pulmonary disease (COPD).

pages: 444 words: 117,770

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma
by Mustafa Suleyman
Published 4 Sep 2023

The resulting paper by Hinton and his colleagues became one of the most frequently cited works in the history of AI research. Thanks to deep learning, computer vision is now everywhere, working so well it can classify dynamic real-world street scenes with visual input equivalent to twenty-one full-HD screens, or about 2.5 billion pixels per second, accurately enough to weave an SUV through busy city streets. Your smartphone recognizes objects and scenes, while vision systems automatically blur the background and highlight people in your videoconference calls. Computer vision is the basis of Amazon’s checkout-less supermarkets and is present in Tesla’s cars, pushing them toward increasing autonomy.

The challenge of managing the coming wave’s technologies means understanding them and taking them seriously, starting with the one I have spent my career working on: AI. THE AI SPRING: DEEP LEARNING COMES OF AGE AI is at the center of this coming wave. And yet, since the term “artificial intelligence” first entered the lexicon in 1955, it has often felt like a distant promise. For years progress in computer vision, for example—the challenge of building computers that can recognize objects or scenes—was slower than expected. The legendary computer science professor Marvin Minsky famously hired a summer student to work on an early vision system in 1966, thinking that significant milestones were just within reach.

Keep doing this, modifying the weights again and again, and you gradually improve the performance of the neural network so that eventually it’s able to go all the way from taking in single pixels to learning the existence of lines, edges, shapes, and then ultimately entire objects in scenes. This, in a nutshell, is deep learning. And this remarkable technique, long derided in the field, cracked computer vision and took the AI world by storm. AlexNet was built by the legendary researcher Geoffrey Hinton and two of his students, Alex Krizhevsky and Ilya Sutskever, at the University of Toronto. They entered the ImageNet Large Scale Visual Recognition Challenge, an annual competition designed by the Stanford professor Fei-Fei Li to focus the field’s efforts around a simple goal: identifying the primary object in an image.

pages: 404 words: 95,163

Amazon: How the World’s Most Relentless Retailer Will Continue to Revolutionize Commerce
by Natalie Berg and Miya Knights
Published 28 Jan 2019

The removal of any human interface from the most friction-filled process of any store-based shopping journey, ie checkout, affords the customer unprecedented speed and simplicity. Autonomous computing: AI-based computer vision, sensor fusion and deep learning technologies power Amazon Go’s Just Walk Out technology. Just Walk Out technology operates without manual intervention, eliminating the need for checkout staff or hardware. It also eliminates shrinkage as a major source of loss for traditional brick and mortar retailers. Customers are charged with whatever goods they walk out with, even if they try to hide the fact from the store’s extensive computer vision camera systems. The untapped potential of voice It’s taken a while to get here.

Even so, robots won’t be replacing humans altogether anytime soon. From self-checkout to no checkout So, we come to Amazon Go, which first opened its doors in Seattle in 2018. The computer vision-equipped, AI-powered store uses Amazon’s patented ‘Just Walk Out’ technology to enable customers to literally walk out with their goods without having to go through any checkout process at all. Customers have to scan their Amazon Go app to gain entry and register a form of payment that is charged when they leave according to what the computer vision systems detect they have taken from the shelves. Its beauty is Amazon knows precisely who is in its store and what they do at every move, while the technology eliminates shrink.

Addresses the perennial headache that is online returns, while driving footfall to Kohl’s. We expect this to be rolled out internationally. No 2018 Amazon Go Retail First checkout-free store. Shoppers scan their Amazon app to enter. The high-tech convenience store uses a combination of computer vision, sensor fusion and deep learning to create a frictionless customer experience. No 2019 and beyond Fashion or furniture stores would be a logical next step NOTE Amazon Go officially opened its doors to the public in 2018 SOURCE Amazon; author research as of June 2018 However, it was Amazon’s rather ironic launch of physical bookstores in 2015 that marked a genuine shift in strategy, as this was the first time Amazon mimicked digital merchandising and pricing in a physical setting.

pages: 175 words: 54,755

Robot, Take the Wheel: The Road to Autonomous Cars and the Lost Art of Driving
by Jason Torchinsky
Published 6 May 2019

He was able to prove that with the delay, the rover would not be reliably controllable at speeds over 0.2 mph, which is, as you can guess, really, really slow.¹⁷ To overcome this, experiments began to attempt to give the cart its own ability to “see” its environment, detect obstacles, and take steps to avoid them. This was the birth of nearly all computer vision systems employed by autonomous vehicles (and, really, any robot that uses some manner of camera-based synthetic vision) today. By 1964 the cart had been re-outfitted with a low-power television transmitter that broadcast TV signals to a PDP-6¹⁸ computer to process the images. With this setup, which I’m dramatically simplifying here, the cart was able to visually follow a high-contrast white line on the road at about 0.8 mph. This was a big deal, as it represented real computer vision controlling a moving machine, even if it was quite crude.

Extending from the two engine air scoops on each side of the nose of the car are probes or antennas which pick up wave impulses from the conductor strip in the center of the control lane.¹³ Of course, none of these things actually worked, but it is interesting to see how the very sticky problems of computer vision could be avoided if fully autonomous operation is limited to areas where an infrastructure has been built to guide the cars. No mention is made of obstacle avoidance or anything like that; presumably, it is the job of the singing gentlemen in the control towers to make sure everything is running smoothly and to warn drivers to stop if they’re approaching a broken-down vehicle or a coyote on the road or something else they don’t want to barrel through.

The cart was eventually able to navigate a chair-filled room in about five hours, and while it may be tempting to laugh at that idea now, I can think of plenty of times I’ve not been able to navigate a chair-filled room without running into half the chairs and looking like an idiot. 1977: Tsukuba Mechanical Engineering Lab, Japan Arguably the first fully autonomous, computer-vision-controlled car was shown in 1977 by the Tsukuba Mechanical Engineering Lab, in Japan. The project, headed by Sadayuki Tsugawa,¹⁹ modified a full-size car to follow special white road markings and was able to drive at speeds of nearly 20 mph. While still essentially a follower of specially contrived external visual guides, the fact that this technology was implemented in a full-size car driving at a reasonable speed (compared to, say, the Stanford Cart) and using computer-interpreted visual information made this a significant milestone. 1980s: Ernst Dickmanns: The Man Who Made Cars See If the overall concept of vehicles driven via true computer “vision” can be said to have a father, that father would have a German accent and a hilarious last name: Dickmanns.

pages: 161 words: 39,526

Applied Artificial Intelligence: A Handbook for Business Leaders
by Mariya Yao , Adelyn Zhou and Marlene Jia
Published 1 Jun 2018

The fields of pathology and radiology, both of which rely largely on trained human eyes to spot anomalies, are being revolutionized by advancements in computer vision. Pathology is especially subjective, with studies showing that two pathologists assessing the same slide of biopsied tissue will only agree about 60 percent of the time.(25) Researchers at Houston Methodist Research Institute in Texas announced an AI system for diagnosing breast cancer that utilizes computer vision techniques optimized for medical image recognition,(26) which interpreted patient records with a 99 percent accuracy rate.(27) In radiology, 12.1 million mammograms are performed annually in the United States, but half yield false positive results, which means that one in two healthy women may be wrongly diagnosed with cancer.

Much media attention has been focused on deep learning, and an increasing number of sophisticated technology companies have successfully implemented deep learning for enterprise-scale products. Google replaced previous statistical methods for machine translation with neural networks to achieve superior performance.(4) Microsoft announced in 2017 that they had achieved human parity in conversational speech recognition.(5) Promising computer vision startups like Clarifai employ deep learning to achieve state-of-the-art results in recognizing objects in images and video for Fortune 500 brands.(6) While deep learning models outperform older machine learning approaches to many problems, they are more difficult to develop because they require robust training of data sets and specialized expertise in optimization techniques.

We’ve probably all run into organizations that continue to insist on handwritten forms that some poor intern must then painstakingly enter into legacy databases. The need for manual entry creates a bottleneck and increases the risk of error, especially as the prevalence of keyboards has sent handwriting legibility into a steep decline. To deal with this problem, HyperScience utilizes advanced computer vision techniques to scan and process handwritten forms to eliminate the data entry bottleneck. Once a form is scanned, their software cleans the image, matches the format to the correct form, then extracts and stores the relevant information in the correct database. General Operations Most companies have tons of repetitive digital workflows.

Artificial Whiteness
by Yarden Katz

Brian Wallis, “Black Bodies, White Science: Louis Agassiz’s Slave Daguerreotypes,” American Art 9, no. 2 (1995): 39–61.   33.   Anh Nguyen, Jason Yosinski, and Jeff Clune, “Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 427–36.   34.   Oriol Vinyals et al., “Show and Tell: A Neural Image Caption Generator,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 3156–64.   35.   The incident occurred during a protest in the village of Nabi Saleh in the West Bank in the summer of 2015. See “A Perfect Picture of the Occupation,” editorial, Haaretz, August 31, 2015.   36.   

AI practitioners have steadily produced racial, gendered, and classed models of the self that both reflect and recharge these projects. This raises the question: Can something like what AI purports to be—an attempt to understand ourselves, our minds, and behavior in computational terms—be pursued differently? Might there be radical computational visions that shed AI’s epistemic forgeries and oppose the imperial and capitalist interests that have sustained that endeavor? Since AI has been contested from the start, even by insiders, can we find alternatives in these critiques? Or is the endeavor, by virtue of its framing around computation, or its place in a scientific tradition intimately connected to state power and capital, so compromised that it will merely reproduce AI’s flaws?

Lake et al., “Building Machines That Learn and Think Like People,” Behavioral and Brain Sciences 40, no. E253 (2016): 1–101.   29.   Lake et al., 8.   30.   DeepMind, “AlphaGo Zero.”   31.   Microsoft COCO is described in Tsung-Yi Lin et al., “Microsoft COCO: Common Objects in Context,” in Proceedings of the European Conference on Computer Vision, 2014, 740–55. For a similar data set, the “Caltech” objects, see Fei-Fei Li, Rob Fergus, and Pietro Perona, “One-Shot Learning of Object Categories,” IEEE Transactions on Pattern Analysis and Machine Intelligence 28, no. 4 (2006): 594–611.   32.   Brian Wallis, “Black Bodies, White Science: Louis Agassiz’s Slave Daguerreotypes,” American Art 9, no. 2 (1995): 39–61.   33.   

pages: 260 words: 67,823

Always Day One: How the Tech Titans Plan to Stay on Top Forever
by Alex Kantrowitz
Published 6 Apr 2020

Facebook, at the time, was unable to build features like tag suggestions on its own, because identifying faces in photos required expertise in machine learning that Facebook did not have. Hirsch and Shochat, meanwhile, were applying computer vision brilliantly within Zuckerberg’s own product, and he was eager to learn more about what they were doing. “Zuck was curious from the get-go,” Shochat told me. “He knew that something interesting was going on there, and he wanted to be close to such technology.” For the next ninety minutes, Zuckerberg interrogated Hirsch and Shochat about the future of computer vision and facial recognition. And as the conversation wrapped, his focus turned to acquisition. “If it makes sense, we should make this work,” he said before walking out.

Go has no lines, no waiting, and no cashiers. It feels like the future, and it very well might be. Go is powered by some impressive technology, much of which you can see by looking up. Cameras and sensors line its ceilings, pointing every which way to capture your body and its movements as you walk its aisles. Using computer vision (a subset of machine learning), Go figures out who you are, what you’ve taken, and what you’ve put back. Then it charges you. The store is almost always accurate, as I’ve found in my various attempts to trick it. No matter the method, be it concealing products or running in and out at my top speed (sixteen seconds total visit time), Go has never missed an item.

When a robot passes over a code, it’s instructed either to wait or to move to the next QR code, where it’s given more instructions. The system knows how fast each picker and stower works, and automatically sends more robots to the faster workers and fewer to the slower ones. At another FC I visited, in Kent, Washington, the robots stop in front of cameras that scan the racks, assess the amount of space left (using computer vision), and determine when they should be sent back for more stowing (or sent to a problem-solving team when items look askew). As pickers work, some compete voluntarily in “FC games,” which rank them on speed. The employees I met at the two FCs I visited seemed to be in good spirits and happy to work at Amazon.

Succeeding With AI: How to Make AI Work for Your Business
by Veljko Krunic
Published 29 Mar 2020

Let’s see what the ML community as a whole (all academics and quite a few industry practitioners) has achieved on probably the most widely used dataset in computer vision today. That Why we need to analyze the ML pipeline 127 dataset is a Modified National Institute of Standards and Technology (MNIST) dataset [102], which consists of 60,000 handwritten digits from 0 to 9.2 The MNIST dataset was often used to benchmark computer vision algorithms. The AI community has tracked the accuracy of the various computer vision algorithms on the MNIST dataset. According to LeCun et al. [102] and Benenson [104], algorithm improvements by the community between 1998 and 2013 resulted in the accuracy of digit recognition improving by only 2.19%—the error rate declined from 2.4% [102] to 0.21% [104].

Professional interpretation is also costly and something that your hospital would save money on if you could make an alternate system that’s helpful when diagnosing eye diseases. This use case is worth further investigation. Further research from your data science team shows that there has been significant progress in the application of computer vision to medical diagnosis. You find that Google’s team created an AI capable of diagnosing cases of moderate to severe diabetic retinopathy [49]. You have enough data from past optometry exams that you can train AI on that data. To make sure the Sense/Analyze/React loop is applicable in this use case, you need to cover only the Sense part.

According to LeCun et al. [102] and Benenson [104], algorithm improvements by the community between 1998 and 2013 resulted in the accuracy of digit recognition improving by only 2.19%—the error rate declined from 2.4% [102] to 0.21% [104]. Although the 2.19% better accuracy for digit recognition the community has achieved on MNIST is a significant improvement for computer vision algorithms, how relevant is 2.19% to us? We have to remember that, in our use case, 5% of our data is wrong. Moreover, the improvement in vision algorithms came at a significant cost. Some of the algorithms used to achieve that 2.19% improvement were the result of the best efforts of the entire ML community!

pages: 252 words: 74,167

Thinking Machines: The Inside Story of Artificial Intelligence and Our Race to Build the Future
by Luke Dormehl
Published 10 Aug 2016

What the X stood for depended on who was doing the asking. One researcher wrote a checkers program capable of beating most amateurs, including himself. Another breakthrough included a perceptive AI able to rearrange coloured, differently shaped blocks on a table using a robotic hand: an astonishing feat in computer vision. A program called SAINT proved able to solve calculus integration problems of the level found on a first-year college course. Another, called ANALOGY, did the same for the geometric questions found in IQ tests, while STUDENT cracked complex algebra story conundrums such as: ‘If the number of customers Tom gets is twice the square of 20 per cent of the number of advertisements he runs, and the number of advertisements he runs is 45, what is the number of customers Tom gets?

The garage door had to be left open for ventilation. It must have seemed innocuous at the time, but over the next two decades, Larry Page and Sergey Brin’s company would make some of the biggest advances in AI history. These spanned fields including machine translation, pattern recognition, computer vision, autonomous robots and far more, which AI researchers had struggled with for half a century. Virtually none of it was achieved using Good Old-Fashioned AI. The company’s name, of course, was Google. CHAPTER 2 Another Way to Build AI IT IS 2014 and, in the Google-owned London offices of an AI company called DeepMind, a computer whiles away the hours by playing an old Atari 2600 video game called Breakout.

In a strikingly prescient 1958 article, marred by the hyperbolic title, ‘Human Brains Replaced?’, a writer for Science magazine gushed: ‘Perceptrons may eventually be able to learn, make decisions, and translate languages.’ A New Yorker article meanwhile quoted Rosenblatt as saying perceptrons should prove capable of telling ‘the difference between a dog and a cat’ using computer vision. In 1960, Rosenblatt oversaw the creation of an ‘alpha-perceptron’ computer called the MARK I, for which he received sponsorship from the Information Systems Branch of the Office of Naval Research. It became one of the first computers in history to be able to acquire new skills through trial and error.

pages: 326 words: 74,433

Do More Faster: TechStars Lessons to Accelerate Your Startup
by Brad Feld and David Cohen
Published 18 Oct 2010

We didn't need anyone else's money. We already had what we needed, which was a core competency in computer vision, a technology area that we believed had incredible intrinsic value. In fact, we were borderline arrogant about it—we hypothesized that we could just hack off a tiny chunk of this technology and turn it into revenue. We tested this, stayed small, and launched ClearCam on February 3, 2009. ClearCam is a $10 iPhone application that captures high-resolution photos with the aid of computer vision. ClearCam was popular and we immediately were cash-flow positive. Near-death averted and hypothesis reinforced.

In addition to WordPress, YouTube, Google Apps, and Skype, the TechStars companies told us that they routinely use the following free or very inexpensive products: Balsamiq for screen prototyping DimDim for web meetings DropBox for file storage and sharing Evernote for organizing tidbits of information Gist for keeping on top of your contacts GitHub for source code sharing Jing for screencasting MogoTest (TechStars 2009) for making sure your applications look great on every browser Pivotal Tracker for issue tracking SendGrid (TechStars 2009) for e-mail delivery SnapABug (TechStars 2009) for chatting with customers who visit your web site Twilio for audio conferencing and phone and SMS services Vanilla (TechStars 2009) for hosting a great forum for your community Be Tiny Until You Shouldn't Be Jeffrey Powers Jeffrey is a co-founder of Occipital, which uses state of the art computer vision in mobile applications for faster information capture and retrieval. On June 23, 2010, Occipital sold its RedLaser product line to eBay. Occipital remains an independent company. In December 2008, the situation for Occipital was dire. We had a $10,000 deferred legal bill, dried up personal bank accounts, and no revenue.

That led to a near-merger with a group of seasoned entrepreneurs and another failed attempt at getting investors excited. The chip on our shoulder got bigger and led us to hack off a slightly larger chunk of technology than ClearCam. This turned into RedLaser, the first iPhone barcode scanner that really worked because it used computer vision to compensate for blur. The response to our new product blew us away and RedLaser claimed a position in the top five paid applications on the iPhone App Store for many months. Today, we're more confident than ever about the technology area we have focused on, we have a growing reputation with consumers, and we have the money to stop worrying about the premature death of the company.

Mastering Machine Learning With Scikit-Learn
by Gavin Hackeling
Published 31 Oct 2014

Another disadvantage of the hashing trick is that the resulting model is more difficult to inspect, as the hashing function cannot recall what input token is mapped to each element of the feature vector. [ 62 ] www.it-ebooks.info Chapter 3 Extracting features from images Computer vision is the study and design of computational artifacts that process and understand images. These artifacts sometimes employ machine learning. An overview of computer vision is far beyond the scope of this book, but in this section we will review some basic techniques used in computer vision to represent images in machine learning problems. Extracting features from pixel intensities A digital image is usually a raster, or pixmap, that maps colors to coordinates on a grid.

A model trained on our basic feature representations might not be able to recognize the same zero if it were shifted a few pixels in any direction, enlarged, or rotated a few degrees. Furthermore, learning from pixel intensities is itself problematic, as the model can become sensitive to changes in illumination. For these reasons, this representation is ineffective for tasks that involve photographs or other natural images. Modern computer vision applications frequently use either hand-engineered feature extraction methods that are applicable to many different problems, or automatically learn features without supervision problem using techniques such as deep learning. We will focus on the former in the next section. [ 64 ] www.it-ebooks.info Chapter 3 Extracting points of interest as features The feature vector we created previously represents every pixel in the image; all of the informative attributes of the image are represented and all of the noisy attributes are represented too.

pages: 661 words: 156,009

Your Computer Is on Fire
by Thomas S. Mullaney , Benjamin Peters , Mar Hicks and Kavita Philip
Published 9 Mar 2021

These include but are not limited to: • A variety of what I would describe as “first-order,” more rudimentary, blunt tools that are long-standing and widely adopted, such as keyword ban lists for content and user profiles, URL and content filtering, IP blocking, and other user-identifying mechanisms;13 • More sophisticated automated tools such as hashing technologies used in products like PhotoDNA (used to automate the identification and removal of child sexual exploitation content; other engines based on this same technology do the same with regard to terroristic material, the definitions of which are the province of the system’s owners);14 • Higher-order AI tools and strategies for content moderation and management at scale, examples of which might include: ◦ Sentiment analysis and forecasting tools based on natural language processing that can identify when a comment thread has gone bad or, even more impressive, when it is in danger of doing so;15 ◦ AI speech-recognition technology that provides automatic, automated captioning of video content;16 ◦ Pixel analysis (to identify, for example, when an image or a video likely contains nudity);17 ◦ Machine learning and computer vision-based tools deployed toward a variety of other predictive outcomes (such as judging potential for virality or recognizing and predicting potentially inappropriate content).18 Computer vision was in its infancy when I began my research on commercial content moderation. When I queried a computer scientist who was a researcher in a major R&D site at a prominent university about the state of that art some years ago and how it might be applied to deal with user-generated social media content at scale, he gestured at a static piece of furniture sitting inside a dark visualization chamber and remarked, “Right now, we’re working on making the computer know that that table is a table.”

Historically, these training sets have treated the white male adult face as a default against which computer vision algorithms are trained.37 Databases of missing and abused children held by NCMEC in the US disproportionately contain images of white children, and nonwhite children are statistically less likely to be reported as missing or have extensive case files of data. In examining how image-recognition software and content reviewers “see” abuse images, I consider how skin tone, as a category for detection, is made manifest as digital racial matter.38 In the course of my fieldwork, computer vision researchers often remark that they earnestly want to address the “problem” of underdetectability, or even undetectability, of darker skin tones, Black and East Asian features, and younger ages.

Different actors—be they law enforcement investigators, digital forensic startups, social media company reviewer teams, or the outsourced content review workers who are contracted by larger corporations to make the first reviews of potentially abusive images—learn to adapt shared ways of seeing images. Cristina Grasseni emphasizes that “one never simply looks. One learns how to look.”28 Such ways of seeing,29 distributed and honed across human and computer vision, become manifest as ways of accessing the world and managing it. Seeing Like an Image-Recognition Algorithm The image forensics software used by NCMEC searches through an in-house database of known images of child pornography to see if the new image might be similar, or even identical, to an image that has already gone through that system.

pages: 263 words: 81,527

The Mind Is Flat: The Illusion of Mental Depth and the Improvised Mind
by Nick Chater
Published 28 Mar 2018

Yet progress seemed slower, and the challenges far greater, than had been imagined. By the 1970s, serious doubts began to set in; by the 1980s, the programme of mining and systematizing knowledge started to grind to a halt. Indeed, the project of modelling human intelligence has since been quietly abandoned, in favour of specialist projects in computer vision, speech-processing, machine translation, game-playing, robotics and self-driving vehicles. Artificial intelligence since the 1980s has been astonishingly successful in tackling these specialized problems. This success has come, though, from completely bypassing the extraction of human knowledge into common-sense theories.

Neisser’s first intriguing finding was that, from the start, people found this apparently substantial complication of the task no problem at all – they were easily able to lock their attention onto one stream of video and ignore the other. The brain was able to monitor one video almost as if the other superimposed video was not there at all. By contrast, for current computer vision systems, ‘unscrambling’ the scenes, and attending to one and ignoring the other, would be enormously challenging. But Neisser’s second finding was the real surprise. He added a highly salient, and unexpected, event during the course of the video: a woman carrying a large umbrella strolled into view among the players, walked right across the scene, before disappearing from view.

To the extent that the brain focuses on just a few constraints, satisfies them as well as possible, and then looks at the remaining constraints, there is a real danger of heading up a cul-de-sac – the next constraints may not fit our tentative interpretation at all, and it will then have to be abandoned. The task of simultaneously matching a huge number of clues and constraints is just what the brain’s cooperative style of computation is wonderfully good at. But these are the calculations that our imagined computer vision program would have to carry out – and which, we can conjecture, the brain must carry out in order to create Idesawa’s spiky sphere. It turns out, in fact, that the brain may be particularly well adapted to solving problems in which large numbers of constraints must be satisfied simultaneously.

pages: 288 words: 86,995

Rule of the Robots: How Artificial Intelligence Will Transform Everything
by Martin Ford
Published 13 Sep 2021

DEEP LEARNING AND THE FUTURE OF ARTIFICIAL INTELLIGENCE 1. Martin Ford, Interview with Geoffrey Hinton, in Architects of Intelligence: The Truth about AI from the People Building It, Packt Publishing, 2018, pp. 72–73. 2. Matt Reynolds, “New computer vision challenge wants to teach robots to see in 3D,” New Scientist, April 7, 2017, www.newscientist.com/article/2127131-new-computer-vision-challenge-wants-to-teach-robots-to-see-in-3d/. 3. Ashlee Vance, “Silicon Valley’s latest unicorn is run by a 22-year-old,” Bloomberg Businessweek, August 5, 2019, www.bloomberg.com/news/articles/2019-08-05/scale-ai-is-silicon-valley-s-latest-unicorn. 4.

This data gusher would soon intersect with the latest machine learning algorithms to enable a revolution in artificial intelligence. One of the most consequential new data troves resulted from the efforts of a young computer science professor at Princeton University. Fei-Fei Li, whose work was focused on computer vision, realized that teaching machines to make visual sense of the real world would require a comprehensive teaching resource with properly labeled examples showing many variations of people, animals, buildings, vehicles, objects—and just about anything else one might encounter. Over a two-and-a-half-year period, she set out to give titles to more than three million images across over 5,000 categories.

Facial recognition, in particular, is being widely deployed in the United States and other democratic countries and has already led to intense debate and accusations of bias and misuse. These issues will become only more fraught as the technology continues to become more powerful and—unless it is strictly regulated—ubiquitous. CHINA’S LEAP TO THE FOREFRONT OF ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT In June 2018, a major conference on computer vision was held in Salt Lake City, Utah. In the six years since the famous 2012 ImageNet competition, the field of machine vision had advanced dramatically, and researchers were now focused on solving far more difficult problems. One of the highlights of the conference was the Robust Vision Challenge.

pages: 416 words: 112,268

Human Compatible: Artificial Intelligence and the Problem of Control
by Stuart Russell
Published 7 Oct 2019

Roughly speaking, satellites image the entire world every day at an average resolution of around fifty centimeters per pixel. At this resolution, every house, ship, car, cow, and tree on Earth is visible. Well over thirty million full-time employees would be needed to examine all these images;25 so, at present, no human ever sees the vast majority of satellite data. Computer vision algorithms could process all this data to produce a searchable database of the whole world, updated daily, as well as visualizations and predictive models of economic activities, changes in vegetation, migrations of animals and people, the effects of climate change, and so on. Satellite companies such as Planet and DigitalGlobe are busy making this idea a reality.

It is a straightforward process, using methods such as inductive logic programming,44 to create programs that propose new concepts and definitions in order to identify theories that are both accurate and concise. At present, we know how to do this for relatively simple cases, but for more complex theories the number of possible new concepts that could be introduced becomes simply enormous. This makes the recent success of deep learning methods in computer vision all the more intriguing. The deep networks usually succeed in finding useful intermediate features such as eyes, legs, stripes, and corners, even though they are using very simple learning algorithms. If we can understand better how this happens, we can apply the same approach to learning new concepts in the more expressive languages needed for science.

It searches for up to six hours in a given geographical region for any target that meets a given criterion and then destroys it. The criterion could be “emits a radar signal resembling antiaircraft radar” or “looks like a tank.” By combining recent advances in miniature quadrotor design, miniature cameras, computer vision chips, navigation and mapping algorithms, and methods for detecting and tracking humans, it would be possible in fairly short order to field an antipersonnel weapon like the Slaughterbot13 shown in figure 7 (right). Such a weapon could be tasked with attacking anyone meeting certain visual criteria (age, gender, uniform, skin color, and so on) or even specific individuals based on face recognition.

pages: 282 words: 63,385

Attention Factory: The Story of TikTok and China's ByteDance
by Matthew Brennan
Published 9 Oct 2020

The size of each video’s audience is decided predominantly by the system’s ever-changing and mysterious algorithms, and the key to gaming the system is understanding how these algorithms work. The moment a video is uploaded to TikTok, the clip and its text description are queued up to go through an automated audit. Computer vision is used to analyze and identify elements within the clip, which are then tagged and categorized with keywords. Videos suspected of violating the platform’s content guidelines are flagged for human review. The audit cross-checks the footage against a massive archive for duplicate content. This system is designed to prevent plagiarism, as well as the practice of downloading popular videos, removing the watermark, and reuploading them to a new account.

This core system for recommending written articles with Toutiao was later adapted and used for short videos with TikTok and Douyin. All these apps make use of the same ByteDance backend recommendation engine system. Videos are more challenging as they tend to be uploaded without keyword tagging or accurate titles and descriptions, making for an exciting computer vision challenge to work out what is actually in the video. The beauty of relying on recommendations to improve engagement is that it creates a virtuous cycle of continual improvement over time, often referred to as a “data network effect.” The more time spent using the app, the more enriched becomes the user profile, which leads to more accurate content matches and better user experience.

In a presentation 203 discussing the rise of Douyin, Kelly Zhang drew attention to four factors—full-screen high definition, music, special effects filters, and personalized recommendations. S martphone screens had overall gotten much larger with higher definition, greatly improving the video watching experience. Face recognition and augmented reality effects had become commonplace, which allowed for more engaging, fun special effects and filters. Image recognition and computer vision had made very considerable advances, greatly reducing the need for manual audits of inappropriate content and allowing for the classification of videos that lacked meta-data. Most relevant of all were the advances made in big data and recommendation technology in which ByteDance specialized, which leads us nicely into the next reason.

pages: 196 words: 61,981

Blockchain Chicken Farm: And Other Stories of Tech in China's Countryside
by Xiaowei Wang
Published 12 Oct 2020

And I am struck by her relationship to machines, and to her own body. In the same way hardware can have different enclosures, she says, she sees her own body as an enclosure. She performs body modification because she believes “you have to give the computer what it wants.” She anticipates a world of computer vision algorithms on video platforms that increase rankings based on the content of the video, with platforms placing “attractive women” first in search results. Naomi wants to show up first. In an ideal universe, she says, she would have a shop at Huaqiangbei, the famed electronics market of Shenzhen, known as “the market of the future.”

Each house is perfectly numbered, some with hyphens like 1-1 or 684-1. When he clicks on the house, a list of residents pops up in a small window. I ask him how the numbers are so precise, in the absence of formal addresses, and how they get the information about the residents. In my mind, I imagine some sophisticated computer vision tool that looks at the aerial image, calculates the boundary of the house, and then assigns it a number. I imagine that the city has sensors and surveillance cameras to capture how many people leave the house. I also imagine that the surveillance cameras would know the face and personal ID number of each resident, perhaps tracked all the way from their tiny rural village through the numerous cameras I see everywhere—in train stations, at vending machines, on the street.

Face recognition is a system with numerous parts, and each part is the domain of a private company—whether the one that owns the surveillance cameras used, the algorithm, or the computational power rented out on a server. The Face++ showroom has plush white carpeting and shiny white walls with inset screens. One wall features real-time camera footage from outside the showroom, in the office and outside the building. The display showcases how fast and precise Face++ computer vision algorithms are—as someone walks by the building, the algorithm detects their blue pants and umbrella. There’s also a hidden camera that you can stand in front of and the algorithm instantly classifies your age and gender. I am for some reason characterized by the Face++ algorithm as a twenty-seven-year-old male.

pages: 424 words: 114,905

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again
by Eric Topol
Published 1 Jan 2019

The other was that “at least [Deep Blue] didn’t enjoy beating me.”18 These will be important themes in our discussion of what AI can (and can’t) do for medicine. Even if Deep Blue didn’t have much of anything to do with deep learning, the technology’s day was coming. The founding of ImageNet by Fei-Fei Li in 2007 had historic significance. That massive database of 15 million labeled images would help catapult DNN into prominence as a tool for computer vision. In parallel, natural-language processing for speech recognition based on DNN at Microsoft and Google was moving into full swing. More squarely in the public eye was man versus machine in 2011, when IBM Watson beat the human Jeopardy! champions. Despite the relatively primitive AI that was used, which had nothing to do with deep learning networks and which relied on speedy access to Wikipedia’s content, IBM masterfully marketed it as a triumph of AI.

Car perception is achieved by a combination of cameras, radar, UDAR (light pulses reflected off objects), and the AI “multi-domain controller” that handles, with DNN, the inputs and the outputs of decisions. Simulating human perceptive capabilities through software is still considered a formidable challenge. Computer vision has since reduced its error rate at identifying a pedestrian from 1 out of 30 frames to 1 in 30 million frames. There’s the power of fleet learning to help, whereby the communication and sharing among all autonomous cars with the same operating system can make them smarter. There are other challenges besides perception, however.

It’s the combination of AI learning with key human-specific features like common sense that is alluring for medicine. All too commonly we ascribe the capability of machines to “read” scans or slides, when they really can’t read. Machines’ lack of understanding cannot be emphasized enough. Recognition is not understanding; there is zero context, exemplified by Fei-Fei Li’s TED Talk on computer vision. A great example is the machine interpretation of “a man riding a horse down the street,” which actually is a man on a horse sitting high on a statue going nowhere. That symbolizes the plateau we’re at for image recognition. When I asked Fei-Fei Li in 2018 whether anything had changed or improved, she said, “Not at all.”

pages: 49 words: 12,968

Industrial Internet
by Jon Bruner
Published 27 Mar 2013

Automotive Google captured the public imagination when, in 2010, it announced that its autonomous cars had already driven 140,000 miles of winding California roads without incident. The idea of a car that drives itself was finally realized in a practical way by software that has strong links to the physical world around it: inbound, through computer vision software that takes in images and rangefinder data and builds an accurate model of the environment around the car; and outbound, through a full linkage to the car’s controls. The entire system is encompassed in a machine-learning algorithm that observes the results of its actions to become a better driver, and that draws software updates and useful data from the Internet.

Silicon Valley and industry adapting to each other Nathan Oostendorp thought he’d chosen a good name for his new startup: “Ingenuitas,” derived from the Latin for “freely born” — appropriate, he thought, for a company that would be built on his own commitment to open-source software. But Oostendorp, earlier a co-founder of Slashdot, was aiming to bring modern computer vision systems to heavy industry, where the Latinate name didn’t resonate. At his second meeting with a salty former auto executive who would become an advisor to his company, Oostendorp says, “I told him we were going to call the company Ingenuitas, and he immediately said, ‘bronchitis, gingivitis, inginitis.

pages: 590 words: 152,595

Army of None: Autonomous Weapons and the Future of War
by Paul Scharre
Published 23 Apr 2018

Grainy SAR images of tanks, artillery, or airplanes parked on a runway often push the limits of human abilities to recognize objects, and historically ATR algorithms have fallen far short of human abilities. The poor performance of military ATR stands in stark contrast to recent advances in computer vision. Artificial intelligence has historically struggled with object recognition and perception, but the field has seen rapid gains recently due to deep learning. Deep learning uses neural networks, a type of AI approach that is analogous to biological neurons in animal brains. Artificial neural networks don’t directly mimic biology, but are inspired by it.

Within other parts of TensorFlow, though, lie more powerful tools to use existing neural networks or design custom ones, all within reach of a reasonably competent programmer in Python or C++. TensorFlow includes extensive tutorials on convolutional neural nets, the particular type of neural network used for computer vision. In short order, I found a neural network available for download that was already trained to recognize images. The neural network Inception-v3 is trained on the ImageNet dataset, a standard database of images used by programmers. Inception-v3 can classify images into one of 1,000 categories, such as “gazelle,” “canoe,” or “volcano.”

ROBOTS EVERYWHERE Just because the tools needed to make an autonomous weapon were widely available didn’t tell me how easy or hard it would be for someone to actually do it. What I wanted to understand was how widespread the technological know-how was to build a homemade robot that could harness state-of-the-art techniques in deep learning computer vision. Was this within reach of a DIY drone hobbyist or did these techniques require a PhD in computer science? There is a burgeoning world of robot competitions among high school students, and this seemed like a great place to get a sense of what an amateur robot enthusiast could do. The FIRST Robotics Competition is one such competition that includes 75,000 students organized in over 3,000 teams across twenty-four countries.

pages: 269 words: 70,543

Tech Titans of China: How China's Tech Sector Is Challenging the World by Innovating Faster, Working Harder, and Going Global
by Rebecca Fannin
Published 2 Sep 2019

Determined to keep a lead in cutting-edge AI technology, Baidu budgeted $300 million for a second Silicon Valley research lab in 2017, supplementing its first in 2014, and the Beijing-based titan has set up an engineering office in Seattle to focus on autonomous driving and internet security. Baidu has pumped loads of capital into AI startups in the United States with technologies for deep learning, data analytics, and computer vision. See table 2-3. “Having missed out on the social mobile and e-commerce waves of the past few years, Baidu is trying not to repeat the same mistake by going all in on AI, on all fronts,” observes Evdemon of Sinovation Ventures, the Beijing-based venture capital firm headed by AI expert and investor Kai-Fu Lee.

In China, Baidu, Alibaba, and Tencent are working on similar technologies and racing with the US tech giants to become world leaders in AI. The Ministry of Science and Technology in China has earmarked specialties for each of these Chinese tech titans in its master plan for AI global dominance: Baidu for autonomous driving, Alibaba for smart-city initiatives, and Tencent for computer vision in medical diagnoses. The Chinese government also has designated two startups to lead AI development: SenseTime for facial recognition and iFlytek for speech recognition. Baidu, Alibaba, and Tencent are all powering up in autonomous driving, and each has a specialty focus area in AI. Baidu has its DuerOS line of smart household goods and Apollo, an open platform for self-driving technology solutions, and detoured on the AI journey several years before Google in 2015.

Baidu’s AI plate includes not only 95 partners in its ecosystem worldwide working on autonomous driving but also investments in AI-related startups in the US: ZestFinance in fintech underwriting, Kitt.ai in conversational language search, TigerGraph in data link analytics, Tiger Computing Solutions in big data, and xPerception in computer vision for self-driving. Tencent has a number of AI partnerships in the health-care space globally and has invested in 12 US startups in AI, including avatar creator ObEN and two in drug discovery based on deep learning, Atomwise and XtalPi. White House Weighs In China hasn’t created any world leaders in cars or semiconductors, but few pooh-pooh its growing ability of AI fundamental technology that touches our everyday lives, from e-commerce fraud detection to systems that can detect cancer; to sensors for self-driving; to robot-powered deliveries, education, and online lending.

pages: 475 words: 134,707

The Hype Machine: How Social Media Disrupts Our Elections, Our Economy, and Our Health--And How We Must Adapt
by Sinan Aral
Published 14 Sep 2020

Video accounts for 80 percent of all consumer Internet traffic: Mary Lister, “37 Staggering Video Marketing Statistics for 2018,” Wordstream Blog, June 9, 2019, https://www.wordstream.com/​blog/​ws/​2017/​03/​08/​video-marketing-statistics. Facebook’s “visual cortex”: Manohar Paluri, manager of Facebook’s Computer Vision Group, speaking at the LDV Capital “Vision Summit” in 2017, https://www.ldv.co/​blog/​2018/​4/4/​facebook-is-building-a-visual-cortex-to-better-understand-content-and-people. “We’ve pushed computer vision to the next stage”: Joaquin Quiñonero Candela, “Building Scalable Systems to Understand Content,” Facebook Engineering Blog, February 2, 2017, https://engineering.fb.com/​ml-applications/​building-scalable-systems-to-understand-content/.

VidMob is a portfolio company of Manifest Capital, the venture fund I started in 2016 with my longtime friend and business partner Paul Falzone. I work directly with VidMob on developing its Agile Creative Studio (ACS), the leading platform for video optimization. The task of video optimization is challenging. It requires a complex combination of machine learning, computer vision, predictive modeling, and optimization. But the basic process is easy to understand. The main goal is to understand, second by second, what’s in a video, what it’s about, its context, feelings, and sentiment, and to compare the presence or absence of these elements to key performance indicators (KPIs) like video view-throughs, retention, drop-off rates, clicks, engagement, brand recognition, and satisfaction.

The main goal is to understand, second by second, what’s in a video, what it’s about, its context, feelings, and sentiment, and to compare the presence or absence of these elements to key performance indicators (KPIs) like video view-throughs, retention, drop-off rates, clicks, engagement, brand recognition, and satisfaction. By closing the loop of video production, analytics, optimization, and publishing, VidMob can improve its clients’ return on marketing investment. ACS automatically extracts video metadata and performs sentiment analysis. It uses deep learning and computer vision to identify the emotions, objects, logos, people, and words in videos; it can detect facial expressions like delight, surprise, or disgust. It then analyzes how each of these elements corresponds, for instance, to moments when viewers are dropping off from watching the video, and it recommends (and automates) editing that improves retention.

pages: 472 words: 80,835

Life as a Passenger: How Driverless Cars Will Change the World
by David Kerrigan
Published 18 Jun 2017

Google announce Car: https://googleblog.blogspot.ie/2010/10/what-were-driving-at.html http://www.makeuseof.com/tag/how-self-driving-cars-work-the-nuts-and-bolts-behind-googles-autonomous-car-program/ https://techcrunch.com/2017/02/12/wtf-is-lidar/ http://velodynelidar.com/hdl-64e.html https://www.bloomberg.com/news/articles/2017-05-04/another-group-of-google-veterans-starts-a-self-driving-technology-company http://www.wsj.com/articles/google-tries-to-make-its-cars-drive-more-like-humans-1443463523 Skill Atrophy: http://cacm.acm.org/magazines/2016/5/201592-the-challenges-of-partially-automated-driving/fulltext http://news.stanford.edu/2016/12/06/taking-back-control-autonomous-car-affects-human-steering-behavior/review/ https://arxiv.org/pdf/1704.07911.pdf https://blogs.nvidia.com/blog/2017/04/27/how-nvidias-neural-net-makes-decisions/ https://www.technologyreview.com/s/601567/tesla-tests-self-driving-functions-with-secret-updates-to-its-customers-cars/ http://www.engadget.com/2016/01/11/ford-is-testing-autonomous-cars-in-the-snow/ http://www.engadget.com/2015/11/13/ford-first-self-driving-mcity-michigan/ https://www.dmv.ca.gov/portal/dmv/detail/vr/autonomous/disengagement_report_2016 If you want to understand more about the technicalities of SDC Computer Vision, there’s a massively detailed review of computer vision. https://arxiv.org/pdf/1704.05519.pdf Chapter 4 - Safety Cost of crashes: http://www.rmiia.org/auto/traffic_safety/Cost_of_crashes.asp http://www.who.int/mediacentre/factsheets/fs358/en/ http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2610566/ http://www.nytimes.com/2016/05/23/science/its-no-accident-advocates-want-to-speak-of-car-crashes-instead.html?

In November 2015, in widely reported comments, an electronics researcher for Volkswagen, said at the Connected Car Expo event in Los Angeles that even a tumbleweed in the road can bring a driverless car to a halt.[268] The point is valid in so far as an unknown object represents a challenge to a driverless car where normally none would exist for a human. I agree that we must plan for the unusual but we must also keep some perspective. Is it really any harm if a car stops for a tumbleweed? As long as the cars all stop, it’s better than the number of accidents caused by drivers avoiding more real obstacles today. More likely though, the computer vision technology will soon evolve to the point where it can identify tumbleweeds and deal appropriately with them. These challenges can either be presented in a sensationalist negative way or as a more matter-of-fact challenge to solve. Stop the Lights! Another often quoted challenge for driverless cars is identifying traffic lights.

Another frequently mooted challenge is the condition of roads.[286] Between their multitude of sensors, GPS and detailed map data, driverless cars can now cope much better with obscured, weathered, or inaccurate road edges or lane markings than they could at the start of their development. But it remains preferable for them to have good quality markings, and I, for one, would certainly prefer that the computer vision abilities of the car were scanning for any danger rather than exerting effort on simply trying to figure out where the road was supposed to be. We know that trains aren’t very good at running without tracks. But if we want trains, we put tracks down and maintain them. Similarly, we know that cars need roads.

pages: 339 words: 92,785

I, Warbot: The Dawn of Artificially Intelligent Conflict
by Kenneth Payne
Published 16 Jun 2021

And Shakey wasn’t even moving about the real world, but a carefully simplified lab version of it, whose surfaces were painted and brightly lit to assist its computer modelling of the environment. If there was too much novelty—something in the wrong place, lighting not quite right—the robot was flummoxed. Researchers would say that its intelligence was ‘brittle’—unable to cope with novelty. Shakey was hugely ambitious, combining distinct research sub-fields in computer vision, language processing and robotics. And like many AI projects of the era, it was funding by the Pentagon, through its Advanced Research Projects Agency (ARPA—a D, for Defense, was added in 1972). The Pentagon was an enthusiastic sponsor of many AI research projects, forging links with centres of excellence at universities around the United States, including Shakey’s home department at Stanford, but also MIT and Carnegie Mellon: today these remain leading departments in the field.

But now there was a new challenge—how to make use of the new technology landscape, with Google, Amazon and Facebook sponsoring cutting-edge research. In 2017, the Pentagon stood up an ‘algorithmic warfare cross functional team’, known as Project Maven. The team would consolidate ‘all initiatives that develop, employ, or field artificial intelligence, automation, machine learning, deep learning, and computer vision algorithms’.12 It was a small team, initially, but would grow rapidly. And one of its main contractors on image recognition? Google, of course. The stage was set for the arrival of deep warbots. Hype or hope? The deep learning revolution has produced a new wave of AI hype, only some of which is justified.

Democracies and authoritarians sometimes develop exactly the same technology, see Collingridge, John, and Rob Watts, ‘Huawei buys stake in UK spy firm Vision Semantics’, The Times, 19 July 2020, https://www.thetimes.co.uk/article/huawei-buys-stake-in-uk-spy-firm-vision-semantics-65t98vdz0. 25. Thys, Simen, Wiebe Van Ranst, and Toon Goedemé. ‘Fooling automated surveillance cameras: adversarial patches to attack person detection’, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 49–55. 2019. 26. Ilyas, Andrew, Logan Engstrom, Anish Athalye, and Jessy Lin. ‘Black-box adversarial attacks with limited queries and information’, arXiv preprint arXiv:1804.08598 (2018). 27. For the British Army, see chapter 6, ‘Mission Command’, of Land Warfare Development Centre, ‘Land Operations,’ Army Doctrine Publication AC 71940, 31 March 2017, https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/605298/Army_Field_Manual__AFM__A5_Master_ADP_Interactive_Gov_Web.pdf.

pages: 290 words: 90,057

Billion Dollar Brand Club: How Dollar Shave Club, Warby Parker, and Other Disruptors Are Remaking What We Buy
by Lawrence Ingrassia
Published 28 Jan 2020

They soon chose the name ThirdLove for the bra business, to convey the three attributes—style, feel, and fit—that they wanted women to “love” about their bras (in contrast to most brands, which offered fashion or comfort, but rarely both, and even more rarely all three). Their first hire was Ra’el Cohen, a lingerie designer who had worked for several fashion retailers and had even started her own boutique luxury bra company a few years earlier, though it hadn’t worked out. Their second hire was a NASA engineer who had expertise in computer vision technology, using cameras to collect and analyze digital images, to create a better-fitting bra. Even though Spector had come from the venture capital world, raising money wasn’t easy. After all, this was in the early days of direct-to-consumer brands. But there was another reason: Zak recalls that they pitched their idea to more than fifty VC companies, almost always conference rooms full of men who didn’t understand why women might need a better bra.

The photo’s digital data—though not the photo itself, to avoid privacy concerns—would then be sent to the company, where an algorithm would translate the two-dimensional image’s data into three-dimensional measurements to recommend the right size. In early 2013, when a version of the bra app (whose computer-vision imaging technology has since received two patents) was ready for testing, ThirdLove placed ads on Craigslist (where most ad postings are free) and invited women to come to the company’s small office in San Francisco wearing their best bra. About one hundred showed up and used the app, and then tried on the prototype bra that Zak and Cohen had created based on standard sizes used in the industry.

“For example, the way a certain blouse fits tightly on the shoulders and flaunts the upper-arms may provide value to some clients while being an undesirable quality to others … Machines are great at finding and applying these relationships.” As Stitch Fix has developed more sophisticated algorithms, it has incorporated the use of computer vision to help select clothing. “We have our machines look at photos of clothing that customers like (e.g., from Pinterest), and look for visually similar items,” the website explains. And while the company initially sold apparel and accessories made by others, its data scientists in 2017 started designing “Stitch Fix exclusive brand” items by combining different style characteristics from popular clothing.

When Computers Can Think: The Artificial Intelligence Singularity
by Anthony Berglas , William Black , Samantha Thalind , Max Scratchmann and Michelle Estes
Published 28 Feb 2015

But today there are several quite effective translation engines. They do not produce human quality output, but they are certainly very usable. Computer vision is another technology that is surprisingly difficult to implement. Yet today’s computers regularly review the vast quantity of recorded surveillance video. People can be recognized and tracked over time, and this data can then be stored and analyzed. The Curiosity rover on Mars uses computer vision technology to navigate over the terrain without getting stuck. None of the above involves human-level reasoning, but they address difficult problems that form a basis for that reasoning.

Traditional AGI systems were strictly symbolic systems that communicated with the outside world via people typing on a teletype. These machines had very little access to the “real” world — they were like a brain in a vat that could only send and receive written letters to other people. Today work on computer vision and robotics has progressed enormously. As robots leave the factory, they will indeed be both able to and required to see and touch the real world. This will produce much richer symbolic and pre-symbolic models that should provide plenty of “intentionality”. This is also known as the symbol grounding problem and will be discussed in part II of this book.

A naively applied support vector machine produced a very reasonable error rate of 1.1%. A tuned version was then built that centered the image and focused the kernels on nearby pixels. This produced an impressive 0.56% error rate. A shape-matching approach explicitly looked at edges between black and white pixels in a similar manner to computer vision systems. It then attempted to match corresponding features of each pair of images using nearest neighbour clustering. This produced an error rate of 0.63%. Decision tables have also been effective in image analysis, although their effectiveness largely depends on the tests that can be applied to the image.

pages: 482 words: 121,173

Tools and Weapons: The Promise and the Peril of the Digital Age
by Brad Smith and Carol Ann Browne
Published 9 Sep 2019

Paul Scharre, a former US defense official working at a think tank, brings to life increasingly pertinent questions in his book, Army of None: Autonomous Weapons and the Future of War.19 As he illustrates, a central question is not just when but how computers should be empowered to launch a weapon without additional human review. On the one hand, even though a drone with computer vision and facial recognition might exceed human accuracy in identifying a terrorist on the ground, this doesn’t mean that military officials need to or should take personnel and common sense out of the loop. On the other hand, if dozens of missiles are launched at a naval flotilla, the Aegis combat system’s antimissile defenses need to respond according to computer-based decision-making.

The technology that recognizes Cruise in the Gap store is informed by a chip embedded inside him. But the real-world technology advances of the first two decades of the twenty-first century have outpaced even Spielberg’s imagination, as today no such chip is needed. Facial-recognition technology, utilizing AI-based computer vision with cameras and data in the cloud, can identify the faces of customers as they walk into a store based on their visit last week—or an hour ago. It is creating one of the first opportunities for the tech sector and governments to address ethical and human rights issues for artificial intelligence in a focused and concrete way, by deciding how facial recognition should be regulated.

Given the nature and role of academic research, universities have begun to set up data depositories, where data can be shared for multiple uses. Microsoft Research is pursuing this data-sharing approach too, making available a collection of free data sets to advance research in areas such as natural language processing and computer vision, as well as in the physical and social sciences. It was this ability to share data that inspired Matthew Trunnell. He recognized that the best way to accelerate the race to cure cancer is to enable multiple research organizations to share their data in new ways. While this sounds simple in theory, its execution is complicated.

pages: 360 words: 100,991

Heart of the Machine: Our Future in a World of Artificial Emotional Intelligence
by Richard Yonck
Published 7 Mar 2017

This was the situation when a young computer engineer named Rosalind Picard came to the MIT Media Lab in 1987 as a teaching and research assistant before joining the Vision and Modeling group as faculty in 1991. There Picard taught and worked on a range of new technologies and engineering challenges, including developing new architectures for pattern recognition, mathematical modeling, computer vision, perceptual science, and signal processing. With a degree in electrical engineering and later computer science, Picard had already made major contributions in a number of these areas. But it was Picard’s work developing image modeling and content-based retrieval systems that led her in a direction no one could have foreseen, least of all herself.

She’d driven herself hard doing pioneering research in image pattern modeling and developing the world’s first content-based retrieval system. Additionally, as she wrote in an article for IEEE, the Institute of Electrical and Electronics Engineers: I was busy working six days and nights a week building the world’s first content-based retrieval system, creating and mixing mathematical models from image compression, computer vision, texture modeling, statistical physics, machine learning, and ideas from filmmaking, and spending all my spare cycles advising students, building and teaching new classes, publishing, reading, reviewing, and serving on non-stop conference and lab committees. I worked hard to be taken as the serious researcher I was, and I had raised over a million dollars in funding for my group’s work.

As they gathered more and more samples of expressive responses to each ad, the system got better and better. As el Kaliouby explained in a keynote address: We capture emotions by looking at the face. The face happens to be one the most powerful channels for communicating social and emotion information. And we do that by using computer vision and machine learning algorithms that track your face, your facial features, your eyes, your mouth, your eyebrows, and we map those to emotional data points. Then we take all this information and we map it into emotional states, like confusion, interest, enjoyment. And what we’ve found over the past couple of years as we’ve started to run off all this data is that the more data we had, the more accurate our emotion classifiers were able to be.

Calling Bullshit: The Art of Scepticism in a Data-Driven World
by Jevin D. West and Carl T. Bergstrom
Published 3 Aug 2020

Essentially, they aim to determine whether advanced computer vision can reveal subtle cues and patterns that Lombroso and his followers might have missed. To test this hypothesis, the authors use machine learning algorithms to determine what features of the human face are associated with “criminality.” Wu and Zhang claim that based on a simple headshot, their programs can distinguish criminal from noncriminal faces with nearly 90 percent accuracy. Moreover, they argue that their computer algorithms are free from the myriad biases and prejudices that cloud human judgment: Unlike a human examiner/judge, a computer vision algorithm or classifier has absolutely no subjective baggages [sic], having no emotions, no biases whatsoever due to past experience, race, religion, political doctrine, gender, age, etc., no mental fatigue, no preconditioning of a bad sleep or meal.

It is called MNIST, the Modified National Institute of Standards and Technology database for handwritten digits, and it includes seventy thousand labeled images of handwritten digits, similar to those drawn below. So how does the algorithm “see” images? If you don’t have a background in computer vision, this may seem miraculous. Let’s take a brief digression to consider how it works. A computer stores an image as a matrix. A matrix can be thought of as a table of rows and columns. Each cell in this table contains a number. For simplicity, let’s assume that our image is black and white.

.*5 The Economist and Guardian stories described a research paper in which Stanford University researchers Yilun Wang and Michal Kosinski trained a deep neural network to predict whether someone was straight or gay by looking at their photograph. Wang and Kosinski collected a set of training images from an Internet dating website, photos of nearly eight thousand men and nearly seven thousand women, evenly split between straight and gay. The researchers used standard computer vision techniques for processing the facial images. When given pictures of two people, one straight and the other gay, the algorithm did better than chance at guessing which was which. It also did better than humans charged with the same task. There are so many questions one could ask about the training data.

pages: 245 words: 64,288

Robots Will Steal Your Job, But That's OK: How to Survive the Economic Collapse and Be Happy
by Pistono, Federico
Published 14 Oct 2012

Robots will eventually steal your job, but before them something else is going to jump in. In fact, it already has, in a much more pervasive way that any physical machine could ever do. I am of course talking about computer programs in general. Automated Planning and Scheduling, Machine Learning, Natural Language Processing, Machine Perception, Computer Vision, Speech Recognition, Affective Computing, Computational Creativity, these are all fields of Artificial Intelligence that do not have to face the cumbersome issues that Robotics has to. It is much easier to enhance an algorithm than it is to build a better robot. A more accurate title for the book would have been “Machine intelligence and computer algorithms are already stealing your job, and they will do so ever more in the future” – but that was not exactly a catchy title.

What this means is a database of information (thirteen years of studies and training) connected to a visual recognition system (the radiologist’s brain) is a process that already exists today and finds many applications. Visual pattern recognition software is already highly sophisticated, one such example is Google Images. You can upload an image to the search engine, Google uses computer vision techniques to match your image to other images in the Google Images index and additional image collections. From those matches, they try to generate an accurate “best guess” text description of your image, as well as find other images that have the same content as your uploaded image. * * * Figure 6.1: Front page of Google Images.

The approach by Andrew Ng inspired many others, who are now teaching under the umbrella of a non-profit called ‘Coursera’, with high level subjects such as Model Thinking, Natural Language Processing, Game Theory, Probabilistic Graphical Models, Cryptography, Design and Analysis of Algorithms, Software as a Service, Computer Vision, Computer Science, Machine Learning, Human-Computer Interaction, Making Green Buildings, Information Theory, Anatomy, and Computer Security. Needless to say, this is just the beginning. It is the natural evolution of education when combined with technology. Embrace change, or die. So, how does this apply to you?

pages: 371 words: 107,141

You've Been Played: How Corporations, Governments, and Schools Use Games to Control Us All
by Adrian Hon
Published 14 Sep 2022

Everywhere you find Digital Taylorism, gamification follows. Even in what might seem like unorganised environments, like a busy shop floor or a crowded cafe, any task can look repetitive and improvable if you use enough sensors and apply enough processing power. Percolata is a “machine learning–based retail staffing” tool that uses computer vision to surveil shoppers and employees.55 It combines this information with sales data, weather forecasts, and marketing calendars to predict future shopper traffic, all in order to optimise staffing levels so employers pay only the bare minimum labour costs. At the same time, it creates a “true productivity” score for workers, ranking them from most to least productive.

Perhaps we could place sensors onto the mop handle, so the game could tell whether you were mopping or not, and with good enough processing, it might even be able to learn which surfaces players were mopping. It’s more convenient in some ways, but on its own it wouldn’t be able to assess the cleanliness of the floor. The obvious best solution is to check for dirt in the same way humans do: by looking at the floor. These days, computer vision (i.e., combining cameras and algorithms to understand the real world) is quite powerful and likely up to the task. However, it poses a new challenge: getting visual coverage of the entire floor. Asking players to buy and mount cameras all over the walls and ceiling is a stretch, and also a bit creepy.

AR has the potential to make challenging and mundane activities entertaining, or at the very least, slightly more bearable. AR games could also provide social benefits. I occasionally go out with a bin bag and litter picker to tidy up my street, which is something of a thankless task. It wouldn’t be difficult to turn this into a game by using computer vision to identify and classify litter, awarding extra points for especially ugly or messy things, or for collecting in neglected areas. No doubt people would try to cheat, but on balance I suspect you’d end up with happier litter pickers and much cleaner neighbourhoods. Litter picking, waste recycling, safe driving, healthy cooking, fitness, mindfulness—the sky’s the limit for AR making the world a better place!

pages: 696 words: 143,736

The Age of Spiritual Machines: When Computers Exceed Human Intelligence
by Ray Kurzweil
Published 31 Dec 1998

A link to an article in USA Today on how Ian Goldberg, the graduate student from the University of California, cracked the 40-bit encryption code: <http://www.usatoday.com/life/cyber/tech/ct718.htm> Autonomous Agents Agent Web Links: <http://www.cs.bham.ac.uk/~amw/agents/links/index.html> Computer Vision Computer Vision Research Groups: <http://www.cs.cmu.edu/~cil/v-groups.html> DNA Computing “DNA-based computers could race past supercomputers, researchers predict.” A link to an article in the Chronicle of Higher Education on DNA computing, by Vincent Kiernan: <http://chronicle.com/data/articles.dir/art-44.dir/issue-14.dir/14a02301.htm> Explanation of Molecular Computing with DNA, by Fred Hapgood, Moderator of the Nanosystems Interest Group at MIT: <http://www.mitre.org/research/nanotech/hapgood_on_dna.html> The University of Wisconsin: DNA Computing: <http://corninfo.chem.wisc.edu/writings/DNAcomputing.html> Expert Systems/Knowledge Engineering Knowledge Engineering, Engineering Management Graduate Program at Christian Brothers University: Online Resources to a Variety of Links: <http://www.cbu.edu/~pong/engm624.htm> Genetic Algorithms/Evolutionary Computation The Genetic Algorithms Archive at the Navy Center for Applied Research in Artificial Intelligence: <http://www.aic.nrl.navy.mil/galist/> The Hitchhiker’s Guide to Evolutionary Computation, Issue 6.2: A List of Frequently Asked Questions (FAQ), edited by Jörg Heitkotter and David Beasley: <ftp://ftp.cs.wayne.edu/pub/EC/FAQ/www/top.htm> The Santa Fe Institute: <http://www.santafe.edu> Knowledge Management ATM Links (Asynchronous Transfer Mode): <http://www.ee.cityu.edu.hk/~splam/html/atmlinks.html> Knowledge Management Network: <http://kmn.cibit.hvu.nl/index.html> Some Ongoing KBS/Ontology Projects and Groups: <http://www.cs.utexas.edu/users/mfkb/related.html> Nanotechnology Eric Drexler’s web site at the Foresight Institute (includes the complete text of Engines of Creation): <http://www.foresight.org/EOC/index.html> Richard Feynman’s talk, “There’s Plenty of Room at the Bottom”: <http://nano.xerox.com/nanotech/feynman.html> Nanotechnology: Ralph Merkle’s web site at the Xerox Palo Alto Research Center: <http://sandbox.xerox.com/nano> MicroElectroMechanical Systems and Fluid Dynamics Research Group Professor Chih-Ming Ho’s Laboratory, University of California at Los Angeles: <http://ho.seas.ucla.edu/new/main.htm> Nanolink: Key Nanotechnology Sites on the Web: <http://sunsite.nus.sg/MEMEX/nanolink.html> Nanothinc: <http://www.nanothinc.com/> NEC Research and Development Letter: A summary of Dr.

New York: Times Books, 1978. Finkelstein, Joseph, ed. Windows on a New World: The Third Industrial Revolution. New York: Greenwood Press, 1989. Fischler, Martin A. and Oscar Firschein. Intelligence: The Eye, the Brain and the Computer. Reading, MA: Addison-Wesley, 1987. ————, eds., Readings in Computer Vision: Issues, Problems, Principles, and Paradigms. Los Altos, CA: Morgan Kaufmann, 1987. Fjermedal, Grant. The Tomorrow Makers: A Brave New World of Living Brain Machines. New York: Macmillan Publishing Company, 1986. Flanagan, Owen. Consciousness Reconsidered. Cambridge, MA: MIT Press, 1992. Flynn, Anita, Rodney A.

New York: Academic Press, 1980. Miller, Eric, ed. Future Vision: The 189 Most Important Trends of the 1990s. Naperville, IL: Sourcebooks Trade, 1991. Minsky, Marvin. Computation: Finite and Infinite Machines. Englewood Cliffs, NJ: Prentice-Hall, 1967. ─. “A Framework for Representing Knowledge.” In The Psychology of Computer Vision, edited by P H. Winston. New York: McGraw-Hill, 1975. ─. The Society of Mind. New York: Simon and Schuster, 1985. ─, ed. Robotics. New York: Doubleday, 1985. ─, ed. Semantic Information Processing. Cambridge, MA: MIT Press, 1968. Minsky, Marvin and Seymour A. Papert. Perceptrons: An Introduction to Computational Geometry.

pages: 913 words: 265,787

How the Mind Works
by Steven Pinker
Published 1 Jan 1997

Rotating objects to recognize them: A case study on the role of viewpoint dependency in the recognition of three-dimensional shapes. Psychonomic Bulletin and Review, 2, 55–82. Tarr, M. J., & Black, M. J. 1994a. A computational and evolutionary perspective on the role of representation in vision. Computer Vision, Graphics, and Image Processing: Image Understanding, 60, 65–73. Tarr, M. J., & Black, M. J. 1994b. Reconstruction and purpose. Computer Vision, Graphics, and Image Processing: Image Understanding, 60, 113–118. Tarr, M. J., & Bülthoff, H. H. 1995. Is human object recognition better described by geon-structural-descriptions or by multiple views? Journal of Experimental Psychology: Human Perception and Performance, 21, 1494–1505.

The analysis begins with a goal to be attained and a world of causes and effects in which to attain it, and goes on to specify what kinds of designs are better suited to attain it than others. Unfortunately for those who think that the departments in a university reflect meaningful divisions of knowledge, it means that psychologists have to look outside psychology if they want to explain what the parts of the mind are for. To understand sight, we have to look to optics and computer vision systems. To understand movement, we have to look to robotics. To understand sexual and familial feelings, we have to look to Mendelian genetics. To understand cooperation and conflict, we have to look to the mathematics of games and to economic modeling. Once we have a spec sheet for a well-designed mind, we can see whether Homo sapiens has that kind of mind.

Two-dimensional surfaces can be curved in the third dimension, like a rubber mold or a blister package. Fifth, we don’t immediately see “objects,” the movable hunks of matter that we count, classify, and label with nouns. As far as vision is concerned, it’s not even clear what an object is. When David Marr considered how to design a computer vision system that finds objects, he was forced to ask: Is a nose an object? Is a head one? Is it still one if it is attached to a body? What about a man on horseback? These questions show that the difficulties in trying to formulate what should be recovered as a region from an image are so great as to amount almost to philosophical problems.

pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
by Pedro Domingos
Published 21 Sep 2015

Like Bayesian networks, Markov networks can be represented by graphs, but they have undirected arcs instead of arrows. Two variables are connected, meaning they depend directly on each other, if they appear together in some feature, like Ballad and By a hip-hop artist in Ballad by a hip-hop artist. Markov networks are a staple in many areas, such as computer vision. For instance, a driverless car needs to segment each image it sees into road, sky, and countryside. One option is to label each pixel as one of the three according to its color, but this is not nearly good enough. Images are very noisy and variable, and the car will hallucinate rocks strewn all over the roadway and patches of road in the sky.

Alchemy learned over a million such patterns from facts extracted from the web (e.g., Earth orbits the sun). It discovered concepts like planet all by itself. The version we used was more advanced than the basic one I’ve described here, but the essential ideas are the same. Various research groups have used Alchemy or their own MLN implementations to solve problems in natural language processing, computer vision, activity recognition, social network analysis, molecular biology, and many other areas. Despite its successes, Alchemy has some significant shortcomings. It does not yet scale to truly big data, and someone without a PhD in machine learning will find it hard to use. Because of these problems, it’s not yet ready for prime time.

My paper on Naïve Bayes, with Mike Pazzani, is “On the optimality of the simple Bayesian classifier under zero-one loss”* (Machine Learning, 1997; expanded journal version of the 1996 conference paper). Judea Pearl’s book,* mentioned above, discusses Markov networks along with Bayesian networks. Markov networks in computer vision are the subject of Markov Random Fields for Vision and Image Processing,* edited by Andrew Blake, Pushmeet Kohli, and Carsten Rother (MIT Press, 2011). Markov networks that maximize conditional likelihood were introduced in “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,”* by John Lafferty, Andrew McCallum, and Fernando Pereira (International Conference on Machine Learning, 2001).

pages: 144 words: 43,356

Surviving AI: The Promise and Peril of Artificial Intelligence
by Calum Chace
Published 28 Jul 2015

For instance, when training a machine to recognise faces, or images of cats, the researchers will present the machine with thousands of images and the machine will devise statistical rules for categorising images based on their common features. Then the machine is presented with another set of images to see whether the rules hold up, or need revising. Machine learning has proven to be a powerful tool, with impressive performance in applications like computer vision and search. One of the most promising approaches to computer vision at the moment is “convolutional neural nets”, in which a large number of artificial neurons are each assigned to a tiny portion of an image. It is an interesting microcosm of the whole field of machine learning in that it was first invented in 1980, but did not become really useful until the 21st century when graphics processing unit (GPU) computer chips enabled researchers to assemble very large networks.

pages: 294 words: 81,292

Our Final Invention: Artificial Intelligence and the End of the Human Era
by James Barrat
Published 30 Sep 2013

Many know that DARPA (then called ARPA) funded the research that invented the Internet (initially called ARPANET), as well as the researchers who developed the now ubiquitous GUI, or Graphical User Interface, a version of which you probably see every time you use a computer or smart phone. But the agency was also a major backer of parallel processing hardware and software, distributed computing, computer vision, and natural language processing (NLP). These contributions to the foundations of computer science are as important to AI as the results-oriented funding that characterizes DARPA today. How is DARPA spending its money? A recent annual budget allocates $61.3 million to a category called Machine Learning, and $49.3 million to Cognitive Computing.

Consider the old joke about the drunk who loses his car keys and looks for them under a streetlight. A policeman joins the search and asks, “Exactly where did you lose your keys?” The man points down the street to a dark corner. “Over there,” he says. “But the light’s better here.” Search, voice recognition, computer vision, and affinity analysis (the kind of machine learning Amazon and Netflix use to suggest what you might like) are some of the fields of AI that have seen the most success. Though they were the products of decades of research, they are also among the easiest problems, discovered where the light’s better.

This axiom is known as Moravec’s Paradox, because AI and robotics pioneer Hans Moravec expressed it best in his robotics classic, Mind Children: “It is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.” Puzzles so difficult that we can’t help but make mistakes, like playing Jeopardy! and deriving Newton’s second law of thermodynamics, fall in seconds to well-programmed AI. At the same time, no computer vision system can tell the difference between a dog and a cat—something most two-year-old humans can do. To some degree these are apples-and-oranges problems, high-level cognition versus low-level sensor motor skill. But it should be a source of humility for AGI builders, since they aspire to master the whole spectrum of human intelligence.

pages: 259 words: 84,261

Scary Smart: The Future of Artificial Intelligence and How You Can Save Our World
by Mo Gawdat
Published 29 Sep 2021

The smartest are artificial intelligence machines. But if listening, understanding and speaking is still not impressive enough, look at how our computers can see. In the late 1960s, computer vision research started. It was designed to mimic the human visual system, as a stepping stone to endowing robots with intelligent behaviour based on what they could see. Studies in the 1970s formed the early foundations for many of the computer vision algorithms that exist today, including extracting edges from images, labelling lines, optical flow and motion estimation. The 1980s saw studies based on more rigorous mathematical analysis while, in the 1990s, research advanced 3D reconstructions.

This was also the decade when, for the first time, statistical learning techniques were used to recognize faces in images. All of the above, however, were based on traditional computer programming, and while they delivered impressive results, they failed to offer the accuracy and scale today’s computer vision can offer, due to the advancement of Deep Learning artificial intelligence techniques, which have completely surpassed and replaced all prior methods. This intelligence did not learn to see by following a programmer’s list of instructions, but rather through the very act of seeing itself. With AI helping computers see, they can now do it much better than we do, specifically when it comes to individual tasks.

pages: 326 words: 84,180

Dark Matters: On the Surveillance of Blackness
by Simone Browne
Published 1 Oct 2015

Not attending to these considerations and failing to consider social impacts diminishes their efficacy and can bring serious unintended consequences,” like the further marginalization, and in some cases the disenfranchisement, of people who because of industry-determined standard algorithms encounter difficulty in using this technology.6 When dark matter troubles algorithms in this way, it amounts to a refusal of the idea of neutrality when it comes to certain technologies. But if algorithms can be troubled, this might not necessarily be a bad thing. In other words, could there be some potential in going about unknown or unremarkable, and perhaps unbothered, where CCTV, camera-enabled devices, facial recognition, and other computer vision technologies are in use? The very thing that rendered Black Desi unseen in the “HP Computers Are Racist” video is what viewers of another YouTube video are instructed to employ in order to remain undetected by facial recognition technology. In her DIY (do-it-yourself) makeup tutorial on “how to hide from cameras,” artist Jillian Mayer demonstrates how to use black lipstick, clear tape, scissors, white cream, some glitter, and black eyeliner to distort one’s face in order to make it indiscernible to cameras and “look great.”7 Modeled in a format similar to popular makeup, hair, or other beauty tutorials on YouTube, Mayer tells her viewers that the most important thing “is to really break up your face.”

In her DIY (do-it-yourself) makeup tutorial on “how to hide from cameras,” artist Jillian Mayer demonstrates how to use black lipstick, clear tape, scissors, white cream, some glitter, and black eyeliner to distort one’s face in order to make it indiscernible to cameras and “look great.”7 Modeled in a format similar to popular makeup, hair, or other beauty tutorials on YouTube, Mayer tells her viewers that the most important thing “is to really break up your face.” Mayer’s tutorial is based on artist Adam Harvey’s cv Dazzle project, which explores the role of camouflage in subverting face-recognition technology. Computer Vision (CV) Dazzle is a play on dazzle camouflage used during World War I, which saw warships painted with block patterns and geometric shapes in contrasting colors, so that rather than concealing a ship, dazzle camouflage was intended to make it difficult to visually assess its size and speed by way of optical illusion.

Harvey offers up “style tips for reclaiming privacy” and suggests that to decrease the possibilities of detection you should “apply makeup that contrasts with your skin tone in unusual tones and directions: light colors on dark skin, dark colors on light skin.” Makeup could be used not only to prevent recognition but to obscure skin texture analysis as well. These tactics, however, do not explicitly challenge the proliferation of CCTV and other computer vision technologies in public and private spaces, but rather leave it up to the individual to adapt. One of the tasks of Dark Matters has been to situate the dark, blackness, and the archive of slavery and its afterlife as a way to trouble and expand understandings of surveillance. Of course, some things are still left in the dark: the open secret that is the operation of black sites for rendition, torture, detention, and disappearance of people suspected as terroristic threats, or Edward Snowden’s revelations in the summer of 2013 of the National Security Agency’s warrantless wiretapping, a program representing what he called “a dangerous normalization of governing in the dark.”8 In the beginning of this book, I named dark sousveillance as a form of critique that centers black epistemologies of contending with surveillance, and I later looked to freedom acts such as escaping from enslavement by using falsified documents and aliases, or the Totau as celebratory resistance performed right under the surveillant gazes of white audiences, and Solange’s critique of TSA searches as “Discrim-FRO-nation.”

pages: 199 words: 47,154

Gnuplot Cookbook
by Lee Phillips
Published 15 Feb 2012

His main interests are functional programming and machine-learning algorithms. David Millán Escrivá was 8 years old when he wrote his first program on 8086 PC with Basic language. He has more than 10 years of experience in IT. He has worked on computer vision, computer graphics, and pattern recognition. Currently he is working on different projects about computer vision and AR. I would like to thank Izanskun and my daughter Eider. www.PacktPub.com Support files, eBooks, discount offers, and more You might want to visit www.PacktPub.com for support files and downloads related to your book.

pages: 328 words: 90,677

Ludicrous: The Unvarnished Story of Tesla Motors
by Edward Niedermeyer
Published 14 Sep 2019

Mobileye’s deeply held view is that the long-term potential for vehicle automation to reduce traffic injuries and fatalities significantly is too important to risk consumer and regulatory confusion or to create an environment of mistrust that puts in jeopardy technological advances that can save lives. Indeed, Mobileye had been working on computer vision for autonomous vehicles for more than a decade, and it supplied the same EyeQ3 chip to so many automakers that its contract with Tesla made up only 1 percent of its revenue. Though Tesla’s Autopilot initially showed how capable Mobileye’s technology could be when pushed to its limits, the Brown crash suggested that Tesla was willing to sacrifice safety for the perception of a lead in the “race to autonomy.” With its head start on computer vision and supply contracts with half the industry, Mobileye had no need for the kind of risks Tesla was taking.

According to Musk, “Mobileye’s ability to evolve its technology is unfortunately negatively affected by having to support hundreds of models from legacy auto companies, resulting in a very high engineering drag coefficient.” The war of words continued into September, with Shashua arguing that Mobileye’s chip wasn’t designed to handle the kind of cross traffic that had caused the Brown crash and Musk contending that Mobileye was threatened by Tesla’s research into computer vision. The conflict came to a head when Musk claimed that Mobileye had “attempted to force Tesla to discontinue this development, pay them more and use their products in future hardware . . . When Tesla refused to cancel its own vision development activities and plans for deployment, Mobileye discontinued hardware support for future platforms and released public statements implying that this discontinuance was motivated by safety concerns.”

pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms
by Mehmed Kantardzić
Published 2 Jan 2003

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) http://computer.org/tpami/ IEEE TPAMI is a scholarly archival journal published monthly. Its editorial board strives to present most important research results in areas within TPAMI’s scope. This includes all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence. Areas such as machine learning, search techniques, document and handwriting analysis, medical-image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition, and relevant specialized hardware and/or software architectures are also covered. 5.

Kumar, Similarity Measures for Categorical Data: A Comparative Evaluation, SIAM Conference, 2008, pp. 243–254. Brachman, R. J., T. Khabaza, W. Kloesgen, G. S. Shapiro, E. Simoudis, Mining Business Databases, CACM, Vol. 39, No. 11, 1996, pp. 42–48. Chen, C. H., L. F. Pau, P. S. P. Wang, Handbook of Pattern Recognition & Computer Vision, World Scientific Publ. Co., Singapore, 1993. Clark, W. A. V., M. C. Deurloo, Categorical Modeling/Automatic Interaction Detection, Encyclopedia of Social Measurement, 2005, pp. 251–258. Dwinnell, W., Data Cleansing: An Automated Approach, PC AI, March/April 2001, pp 21–23. Fayyad, U. M., G.

Thearling, Building Data Mining Applications for CRM, McGraw-Hill, New York, 2000. Brachman, R. J., T. Khabaza, W. Kloesgen, G. S. Shapiro, E. Simoudis, Mining Business Databases, CACM, Vol. 39, No. 11, 1996, pp. 42–48. Chen, C. H., L. F. Pau, P. S. P. Wang, Handbook of Pattern Recognition and Computer Vision, World Scientific Publ. Co., Singapore, 1993. Clark, W. A. V., M. C. Deurloo, Categorical Modeling/Automatic Interaction Detection, Encyclopedia of Social Measurement, 2005, pp. 251–258. Dwinnell, W., Data Cleansing: An Automated Approach, PC AI, March/April 2001, pp. 21–23. Eddy, W. F., Large Data Sets in Statistical Computing, in International Encyclopedia of the Social & Behavioral Sciences, N.

pages: 363 words: 109,077

The Raging 2020s: Companies, Countries, People - and the Fight for Our Future
by Alec Ross
Published 13 Sep 2021

One of the main reasons is that digital tools like AI are more difficult to categorize than traditional defense technologies. Fighter jets and warships are used for one thing: the projection and exercise of military power. But artificial intelligence is a general-purpose technology with both national security applications and completely benign commercial uses. A computer vision algorithm can be trained to spot enemy combatants on a battlefield, but it can also be used to tag friends in social media posts and power self-driving cars. AI takes on the values and intentions of its human masters. The same AI-enabled facial recognition technology that can identify known terrorism suspects can just as easily profile and track members of an ethnic minority.

Building effective artificial intelligence requires lots and lots of training data, and you would be hard-pressed to find a government with fewer qualms about general public data collection than China’s. Companies share data with the government, which then shares data back with other companies, which then refine their algorithms and continue collecting more data. The CEO of the Chinese computer vision company SenseTime, which helped construct the Xinjiang surveillance apparatus, referred to the government as the company’s “largest data source.” More data beget better algorithms, which beget better data. The surveillance state feeds itself and becomes more effective as it goes. As fifth-generation broadband networks enable China to embed more sensors on its streets, in its vehicles, and around its offices, homes, and public spaces, the panopticon will become more total.

As of 2019, China has deployed approximately one: Thomas Ricker, “The US, Like China, Has about One Surveillance Camera for Every Four People, Says Report,” The Verge, December 9, 2019, https://www.theverge.com/2019/12/9/21002515/surveillance-cameras-globally-us-china-amount-citizens; Charlie Campbell, “‘The Entire System Is Designed to Suppress Us.’ What the Chinese Surveillance State Means for the Rest of the World,” Time, November 21, 2019, https://time.com/5735411/china-surveillance-privacy-issues/. The CEO of the Chinese computer vision company SenseTime: Ross Andersen, “The Panopticon Is Already Here,” Atlantic, September 2020, https://www.theatlantic.com/magazine/archive/2020/09/china-ai-surveillance/614197/. But less than six months later: Amy Hawkins, “Beijing’s Big Brother Tech Needs More African Faces,” Foreign Policy, July 24, 2018, https://foreignpolicy.com/2018/07/24/beijings-big-brother-tech-needs-african-faces/; Kudzai Chimhangwa, “How Zimbabwe’s Biometric ID Scheme—and China’s AI Aspirations—Threw a Wrench in Elections,” GlobalVoices, January 30, 2020, https://globalvoices.org/2020/01/30/how-zimbabwes-biometric-id-scheme-and-chinas-ai-aspirations-threw-a-wrench-into-the-2018-election/.

Data Mining: Concepts and Techniques: Concepts and Techniques
by Jiawei Han , Micheline Kamber and Jian Pei
Published 21 Jun 2011

Machine learning and pattern recognition research is published in the proceedings of several major machine learning, artificial intelligence, and pattern recognition conferences, including the International Conference on Machine Learning (ML), the ACM Conference on Computational Learning Theory (COLT), the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), the International Conference on Pattern Recognition (ICPR), the International Joint Conference on Artificial Intelligence (IJCAI), and the American Association of Artificial Intelligence Conference (AAAI). Other sources of publication include major machine learning, artificial intelligence, pattern recognition, and knowledge system journals, some of which have been mentioned before.

Wavelet transforms give good results on sparse or skewed data and on data with ordered attributes. Lossy compression by wavelets is reportedly better than JPEG compression, the current commercial standard. Wavelet transforms have many real-world applications, including the compression of fingerprint images, computer vision, analysis of time-series data, and data cleaning. 3.4.3. Principal Components Analysis In this subsection we provide an intuitive introduction to principal components analysis as a method of dimesionality reduction. A detailed theoretical explanation is beyond the scope of this book. For additional references, please see the bibliographic notes (Section 3.8) at the end of this chapter.

One example is genetic linkage analysis (e.g., the mapping of genes onto a chromosome). By casting the gene linkage problem in terms of inference on Bayesian networks, and using state-of-the art algorithms, the scalability of such analysis has advanced considerably. Other applications that have benefited from the use of belief networks include computer vision (e.g., image restoration and stereo vision), document and text analysis, decision-support systems, and sensitivity analysis. The ease with which many applications can be reduced to Bayesian network inference is advantageous in that it curbs the need to invent specialized algorithms for each such application. 9.1.2.

pages: 501 words: 114,888

The Future Is Faster Than You Think: How Converging Technologies Are Transforming Business, Industries, and Our Lives
by Peter H. Diamandis and Steven Kotler
Published 28 Jan 2020

Move: See: https://www.move.com/. Redfin: See: https://www.redfin.com/. invested millions: For an example of investment in real estate AI, see this VentureBeat article about one of Zillow’s latest computer vision tools: Kyle Wiggers, “Zillow Now Uses Computer Vision To Improve Property Value Estimates,” VentureBeat, June 26, 2019. See: https://venturebeat.com/2019/06/26/zillow-now-uses-computer-vision-to-improve-property-value-estimates/. Reinventing the City five hundred coastal cities now threatened by global warming: This World Economic forum report predicts that 570 coastal cities around the world are vulnerable to a sea-level rise of 0.5 meters by 2050: http://www3.weforum.org/docs/WEF_Global_Risks_Report_2019.pdf.

pages: 447 words: 111,991

Exponential: How Accelerating Technology Is Leaving Us Behind and What to Do About It
by Azeem Azhar
Published 6 Sep 2021

And the Israeli-developed drones Harpy and Harop are often cited as autonomous weapons that are already in use. Harpy uses electromagnetic sensors to search for pre-specified targets; its follow-on system, Harop, uses visual and infrared sensors to hunt those targets. The development of facial recognition and computer vision will further add to the power of such technology on the battlefield. A commercial drone not even intended for military use, the Skydio R1, uses a cutting-edge computer vision system to recognise and track its owner autonomously. Thirteen on-board cameras do real-time mapping, path planning and obstacle avoidance.59 It sells for less than $2,500. Such drones, with high degrees of autonomy, are more and more commonplace.

Bleek, ‘Drones of Mass Destruction: Drone Swarms and the Future of Nuclear, Chemical, and Biological Weapons’, War on the Rocks, 14 February 2019 <https://warontherocks.com/2019/02/drones-of-mass-destruction-drone-swarms-and-the-future-of-nuclear-chemical-and-biological-weapons/> [accessed 26 April 2021]. 58 Missy Cummings, The Human Role in Autonomous Weapon Design and Deployment, 2014 <https://www.law.upenn.edu/live/files/3884-cummings-the-human-role-in-autonomous-weapons>. 59 Nick Statt, ‘Skydio’s AI-Powered Autonomous R1 Drone Follows You around in 4K’, The Verge, 13 February 2018 <https://www.theverge.com/2018/2/13/17006010/skydio-r1-autonomous-drone-4k-video-recording-ai-computer-vision-mapping> [accessed 2 January 2021]. 60 ‘Autonomous Weapons and the New Laws of War’, The Economist, 19 January 2019 <https://www.economist.com/briefing/2019/01/19/autonomous-weapons-and-the-new-laws-of-war> [accessed 26 March 2021]. 61 Burgess Laird, ‘The Risks of Autonomous Weapons Systems for Crisis Stability and Conflict Escalation in Future U.S.

pages: 391 words: 71,600

Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone
by Satya Nadella , Greg Shaw and Jill Tracie Nichols
Published 25 Sep 2017

Chris Capossela, our chief marketing officer, who grew up in a family-run Italian restaurant in the North End of Boston, and joined Microsoft right out of Harvard College the year before I joined. Kevin Turner, a former Wal-Mart executive, who was chief operating officer and led worldwide sales. Harry Shum, who leads Microsoft’s celebrated Artificial Intelligence and Research Group operation, received his PhD in robotics from Carnegie Mellon and is one of the world’s authorities on computer vision and graphics. I had been a member of the SLT myself when Steve Ballmer was CEO, and, while I admired every member of our team, I felt that we needed to deepen our understanding of one another—to delve into what really makes each of us tick—and to connect our personal philosophies to our jobs as leaders of the company.

These are the building blocks of AI, and for many years Microsoft has invested in advancing each of these tiers—statistical machine learning tools to make sense of data and recognize patterns; computers that can see, hear, and move, and even begin to learn and understand human language. Under the leadership of our chief speech scientist, Xuedong Huang, and his team, Microsoft set the accuracy record with a computer system that can transcribe the contents of a phone call more accurately than a human professional trained in transcription. On the computer vision and learning front, in late 2015 our AI group swept first prize across five challenges even though we only trained our system for one of those challenges. In the Common Objects in Context challenge, an AI system attempts to solve several visual recognition tasks. We trained our system to accomplish just the first one, simply to look at a photograph and label what it sees.

The Ethical Algorithm: The Science of Socially Aware Algorithm Design
by Michael Kearns and Aaron Roth
Published 3 Oct 2019

To see a particularly egregious example, let’s go back to 2015, when the market for machine learning talent was heating up. The techniques of deep learning had recently reemerged from relative obscurity (its previous incarnation was called backpropagation in neural networks, which we discussed in the introduction), delivering impressive results in computer vision and image recognition. But there weren’t yet very many experts who were good at training these algorithms—which was still more of a black art, or perhaps an artisanal craft, than a science. The result was that deep learning experts were commanding salaries and signing bonuses once reserved for Wall Street.

See also traffic and navigation problems competition among scientific journals, 144–45 competitive equilibrium, 105–6, 108–9, 111, 115 and equilibrium states, 98–99 image recognition competition, 145–49 medical residency matchmaking, 126–30. See also games and game theory complex algorithms, 174–75 complex datasets, 9, 151, 155 compromising information, 40–45 computation, 11–12 computational complexity, 101 computational literacy, 172. See also interpretability of outputs computer science, 5 computer vision, 145–49 confirmation bias, 92 consumer data, 6–7, 13, 33 consumer Internet, 64 Consumer Reports, 116 convex minimization problems, 110 cooperation cooperative solutions in game theory, 113–15 and equilibrium in game theory, 99–100 and navigation problems, 112–13 through correlation, 113–15 Cornell University, 159 correlations cooperation through, 113–15 and dangers of adaptive data analysis, 152–55 and forbidden inputs, 66–67 and online shopping algorithms, 120 and torturing data, 159 in traffic equilibrium problems, 113–15 and word embedding, 68 costs of ethical behaviors, 19 cost structures, 103 counterfactuals, 156, 174, 191 credit.

pages: 410 words: 119,823

Radical Technologies: The Design of Everyday Life
by Adam Greenfield
Published 29 May 2017

Technological Unemployment and the Meaning of Life,” Science and Engineering Ethics, forthcoming, philpapers.org/archive/DANWLB.pdf 53.Hannah Arendt, The Human Condition, Chicago: University of Chicago Press, 1958. 54.Amos Zeeberg, “Alienation Is Killing Americans and Japanese,” Nautilus, June 1, 2016. 8Machine learning 1.Rob Kitchin, The Data Revolution: Big Data, Open Data, Data Infrastructures and their Consequences, London: Sage Publications, 2014. 2.Daniel Rosenberg, “Data Before the Fact,” in Lisa Gitelman, ed., “Raw Data” Is an Oxymoron, Cambridge, MA: MIT Press, 2013. 3.These questions are explored in greater depth in the excellent Critical Algorithm Studies reading list maintained by Tarleton Gillespie and Nick Seaver of Microsoft Research’s Social Media Collective: socialmediacollective.org/reading-lists/critical-algorithm-studies. 4.Nick Bostrom, Superintelligence: Paths, Dangers, Strategies, Oxford, UK: Oxford University Press, 2014. 5.For those inclined to dig deeper into such subjects, Andrey Kurenkov’s history of neural networks is fantastic: andreykurenkov.com/writing/a-brief-history-of-neural-nets-and-deep-learning. 6.Alistair Barr, “Google Mistakenly Tags Black People as ‘Gorillas,’ Showing Limits of Algorithms,” Wall Street Journal, July 1, 2015. 7.Aditya Khosla et al., “Novel dataset for Fine-Grained Image Categorization,” First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, 2011, vision.stanford.edu/aditya86/ImageNetDogs; ImageNet, “Large Scale Visual Recognition Challenge 2012,” image-net.org/challenges/LSVRC/2012. 8.David M. Stavens, “Learning to Drive: Perception for Autonomous Cars,” Ph.D dissertation, Stanford University Department of Computer Science, May 2011, cs.stanford.edu/people/dstavens/thesis/David_Stavens_PhD_Dissertation.pdf. 9.Tesla Motors, Inc.

“The Impact of Legalized Abortion on Crime,” Quarterly Journal of Economics, Volume 116, Issue 2, May 2001. 45.Daniel Rosenberg, “Data Before the Fact,” in Lisa Gitelman, ed., “Raw Data” Is An Oxymoron, MIT Press; see also Mary Poovey, A History of The Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society, Chicago: University of Chicago Press, 1998. 46.Brian Eno and Peter Schmidt, Oblique Strategies: Over One Hundred Worthwhile Dilemmas, London: Opal Ltd., January 1975. 47.Karl Ricanek Jr. and Chris Boehnen, “Facial Analytics: From Big Data to Law Enforcement,” Computer, Volume 45, Number 9, September 2012. 48.Charles Arthur, “Quividi Defends Tesco Face Scanners after Claims over Customers’ Privacy,” Guardian, November 4, 2013. 49.Amrutha Sethuram et al., “Facial Landmarking: Comparing Automatic Landmarking Methods with Applications in Soft Biometrics,” Computer Vision—ECCV 2012, October 7, 2012. 50.Judith Butler, Gender Trouble: Feminism and the Subversion of Identity, New York and London: Routledge, 1990. 51.Vladimir Khryashchev et al., “Gender Recognition via Face Area Analysis,” Proceedings of the World Congress on Engineering and Computer Science 2012, Volume 1, October 24, 2012. 52.Shaun Walker, “Face Recognition App Taking Russia by Storm May Bring End to Public Anonymity,” Guardian, May 17, 2016. 53.Kevin Rothrock, “The Russian Art of Meta-Stalking,” Global Voices Advox, April 7, 2016. 54.Mary-Ann Russon, “Russian Trolls Outing Porn Stars and Prostitutes with Neural Network Facial Recognition App,” International Business Times, April 27, 2016. 55.Weiyao Lin et al., “Group Event Detection for Video Surveillance,” 2009 IEEE International Symposium on Circuits and Systems, May 24, 2009, ee.washington.edu/research/nsl/papers/ISCAS-09.pdf. 56.Paul Torrens, personal conversation, 2007; see also geosimulation.org/riots.html.

Here’s What He Said,” Recode, April 13, 2016. f 3.Brad Stone and Jack Clark, “Google Puts Boston Dynamics Up for Sale in Robotics Retreat,” Bloomberg Technology, March 17, 2016. 4.John Markoff, “Latest to Quit Google’s Self-Driving Car Unit: Top Roboticist,” New York Times, August 5, 2016. 5.Mark Harris, “Secretive Alphabet Division Funded by Google Aims to Fix Public Transit in US,” Guardian, June 27, 2016. 6.Siimon Reynolds, “Why Google Glass Failed: A Marketing Lesson,” Forbes, February 5, 2015. 7.Rajat Agrawal, “Why India Rejected Facebook’s ‘Free’ Version of the Internet,” Mashable, February 9, 2016. 8.Mark Zuckerberg, “The technology behind Aquila,” Facebook, July 21, 2016, facebook.com/notes/mark-zuckerberg/the-technology-behind-aquila/10153916136506634/. 9.Mari Saito, “Exclusive: Amazon Expanding Deliveries by Its ‘On-Demand’ Drivers,” Reuters, February 8, 2016. 10.Alan Boyle, “First Amazon Prime Airplane Debuts in Seattle After Secret Night Flight,” GeekWire, August 4, 2016. 11.Farhad Manjoo, “Think Amazon’s Drone Delivery Idea Is a Gimmick? Think Again,” New York Times, August 10, 2016. 12.CBS News, “Amazon Unveils Futuristic Plan: Delivery by Drone,” 60 Minutes, December 1, 2013. 13.Ben Popper, “Amazon’s drone program acquires a team of Europe’s top computer vision experts,” The Verge, May 10, 2016. 14.Danielle Kucera, “Amazon Acquires Kiva Systems in Second-Biggest Takeover,” Bloomberg, March 19, 2012. 15.Mike Rogoway, “Amazon Reports Price of Elemental Acquisition: $296 Million,” Oregonian, October 23, 2015. 16.Caleb Pershan, “Startup Doze Monetizes Nap Time for Tired Techies,” SFist, September 28, 2015; Kate Taylor, “Food-Tech Startup Soylent Snags $20 Million in Funding,” Entrepreneur, January 15, 2015; Michelle Starr, “Brain-to-Brain Verbal Communication in Humans Achieved for the First Time,” CNet, September 3, 2014; Frank Tobe, “When Will Sex Robots Hit the Marketplace?

pages: 494 words: 116,739

Geek Heresy: Rescuing Social Change From the Cult of Technology
by Kentaro Toyama
Published 25 May 2015

So in college I majored in physics, but, as often happens, one thing led to another, and I changed fields. I did a PhD in computer science, and after that, I took a job at Microsoft Research – one of the world’s largest computer science laboratories. What didn’t change was my search for technological solutions. At first I worked in an area called computer vision, which tries to give machines a skill that one-year-olds take for granted but that science still toils to explain: converting an array of color into meaning – a crib, a mother’s smile, a looming bottle. Computers still can’t recognize these objects reliably, but the field has made progress. For example, these days we don’t think twice about the little squares that track a person’s face on our mobile-phone cameras.

How Internet censorship actually works in China. The Atlantic, Oct. 2, 2013, www.theatlantic.com/china/archive/2013/10/how-internet-censorship-actually-works-in-china/280188/. Toyama, Kentaro, and Andrew Blake. (2001). Probabilistic tracking in a metric space. In Proceedings of the Eighth International Conference on Computer Vision 2:50–57, http://dx.doi.org/10.1109/ICCV.2001.937599. Tripathi, Salil. (2006). Microcredit won’t make poverty history. The Guardian, Oct. 17, 2006, www.theguardian.com/business/2006/oct/17/businesscomment.internationalaidanddevelopment. Tsotsis, Alexia. (2011). To celebrate the #Jan25 Revolution, Egyptian names his firstborn “Facebook.”

Information Technologies and International Development 5(1):81–95, http://itidjournal.org/itid/article/view/327/150. Venkatesh, Sudhir. (2008). Gang Leader for a Day: A Rogue Sociologist Crosses the Line. Allen Lane. Viola, Paul, and Michael Jones. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the Conference on Computer Vision and Pattern Recognition, http://dx.doi.org/10.1109/CVPR.2001.990517. Vornovytskyy, Marina, Alfred Gottschalck, and Adam Smith. (2011). Household debt in the U.S.: 2000 to 2011. US Census Bureau, www.census.gov/people/wealth/files/Debt%20Highlights%202011.pdf. Wahba, Mahmoud A., and Lawrence G.

pages: 458 words: 135,206

CTOs at Work
by Scott Donaldson , Stanley Siegel and Gary Donaldson
Published 13 Jan 2012

Donaldson: So you moved away from the financial services world and established the company that you currently run. Can you tell us a little about that transition and what your company does? Bloore: Sure. So what our company does is make images searchable by looking at the patterns within the pixels themselves. We are entirely focused on using computer vision and pattern recognition algorithms to make large image sets searchable. It's very different from using keywords to search images. G. Donaldson: That's a big shift from developing risk management software to pixel imagery and software. Where did that shift come from? What was your interest in going in that direction?

It is worthless to capture all the visual information around you if you can't make any sense of it, so that's something that I see will be an area for us to focus on, making sense of the visual world around us. S. Donaldson: With respect to the technical people you need to create these products and explore new areas, where do you look to recruit people and what do you look for in employees when you hire them? Bloore: When it comes to algorithms, computer vision algorithm work, of course we're looking at people who have advanced degrees in that field. Despite my own lack of a degree in that field, people who are able to write robust algorithms are usually coming out of an academic background. And we are very often recruiting through our network of other technology people, who are telling us about very talented people that they know.

Donaldson: In terms of keeping current and where the future is going, are you aligned at all with any universities or research labs? Bloore: Yes, we certainly are. We keep in touch with academics at both the University of Toronto and the University of Waterloo. We've done joint projects with the University of Toronto in the past, and we're likely to in the future as well, especially in the computer vision department. G. Donaldson: Do you see your role evolving over the next several years? Bloore: Well, as the team grows, at a certain point there will be less direct contact day-to-day with all the development staff. I'll see that as an unfortunate day, but it's probably a necessary step. G. Donaldson: You have a lead engineer for your technology staff?

pages: 486 words: 132,784

Inventors at Work: The Minds and Motivation Behind Modern Inventions
by Brett Stern
Published 14 Oct 2012

But many people, who have been here fifteen years or more, should be taking on somewhat of a mentorship role. In a research lab, even in periods when you are laying off, you’d better be hiring people who have the latest new skills—especially skills that I don’t have. A lot of new hires are much better at computer vision and video processing, for example, than I am. But I can help them formulate their ideas into solutions for problems, rather than just staying in their heads or on their desks. A lot of the mentoring I do is also showing them how to communicate their work and to get their solutions out to the community.

You want to give the real technical information that the person needs to know, not all of your grief in getting there. Stern: I realize that your work is proprietary, but can you give us a general sense of where your technology and research interests are going? Loce: Most of it is in video processing and computer vision, with key application fields of transportation, health care, and education. Transportation is probably the easiest one for me to explain without stepping into specifics. Half of the car fuel used in San Francisco is burned looking for a parking space. Thirty to fifty percent of the cars driving around the streets in Brooklyn are looking for parking spaces.

Some of the transportation problems that we are going after are to save fuel, reduce emissions, reduce congestion, improve highway safety, and reduce the cost of law enforcement. Solutions have worldwide application. Every country in the world is looking to make their transportation systems more intelligent. Computer vision and video processing are going to be key tools in making highways and transportation systems more intelligent. Look at the cameras that are out there already: there are a million CCTV cameras in London—forty thousand in the London subway system alone. Some of our buses here in Rochester have nine cameras on them.

pages: 291 words: 81,703

Average Is Over: Powering America Beyond the Age of the Great Stagnation
by Tyler Cowen
Published 11 Sep 2013

Self-scrutiny doesn’t have to be restricted to matters of the heart. Which products do we really like or really notice? How are we responding to advertisements when we see them? Consult your pocket device. There is currently a DARPA (Defense Advanced Research Projects Agency, part of the Department of Defense) project called “Cortically Coupled Computer Vision.” The initial applications help analysts scan satellite photos or help a soldier-driver navigate dangerous terrain in a Jeep. Basically, the individual wears some headgear and the device measures neural signals whenever the individual experiences a particular kind of subconscious alert (Danger!

See Freestyle chess conscientiousness, 29–40, 201–2 conservatism, 74, 98, 235, 254–56 consulting, 41, 42–43 consumer empowerment, 122–23 consumer quality quotients, 125 contempt aversion, 98 convergence, 150–51, 155–58 cooperative research, 207 Cortically Coupled Computer Vision, 14 cosmology, 211, 212–13, 218–19, 226 cost of living, 236, 244–46, 248 counterintuitiveness, 205 coupons, 24 Cox cable company, 111, 119 Cramton, Steven, 78 credentials, 40 credit markets, 55 credit ratings, 124–25 crime, 52, 253 Cuba, 171 cultural economy, 67 Danailov, Silvio, 149–50 data collection, 95–96, 219–20, 227 data quality, 224 dating market, 73, 125.

pages: 305 words: 89,103

Scarcity: The True Cost of Not Having Enough
by Sendhil Mullainathan
Published 3 Sep 2014

In the pilot study, they found that 106 employees at twelve banks showed increased performance on several metrics. Perhaps this sounds far-fetched. But how different is this from how we manage our bodies? To prevent repetitive strain injury, frequent computer users take mandated breaks. To help with computer vision syndrome, people are advised to look away from the screen every twenty minutes or so for about twenty seconds to rest the eyes. Why is it counterintuitive that our cognitive system should be so different from our physical one? The deeper lesson is the need to focus on managing and cultivating bandwidth, despite pressures to the contrary brought on by scarcity.

arousal, and performance artificial scarcity Asia Atkins diet attention bottom-up processing capture of performance and top-down processing attentional blink Australia automatic bill pay automatic impulse bandwidth building cognitive capacity and comes at a price economizing on executive control and tax terminology timeline Banerjee, Abhijit Bangladesh Bank of America bankruptcy banks bargaining basketball beer bees behavioral economics Benihana restaurants Berra, Yogi Bertrand, Marianne bills automatic payment late payment of Bjorkegren, Dan Bohn, Roger Bolivia borrowing Family Feud and payday loans traps tunneling and See also borrowing; debt Boston bottom-up processing Bowen, Bruce brain development lateralization perception See also mind bridges Bryan, Chris buffer stock cabinet castaways cancer carbohydrates carbon dioxide Carlin, George cars accidents cell phone use and eating in impulse purchases insurance registration repairs repossession shopping for traffic cash transfer programs castaways cell phones Center for Responsible Lending Chapanis, Alphonse checker-shadow illusion chemistry Chennai, India Chevys restaurant child care China choices burden of one-off choking Christmas Churchill, Winston cigarettes taxes clothing packing professional purchase mistakes cockpit errors cocktails cognitive capacity cognitive science Cohen, Amanda college deadlines exams financial aid programs loans tuition communal tables commuters computers shopping for software computer vision syndrome conditional cash transfers consistency Consumer Reports contextual cues control impulses cortisol Covey, Stephen creativity credit cards crop insurance crop yields culture customer service dating, online daycare deadlines benefits of focus dividend and debt in India leveraged buyout payday loans rolled-over traps tunneling and See also borrowing; loans decisions, linking and the timing of declarative memory Dempsey, Christy diabetes dichotic listening task Dickinson, Charlie dieting diminishing marginal utility discretion, lack of disease divorce Dominican Republic DOTS (directly observed therapy) DVD players economics behavioral expertise and in India scarcity and 2008 recession edema education college financial literacy noise and Eisenhower, Dwight Eliot, T.

pages: 339 words: 88,732

The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies
by Erik Brynjolfsson and Andrew McAfee
Published 20 Jan 2014

Our digital machines have escaped their narrow confines and started to demonstrate broad abilities in pattern recognition, complex communication, and other domains that used to be exclusively human. We’ve also recently seen great progress in natural language processing, machine learning (the ability of a computer to automatically refine its methods and improve its results as it gets more data), computer vision, simultaneous localization and mapping, and many of the other fundamental challenges of the discipline. We’re going to see artificial intelligence do more and more, and as this happens costs will go down, outcomes will improve, and our lives will get better. Soon countless pieces of AI will be working on our behalf, often in the background.

As futurist Kevin Kelly put it “You’ll be paid in the future based on how well you work with robots.”7 Sensing Our Advantage So computers are extraordinarily good at pattern recognition within their frames, and terrible outside them. This is good news for human workers because thanks to our multiple senses, our frames are inherently broader than those of digital technologies. Computer vision, hearing, and even touch are getting exponentially better all the time, but there are still tasks where our eyes, ears, and skin, to say nothing of our noses and tongues, surpass their digital equivalents. At present and for some time to come, the sensory package and its tight connection to the pattern-recognition engine of the brain gives us a broader frame.

pages: 340 words: 90,674

The Perfect Police State: An Undercover Odyssey Into China's Terrifying Surveillance Dystopia of the Future
by Geoffrey Cain
Published 28 Jun 2021

Dave Gershgorn, “The Inside Story of How AI Got Good Enough to Dominate Silicon Valley,” Business Insider, June 28, 2018, https://qz.com/1307091/the-inside-story-of-how-ai-got-good-enough-to-dominate-silicon-valley/. 10. I am grateful to a former Google AI developer for illustrating the techniques and commercial applications of GPU technologies in an interview. 11. Allison Linn, “Microsoft Researchers Win ImageNet Computer Vision Challenge,” AI Blog, Microsoft, December 10, 2015, https://blogs.microsoft.com/ai/microsoft-researchers-win-imagenet-computer-vision-challenge/. 12. Crunchbase, “Series A—MEGVII,” announcement of Series A funding round, July 18, 2013, https://www.crunchbase.com/funding_round/megvii-technology-series-a--927a6b8b. 13. Shu-Ching Jean Chen, “SenseTime: The Faces behind China’s Artificial Intelligence Unicorn,” Forbes Asia, March 7, 2018, https://www.forbes.com/sites/shuchingjeanchen/2018/03/07/the-faces-behind-chinas-omniscient-video-surveillance-technology/?

pages: 332 words: 93,672

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy
by George Gilder
Published 16 Jul 2018

Perhaps Buterin, who launched Bitcoin Magazine while working as research assistant to the cryptographer Ian Goldberg, is the truest legatee of Shannon’s vision. Like Shannon he can move seamlessly between the light and dark sides of information, between communication and cryptography. Shannon’s information theory, like Turing’s computational vision, began with an understanding of codes. His first major paper, “A Mathematical Theory of Cryptography” (1945) proved that a perfect randomized one-time pad constitutes an unbreakable code, a singularity. The theory of information deals with a continuum between white noise (purely random) and perfect order (predictable and information-free).

He ended up with a cool hat, better Mandarin, and a sharper sales pitch but no manufacturer or market for the product. “The technology was not mature,” Stephen decided. Despite the disappointment, he still did not want to work on “something that wasn’t mine.” In early 2015, Gary Bradski—the robotics pioneer who developed computer vision at Intel, founded the Willow Garage robotics incubator, which convinced Wired’s Kevin Kelly that “robots have wants,” and started Industrial Perception, which made “stevedore robots” that could, as Stephen Balaban described it, “pick up and chuck a box so elegantly” that Google bought them—that Gary Bradski—invited Balaban to join his deep-learning team at Magic Leap.

pages: 336 words: 91,806

Code Dependent: Living in the Shadow of AI
by Madhumita Murgia
Published 20 Mar 2024

In particular, he was interested in teaching computers to identify faces even better than humans do. His goal seemed simple: first, unpick how humans see faces, and then teach computers how to do it more efficiently than we do. When he started out back in the eighties and nineties, Karl was developing AI technology to help the US Navy’s submarine fleet navigate autonomously. At the time, computer vision was a slow-moving field, in which machines were taught to merely recognize objects, rather than people’s identities. The technology was nascent – and pretty terrible. The algorithms he designed were trying to get the machine to say that’s a bottle, these are glasses, this is a table, these are humans.

When Karl first started working on the problem of facial recognition, it wasn’t supposed to be used live on protesters or pedestrians or ordinary people. It was supposed to be a photo analysis tool. From its inception in the nineties, researchers knew there were biases and inaccuracies in how the algorithms worked. But they hadn’t quite figured out why. The biometrics community viewed the problems as academic, an interesting computer vision challenge affecting a prototype still in its infancy. They broadly agreed that the technology wasn’t ready for primetime use and they had no plans to profit from it. As the technology steadily improved, Karl began to develop experimental AI analytics models to spot physical signs of illnesses like cardiovascular disease, Alzheimer’s or Parkinson’s from a person’s face.

pages: 407 words: 103,501

The Digital Divide: Arguments for and Against Facebook, Google, Texting, and the Age of Social Netwo Rking
by Mark Bauerlein
Published 7 Sep 2011

. >>> how the web learns: explicit vs. implicit meaning But how does the Web learn? Some people imagine that for computer programs to understand and react to meaning, meaning needs to be encoded in some special taxonomy. What we see in practice is that meaning is learned “inferentially” from a body of data. Speech recognition and computer vision are both excellent examples of this kind of machine learning. But it’s important to realize that machine learning techniques apply to far more than just sensor data. For example, Google’s ad auction is a learning system, in which optimal ad placement and pricing are generated in real time by machine learning algorithms.

Carr, David Carr, Nicholas CDDB Cell phones cameras in in Kenya task switching and Centralization Centre for Addiction and Mental Health Chevy.com Chevy Tahoe Chua, Amy Cicierega, Neil Citizendium Citizen journalism Citizen media Civic causes, Net Geners and Civic disengagement Civic engagement Classmates.com Click Health Clinton, Hillary Clocks Cloudmark Club Penguin CNN CNN Pipeline Coates, Tom Cognition Digital Native differences in Internet use and multitasking and Cognitive psychology Cognitive science Cognitive surplus Cohan, Peter Cohen, Leonard Col, Cynthia Collaboration Collective intelligence collegeabc.com Comcast Company of Friends Complex adaptive networks Comprehension Compressed workweeks Computer-aided design (CAD) Computer games. See Video games Computer vision Computing Power and Human Reason: From Judgement to Calculation (Weizenbaum) Conceptual models Conspicuous consumption Continuous partial attention Cooperating data subsystems Copernicus Counterculture craigslist Craik, Fergus Crary, Jonathan Crest Critical thinking Crowdsourcing Cunningham, Ward CUSeeMe Customization Cyberculture Cyberpunk Czerwinski, Mary Daily Kos Dallas (television series) Darwin, Charles Dateline Dean, Howard Deductions Deep reading Deep thinking del.icio.us Dell Dennett, Daniel The Departed (film) Department of Defense, U.S., learning games and The Diagnosis (Lightman) The Diffusion Group Digg Digital Immigrants Digital Media and Learning Initiative Digital Natives.

pages: 359 words: 96,019

How to Turn Down a Billion Dollars: The Snapchat Story
by Billy Gallagher
Published 13 Feb 2018

The video did not do much to clear up the perplexity surrounding the app. But Evan has also embraced opaqueness at Snapchat. The app is designed for existing users rather than new ones; this helped early growth at Snapchat as users showed friends how to use all of the app’s features in person. Confusion also lets Snapchat work in private, building hardware, computer-vision software capable of analyzing Snapchat pictures, and other moonshot projects that are key to the company’s future. This attitude, combined with Snapchat’s youth-focused design, has led outsiders to question how serious the company is. As Snapchat set out on its IPO roadshow, Evan, Bobby, and the team found themselves pitching the company to potential investors far outside Snapchat’s core demographic.

When they eventually sold the company to Snapchat, it was a reunion of sorts, as Rodriguez had lived with Evan (and Reggie) in the Donner dorm at Stanford back in their freshman year. Evan set up a new division of the company, dubbed Snap Lab, and filled it with the ex-Vergence team and engineers with experience working on computer vision, gaze tracking, and speech recognition. Over the next year, Snapchat recruited a dozen wearable technology experts, industrial designers, and people with experience in the fashion industry. Members of the Snap Lab team took frequent trips to Shenzhen, China, to prepare a potential supply chain for a Snapchat hardware product.

pages: 346 words: 97,330

Ghost Work: How to Stop Silicon Valley From Building a New Global Underclass
by Mary L. Gray and Siddharth Suri
Published 6 May 2019

Forbes, December 18, 2017. https://www.forbes.com/sites/frederickdaso/2017/12/18/bill-gates-elon-musk-are-worried-about-automation-but-this-robotics-company-founder-embraces-it/. Dayton, Eldorous. Walter Reuther: The Autocrat of the Bargaining Table. New York: Devin-Adain, 1958. Deng, J., W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei. “ImageNet: A Large-Scale Hierarchical Image Database.” In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–55. Piscataway, NJ: IEEE. https://doi.org/10.1109/CVPR.2009.5206848. Denyer, Simon. Rogue Elephant: Harnessing the Power of India’s Unruly Democracy. New York: Bloomsbury Press, 2014. DePillis, Lydia. “The Next Labor Fight Is Over When You Work, Not How Much You Make.”

Murphy, Machine Learning: A Probabilistic Perspective (Cambridge, MA: MIT Press, 2012). [back] 9. Fei-Fei Li, “ImageNet: Where Have We Been? Where Are We Going?,” ACM Learning Webinar, https://learning.am.org/, accessed September 21, 2017; Deng et al., “ImageNet: A Large-Scale Hierarchical Image Database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition (Piscataway, NJ: IEEE), 248–55. [back] 10. Accuracy went from 72 percent to 97 percent between 2010 and 2016. [back] 11. In fact, Alexander Wissner-Gross said, “Data sets—not algorithms—might be the limiting factor to development of human-level artificial intelligence.”

pages: 122 words: 29,286

Learning Scikit-Learn: Machine Learning in Python
by Raúl Garreta and Guillermo Moncecchi
Published 14 Sep 2013

Thanks to all the people of the Natural Language Group and the Instituto de Computación at the Universidad de la República. I am proud of the great job we do every day building the uruguayan NLP and ML community. About the Reviewers Andreas Hjortgaard Danielsen holds a Master's degree in Computer Science from the University of Copenhagen, where he specialized in Machine Learning and Computer Vision. While writing his Master's thesis, he was an intern research student in the Lampert Group at the Institute of Science and Technology (IST), Austria in Vienna. The topic of his thesis was object localization using conditional random fields with special focus on efficient parameter learning. He now works as a software developer in the information services industry where he has used scikit-learn for topic classification of text documents.

pages: 214 words: 31,751

Software Engineering at Google: Lessons Learned From Programming Over Time
by Titus Winters , Tom Manshreck and Hyrum Wright
Published 17 Mar 2020

It’s All About Dependencies In looking through the above problems, one theme repeats over and over: managing your own code is fairly straightforward, but managing its dependencies is much harder (see “Dependency Management”) is devoted to covering this problem in detail). There are all sorts of dependencies: sometimes there’s a dependency on a task (e.g. “push the documentation before I mark a release as complete”), and sometimes there’s a dependency on an artifact (e.g. “I need to have the latest version of the computer vision library to build my code”). Sometimes you have internal dependencies on another part of your codebase, and sometimes you have external dependencies on code or data owned by another team (either in your organization or a third party). But in any case, the idea of “I need that before I can have this” is something that recurs repeatedly in the design of build systems, and managing dependencies is perhaps the most fundamental job of a build system.

pages: 394 words: 108,215

What the Dormouse Said: How the Sixties Counterculture Shaped the Personal Computer Industry
by John Markoff
Published 1 Jan 2005

McCarthy had previously gotten Licklider interested in time-sharing, and years later McCarthy said that if he had known that Licklider was going to underwrite the MIT work, he would never have come to Stanford. Initially, McCarthy had been successful in getting a small amount of funding for AI research from Licklider, and the Digital Equipment Corporation had donated the PDP-1 to the young professor. McCarthy had meanwhile become interested in some vexing issues in computer vision that would need to be solved if robots were to recognize and manipulate blocks successfully. In 1964, he had applied for a larger grant, which he received, and he even had the audacity to ask ARPA to allow him to hire an executive officer. By that time, Ivan Sutherland, the designer of the brilliant Sketchpad drawing system, had succeeded Licklider.

It was an instant success, but then the legend grew over time as the world came to realize what Engelbart and his research team had wrought. One reason the presentation worked as well as it did was because at the other end of the hall, standing on a raised platform, was Bill English, Engelbart’s lead engineer. It was easy for Engelbart to wave his hands and conceptualize his computing vision, but someone had to build the demonstration from scratch. And that someone was English. An absolute pragmatist, he had an uncanny knack for making things work. English was the one who had tracked down the remarkable Eidaphor video projector for the demonstration. On loan from NASA, and with the blessing of Bob Taylor at ARPA, the Eidaphor was the only technology that could create the kind of effect that Engelbart had in mind.

pages: 419 words: 109,241

A World Without Work: Technology, Automation, and How We Should Respond
by Daniel Susskind
Published 14 Jan 2020

The Electronic Frontier Foundation lists the winning systems in a similar chart, and also plots the human error rate; see https://www.eff.org/ai/metrics#Vision (accessed July 2018). For an overview of the challenge, see Olga Russakovsky, Jia Deng, Hao Su, et al., “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision 115, no. 3 (2015): 211–52. 25.  Quoted in Susskind and Susskind, Future of the Professions, p. 161. 26.  Not all researchers changed direction of travel in this way, though. Marvin Minsky in fact made the opposite move, moving from bottom-up approaches to AI, to top-down ones instead; see https://www.youtube.com/watch?

“The Supreme Court Forecasting Project: Legal and Political Science Approaches to Predicting Supreme Court Decisionmaking.” Columbia Law Review 104, no. 4 (2004): 1150–1210. Russakovsky, Olga, Jia Deng, Hao Su, et al. “ImageNet Large Scale Visual Recognition Challenge.” International Journal of Computer Vision 115, no. 3 (2015): 211–52. Russell, Bertrand. In Praise of Idleness and Other Essays. New York: Routledge, 2004. Saez, Emmanuel. “Striking It Richer: The Evolution of Top Incomes in the United States.” Published online at https://eml.berkeley.edu/~saez/ (2016). Saez, Emmanuel, and Thomas Piketty.

The Smart Wife: Why Siri, Alexa, and Other Smart Home Devices Need a Feminist Reboot
by Yolande Strengers and Jenny Kennedy
Published 14 Apr 2020

The Look Book records what you wear and when, so that you can “keep track of your favorites and take your closet with you.”25 Amazon’s Echo Look also includes all the other usual Alexa accessories, and its features are likely to continue expanding. In 2018, Look users could crowdsource votes on their outfit; they will eventually be able to make use of a “mirror” that dresses them in virtual clothes.26 Using computer vision, pattern recognition, neural networks, and machine learning, the device is part of a system that will one day be able to design clothes by analyzing the Look’s database of images, identifying emerging trends, and then applying the learning to generate new items from scratch. Such possibilities raise a whole host of other consumption and sustainability concerns associated with fast fashion that support our arguments from chapter 4.

From 2013 on, Jenny simultaneously worked on another ARC project, titled “An Investigation of the Early Adoption and Appropriation of High-Speed Broadband in the Domestic Environment.” Together and with our colleagues, in 2018 we received a gift from the Intel Corporation to reanalyze data from Yolande’s ARC Automating the Smart Home project around Intel’s ambient computing vision for the smart home: protection, productivity, and pleasure—or the 3Ps. In 2017, we came up with the idea for this book, as a collective product of the gendered concerns that we’d both been raising in relation to our respective projects. We unofficially started a smart wife side project—and hired a fabulous research assistant, Paula Arcari, to help us fill in the gaps from our research thus far.

pages: 918 words: 257,605

The Age of Surveillance Capitalism
by Shoshana Zuboff
Published 15 Jan 2019

Adding insult to injury, data rendered by this wave of things are notoriously insecure and easily subject to breaches. Moreover, manufacturers have no legal responsibility to notify device owners when data are stolen or hacked. There are other, even more grandiose ambitions for the rendition of all solitary things. Companies such as Qualcomm, Intel, and ARM are developing tiny, always-on, low-power computer vision modules that can be added to any device, such as your phone or refrigerator, or any surface. A Qualcomm executive says that appliances and toys can know what’s going on around them: “A doll could detect when a child’s face turns toward it.”17 Consider “smart skin,” developed by brilliant university scientists and now poised for commercial elaboration.

The idea here is that readily available devices “can be used to transmit information to only wireless receivers that are in contact with the body,” thus creating the basis for secure and private communications independent of normal Wi-Fi transmissions, which can easily be detected.37 Take a casual stroll through the shop at the New Museum for Contemporary Art in Manhattan, and you pass a display of its bestseller: table-top mirrors whose reflecting surface is covered with the bright-orange message “Today’s Selfie Is Tomorrow’s Biometric Profile.” This “Think Privacy Selfie Mirror” is a project of the young Berlin-based artist Adam Harvey, whose work is aimed at the problem of surveillance and foiling the power of those who surveil. Harvey’s art begins with “reverse engineering… computer vision algorithms” in order to detect and exploit their vulnerabilities through camouflage and other forms of hiding. He is perhaps best known for his “Stealth Wear,” a series of wearable fashion pieces intended to overwhelm, confuse, and evade drone surveillance and, more broadly, facial-recognition software.

Now he redirects that meaning to create garments that separate human experience from the powers that surveil.38 Another Harvey project created an aesthetic of makeup and hairstyling—blue feathers suspended from thick black bangs, dreadlocks that dangle below the nose, cheekbones covered in thick wedges of black and white paint, tresses that snake around the face and neck like octopus tentacles—all designed to thwart facial-recognition software and other forms of computer vision. Harvey is one among a growing number of artists, often young artists, who direct their work to the themes of surveillance and resistance. Artist Benjamin Grosser’s Facebook and Twitter “demetricators” are software interfaces that present each site’s pages with their metrics deleted: “The numbers of ‘likes,’ ‘friends,’ followers, retweets… all disappear.”

pages: 197 words: 35,256

NumPy Cookbook
by Ivan Idris
Published 30 Sep 2012

Another option is to obtain the latest development version by cloning the Git repository, or downloading the repository as a zip file from Github. Then, you will need to run the following command: python setup.py install Detecting corners Corner detection (http://en.wikipedia.org/wiki/Corner_detection ) is a standard technique in Computer Vision. scikits-image offers a Harris Corner Detector, which is great, because corner detection is pretty complicated. Obviously, we could do it ourselves from scratch, but that would violate the cardinal rule of not reinventing the wheel. Getting ready You might need to install jpeglib on your system to be able to load the scikits-learn image, which is a JPEG file.

pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by Orly Lobel
Published 17 Oct 2022

The company’s lead product, Traffic Jam, helps sift through data online to search for victims and trafficking rings. Local, state, and federal law enforcement, including the FBI, have used Traffic Jam to identify thousands of victims of sex trafficking, and it has also been adopted in Canada and the United Kingdom. AI can perform tasks exponentially faster than humans, saving massive amounts of time. Computer vision can identify multiple victims advertised and sold from the same hotel bedroom, identifying the bedding or wallpaper pattern, for example. Traffic Jam also sorts the kinds of language that is coded in online human trafficking advertisements. In 2017, Marinus Analytics also released Face Search, the first facial recognition tool to fight sex trafficking.

.… It doesn’t matter what your disease is; today, A.I. is not yet part of clinical treatment.”14 Barzilay developed an algorithm that analyzes mammogram images differently, assessing risk before cancer develops, which is something that is not attempted by human radiologists. In terms of detection of existing cancer, Barzilay believes that today the best radiologists are still better than machines, though the gap is narrowing fast. The year after she was diagnosed, she created a system that uses computer vision technology to independently learn about the patterns of diagnosing breast cancer. She partnered with Dr. Constance Lehman, chief of breast imaging at Boston’s Massachusetts General Hospital. Lehman herself serves on several key national committees and was eager to apply deep learning to all aspects of breast cancer care, from prevention to detection to treatment.

pages: 677 words: 206,548

Future Crimes: Everything Is Connected, Everyone Is Vulnerable and What We Can Do About It
by Marc Goodman
Published 24 Feb 2015

Baxter can learn to do simple tasks, such as “pick and place” objects on an assembly line, in just five minutes. It has an adorable face on its head-mounted display screen and two highly dexterous arms, which can move in any direction required to get a task done. Baxter requires no special programming and learns by using its computer vision to watch an employee perform a task, which the bot can repeat ad infinitum. As costs drop even further, these robots will be competitively priced compared with cheap overseas labor, and many hope a rise in domestic robotics use may lead to a renaissance in American manufacturing. Today robots are showing up everywhere from restaurants to hospitals.

But now another deeply disruptive player has entered the world of robotics: Google. The search giant is on a robo-buying binge and purchased or acquired eight separate robotics companies in a six-month period through 2014, including companies that specialize in humanoid walking robots, robotic arms, robotics software, and computer vision. Its largest and most surprising robotics acquisition, however, was the military robotics company Boston Dynamics, the same folks who make BigDog, Cheetah, Sand Flea, RiSE, and PETMAN (a biped humanoid robot that might well be the soldier of the future). Google also bested Facebook’s offer to buy Titan Aerospace, a maker of jet-sized solar-powered drones that can remain aloft for three years without landing.

Today we have the following: • algorithmic trading on Wall Street (bots carry out stock buys and sells) • algorithmic criminal justice (red-light and speeding cameras determine infractions of the law) • algorithmic border control (an AI can flag you and your luggage for screening) • algorithmic credit scoring (your FICO score determines your creditworthiness) • algorithmic surveillance (CCTV cameras can identify unusual activity by computer vision analysis, and voice recognition can scan your phone calls for troublesome keywords) • algorithmic health care (whether or not your request to see a specialist or your insurance claim is approved) • algorithmic warfare (drones and other robots have the technical capacity to find, target, and kill without human intervention) • algorithmic dating (eHarmony and others promise to use math to find your soul mate and the perfect match) Though the inventors of these algorithmic formulas might wish to suggest they are perfectly neutral, nothing could be further from the truth.

Scikit-Learn Cookbook
by Trent Hauck
Published 3 Nov 2014

He is the author of the book, Code Explorer's Guide to the Open Source Jungle, available online at https://leanpub.com/opensourcebook. To my beloved. Xingzhong is a PhD candidate in Electrical Engineering at Stevens Institute of Technology, Hoboken, New Jersey, where he works as a research assistant, designing and implementing machine-learning models in computer vision and signal processing applications. Although Python is his primary programming language, occasionally, for fun and curiosity, his works might be written on golang, Scala, JavaScript, and so on. As a self-confessed technology geek, he is passionate about exploring new software and hardware. www.it-ebooks.info www.PacktPub.com Support files, eBooks, discount offers, and more For support files and downloads related to your book, please visit www.PacktPub.com.

pages: 159 words: 42,401

Snowden's Box: Trust in the Age of Surveillance
by Jessica Bruder and Dale Maharidge
Published 29 Mar 2020

Beyond those we’ve already discussed, the algorithms are strongly biased towards white men and are much more likely to misidentify women and people of color — amplifying preexisting racial and gender biases. Enter Hyphen-Labs, an international collective of women technologists of color. The group is working with Berlin artist and privacy advocate Adam Harvey on the HyperFace Project: a purple camouflage scarf packed with ghost faces, designed to scramble computer-vision algorithms. In the meantime, Harvey has also been developing CV Dazzle: a free toolkit of fashion-based strategies that use hair and makeup to thwart facial-recognition software. The name is an homage to Dazzle camouflage — also known as Razzle Dazzle — a series of striking, black-and-white patterns used by the Allies during World War I to conceal their battleships’ size and orientation.

pages: 481 words: 125,946

What to Think About Machines That Think: Today's Leading Thinkers on the Age of Machine Intelligence
by John Brockman
Published 5 Oct 2015

What was needed was not only much more computer power but also a lot more data to train the network. After thirty years of research, a million-times improvement in computer power, and vast data sets from the Internet, we now know the answer to this question: Neural networks scaled up to twelve layers deep, with billions of connections, are outperforming the best algorithms in computer vision for object recognition and have revolutionized speech recognition. It’s rare for any algorithm to scale this well, which suggests that they may soon be able to solve even more difficult problems. Recent breakthroughs have been made that allow the application of deep learning to natural-language processing.

More flexibility means a greater ability to capture the patterns appearing in data but a greater risk of finding patterns that aren’t there. In artificial intelligence research, this tension between structure and flexibility manifests in different kinds of systems that can be used to solve challenging problems like speech recognition, computer vision, and machine translation. For decades, the systems that performed best on those problems came down on the side of structure: They were the result of careful planning, design, and tweaking by generations of engineers who thought about the characteristics of speech, images, and syntax and tried to build into the system their best guesses about how to interpret those particular kinds of data.

pages: 742 words: 137,937

The Future of the Professions: How Technology Will Transform the Work of Human Experts
by Richard Susskind and Daniel Susskind
Published 24 Aug 2015

Fenech (2014). 111 Micklethwait and Wooldridge, God is Back, 268. 112 <http://www.askmoses.com>. 113 <http://www.christianmingle.com>, <http://jdate.com>, <http://www.muslima.com>. 114 Emily Greenhouse, ‘Treasures in the Wall’, New Yorker, 1 Mar. 2013. 115 Lior Wolf et al., ‘Identifying Join Candidates in the Cairo Genizah’, International Journal of Computer Vision, 94: 1 (2011), 118–35. 116 Lior Wolf and Nachum Dershowitz, ‘Automatic Scribal Analysis of Tibetan Writing’, abstract for panel at the International Association for Tibetan Studies 2013 <http://www.cs.tau.ac.il/~wolf/papers/genizahijcv.pdf> (accessed 7 March 2015). 117 Adiel Ben-Shalom et al., ‘Where is my Other Half?’

Wolf, Lior, and Nachum Dershowitz, ‘Automatic Scribal Analysis of Tibetan Writing’, abstract for panel at the International Association for Tibetan Studies 2013 <http://www.cs.tau.ac.il/~wolf/papers/genizahijcv.pdf> (accessed 7 March 2015). Wolf, Lior, Rotem Littman, Naama Mayer, Tanya German, Nachum Dershowitz, Roni Shweka, and Yaacov Choueka, ‘Identifying Join Candidates in the Cairo Genizah’, International Journal of Computer Vision, 94: 1 (2011), 118–35. Wootton, Richard, John Craig, and Victor Patterson (eds.), Introduction to Telemedicine, 2nd edn. (London: Hodder Arnold, 2011). Zittrain, Jonathan, The Future of the Internet—And How to Stop It (New Haven: Yale University Press, 2009). Zittrain, Jonathan, and Benjamin Edelman, ‘Documentation of Internet Filtering in Saudi-Arabia’, 12 Sept. 2002 <http://cyber.law.harvard.edu/filtering/saudiarabia/> (accessed 7 March 2015).

pages: 170 words: 49,193

The People vs Tech: How the Internet Is Killing Democracy (And How We Save It)
by Jamie Bartlett
Published 4 Apr 2018

Underneath Tony’s feet and behind his oversized wheel were work-in-progress wires, pumps, shiny levers and many cogs. They were connected to computers in the back of the cab, which were under Kartik’s command. The software controlled the pedals and steering wheel, which constantly adjust to real-time data collected by mounted radars and computer vision sensors that covered the vehicle: position, speed, road markings, other cars’ positions and speed and so on. We left the narrow residential roads, and joined Freeway 95. Tony turned to Stefan: ‘I could kick the system on if you guys are ready.’ ‘Rosebud on,’ Stefan shouted into his walkie-talkie to other crew members, who were following in a car.

pages: 187 words: 50,083

Collaborative Society
by Dariusz Jemielniak and Aleksandra Przegalinska
Published 18 Feb 2020

Antiscience Social movements and groups convinced that the scientific world is either in conspiracy with industry or simply not competent enough; they thus treat scientific knowledge with suspicion and disbelief, and actively oppose it (e.g., antivaccination movements, Flat Earthers.) Artificial intelligence (AI) A science and a set of computational technologies that are inspired by the ways people use their nervous systems and bodies to take actions, acquire knowledge, and reason about the world. AI is a very broad field, consisting of such domains as computer vision, natural language processing, robotic process automation, expert systems, and machine learning. Biohacking This term, broadly referring to using technology to change or influence biological organisms, can be used in diverse contexts: do-it-yourself (DIY) biology (studying biology on one’s own and pursuing biological experiments); grinder biohacking (altering bodies by implanting DIY cybernetic devices and wearable technologies); nutrigenomics (using nutrition to hack human biology); an in the Quantified Self movement (for measuring activity, behaviors, and biomarkers to optimize health, well-being, and mental states).

pages: 194 words: 57,434

The Age of AI: And Our Human Future
by Henry A Kissinger , Eric Schmidt and Daniel Huttenlocher
Published 2 Nov 2021

Despite Facebook reportedly having tens of thousands of people working on content moderation—with the objective of removing offensive content before users see it—the scale is simply such that it cannot be accomplished without AI. Such monitoring needs at Facebook and other companies have driven extensive research and development in an effort to automate text and image analysis by creating increasingly sophisticated machine learning, natural language processing, and computer vision techniques. For Facebook, the number of removals is currently on the order of roughly one billion fake accounts and spam posts per quarter as well as tens of millions of pieces of content involving nudity or sexual activity, bullying and harassment, exploitation, hate speech, drugs, and violence.

pages: 219 words: 63,495

50 Future Ideas You Really Need to Know
by Richard Watson
Published 5 Nov 2013

Second, if something artificial were to develop consciousness, why would it automatically let us know? Perhaps it would keep this to itself and refuse to participate in childish intelligence tests. The 60s and 70s saw a great deal of progress in AI, but breakthroughs failed to come. Instead scientists and developers focused on specific problems, such as speech and text recognition and computer vision. However, we may now be less than a decade away from seeing the AI vision become a reality. The Chinese room experiment In 1980, John Searle, an American philosopher, argued in a paper that a computer, or perhaps more accurately a bit of software, could pass the Turing test and behave much like a human being at a distance without being truly intelligent—that words, symbols or instructions could be interpreted or reacted to without any true understanding.

Speaking Code: Coding as Aesthetic and Political Expression
by Geoff Cox and Alex McLean
Published 9 Nov 2012

“Hello world!” is expressed thus: >+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>++++++++[<++++>]<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+. Taking this indeterminacy further still, Brainfuck exceeds the world of computation in Bodyfuck, an interpreter using computer vision techniques to map bodily gestures to the Brainfuck instruction set.7 But as with all signifying systems, interpretation still takes place at all levels, even when they are as esoteric as the examples mentioned above. The reader, whether human or machine, is also cast as one of the objects of the software and operating system.

Natural Language Processing with Python and spaCy
by Yuli Vasiliev
Published 2 Apr 2020

You can also connect the statistical models trained by other popular machine learning (ML) libraries, such as TensorFlow, Keras, scikit-learn, and PyTorch. In addition, spaCy can operate seamlessly with other libraries in Python’s AI ecosystem, allowing you to, for example, take advantage of computer vision in your chatbot application, as you’ll do in Chapter 12. Who Should Read This Book? This book is for those interested in learning how to use NLP in practice. In particular, it might be interesting to people who want to develop chatbots for businesses or just for fun. Regardless of your background or experience with NLP or programming, you’ll be able to follow the code examples provided in this book because they all include detailed explanations of the process involved.

pages: 574 words: 164,509

Superintelligence: Paths, Dangers, Strategies
by Nick Bostrom
Published 3 Jun 2014

They also provide important insight into the concept of causality.28 One advantage of relating learning problems from specific domains to the general problem of Bayesian inference is that new algorithms that make Bayesian inference more efficient will then yield immediate improvements across many different areas. Advances in Monte Carlo approximation techniques, for example, are directly applied in computer vision, robotics, and computational genetics. Another advantage is that it lets researchers from different disciplines more easily pool their findings. Graphical models and Bayesian statistics have become a shared focus of research in many fields, including machine learning, statistical physics, bioinformatics, combinatorial optimization, and communication theory.35 A fair amount of the recent progress in machine learning has resulted from incorporating formal results originally derived in other academic fields.

INDEX A Afghan Taliban 215 Agricultural Revolution 2, 80, 261 AI-complete problem 14, 47, 71, 93, 145, 186 AI-OUM, see optimality notions AI-RL, see optimality notions AI-VL, see optimality notions algorithmic soup 172 algorithmic trading 16–17 anthropics 27–28, 126, 134–135, 174, 222–225 definition 225 Arendt, Hannah 105 Armstrong, Stuart 280, 291, 294, 302 artificial agent 10, 88, 105–109, 172–176, 185–206; see also Bayesian agent artificial intelligence arms race 64, 88, 247 future of 19, 292 greater-than-human, see superintelligence history of 5–18 overprediction of 4 pioneers 4–5, 18 Asimov, Isaac 139 augmentation 142–143, 201–203 autism 57 automata theory 5 automatic circuit breaker 17 automation 17, 98, 117, 160–176 B backgammon 12 backpropagation algorithm 8 bargaining costs 182 Bayesian agent 9–11, 123, 130; see also artificial agent and optimality notions Bayesian networks 9 Berliner, Hans 12 biological cognition 22, 36–48, 50–51, 232 biological enhancement 36–48, 50–51, 142–143, 232; see also cognitive enhancement boxing 129–131, 143, 156–157 informational 130 physical 129–130 brain implant, see cyborg brain plasticity 48 brain–computer interfaces 44–48, 51, 83, 142–143; see also cyborg Brown, Louise 43 C C. elegans34–35, 266, 267 capability control 129–144, 156–157 capital 39, 48, 68, 84–88, 99, 113–114, 159–184, 251, 287, 288, 289 causal validity semantics 197 CEV, see coherent extrapolated volition Chalmers, David 24, 265, 283, 295, 302 character recognition 15 checkers 12 chess 11–22, 52, 93, 134, 263, 264 child machine 23, 29; see also seed AI CHINOOK 12 Christiano, Paul 198, 207 civilization baseline 63 cloning 42 cognitive enhancement 42–51, 67, 94, 111–112, 193, 204, 232–238, 244, 259 coherent extrapolated volition (CEV) 198, 211–227, 296, 298, 303 definition 211 collaboration (benefits of) 249 collective intelligence 48–51, 52–57, 67, 72, 142, 163, 203, 259, 271, 273, 279 collective superintelligence 39, 48–49, 52–59, 83, 93, 99, 285 definition 54 combinatorial explosion 6, 9, 10, 47, 155 Common Good Principle 254–259 common sense 14 computer vision 9 computing power 7–9, 24, 25–35, 47, 53–60, 68–77, 101, 134, 155, 198, 240–244, 251, 286, 288; see also computronium and hardware overhang computronium 101, 123–124, 140, 193, 219; see also computing power connectionism 8 consciousness 22, 106, 126, 139, 173–176, 216, 226, 271, 282, 288, 292, 303; see also mind crime control methods 127–144, 145–158, 202, 236–238, 286; see also capability control and motivation selection Copernicus, Nicolaus 14 cosmic endowment 101–104, 115, 134, 209, 214–217, 227, 250, 260, 283, 296 crosswords (solving) 12 cryptographic reward tokens 134, 276 cryptography 80 cyborg 44–48, 67, 270 D DARPA, see Defense Advanced Research Projects Agency DART (tool) 15 Dartmouth Summer Project 5 data mining 15–16, 232, 301 decision support systems 15, 98; see also tool-AI decision theory 10–11, 88, 185–186, 221–227, 280, 298; see also optimality notions decisive strategic advantage 78–89, 95, 104–112, 115–126, 129–138, 148–149, 156–159, 177, 190, 209–214, 225, 252 Deep Blue 12 Deep Fritz 22 Defense Advanced Research Projects Agency (DARPA) 15 design effort, see optimization power Dewey, Daniel 291 Differential Technological Development (Principle of) 230–237 Diffie–Hellman key exchange protocol 80 diminishing returns 37–38, 66, 88, 114, 273, 303 direct reach 58 direct specification 139–143 DNA synthesis 39, 98 Do What I Mean (DWIM) 220–221 domesticity 140–143, 146–156, 187, 191, 207, 222 Drexler, Eric 239, 270, 276, 278, 300 drones 15, 98 Dutch book 111 Dyson, Freeman 101, 278 E economic growth 3, 160–166, 179, 261, 274, 299 Einstein, Albert 56, 70, 85 ELIZA (program) 6 embryo selection 36–44, 67, 268 emulation modulation 207 Enigma code 87 environment of evolutionary adaptedness 164, 171 epistemology 222–224 equation solvers 15 eugenics 36–44, 268, 279 Eurisko 12 evolution 8–9, 23–27, 44, 154, 173–176, 187, 198, 207, 265, 266, 267, 273 evolutionary selection 187, 207, 290 evolvable hardware 154 exhaustive search 6 existential risk 4, 21, 55, 100–104, 115–126, 175, 183, 230–236, 239–254, 256–259, 286, 301–302 state risks 233–234 step risks 233 expert system 7 explicit representation 207 exponential growth, see growth external reference semantics 197 F face recognition 15 failure modes 117–120 Faraday cage 130 Fields Medal 255–256, 272 Fifth-Generation Computer Systems Project 7 fitness function 25; see also evolution Flash Crash (2010) 16–17 formal language 7, 145 FreeCell (game) 13 G game theory 87, 159 game-playing AI 12–14 General Problem Solver 6 genetic algorithms 7–13, 24–27, 237–240; see also evolution genetic selection 37–50, 61, 232–238; see also evolution genie AI 148–158, 285 definition 148 genotyping 37 germline interventions 37–44, 67, 273; see also embryo selection Ginsberg, Matt 12 Go (game) 13 goal-content 109–110, 146, 207, 222–227 Good Old-Fashioned Artificial Intelligence (GOFAI) 7–15, 23 Good, I.

pages: 561 words: 163,916

The History of the Future: Oculus, Facebook, and the Revolution That Swept Virtual Reality
by Blake J. Harris
Published 19 Feb 2019

I wanted to stay, but they made me get moved.” Iribe nodded. “Where are you headed?” “I’m going up to the bank. The SECU bank.” “That’s pretty far. Why don’t you get in and let’s talk? I’ll give you a ride.” As the two caught up, and Antonov talked about a job he’d recently taken—which involved UI (user interface) design and computer vision for handwriting recognition—Iribe realized that Antonov must actually be pretty great with computers. “I’m still working for that company Quatrefoil,” Iribe explained. “It’s a tech museum project and I could really use some help.” Antonov remained relatively neutral on the idea until Iribe started talking about graphics.

Luckey and Dycus were at the office trying to determine the ideal eye-relief settings—the ideal distance between lens and eyeball—that ought to be implemented into the eyecups of DK1 before manufacturing began in China. At a normal company, this would all be calibrated with some sort of fancy computer vision system. But this being a start-up, always short on time and money, they needed a faster and cheaper solution. “Chris! I got it!” Luckey had shouted minutes earlier. “We’re going to drill a hole in the center of the lens and then run a flat-top screw through the opening so it’s jutting out, you know?

pages: 533

Future Politics: Living Together in a World Transformed by Tech
by Jamie Susskind
Published 3 Sep 2018

‘Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation’, arXiv, 8 October 2016 <https://arxiv.org/abs/1609.08144> (accessed 6 December 2017); Yaniv Taigman et al.,‘DeepFace: Closing the Gap to Human-Level Performance in Face Verification’, 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014 <https://www.cs.toronto.edu/~ranzato/publications/taigman_ cvpr14.pdf> (accessed 11 December 2017); Aäron van den Oord et al., ‘WaveNet: A Generative Model for Raw Audio’, arXiv, 19 September 2016 <https://arxiv.org/abs/1609.03499> (accessed 6 December 2017). 4.

Swift, Adam. Political Philosophy: A Beginners’ Guide for Students and Politicians (Second Edition). Cambridge: Polity Press, 2007. Taigman, Yaniv, Ming Yang, Marc’ Aurelio Ranzato, and Lior Wolf. ‘DeepFace: Closing the Gap to Human-Level Performance in Face Verification’. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014 <https://www.cs.toronto.edu/~ranzato/ publications/taigman_cvpr14.pdf> (accessed 11 Dec. 2017). Takahashi, Dean. ‘Magic Leap Sheds Light on its Retina-based Augmented Reality 3D Displays’. VentureBeat, 20 Feb. 2015 <http://venturebeat.com/ 2015/02/20/magic-leap-sheds-light-on-its-retina-based-augmentedreality-3d-displays/> (accessed 30 Nov. 2017).

The Singularity Is Nearer: When We Merge with AI
by Ray Kurzweil
Published 25 Jun 2024

A growing body of research suggests that for many applications, algorithmic progress is roughly as important as hardware progress. According to a 2022 study, better algorithms halved compute requirements for a given level of performance every nine months from 2012–2021. See Ege Erdil and Tamay Besiroglu, “Algorithmic Progress in Computer Vision,” arXiv:2212.05153v4 [cs.CV] August 24, 2023, https://arxiv.org/pdf/2212.05153.pdf; Katja Grace, Algorithmic Progress in Six Domains, Machine Intelligence Research Institute technical report 2013-3, December 9, 2013, https://intelligence.org/files/AlgorithmicProgress.pdf. BACK TO NOTE REFERENCE 132 Anderljung et al., “Compute Funds and Pre-Trained Models.”

id=qrcuDwAAQBAJ; Thomas Anthony, Zheng Tian, and David Barber, “Thinking Fast and Slow with Deep Learning and Tree Search,” 31st Conference on Neural Information Processing Systems (NIPS 2017), revised December 3, 2017, https://arxiv.org/pdf/1705.08439.pdf; Kaiming He et al., “Deep Residual Learning for Image Recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, December 10, 2015, https://arxiv.org/pdf/1512.03385.pdf. BACK TO NOTE REFERENCE 6 Holt, “Waymo’s Autonomous Vehicles Have Clocked 20 Million Miles on Public Roads.” BACK TO NOTE REFERENCE 7 In 2021, the employed labor force in the US was around 155 million people.

pages: 237 words: 64,411

Humans Need Not Apply: A Guide to Wealth and Work in the Age of Artificial Intelligence
by Jerry Kaplan
Published 3 Aug 2015

A Japanese competitor claims that its technology can reduce strawberry picking time by 40 percent.23 Blue River Technologies, a Silicon Valley venture-funded startup headed by a Stanford graduate, is developing robots that can weed. To quote from their marketing materials: “We are creating systems that can distinguish crops from weeds in order to kill the weeds without harming the crops or the environment. Our systems use cameras, computer vision, and machine learning algorithms.”24 Note that the coming army of mechanical farmworkers doesn’t have to be faster than the workers they replace because, like autonomous vehicles, they can work in the dark and so aren’t limited to operating in daylight. Warehouse workers. Beyond the picking and packing of orders, as I’ve described above, there’s the loading and unloading of packages.

pages: 271 words: 62,538

The Best Interface Is No Interface: The Simple Path to Brilliant Technology (Voices That Matter)
by Golden Krishna
Published 10 Feb 2015

Among many things, they monitored defensive impact, the speed and distance of each player, and his number of passes.19 The players wouldn’t have to wear any extra gadgets to enable the system, and they didn’t have to download any apps to empower it; the cameras would work seamlessly and invisibly while the players did their typical thing during each game. To put it technically: There are six computer vision cameras set up along the catwalk of the arena—three per half court. These cameras are synched with complex algorithms extracting x, y, and z positioning data for all objects on the court, capturing 25 pictures per second.20 For fans, the data from the cameras gave them new ways to admire their favorite players.

pages: 225 words: 70,241

Silicon City: San Francisco in the Long Shadow of the Valley
by Cary McClelland
Published 8 Oct 2018

I arrived here in September of 2004 to get my PhD, and by October, I was in the Mojave Desert. The research group I joined had started a really cool project, building a car that drives itself. There was a race called the DARPA Grand Challenge, 150 miles, autonomous cars racing through the desert.† I built the computer vision: using a combination of lasers and cameras to figure out what the road looks like and to decide when we could drive faster and slower. In the desert, you just had to drive straight ahead, but on real roads, you had to look all around you for other cars, for lane markers and so on. So we decided to just play with it.

pages: 234 words: 67,589

Internet for the People: The Fight for Our Digital Future
by Ben Tarnoff
Published 13 Jun 2022

This revival was made possible by a number of factors, foremost among them advances in computing power and the abundance of training data that could be sourced from the internet. Deep learning is the paradigm that underlies much of what is currently known as “artificial intelligence,” and has centrally contributed to significant breakthroughs in computer vision and natural language processing. See Andrey Kurenkov, “A Brief History of Neural Nets and Deep Learning,” Skynet Today, September 27, 2020, and Alex Hanna et al., “Lines of Sight,” Logic, December 20, 2020. 109, The sophistication of these systems … “Data imperative”: Marion Fourcade and Kieran Healy, “Seeing Like a Market,” Socio-Economic Review 15, no. 1 (2017): 9–29. 110, The same individual … Smartphone usage: “Mobile Fact Sheet,” April 7, 2021, Pew Research Center.

pages: 834 words: 180,700

The Architecture of Open Source Applications
by Amy Brown and Greg Wilson
Published 24 May 2011

He co-founded Kitware in 1998 and since then has helped grow the company to its current position as a leading R&D provider with clients across many government and commercial sectors. Aaron Mavrinac (Thousand Parsec): Aaron is a Ph.D. candidate in electrical and computer engineering at the University of Windsor, researching camera networks, computer vision, and robotics. When there is free time, he fills some of it working on Thousand Parsec and other free software, coding in Python and C, and doing too many other things to get good at any of them. His web site is http://www.mavrinac.com. Kim Moir (Eclipse): Kim works at the IBM Rational Software lab in Ottawa as the Release Engineering lead for the Eclipse and Runtime Equinox projects and is a member of the Eclipse Architecture Council.

Several common cases, like adding a new library, required changing that source and checking in a new binary. Although this was a unified system in some sense, it had many shortcomings. The other approach the developers had experience with was a gmake based build system for TargetJr. TargetJr was a C++ computer vision environment originally developed on Sun workstations. Originally TargetJr used the imake system to create Makefiles. However, at some point, when a Windows port was needed, the gmake system was created. Both Unix compilers and Windows compilers could be used with this gmake-based system. The system required several environment variables to be set prior to running gmake.

pages: 280 words: 74,559

Fully Automated Luxury Communism
by Aaron Bastani
Published 10 Jun 2019

While the world economy may be much bigger now than it was in 1900, employing more people and enjoying far higher output per person, the lines of work nearly everyone performs – drivers, nurses, teachers and cashiers – aren’t particularly new. Actually Existing Automation In March 2017 Amazon launched its Amazon GO store in downtown Seattle. Using computer vision, deep learning algorithms, and sensor fusion to identify selected items the company looked to build a near fully automated store without cashiers. Here Amazon customers would be able to buy items simply by swiping in with a phone, choosing the things they wanted and swiping out to leave, their purchases automatically debited to their Amazon account.

pages: 296 words: 78,631

Hello World: Being Human in the Age of Algorithms
by Hannah Fry
Published 17 Sep 2018

Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer and Michael Reiter, ‘Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition’, paper presented at ACM SIGSAC Conference, 2016, https://www.cs.cmu.edu/~sbhagava/papers/face-rec-ccs16.pdf. See also: https://commons.wikimedia.org/wiki/File:Milla_Jovovich_Cannes_2011.jpg. 64. Ira Kemelmacher-Shlizerman, Steven M. Seitz, Daniel Miller and Evan Brossard, The MegaFace Benchmark: 1 Million Faces for Recognition at Scale, Computer Vision Foundation, 2015, https://arxiv.org/abs/1512.00596 65. ‘Half of all American adults are in a police face recognition database, new report finds’, press release, Georgetown Law, 18 Oct. 2016, https://www.law.georgetown.edu/news/press-releases/half-of-all-american-adults-are-in-a-police-face-recognition-database-new-report-finds.cfm. 66.

pages: 685 words: 203,949

The Organized Mind: Thinking Straight in the Age of Information Overload
by Daniel J. Levitin
Published 18 Aug 2014

How would they divide up the United States into surveillable sections with a high-enough resolution to spot the balloons, but still be able to navigate the enormous number of photographs quickly? Would the satellite images be analyzed by rooms full of humans, or would the winning team perfect a computer-vision algorithm for distinguishing the red balloons from other balloons and from other round, red objects that were not the target? (Effectively solving the Where’s Waldo? problem, something that computer programs couldn’t do until 2011.) Further speculation revolved around the use of reconnaissance planes, telescopes, sonar, and radar.

Automatic object recognition applied to Where’s Waldo? Aerospace and Electronics Conference (NAECON), 2012 IEEE National, 117–120. and, Garg, R., Seitz, S. M., Ramanan, D., & Snavely, N. (2011, June). Where’s Waldo: Matching people in images of crowds. Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 1793–1800. Wikipedia is an example of crowdsourcing Ayers, P., Matthews, C., & Yates, B. (2008). How Wikipedia works: And how you can be a part of it. San Francisco, CA: No Starch Press, p. 514. More than 4.5 million people Kickstarter, Inc. (2014). Seven things to know about Kickstarter.

pages: 337 words: 86,320

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
by Seth Stephens-Davidowitz
Published 8 May 2017

Politico, August 30, 2013, http://www.politico.com/media/story/2013/08/how-much-does-the-new-york-post-actually-lose-001176. 97 Shapiro told me: I interviewed Matt Gentzkow and Jesse Shapiro on August 16, 2015, at the Royal Sonesta Boston. 98 scanned yearbooks from American high schools: Kate Rakelly, Sarah Sachs, Brian Yin, and Alexei A. Efros, “A Century of Portraits: A Visual Historical Record of American High School Yearbooks,” paper presented at International Conference on Computer Vision, 2015. The photos are reprinted with permission from the authors. 99 subjects in photos copied subjects in paintings: See, for example, Christina Kotchemidova, “Why We Say ‘Cheese’: Producing the Smile in Snapshot Photography,” Critical Studies in Media Communication 22, no. 1 (2005). 100 measure GDP based on how much light there is in these countries at night: J.

pages: 361 words: 83,886

Inside the Robot Kingdom: Japan, Mechatronics and the Coming Robotopia
by Frederik L. Schodt
Published 31 Mar 1988

In an attempt to bolster its competitiveness, in 1985 it once more turned to a U S. firm and took out a license for a new generation of robots made by Adept Technology, a small California firm founded by former Unimation employees. The current star of the U.S. robotics industry, Adept Technology was the first to manufacture a commercial "direct drive" robot, which uses special electric motors that eliminate the need for almost all gears and is exceedingly fast and accurate. Employing the latest in computer vision and American software, the Adept robot shows how American firms can still have an advantage over Japan in state-of-the-art technologies, and also how thoroughly intertwined the Japanese and American robotics industries have become. The first experimental direct-drive robot arm was developed in 1981 at Carnegie-Mellon University in the U.S., mainly by two Japanese scientists, Haruhiko Asada and Takeo Kanade.

pages: 330 words: 83,319

The New Rules of War: Victory in the Age of Durable Disorder
by Sean McFate
Published 22 Jan 2019

Rise of the robots: Matthew Rosenberg and John Markoff, “The Pentagon’s ‘Terminator Conundrum’: Robots That Could Kill on Their Own,” New York Times, 25 October 2016, www.nytimes.com/2016/10/26/us/pentagon-artificial-intelligence-terminator.html; Kevin Warwick, “Back to the Future,” Leviathan, BBC News, 1 January 2000, http://news.bbc.co.uk/hi/english/static/special_report/1999/12/99/back_to_the_future/kevin_warwick.stm. 5. Robots are stupid: Andrej Karpathy and Li Fei-Fei, “Deep Visual-Semantic Alignments for Generating Image Descriptions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015): 3128–37, http://cs.stanford.edu/people/karpathy/cvpr2015.pdf. 6. What is “cyber”?: Cyber is a prefix used to describe anything having to do with computers, which doesn’t explain much. The term “cyber” was coined by the science fiction writer William Gibson in the 1980s but has advanced little as a concept since then.

pages: 301 words: 85,263

New Dark Age: Technology and the End of the Future
by James Bridle
Published 18 Jun 2018

Instead, they only find the faces in the bottom row appearing somewhat more relaxed than those in the top row. Perhaps, the different perceptions here are due to cultural differences.’10 What was left untouched in the original paper was the assumption that any such system could ever be free of encoded, embedded bias. At the outset of their study, the authors write, Unlike a human examiner/judge, a computer vision algorithm or classifier has absolutely no subjective baggages, having no emotions, no biases whatsoever due to past experience, race, religion, political doctrine, gender, age, etc., no mental fatigue, no preconditioning of a bad sleep or meal. The automated inference on criminality eliminates the variable of meta-accuracy (the competence of the human judge/examiner) all together.11 In their response, they double down on this assertion: ‘Like most technologies, machine learning is neutral.’

pages: 245 words: 83,272

Artificial Unintelligence: How Computers Misunderstand the World
by Meredith Broussard
Published 19 Apr 2018

“Today, all of the high-end cars have features like adaptive cruise control or parking assistance. It’s getting more and more automated,” explained Dan Lee, associate professor of engineering and the team’s adviser. “Now, to do it fully, the car has to have a complete awareness of the surrounding world. These are the hard problems of robotics: computer vision, having computers ‘hear’ sounds, having computers understand what’s happening in the world around them. This is a good environment to test these things.” For Little Ben to “see” an obstacle and drive around it, the automated driving and GPS navigation had to work properly, and the laser sensors on the roof rack had to observe the object.

pages: 301 words: 89,076

The Globotics Upheaval: Globalisation, Robotics and the Future of Work
by Richard Baldwin
Published 10 Jan 2019

Some of the more innovative uses of white-collar robots are in psychology. Ellie is an on-screen white-collar robot (some call it an avatar but that is focusing on the image and underplaying the technology driving the image). She looks and acts human enough to make people comfortable talking to her. Computer vision and a Kinect sensor allow her to record body language and subtle facial clues that she then codifies for a human psychologist to evaluate. Research shows that she is better at such data gathering than humans—in part because people feel freer to open up to a robot. University of Southern California researchers created Ellie as part of a program financed by the US Defense Advanced Research Projects Agency.

pages: 301 words: 85,126

AIQ: How People and Machines Are Smarter Together
by Nick Polson and James Scott
Published 14 May 2018

The Stanford researchers compiled 19 databases containing 129,450 images, each of them classified according to a taxonomy of 2,032 different skin lesions. More data means a wider range of experience and thus better pattern recognition, like a veteran dermatologist who’s been looking at skin lesions for decades and who’s seen it all. The second choice was their approach to computer vision, which involved the deep neural networks we met in chapter 2. These networks can extract subtle visual features, and they can combine those features into high-level visual concepts—like circles, edges, stripes, texture, or nuances of variegation—that can be used to distinguish 2,000 different types of skin lesion.

Know Thyself
by Stephen M Fleming
Published 27 Apr 2021

International Journal of Law and Psychiatry 62 (2019): 56–76. Kelley, W. M., C. N. Macrae, C. L. Wyland, S. Caglar, S. Inati, and T. F. Heatherton. “Finding the Self? An Event-Related fMRI Study.” Journal of Cognitive Neuroscience 14, no. 5 (2002): 785–794. Kendall, Alex, and Yarin Gal. “What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?” arXiv.org, October 5, 2017. Kentridge, R. W., and C. A. Heywood. “Metacognition and Awareness.” Consciousness and Cognition 9, no. 2 (2000): 308–312. Kepecs, Adam, Naoshige Uchida, Hatim A. Zariwala, and Zachary F. Mainen. “Neural Correlates, Computation and Behavioural Impact of Decision Confidence.”

pages: 295 words: 81,861

Road to Nowhere: What Silicon Valley Gets Wrong About the Future of Transportation
by Paris Marx
Published 4 Jul 2022

The US government was also interested in the prospect of autonomous driving, both for military and civilian uses. In the 1980s, the DARPA Strategic Computing Initiative funded the Autonomous Land Vehicle project as part of its efforts to “bring new technologies to the battlefield.”9 The project made significant advances in the use of laser imaging and computer vision for autonomous navigation. One of the beneficiaries of that funding was Carnegie Mellon University, which used the money from DARPA to create its first Navlab autonomous vehicle. The experiment served as a foundation for future research aimed at civilian uses. To that end, the Intermodal Surface Transportation Efficiency Act of 1991 mandated the Department of Transportation (DOT) to “develop an automated highway and vehicle prototype from which future fully automated intelligent vehicle-highway systems can be developed.”10 To achieve its aims by 1997, it handed out nearly $100 million to partners in the private sector and university research centers, including the same team at Carnegie Mellon.

pages: 300 words: 81,293

Supertall: How the World's Tallest Buildings Are Reshaping Our Cities and Our Lives
by Stefan Al
Published 11 Apr 2022

The computer model can run virtual experiments and test policies before they are actually implemented. For instance, it can explore the impact of a new building or park on the shadows and wind flows. Systems like this may soon be able to calculate and evaluate the many opportunities for buildings to generate resources. This Big Brother may also be watching you, though, with computer vision monitoring your every garbage and sewage output. With skyscrapers producing and sharing energy, food, species, and more, our world may look quite different. Imagine a day in Singapore a few years from now. You throw your recyclable trash into a chute, where a system of pneumatic tubes sucks it to the recycling plant, avoiding the need for a polluting garbage truck.

pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists
by Gary Marcus and Jeremy Freeman
Published 1 Nov 2014

Despite the terrific progress that cognitive neuroscience of language has made in the last twenty years, mechanistic neurobiological explanations are lacking. Some Promising Directions: Correlational Examples, with Explanatory Ambitions Syntactic Primitives The goals of syntactic research over the last twenty years align well with the goals of cognitive and systems neuroscience (for example, in work on computational vision, see chapter by Carandini): to identify fundamental neuronal computations that (i) underlie a large number of (linguistic) phenomena, and (ii) rely as little as possible on domain-specific properties. As a concrete example, the syntactic theory known as minimalism, developed by Chomsky and others, has formulated a two-step syntactic function called “Merge” (see above re concatenation) that separates into a domain-general computation that combines elements (somewhat akin to binding, in the context of systems neuroscience), and a probably more domain-specific computation that labels the output of the binding computation: (1) Bind: Given an expression A and an expression B, bind A,B → {A,B} (2) Label: Given a combined {A,B}, label the complex A or B; → {A A,B} or {B A,B} Recent work in linguistics suggests that many of the complex properties of natural languages can be modeled as repeated applications of these Bind and Label computations.

pages: 404 words: 92,713

The Art of Statistics: How to Learn From Data
by David Spiegelhalter
Published 2 Sep 2019

‘Narrow’ AI refers to systems that can carry out closely prescribed tasks, and there have been some extraordinarily successful examples based on machine learning, which involves developing algorithms through statistical analysis of large sets of historical examples. Notable successes include speech recognition systems built into phones, tablets and computers; programs such as Google Translate which know little grammar but have learned to translate text from an immense published archive; and computer vision software that uses past images to ‘learn’ to identify, say, faces in photographs or other cars in the view of self-driving vehicles. There has also been spectacular progress in systems playing games, such as the DeepMind software learning the rules of computer games and becoming an expert player, beating world-champions at chess and Go, while IBM’s Watson has beaten competing humans in general knowledge quizzes.

pages: 287 words: 95,152

The Dawn of Eurasia: On the Trail of the New World Order
by Bruno Macaes
Published 25 Jan 2018

One university lecturer in Chengdu is using face recognition technology, not only to register attendance but also to help determine boredom levels among his students. Translation apps make it easy for locals and tourists to have long conversations speaking in their own languages. An app developed by Baidu uses computer vision to help blind people by telling them what is in front of them, from simple but important information like the denomination of bank notes to trickier facts like the age of an interlocutor. Baidu has also partnered with a global food chain to open a new smart restaurant in Beijing, which employs facial recognition to make recommendations about what customers might order, based on factors like their age, gender and facial expression.

High-Frequency Trading
by David Easley , Marcos López de Prado and Maureen O'Hara
Published 28 Sep 2013

What interpretation can be given for a single order placement in a massive stream of microstructure data, or to a snapshot of an intraday order book, especially considering the fact that any outstanding order can be cancelled by the submitting party any time prior to execution?2 95 i i i i i i “Easley” — 2013/10/8 — 11:31 — page 96 — #116 i i HIGH-FREQUENCY TRADING To offer an analogy, consider the now common application of machine learning to problems in natural language processing (NLP) and computer vision. Both of them remain very challenging domains. But, in NLP, it is at least clear that the basic unit of meaning in the data is the word, which is how digital documents are represented and processed. In contrast, digital images are represented at the pixel level, but this is certainly not the meaningful unit of information in vision applications – objects are – but algorithmically extracting objects from images remains a difficult problem.

pages: 442 words: 94,734

The Art of Statistics: Learning From Data
by David Spiegelhalter
Published 14 Oct 2019

‘Narrow’ AI refers to systems that can carry out closely prescribed tasks, and there have been some extraordinarily successful examples based on machine learning, which involves developing algorithms through statistical analysis of large sets of historical examples. Notable successes include speech recognition systems built into phones, tablets and computers; programs such as Google Translate which know little grammar but have learned to translate text from an immense published archive; and computer vision software that uses past images to ‘learn’ to identify, say, faces in photographs or other cars in the view of self-driving vehicles. There has also been spectacular progress in systems playing games, such as the DeepMind software learning the rules of computer games and becoming an expert player, beating world-champions at chess and Go, while IBM’s Watson has beaten competing humans in general knowledge quizzes.

pages: 302 words: 90,215

Experience on Demand: What Virtual Reality Is, How It Works, and What It Can Do
by Jeremy Bailenson
Published 30 Jan 2018

Bailenson, Nicole Kramer, and Benjamin Li, “Let the Avatar Brighten Your Smile: Effects of Enhancing Facial Expressions in Virtual Environments,” PLoS One (2016). 8. STORIES IN THE ROUND 1. Susan Sontag, Regarding the Pain of Others (New York: Farrar, Straus and Giroux, 2003), 54. 2. Jon Peddie, Kurt Akeley, Paul Debevec, Erik Fonseka, Maichael Mangan, and Michael Raphael, “A Vision for Computer Vision: Emerging Technologies,” July 2016 SIGGRAPH Panel, http://dl.acm.org/citation.cfm?id=2933233. 3. Zeke Miller, “Romney Campaign Exaggerates Size of Nevada Event with Altered Image,” Buzzfeed, October 26, 2012, https://www.buzzfeed.com/zekejmiller/romney-campaign-appears-to-exaggerate-size-of-neva. 4.

The Internet Trap: How the Digital Economy Builds Monopolies and Undermines Democracy
by Matthew Hindman
Published 24 Sep 2018

Google has even built new globally distributed database systems called Spanner and F1, in which operations across different data centers are synced using atomic clocks.22 The latest iteration of Borg, Google’s cluster management system, coordinates “hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines.”23 In recent years Google’s data centers have expanded their capabilities in other ways, too. As Google has increasingly focused on problems like computer vision, speech recognition, and natural language processing, it has worked to deploy deep learning, a variant of neural network methods. Google’s investments in deep learning have been massive and multifaceted, including (among other things) major corporate acquisitions and the development of the TensorFlow high-level programming toolkit.24 But one critical component has been the development of a custom computer chip built specially for machine learning.

pages: 277 words: 91,698

SAM: One Robot, a Dozen Engineers, and the Race to Revolutionize the Way We Build
by Jonathan Waldman
Published 7 Jan 2020

In his notes, he wrote, “How many brick buildings in NY?” Engineers from RPI finally showed Scott the sensing system they’d devised, and it was so complicated that it made the simple, level string line used by masons everywhere seem worthy of the Nobel Prize. RPI’s system used triangulation and computer vision and spinning lasers. The spinning lasers, fixed on one side of the building, defined two flat, parallel planes at the height of the course to be laid. On the robot’s gripper, two photo sensors detected the lasers—both the time the beams hit and the angle at which they hit. In this way, it was possible to calculate position and to figure out roll, but to get yaw and pitch, the engineers said they’d need another photo sensor and an accelerometer.

pages: 406 words: 88,977

How to Prevent the Next Pandemic
by Bill Gates
Published 2 May 2022

The change won’t come right away, since most people don’t own tools to enable this kind of capture yet, in contrast to the way the switch to video meetings was enabled by the fact that many people already had PCs or phones with cameras. Right now, you can use virtual reality goggles and gloves to control your avatar, but more sophisticated and less obtrusive tools—like lightweight glasses and contact lenses—will come along over the next few years. Improvements in computer vision, display technology, audio, and sensors will capture your facial expressions, eyeline, and body language with very little delay. Think about any time you’ve tried to jump in with a thought during a spirited video meeting, and how hard that was to do when you couldn’t see the way people’s body language shifts as they’re wrapping up a thought.

pages: 285 words: 86,858

How to Spend a Trillion Dollars
by Rowan Hooper
Published 15 Jan 2020

The system correctly identified gay and heterosexual men from single photographs in 81 per cent of cases, and in 71 per cent of cases with women. Humans trying to classify the photos managed 61 per cent accuracy for men and 54 per cent for women. The researchers, who obtained the photos from public profiles on a dating site, say that their work highlights the threat that computer vision algorithms could pose to the safety and privacy of gay people.5 That’s an illustration of the sort of tension between how AI may help or hamper us a society. In other areas, AI is more unambiguously helpful. In 2016, DeepMind designed an AI to analyse the efficiency of the cooling system used in Google data-centres.

Future Files: A Brief History of the Next 50 Years
by Richard Watson
Published 1 Jan 2008

The true test for artificial intelligence dates to 1950, when the British mathematician Alan Turing suggested the criterion of humans submitting statements through a machine and then not being able to tell whether the responses had come from another person or the machine. The 1960s and 1970s saw a great deal of progress in AI, but real breakthroughs failed to materialize. Instead, scientists and developers focused on specific problems such as speech recognition, text recognition and computer vision. However, we may be less than ten years away from seeing Turing’s AI vision become a reality. For instance, a company in Austin, Texas has developed a product called Cyc. It is much like a “chatbot” except that, if it answers Science and Technology 45 a question incorrectly, you can correct it and Cyc will learn from its mistakes.

Falter: Has the Human Game Begun to Play Itself Out?
by Bill McKibben
Published 15 Apr 2019

Ibid., p. vi. 8. Ibid., p. 215. 9. Ayn Rand, Fountainhead, p. 11. PART THREE: THE NAME OF THE GAME CHAPTER 13 1. Personal conversation, November 22, 2017. 2. James Bridle, “Known Unknowns,” Harper’s, July 2018. 3. “Rise of the Machines,” The Economist, May 22, 2017. 4. “On Welsh Corgis, Computer Vision, and the Power of Deep Learning,” microsoft.com, July 14, 2014. 5. Andrew Roberts, “Elon Musk Says to Forget North Korea Because Artificial Intelligence Is the Real Threat to Humanity,” uproxx.com, August 12, 2017. 6. Tom Simonite, “What Is Ray Kurzweil Up to at Google? Writing Your Emails,” Wired, August 2, 2017. 7.

Mindf*ck: Cambridge Analytica and the Plot to Break America
by Christopher Wylie
Published 8 Oct 2019

I started as a student at UAL and ended up working under the supervision of Carolyn Mair, who had a background in cognitive psychology and machine learning. Dr. Mair wasn’t a typical fashion professor, but the match made sense, as I wasn’t a typical fashion student. After I explained to her that I wanted to start researching fashion “models” of another kind—neural networks, computer vision, and autoencoders—she convinced the university’s postgraduate research committee to allow me to commence a Ph.D. in machine learning rather than in design. It was around this time that I also began my new job at SCL Group, so my days fluctuated between fashion models and cyberwarfare. I was keen to dive into my academic research on cultural trends, so I told Nix that I did not want to work for SCL full-time, and that if SCL wanted me, they would have to accept that I would be continuing my Ph.D. in parallel to their projects.

pages: 420 words: 100,811

We Are Data: Algorithms and the Making of Our Digital Selves
by John Cheney-Lippold
Published 1 May 2017

Simone Browne, “Digital Epidermalization: Race, Identity and Biometrics,” Critical Sociology 36, no. 1 (2010): 135. 27. Lev Manovich, Language of New Media (Cambridge, MA: MIT Press, 2001), 63. 28. Ibid., 63–64. 29. Ian Fasel, Bret Fortenberry, and Javier Movellan, “A Generative Framework for Real Time Object Detection and Classification,” Computer Vision and Image Understanding 98, no. 1 (2005): 182–210. 30. Jacob Whitehill, Gwen Littlewort, Ian Fasel, Marian Bartlett, and Javier Movellan, “Toward Practical Smile Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence 31, no. 11 (2009): 2107. 31. Ms Smith, “Face Detection Technology Tool Now Detects Your Moods Too,” Network World, July 14, 2011, www.networkworld.com; Yaniv Taigman and Lior Wolf, “Leveraging Billions of Faces to Overcome Performance Barriers in Unconstrained Face Recognition,” arXiv:1108.1122, 2011. 32.

pages: 418 words: 102,597

Being You: A New Science of Consciousness
by Anil Seth
Published 29 Aug 2021

Neuroscience of Consciousness. Haun, A. M., & Tononi, G. (2019). ‘Why does space feel the way it does? Towards a principled account of spatial experience’. Entropy, 21(12), 1160. He, K., Zhang, X., Ren, S., et al. (2016). ‘Deep residual learning for image recognition’. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Heilbron, M., Richter, D., Ekman, M., et al. (2020). ‘Word contexts enhance the neural representation of individual letters in early visual cortex’. Nature Communications, 11(1), 321. Herculano-Houzel, S. (2009). ‘The human brain in numbers: a linearly scaled-up primate brain’.

pages: 346 words: 97,890

The Road to Conscious Machines
by Michael Wooldridge
Published 2 Nov 2018

In April 2019, you may recall seeing the first ever pictures of a black hole.1 In a mind-boggling experiment, astronomers used data collected from eight radio telescopes across the world to construct an image of a black hole which is 40 billion miles across and 55 million light years away. The image represents one of the most dramatic scientific achievements this century. But what you might not know is that it was only made possible through AI: advanced computer vision algorithms were used to reconstruct the image, ‘predicting’ missing elements of the picture. In 2018, researchers from the computer processor company Nvidia demonstrated the ability of AI software to create completely convincing but completely fake pictures of people.2 The pictures were developed by a new type of neural network, one called a generative adversarial network.

pages: 502 words: 107,510

Natural Language Annotation for Machine Learning
by James Pustejovsky and Amber Stubbs
Published 14 Oct 2012

Snow, Rion, Brendan O’Connor, Daniel Jurafsky, and Andrew Y. Ng. 2008. “Cheap and Fast—But Is It Good? Evaluating Non-Expert Annotations for Natural Language Tasks.” In Proceedings of EMNLP-08. Sorokin, Alexander, and David Forsyth. 2008. “Utility data annotation with Amazon Mechanical Turk.” In Proceedings of the Computer Vision and Pattern Recognition Workshops. Index A note on the digital index A link in an index entry is displayed as the section title in which that entry appears. Because some sections have multiple index markers, it is not unusual for an entry to have several links to the same section.

pages: 383 words: 108,266

Predictably Irrational, Revised and Expanded Edition: The Hidden Forces That Shape Our Decisions
by Dan Ariely
Published 19 Feb 2007

BY THE WAY, a funny thing happened when we ran the experiment in the Carolina Brewery: Dressed in my waiter’s outfit, I approached one of the tables and began to read the menu to the couple there. Suddenly, I realized that the man was Rich, a graduate student in computer science, someone with whom I had worked on a project related to computational vision three or four years earlier. Because the experiment had to be conducted in the same way each time, this was not a good time for me to chat with him, so I put on a poker face and launched into a matter-of-fact description of the beers. After I finished, I nodded to Rich and asked, “What can I get you?”

Smart Mobs: The Next Social Revolution
by Howard Rheingold
Published 24 Dec 2011

I picked one in Chinese, which I don’t read. Following his directions, I pointed the lens of the device in my hand at the sign on the wall, clicked the shutter, pressed some buttons on the telephone, and in a few seconds, the English words “reservation desk” appeared on the screen of the Info-Scope. “We use computer-vision techniques to extract the text from the sign,” Haritaoglu explained. “That requires processor power.” The telephone sent the picture to a computer on IBM’s network, which crunched the numbers to parse the characters out of the image, crunched the numbers to translate the text, and sent it back to the device in my hand.

pages: 629 words: 109,663

Docker in Action
by Jeff Nickoloff and Stephen Kuenzli
Published 10 Dec 2019

Containers have access to some of the host’s devices by default, and Docker creates other devices specifically for each container. This works similarly to how a virtual terminal provides dedicated input and output devices to the user. On occasion, it may be important to share other devices between a host and a specific container. Say you’re running computer vision software that requires access to a webcam, for example. In that case, you’ll need to grant access to the container running your software to the webcam device attached to the system; you can use the --device flag to specify a set of devices to mount into the new container. The following example would map your webcam at /dev/video0 to the same location within a new container.

pages: 489 words: 106,008

Risk: A User's Guide
by Stanley McChrystal and Anna Butrico
Published 4 Oct 2021

The DoD knew that its competitors were changing and that it was entering an “AI arms race”—and it believed Google could provide the capabilities it needed to compete. Google eagerly signed the contract. Now identifying as an AI company (not a data company, as it had been formerly known) Google would create a “customized AI surveillance engine” to scour the DoD’s massive amount of footage. Google’s computer vision, which incorporated both machine learning and deep learning, would analyze the data to track the movements of vehicles and other objects. As they quietly engaged with Project Maven, Google’s AI services showed initial progress—Google’s software had greater success than humans in detecting important footage.

Reset
by Ronald J. Deibert
Published 14 Aug 2020

Central Asian countries like Uzbekistan and Kazakhstan have even gone so far as to advertise for Bitcoin mining operations to be hosted in their jurisdictions because of cheap and plentiful coal and other fossil-fuelled energy sources.349 Some estimates put electric energy consumption associated with Bitcoin mining at around 83.67 terawatt-hours per year, more than that of the entire country of Finland, with carbon emissions estimated at 33.82 megatons, roughly equivalent to those of Denmark.350 To put it another way, the Cambridge Centre for Alternative Finance says that the electricity consumed by the Bitcoin network in one year could power all the teakettles used to boil water in the entire United Kingdom for nineteen years.351 A similar energy-sucking dynamic underlies other cutting-edge technologies, like “deep learning.” The latter refers to the complex artificial intelligence systems used to undertake the fine-grained, real-time calculations associated with the range of social media experiences, such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, and so on. Research undertaken at the University of Massachusetts, Amherst, in which the researchers performed a life-cycle assessment for training several common large AI models, found that training a single AI model can emit more than 626,000 pounds of carbon dioxide equivalent — or nearly five times the lifetime emissions of the average American car (including its manufacturing).352 It’s become common to hear that “data is the new oil,” usually meaning that it is a valuable resource.

pages: 338 words: 104,815

Nobody's Fool: Why We Get Taken in and What We Can Do About It
by Daniel Simons and Christopher Chabris
Published 10 Jul 2023

Maybe it would, but in the face of a compelling demo, we tend to assume that the performance we’re seeing is generalizable to similar settings even when we have no direct evidence, at least from the demo, that it does.6 The practice of developing computer systems capable of performing with apparent intelligence in highly constrained situations and either claiming or implying that they would work just as well in a broad range of contexts goes back at least fifty years. Sometimes the developers are not deliberately deceptive—they’re just overly optimistic about how easy it will be to improve their own technology so that it works in more situations. For decades, computer vision and robotics experts assumed that if a robot could understand a scene containing regular geometric solids (cubes, pyramids, cylinders, etc.), then the hard work would be done, and it would take just a small step to generalize that capability to natural scenes. But time after time, artificial intelligence (AI) systems fall short when making the jump from an optimized “microworld” to the real world, much as potential medicines can perform well in laboratory experiments with animals but fail in human trials.

pages: 409 words: 112,055

The Fifth Domain: Defending Our Country, Our Companies, and Ourselves in the Age of Cyber Threats
by Richard A. Clarke and Robert K. Knake
Published 15 Jul 2019

Many of the current uses of AI are still, in fact, attempts to have machines do things that humans do. What is important for us, however, is that AI has moved on to do things that no individual human could do, indeed what even groups of highly trained humans could not reliably do in any reasonable amount of time. Using AI, machines can now have meaningful visual capacity, so-called computer vision. They can see things by translating images into code and classifying or identifying what appears in the image or video. Cars can now see other cars, view and understand certain traffic signs, and use the knowledge they gain from their visual capacity to make and implement decisions such as braking to avoid an accident.

System Error: Where Big Tech Went Wrong and How We Can Reboot
by Rob Reich , Mehran Sahami and Jeremy M. Weinstein
Published 6 Sep 2021

Will AGI put humanity: Edward Feigenbaum et al., Advanced Software Applications in Japan (Park Ridge, NJ: Noyes Data Corporation, 1995). problems in reasoning have the potential: Yaniv Taigman et al., “DeepFace: Closing the Gap to Human-Level Performance in Face Verification,” 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014) (New York: IEEE, 2014), 1701–8, https://doi.org/10.1109/CVPR.2014.220. “nine-layer deep neural network”: Ibid. Anyone using Zoom videoconferencing: “Language Interpretation in Meetings and Webinars,” Zoom Help Center, https://support.zoom.us/hc/en-us/articles/360034919791-Language-interpretation-in-meetings-and-webinars.

pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline
by Cathy O'Neil and Rachel Schutt
Published 8 Oct 2013

There are convergence issues—the solution can fail to exist, if the algorithm falls into a loop, for example, and keeps going back and forth between two possible solutions, or in other words, there isn’t a single unique solution. Interpretability can be a problem—sometimes the answer isn’t at all useful. Indeed that’s often the biggest problem. In spite of these issues, it’s pretty fast (compared to other clustering algorithms), and there are broad applications in marketing, computer vision (partitioning an image), or as a starting point for other models. In practice, this is just one line of code in R: kmeans(x, centers, iter.max = 10, nstart = 1, algorithm = c("Hartigan-Wong", "Lloyd", "Forgy", "MacQueen")) Your dataset needs to be a matrix, x, each column of which is one of your features.

pages: 389 words: 119,487

21 Lessons for the 21st Century
by Yuval Noah Harari
Published 29 Aug 2018

.: Arash Bahrammirzaee, ‘A comparative Survey of Artificial Intelligence Applications in Finance: Artificial Neural Networks, Expert System and Hybrid Intelligent Systems’, Neural Computing and Applications 19:8 (2010), 1165–95; analysis of complex data in medical systems and production of diagnosis and treatment: Marjorie Glass Zauderer et al., ‘Piloting IBM Watson Oncology within Memorial Sloan Kettering’s Regional Network’, Journal of Clinical Oncology 32:15 (2014), e17653; creation of original texts in natural language from massive amounts of data: Jean-Sébastien Vayre et al., ‘Communication Mediated through Natural Language Generation in Big Data Environments: The Case of Nomao’, Journal of Computer and Communication 5 (2017), 125–48; facial recognition: Florian Schroff, Dmitry Kalenichenko and James Philbin, ‘FaceNet: A Unified Embedding for Face Recognition and Clustering’, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), 815–23; and driving: Cristiano Premebida, ‘A Lidar and Vision-based Approach for Pedestrian and Vehicle Detection and Tracking’, 2007 IEEE Intelligent Transportation Systems Conference (2007). 3 Daniel Kahneman, Thinking, Fast and Slow (New York: Farrar, Straus & Giroux, 2011); Dan Ariely, Predictably Irrational (New York: Harper, 2009); Brian D.

pages: 1,172 words: 114,305

New Laws of Robotics: Defending Human Expertise in the Age of AI
by Frank Pasquale
Published 14 May 2020

Ideally, machines learn to spot “evil digital twins”—tissue that proved in the past to be dangerous, which is menacingly similar to your own.9 This machine vision—spotting danger where even experienced specialists might miss it—is far different from our own sense of sight. To understand machine learning—which will come up repeatedly in this book—it is helpful to compare contemporary computer vision to its prior successes in facial or number recognition. When a facial recognition program successfully identifies a picture as an image of a given person, it is matching patterns in the image to those in a preexisting database, perhaps on a 1,000-by-1,000-pixel grid. Each box in the grid can be identified as either skin or not skin, smooth or not smooth, along hundreds or even thousands of binaries, many of which would never be noticeable by the human eye.

pages: 458 words: 116,832

The Costs of Connection: How Data Is Colonizing Human Life and Appropriating It for Capitalism
by Nick Couldry and Ulises A. Mejias
Published 19 Aug 2019

The Volokh Conspiracy (blog), January 23, 2012. http://volokh.com/2012/01/23/whats-the-status-of-the-mosaic-theory-after-jones/. Khatchadourian, Raffi. “We Know How You Feel.” The New Yorker, January 12, 2015. Khosla, Aditya, Byoungkwon An, Joseph J. Lim, and Antonio Torralba. “Looking Beyond the Visible Scene.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, 3710–17. Kirkpatrick, David. The Facebook Effect. New York: Simon and Schuster, 2010. Kitchin, Rob. The Data Revolution. London: Sage, 2014. Kitchin, Rob, and Martin Dodge. Code/Space. Cambridge, MA: MIT Press, 2011. Kitchin, Rob, and Gavin McArdle.

pages: 412 words: 116,685

The Metaverse: And How It Will Revolutionize Everything
by Matthew Ball
Published 18 Jul 2022

Sony Pictures, meanwhile, is the largest movie studio by revenue, as well as the largest independent TV/film studio overall. Sony’s semiconductor division is also the world leader in image sensors, with nearly 50% market share (Apple is a top customer), while its Imageworks division is a top visual effects and computer animation studio. Sony’s Hawk-Eye is a computer vision system used by numerous professional sports leagues globally to aid officiating through 3D simulations and playblack (the football club Manchester City is also deploying the technology to create a live digital twin of its stadium, players, and fans during a match). Sony Music is the second-largest music label by revenue (Travis Scott is a Sony Music artist), while Crunchyroll and Funimation provide Sony with the world’s largest anime streaming service.

Visual Thinking: The Hidden Gifts of People Who Think in Pictures, Patterns, and Abstractions
by Temple Grandin, Ph.d.
Published 11 Oct 2022

A lion attacking you in the Hilton is a hallucination. Dreams have a hallucinatory component. Everything I see in my imagination is real. And one of the things my imagination works overtime visualizing is what happens when systems controlled by artificial intelligence (AI) are hacked. In 2015, Google introduced DeepDream, a computer vision program that used AI algorithms to generate and enhance images by detecting patterns. An example of a normal use for such a program would be to find pictures of dogs on the internet. When used for their intended purpose, the programs resemble visual thinking. When forced to look for things that are not there, however, they hallucinate similarly to a person with schizophrenia.

Stock Market Wizards: Interviews With America's Top Stock Traders
by Jack D. Schwager
Published 1 Jan 2001

Still, for certain types of problems, theoretically, you could get speeds that were a thousand times faster than the fastest supercomputer. To be fair, there were a few other researchers who were interested in these sorts of "fine-grained" parallel machines at the time—for example, certain scientists working in the field of computer vision—but it was definitely not the dominant theme within the field. You said that you were trying to design a computer that worked more like the brain. Could you elaborate? At the time, one of the main constraints on computer speed was a limitation often referred to as the "von Neumann bottleneck."

The Future of Technology
by Tom Standage
Published 31 Aug 2005

“The processing power is so much better than before that some of the seemingly simple things we humans do, like recognising faces, can begin to be done,” says Dr Kanade. While prices drop and hardware improves, research into robotic vision, control systems and communications have jumped ahead as well. America’s military and its space agency, nasa, have poured billions into robotic research and related fields such as computer vision. The Spirit and Opportunity rovers exploring Mars can pick their way across the surface to reach a specific destination. Their human masters do not specify the route; instead, the robots are programmed to identify and avoid obstacles themselves. “Robots in the first generation helped to generate economies of scale,” says Navi Radjou, an analyst at Forrester, a consultancy.

Autonomous Driving: How the Driverless Revolution Will Change the World
by Andreas Herrmann , Walter Brenner and Rupert Stadler
Published 25 Mar 2018

J., Lenz, B., Winner, H., Autonomous Driving, Berlin, 213 232. [57] Hoff, K. A., Bashir, M., 2015: Trust in Automation: Integrating Empirical Evidence on Factors that Influence Trust, in: Human Factors, 407 434. Bibliography 418 [58] Hong, T., Abrams, M., Chang, T., Shneier, M., 2008: An Intelligent World Model for Autonomous Off-Road Driving, in: Computer Vision and Image Understanding, 1 16. [59] Huang, P., Pruckner, A., 2016: Steer byWire, in: Harrer, M., Pfeffer, P., Steering Handbook, Cham, 513 526. [60] Hyve Science Lab, 2015: Autonomous Driving The User Perspective, Munich. [61] IBM, 2011: Global Parking Survey Drivers Share Worldwide Parking Woes. [62] IHS Automotive, 2014: Emerging Technologies. [63] IHS Markit, 2016: Autonomous Industry Analysis. [64] Institute for Mobility Research, 2016: Autonomous Driving The Impact of Vehicle Automation on Mobility Behaviour. [65] Isaac, L., 2016: How Local Governments Can Plan for Autonomous Vehicles, in: Meyer, G., Beiker, S., Road Vehicles Automation 3, Berlin, 59 71. [66] Kaufmann, S., Moss, M.

Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data
by Dipanjan Sarkar
Published 1 Dec 2016

We will be exploring several of these libraries in this book. Even though the preceding list may seem a bit overwhelming, this is just scratching the surface of what is possible with Python. It is widely used in several other domains including artificial intelligence (AI) , game development, robotics, Internet of Things (IoT), computer vision, media processing, and network and system monitoring, just to name a few. To read some of the widespread success stories achieved with Python in different diverse domains like arts, science, computer science, education, and others, enthusiastic programmers and researchers can check out www.python.org/about/success/ .

pages: 448 words: 117,325

Click Here to Kill Everybody: Security and Survival in a Hyper-Connected World
by Bruce Schneier
Published 3 Sep 2018

Dudley (17 May 2016), “Deep Patient: An unsupervised representation to predict the future of patients from the electronic health records,” Scientific Reports 6, no. 26094, https://www.nature.com/articles/srep26094. 83But although the system works: Will Knight (11 Apr 2017), “The dark secret at the heart of AI,” MIT Technology Review, https://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai. 83A 2014 book, Autonomous Technologies: William Messner, ed. (2014), Autonomous Technologies: Applications That Matter, SAE International, http://books.sae.org/jpf-auv-004. 84One research project focused on: Anh Nguyen, Jason Yosinski, and Jeff Clune (2 Apr 2015), “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,” in Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’15), https://arxiv.org/abs/1412.1897. 84A related research project was able: Christian Szegedy et al. (19 Feb 2014), “Intriguing properties of neural networks,” in Conference Proceedings: International Conference on Learning Representations (ICLR) 2014, https://arxiv.org/abs/1312.6199. 84Yet another project tricked an algorithm: Andrew Ilyas et al. (20 Dec 2017), “Partial information attacks on real-world AI,” LabSix, http://www.labsix.org/partial-information-adversarial-examples. 85Like the Microsoft chatbot Tay: James Vincent (24 Mar 2016), “Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day,” Verge, https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist. 85In 2017, Dow Jones accidentally: Timothy B.

pages: 402 words: 126,835

The Job: The Future of Work in the Modern Era
by Ellen Ruppel Shell
Published 22 Oct 2018

For as we’ve seen, they seem to have the most difficulty completing tasks that humans find simplest, like picking delicate items off a shelf. Computer scientist Gary Bradski, a Silicon Valley entrepreneur, is cofounder of Industrial Perception, a start-up acquired years ago by Google that developed computer vision systems and robotic arms for loading and unloading trucks. “For Amazon and all Internet retailers, moving things from one place to another is just about their entire cost,” he told me. “Basically people in that industry are used as an extension of a forklift. Human forklift extenders are pretty expensive.

pages: 993 words: 318,161

Fall; Or, Dodge in Hell
by Neal Stephenson
Published 3 Jun 2019

The VEIL had been engineered as a double-edged weapon. Yes, it jammed the facial-recognition algorithms that would enable any camera, anywhere, to know your true name. But the pattern of lights that a VEIL projected on the user’s face wasn’t mere noise. It was a signal designed to convey data to any computer vision system smart enough to read it. The protocol had been published by ENSU and was formidable in its sophistication, but the upshot was that a VEIL could, if the user so chose, project the equivalent of a barcode: a number linking the user to a PURDAH. It was all completely optional and unnecessary, which was part of the point; most people didn’t know or care about any of this and simply did things openly under their own names.

But it was easy, and free, and recommended, that when you were starting out in life you establish at least one PURDAH, so that you could begin compiling a record of things that you had done. You could link it to your actual legal name and face if you wanted, or not. And either way you could punch it into your VEIL system so that as you walked down the street, computer vision systems, even though they couldn’t recognize your actual face, could look up your PURDAH and from there see any activity blockchained to it. All three of the laughing, coffee-toting, VEIL-wearing schoolgirls had done this as a matter of course. It was probably a free-and-mandatory service offered by their prep school.

pages: 476 words: 132,042

What Technology Wants
by Kevin Kelly
Published 14 Jul 2010

Danny Hillis, another polymath and serial inventor, is cofounder of an innovative prototype shop called Applied Minds, which is another idea factory. As you might guess from the name, they use smart people to invent stuff. Their corporate tagline is “the little Big Idea company.” Like Myhrvold’s Intellectual Ventures, they generate tons of ideas in interdisciplinary areas: bioengineering, toys, computer vision, amusement rides, military control rooms, cancer diagnostics, and mapping tools. Some ideas they sell as unadorned patents; others they complete as physical machines or operational software. I asked Hillis, “What percentage of your ideas do you find out later someone else had before you, or at the same time as you, or maybe even after you?”

pages: 416 words: 129,308

The One Device: The Secret History of the iPhone
by Brian Merchant
Published 19 Jun 2017

The difference between this data-driven approach and the logic-driven approach is that this computer doesn’t know anything about Van Gogh or what an artist is. It is only imitating patterns—often very well—that it has seen before. “The thing that’s good for is perception,” Gruber says. “The computer vision, computer speech, understanding, pattern recognition, and these things did not do well with knowledge representations. They did better with data- and signal-processing techniques. So that’s what’s happened. The machine learning has just gotten really good at making generalizations over training examples.”

pages: 459 words: 140,010

Fire in the Valley: The Birth and Death of the Personal Computer
by Michael Swaine and Paul Freiberger
Published 19 Oct 2014

This computer could be a winner, or some other machine might be better. If Jobs and Wozniak really had something, Terrell figured they’d keep in touch with him. The next day, Jobs appeared, barefoot, at Byte Shop. “I’m keeping in touch,” he said. Terrell, impressed by his confidence and perseverance, ordered 50 Apple I computers. Visions of instant wealth flashed before Jobs’s eyes. But Terrell added a condition: he wanted the computers fully assembled. Woz and Jobs were back to their 60-hour work weeks. The two Steves had no parts and no money to buy them, but with a purchase order from Terrell for 50 Apple I computers, they were able to obtain net 30 credit from suppliers.

pages: 303 words: 67,891

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms: Proceedings of the Agi Workshop 2006
by Ben Goertzel and Pei Wang
Published 1 Jan 2007

Since NARS is designed according to an experience-grounded semantics, it is situated and embodied. However, because the system is not designed to duplicate concrete human behaviors and capabilities, it is not equipped with human sensors and effecters. For example, there is no doubt that vision plays a central role in human cognition, and that computer vision has great practical value, but it does not mean that vision is needed in every intelligent system. Because of these considerations, sensors and effecters are treated as optional parts of NARS. 84 P. Wang / From NARS to a Thinking Machine 3.2 Natural languages As mentioned previously, NARS uses Narsese, a formally defined language, to communicate with its environment.

pages: 588 words: 131,025

The Patient Will See You Now: The Future of Medicine Is in Your Hands
by Eric Topol
Published 6 Jan 2015

McCormick et al., “Giving Office-Based Physicians Electronic Access to Patients’ Prior Imaging and Lab Results Did Not Deter Ordering of Tests,” Health Affairs 31, no. 3 (2012): 488–496. 110. T. McMahon, “The Smartphone Will See You Now: How Apps and Social Media Are Revolutionizing Medicine,” Macleans, March 4, 2013, http://www2.macleans.ca/2013/03/04/the-smartphone-will-see-you-now. 111. “Turning Mobile Phones into 3D Scanners,” Computer Vision and Geometry Group, accessed August 13, 2014, http://cvg.ethz.ch/mobile/. 112. J. Lademann, “Optical Methods of Imaging in the Skin,” Journal of Biomedical Optics 18, no. 6 (2013): 061 201-1. 113. W. Sohn et al., “Endockscope: Using Mobile Technology to Create Global Point of Service Endoscopy,” Journal of Endourology 27, no. 9 (2013): 1154–1160. 114.

pages: 573 words: 142,376

Whole Earth: The Many Lives of Stewart Brand
by John Markoff
Published 22 Mar 2022

Brand asked Shel Kaphan, one of the young computer hackers who worked at the Truck Store, to put him in touch with the researchers at computer scientist John McCarthy’s Stanford Artificial Intelligence Laboratory. SAIL had been established to build a working artificial intelligence and he had collected an eclectic group of young researchers exploring technologies like robotics, computer vision, natural language understanding, and speech recognition. Simultaneously, Bill English opened the doors to the Palo Alto Research Center. Xerox had created PARC to compete directly with IBM, and Robert Taylor, a young psychologist who had funded the development of the ARPANET while at the Pentagon, had been given the charter of rethinking the future of the office based upon computers and networks.

The Book of Why: The New Science of Cause and Effect
by Judea Pearl and Dana Mackenzie
Published 1 Mar 2018

The one game it lost to Sedol is the only one it will ever lose to a human. All of this is exciting, and the results leave no doubt: deep learning works for certain tasks. But it is the antithesis of transparency. Even AlphaGo’s programmers cannot tell you why the program plays so well. They knew from experience that deep networks have been successful at tasks in computer vision and speech recognition. Nevertheless, our understanding of deep learning is completely empirical and comes with no guarantees. The AlphaGo team could not have predicted at the outset that the program would beat the best human in a year, or two, or five. They simply experimented, and it did. Some people will argue that transparency is not really needed.

pages: 496 words: 131,938

The Future Is Asian
by Parag Khanna
Published 5 Feb 2019

Japanese companies are applying AI to semiconductor manufacturing, helping them retain their edge in a critical components sector. India has dozens of promising AI companies. Its Fractal Analytics has a “consumer genomics” methodology that supports many of the world’s largest retail companies. Indian AI companies will dominate the Indian market and compete globally in areas such as computer vision, medical diagnostics, legal contract analysis, and customer satisfaction surveys. Google is deploying ever more capital to fund and buy Indian AI companies. Pakistan also has one of Asia’s leading AI outsourcing companies, Afiniti, which has more than three thousand employees and a valuation of $2 billion.

pages: 642 words: 141,888

Like, Comment, Subscribe: Inside YouTube's Chaotic Rise to World Domination
by Mark Bergen
Published 5 Sep 2022

Well, until the summer of 2018. When Pichai took over Google, he determined that its future lay primarily in two fields: business software sales, via cloud computing, and emerging market internet consumers, which he called the “next billion users.” Google had signed a contract with the Pentagon to provide drones with computer vision, paving the way for lucrative government cloud deals. In June, after weeks of raucous internal protests over the Pentagon deal, Google caved and pledged not to renew its contract. Then, that summer, employees discovered a shocking part of the “next billion users” plan: Google was building a search engine for mainland China with censored results.

pages: 310 words: 34,482

Makers at Work: Folks Reinventing the World One Object or Idea at a Time
by Steven Osborn
Published 17 Sep 2013

I went to the reception desk to ask for information, and they were starting to give me some information, when I realized I was in the wrong place. But they were so nice that I didn’t have the courage to say I had made a mistake. I just looked around and then something really strange happened. I saw the work that the students were doing there, which was electronics and computer vision—this was back in 1998 and the web was still very young. We still used modems to connect. Basically, something clicked, and it felt like I had found my place. So I never even applied to the film school. I actually never even went there. I basically went home, applied for ITP, and I got accepted.

Beautiful Data: The Stories Behind Elegant Data Solutions
by Toby Segaran and Jeff Hammerbacher
Published 1 Jul 2009

Document Unshredding and DNA Sequencing In Vernor Vinge’s science fiction novel Rainbows End (Tor Books), the Librareome project digitizes an entire library by tossing the books into a tree shredder, photographing the pieces, and using computer algorithms to reassemble the images. In real life, the German government’s E-Puzzler project is reconstructing 45 million pages of documents shredded by the former East German secret police, the Stasi. Both these projects rely on sophisticated computer vision techniques. But once the images have been converted to characters, language models and hill-climbing search can be used to reassemble the pieces. Similar techniques can be used to read the language of life: the Human Genome Project used a technique called shotgun sequencing to reassemble shreds of DNA.

pages: 565 words: 151,129

The Zero Marginal Cost Society: The Internet of Things, the Collaborative Commons, and the Eclipse of Capitalism
by Jeremy Rifkin
Published 31 Mar 2014

Sensors are being attached to vegetable and fruit cartons in transit to both track their whereabouts and sniff the produce to warn of imminent spoilage so shipments can be rerouted to closer vendors.23 Physicians are even attaching or implanting sensors inside human bodies to monitor bodily functions including heart rate, pulse, body temperature, and skin coloration to notify doctors of vital changes that might require proactive attention. General Electric (GE) is working with computer vision software that “can analyze facial expressions for signs of severe pain, the onset of delirium or other hints of distress” to alert nurses.24 In the near future, body sensors will be linked to one’s electronic health records, allowing the IoT to quickly diagnose the patient’s likely physical state to assist emergency medical personnel and expedite treatment.

pages: 514 words: 152,903

The Best Business Writing 2013
by Dean Starkman
Published 1 Jan 2013

Upgrading Distribution Inside a spartan garage in an industrial neighborhood in Palo Alto, Calif., a robot armed with electronic “eyes” and a small scoop and suction cups repeatedly picks up boxes and drops them onto a conveyor belt. It is doing what low-wage workers do every day around the world. Older robots cannot do such work because computer vision systems were costly and limited to carefully controlled environments where the lighting was just right. But thanks to an inexpensive stereo camera and software that lets the system see shapes with the same ease as humans, this robot can quickly discern the irregular dimensions of randomly placed objects.

pages: 660 words: 141,595

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking
by Foster Provost and Tom Fawcett
Published 30 Jun 2013

Techniques and algorithms are shared between the two; indeed, the areas are so closely related that researchers commonly participate in both communities and transition between them seamlessly. Nevertheless, it is worth pointing out some of the differences to give perspective. Speaking generally, because Machine Learning is concerned with many types of performance improvement, it includes subfields such as robotics and computer vision that are not part of KDD. It also is concerned with issues of agency and cognition—how will an intelligent agent use learned knowledge to reason and act in its environment—which are not concerns of Data Mining. Historically, KDD spun off from Machine Learning as a research field focused on concerns raised by examining real-world applications, and a decade and a half later the KDD community remains more concerned with applications than Machine Learning is.

pages: 513 words: 152,381

The Precipice: Existential Risk and the Future of Humanity
by Toby Ord
Published 24 Mar 2020

Haub, C., and Kaneda, T. (2018). How Many People Have Ever Lived on Earth? https://www.prb.org/howmanypeoplehaveeverlivedon earth/. He, K., Zhang, X., Ren, S., and Sun, J. (2015). “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification.” 2015 IEEE International Conference on Computer Vision (ICCV), 1,026–34. IEEE. Helfand, I. (2013). “Nuclear Famine: Two Billion People at Risk?” Physicians for Social Responsibility. Henrich, J. (2015). The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton University Press.

pages: 573 words: 157,767

From Bacteria to Bach and Back: The Evolution of Minds
by Daniel C. Dennett
Published 7 Feb 2017

Yes, but at a huge cost, says Deacon: by taking these concerns off their hands, system designers create architectures that are brittle (they can’t repair themselves, for instance), vulnerable (locked into whatever set of contingencies their designers have anticipated), and utterly dependent on their handlers.40 This makes a big difference, Deacon insists. Does it? In some ways, I think, it does. At the height of GOFAI back in the 1970s, I observed that AI programs were typically disembodied, “bedridden” aspirants to genius that could only communicate through reading and writing typed messages. (Even work on computer vision was often accomplished with a single immovable video camera eye or by simply loading still images into the system the way you load pictures into your laptop, a vision system without any kind of eyes.) An embodied mobile robot using sense “organs” to situate itself in its world would find some problems harder and other problems easier.

pages: 499 words: 144,278

Coders: The Making of a New Tribe and the Remaking of the World
by Clive Thompson
Published 26 Mar 2019

“human-level performance,” as they noted: Steven Levy, “Inside Facebook’s AI Machine,” Wired, February 23, 2017, accessed August 19, 2018, https://www.wired.com/2017/02/inside-facebooks-ai-machine/; Yaniv Taigman, Ming Yang, Marc’Aurelio Ranzato, and Lior Wolf, “DeepFace: Closing the Gap to Human-Level Performance in Face Verification,” Conference on Computer Vision and Pattern Recognition (CVPR), June 24, 2014, accessed August 19, 2018, https://research.fb.com/publications/deepface-closing-the-gap-to-human-level-performance-in-face-verification. to navigate roads: Andrew J. Hawkins, “Inside Waymo’s Strategy to Grow the Best Brains for Self-driving Cars,” The Verge, May 9, 2018, accessed August 19, 2018, https://www.theverge.com/2018/5/9/17307156/google-waymo-driverless-cars-deep-learning-neural-net-interview.

pages: 524 words: 154,652

Blood in the Machine: The Origins of the Rebellion Against Big Tech
by Brian Merchant
Published 25 Sep 2023

We might marvel at the progress of, say, the self-driving car, but its autonomous navigation requires the labor of numerous invisible workers who do the thankless, drudgery-filled toil, often for very low wages, of labeling image after image to make the datasets the algorithm needs in order to operate. From Amazon’s Mechanical Turk to refugee camps in Europe, workers are paid pennies to sort endless reams of data, the raw materials for computer vision programs and self-driving vehicles. The researchers Mary L. Gray and Siddharth Suri call this “ghost work”—and it’s still ascendent today. The autonomous delivery robots now common on American college campuses and downtown areas may replace delivery people—but they are digitally overseen by other workers who can control them remotely, from places like Colombia, for $2 an hour.

pages: 561 words: 157,589

WTF?: What's the Future and Why It's Up to Us
by Tim O'Reilly
Published 9 Oct 2017

A participating merchant could recognize you as a customer, pulling up your stored payment credentials. As for the products you wanted to buy, I was thinking about the possibility of bar code readers in the cart, or possibly sensors that knew the exact location of each product in the store, or identified it by weight when you put it in the cart. Computer vision wasn’t yet at the point where it could reliably work the kind of magic Amazon is now practicing. Sometimes ideas are in the air, but the technology to make them a reality hasn’t yet arrived. I’ve had numerous other experiences like that. One of my earliest business ideas, back in 1981, was for an interactive hotel brochure using the new RCA LaserDisc player.

pages: 1,331 words: 163,200

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
by Aurélien Géron
Published 13 Mar 2017

In particular, it contains most of the state-of-the-art image classification nets such as VGG, Inception, and ResNet (see Chapter 13, and check out the models/slim directory), including the code, the pretrained models, and tools to download popular image datasets. Another popular model zoo is Caffe’s Model Zoo. It also contains many computer vision models (e.g., LeNet, AlexNet, ZFNet, GoogLeNet, VGGNet, inception) trained on various datasets (e.g., ImageNet, Places Database, CIFAR10, etc.). Saumitro Dasgupta wrote a converter, which is available at https://github.com/ethereon/caffe-tensorflow. Unsupervised Pretraining Suppose you want to tackle a complex task for which you don’t have much labeled training data, but unfortunately you cannot find a model trained on a similar task.

pages: 512 words: 165,704

Traffic: Why We Drive the Way We Do (And What It Says About Us)
by Tom Vanderbilt
Published 28 Jul 2008

A sophisticated analysis of motion, using several frames of motion at once, allows Richman’s system to distinguish the motion of cars from that of trees and shadows…. [Richman’s] system can trackcars through shadows, a feat that is trivial for our visual intelligence but, heretofore, quite difficult for computer vision systems. It’s easy to underestimate our sophistication at constructing visual motion. That is, until we try to duplicate that sophistication on a computer. Then it seems impossible to overestimate it.” From Donald D. Hoffman, Visual Intelligence (New York: W. W. Nortion, 1998), p. 170. “caution for the caution”: See, for example, Don Leavitt, “Insights at the Intersection,” Traffic Management and Engineering, October 2003.

pages: 596 words: 163,682

The Third Pillar: How Markets and the State Leave the Community Behind
by Raghuram Rajan
Published 26 Feb 2019

THE DIRECT EFFECTS ON JOBS As a number of researchers have pointed out, in recent years new technologies have eliminated jobs that involved well-specified routines or simple, predictable tasks.1 For example, the Amazon Go store (opened first in Seattle) tries to create a shopping experience with no lines and no checkout counters.2 As you walk in, you use the app on your phone to register your presence, pick up what you need, and walk out. Later, your Amazon account is billed. Computer vision and machine-learning algorithms, similar to the ones used in driverless cars, help identify what you pick up and tote up your bill. Not only does this do away with checkout clerks, the underlying software has also reduced the need for someone to monitor stock levels, order new inventory, or reconcile the store’s books at the end of the day.

pages: 626 words: 167,836

The Technology Trap: Capital, Labor, and Power in the Age of Automation
by Carl Benedikt Frey
Published 17 Jun 2019

Today, some 3.5 million Americans work as cashiers across the country. But if you go to an Amazon Go store, you will not see a single cashier or even a self-service checkout stand. Customers walk in, scan their phones, and walk out with what they need. To achieve this, Amazon is leveraging recent advances in computer vision, deep learning, and sensors that track customers, the items they reach for, and take with them. Amazon then bills the credit card passed through the turnstile when the customer leaves the store and sends the receipt to the Go app. While the rollout of the first Seattle, Washington, prototype store was delayed because of issues with tracking multiple users and objects, Amazon now runs three Go stores in Seattle and another in Chicago, Illinois, and plans to launch another three thousand by 2021.

pages: 598 words: 183,531

Hackers: Heroes of the Computer Revolution - 25th Anniversary Edition
by Steven Levy
Published 18 May 2010

They were always demanding that hackers get off the machine so they could work on their “Officially Sanctioned Programs,” and they were appalled at the seemingly frivolous uses to which the hackers put the computer. The grad students were all in the midst of scholarly and scientific theses and dissertations which pontificated on the difficulty of doing the kind of thing that David Silver was attempting. They would not consider any sort of computer-vision experiment without much more planning, complete review of previous experiments, careful architecture, and a setup which included pure white cubes on black velvet in a pristine, dustless room. They were furious that the valuable time of the PDP-6 was being taken up for this . . . toy! By a callow teenager, playing with the PDP-6 as if it were his personal go-cart.

pages: 706 words: 202,591

Facebook: The Inside Story
by Steven Levy
Published 25 Feb 2020

It is the horizon-exploring partner to the company’s Applied Machine Learning team, which directs its AI work to products. LeCun says that the integration worked superbly. The applied group imbued the product with machine learning, and the research group worked on general advances in natural-language understanding and computer vision. It often worked out that those advances helped Facebook. “If you ask Schrep or Mark, like, how much of an impact FAIR has had on product, they will say it’s much larger than they expected,” says LeCun. “They told us, Your mission is to really push the state of the art, the research. When things come out of it for a product impact, that’s great, but be ambitious.”

pages: 935 words: 197,338

The Power Law: Venture Capital and the Making of the New Future
by Sebastian Mallaby
Published 1 Feb 2022

But in 2017, unwilling to rest on its laurels, Founders Fund designated a partner named Trae Stephens to identify a third defense startup that might break into the major league. When Stephens scoured the Valley and came up with nothing, his comrades responded with a simple prompt. If no such company exists, start one.[64] Four years later, the resulting unicorn, Anduril, is building a suite of next-generation defense systems. Its Lattice platform combines computer vision, machine learning, and mesh networking to create a picture of a battlefield. Its Ghost 4 sUAS is a military reconnaissance drone. Its solar-powered Sentry Towers have been deployed on the U.S.-Mexico border. In an age when artificial intelligence will overwhelm the war machines of yesteryear, Anduril’s aspiration is to combine the coding virtuosity of a Google with the national-security focus of a Lockheed Martin.

pages: 1,409 words: 205,237

Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale
by Jan Kunigk , Ian Buss , Paul Wilkinson and Lars George
Published 8 Jan 2019

While certainly a hyped term, machine learning goes beyond classic statistics, with more advanced algorithms that predict an outcome by learning from the data—often without explicitly being programmed. The most advanced methods in machine learning, referred to as deep learning, are able to automatically discover the relevant data features for learning, which essentially enables use cases like computer vision, natural language processing, or fraud detection for any corporation. Many machine learning algorithms (even fairly simple ones) benefit from big data in an unproportional, even unreasonable way, an effect which was described as early as 2001.2 As big data becomes readily available in more and more organizations, machine learning becomes a defining movement in the overall IT industry to take advantage of this effect.

pages: 562 words: 201,502

Elon Musk
by Walter Isaacson
Published 11 Sep 2023

The OpenAI team rejected that idea, and Altman stepped in as president of the lab, starting a for-profit arm that was able to raise equity funding. So Musk decided to forge ahead with building a rival AI team to work on Tesla Autopilot. Even as he was struggling with the production hell surges in Nevada and Fremont, he recruited Andrej Karpathy, a specialist in deep learning and computer vision, away from OpenAI. “We realized that Tesla was going to become an AI company and would be competing for the same talent as OpenAI,” Altman says. “It pissed some of our team off, but I fully understood what was happening.” Altman would turn the tables in 2023 by hiring Karpathy back after he became exhausted working for Musk. 41 The Launch of Autopilot Tesla, 2014–2016 Franz von Holzhausen with an early “Robotaxi” Radar Musk had discussed with Larry Page the possibility of Tesla and Google working together to build an autopilot system that would allow cars to be self-driving.

pages: 915 words: 232,883

Steve Jobs
by Walter Isaacson
Published 23 Oct 2011

(TV show), 458 “Why I Won’t Buy an iPad” (Doctorow), 563 Wigginton, Randy, 81–82, 92–93, 104, 161 Wikipedia, 386 Wilkes Bashford (store), 91 Williams, Robert, 329–30 Wired, 276, 295, 311–12, 317, 408, 466 Wolf, Gary, 295 Wolfe, Tom, 58 Wolff, Michael, 523 Wonder Boys (film), 412 Woodside Design, 196 Woolard, Ed, 310, 313, 314, 318–20, 336, 338, 359, 371 options compensation issue and, 364–66, 448 “Wooly Bully” (song), 413 Wordsworth, William, 69 “Working for/with Steve Jobs” (Raskin), 112 Worldwide Developers Conference, 532–33, 536 Wozniak, Francis, 22 Wozniak, Jerry, 77 Wozniak, Stephen, xvi, 21, 29, 32, 33, 59, 62, 69, 79, 93, 94, 102, 110, 124, 132, 163, 168, 170, 217, 305, 308, 317, 319, 334–35, 354, 363, 379, 393, 412, 464, 474, 524, 560, 565 in air crash, 115 Apple I design and, 61, 67–68, 534 Apple II design and, 72–75, 80–81, 84–85, 92, 534, 562 Apple left by, 192–93 Apple partnership and, 63–65 Apple’s IPO and, 103–4 background of, 21–22 Blue Box designed by, 27–30, 81 music passion of, 25–26 personal computer vision of, 60–61 Pong design and, 52–54 as prankster, 23–29 remote control device of, 193–94, 218, 221 SJ contrasted with, 21–22, 40, 64 on SJ’s distortion of reality, 118–19 SJ’s first meeting with, 25 SJ’s friendship with, 21–23 at SJ’s 30th birthday party, 189 in White House visit, 192–93 Wright, Frank Lloyd, 7, 330 Xerox, 95–96, 98, 119, 169, 195, 565, 566 Alto GUI of, 177 Star computer of, 99, 175–76 Xerox PARC, 94–96, 98–99, 100, 111, 114, 120, 177, 179, 474 Yahoo, 502, 545 Yeah Yeah Yeah (music group), 500 Yocam, Del, 4–5, 198, 202 Yogananda, Paramahansa, 35 York, Jerry, 321, 450, 482 “You Say You Want a Revolution” (song), 526 Zaltair hoax, 81, 189 Zander, Ed, 333, 465 Zap, 53 ZDNet, 137 Zen Buddhism, 15, 34–35, 41, 57 Zen Mind, Beginner’s Mind (Suzuki), 35, 49 Ziegler, Bart, 293 Zittrain, Jonathan, 563 Zuckerberg, Mark, 275, 545–46, 552 ILLUSTRATION CREDITS Numbers in roman type refer to illustrations in the Photos section; numbers in italics refer to book pages.

pages: 898 words: 266,274

The Irrational Bundle
by Dan Ariely
Published 3 Apr 2013

BY THE WAY, a funny thing happened when we ran the experiment in the Carolina Brewery: Dressed in my waiter’s outfit, I approached one of the tables and began to read the menu to the couple there. Suddenly, I realized that the man was Rich, a graduate student in computer science, someone with whom I had worked on a project related to computational vision three or four years earlier. Because the experiment had to be conducted in the same way each time, this was not a good time for me to chat with him, so I put on a poker face and launched into a matter-of-fact description of the beers. After I finished, I nodded to Rich and asked, “What can I get you?”

pages: 1,079 words: 321,718

Surfaces and Essences
by Douglas Hofstadter and Emmanuel Sander
Published 10 Sep 2012

If one were to draw up a table of numerical specifications, as is standardly done in comparing one computer with another, Homo sapiens sapiens would wind up in the recycling bin. Given all this, how can we explain the fact that, in terms of serious thought, machines lag woefully behind us? Why is machine translation so often inept and awkward? Why are robots so primitive? Why is computer vision restricted to the simplest kinds of tasks? Why is it that today’s search engines can instantly search billions of Web sites for passages containing the phrase “in good faith”, yet are incapable of spotting Web sites in which the idea of good faith (as opposed to the string of alphanumeric characters) is the central theme?