machine readable

back to index

244 results

Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data
by Leslie Sikos
Published 10 Jul 2015

A personal description extracted from RDF and displayed on a web page 22 Chapter 2 ■ Knowledge Representation Figure 2-4. A graph generated from an RDF file While such machine-readable RDF files are useful, their primary application is data modeling, so the RDF files are separate from the markup of your web site. You can add structured data directly to the markup, such as (X)HTML5, by using machine-readable annotations, which can be processed by semantic data extractors and, if needed, converted into RDF. Machine-Readable Annotations There are four machine-readable annotation formats for web sites (by order of introduction): • Microformats, which publish structured data about basic concepts,1 such as people, places, events, recipes, and audio, through core (X)HTML attributes • RDFa, which expresses RDF in markup attributes that are not part of the core (X) HTML vocabularies • HTML5 Microdata, which extends the HTML5 markup with structured metadata (a HTML5 Application Programming Interface) • JSON-LD, which adds structured data to the markup as JavaScript code RDFa and JSON-LD can be used in most markup language versions and variants, while HTML5 Microdata can be used in (X)HTML5 only.

Without context, the information provided by web sites can be ambiguous to search engines. 3 This is only supported in (X)HTML5. 3 Chapter 1 ■ Introduction to the Semantic Web Figure 1-3. Traditional web site contents are meaningless to computers The concept of machine-readable data is not new, and it is not limited to the Web. Think of the credit cards or barcodes, both of which contain human-readable and machine-readable data (Figure 1-4). One person or product, however, has more than one identifier, which can cause ambiguity. Figure 1-4. Human-readable and machine-readable data 4 Chapter 1 ■ Introduction to the Semantic Web Even the well-formed XML documents, which follow rigorous syntax rules, have serious limitations when it comes to machine-processability.

Accessed 18 January 2015. 11 Chapter 2 Knowledge Representation To improve the automated processability of web sites, formal knowledge representation standards are required that can be used not only to annotate markup elements for simple machine-readable data but also to express complex statements and relationships in a machine-processable manner. After understanding the structure of these statements and their serialization in the Resource Description Framework (RDF), the structured data can be efficiently modeled as well as annotated in the markup, or written in separate, machine-readable metadata files. The formal definitions used for modeling and representing data make efficient data analysis and reuse possible. The three most common machine-readable annotations that are recognized and processed by search engines are RDFa (RDF in attributes), HTML5 Microdata, and JSON-LD, of which HTML5 Microdata is the recommended format.

pages: 282 words: 28,394

Learn Descriptive Cataloging Second North American Edition
by Mary Mortimer
Published 1 Jan 1999

The committee consisting of representatives of the American Library Association, the Australian Committee on Cataloguing, the British Library, the Canadian Committee on Cataloguing, the Library Association and the Library of Congress, established to review, advise on and promote the Anglo-American cataloguing rules journal A periodical issued by an institution, corporation or learned society containing current information and reports of activities or works in a particular field JSC See Joint Steering Committee for the Revision of AACR key-title The unique name given to a serial by the International Serials Data System (ISDS) kit An item containing more than one kind of material, none of which is predominant, e.g., a set of slides and an audiocassette LCMARC Library of Congress machine-readable cataloging format leader Top line of a MARC record that gives information about the record to the computer program that processes it leaf A sheet of paper consisting of two pages, one on each side legend Bytes 6-9 of the MARC leader GLOSSARY 273 limited edition An edition in which a restricted number of copies is printed, often more expensively produced than a regular edition logical record length The length of a self-contained MARC record machine-readable Needing a computer to process or interpret Machine-Readable Bibliographic Information Committee machine-readable cataloging See MARBI See MARC main entry The principal entry in a catalog that contains the complete record of an item manuscript A hand-written or typescript document map A representation, normally to scale, of an area of the earth’s surface or another celestial body MARBI Machine-Readable Bibliographic Information Committee. North American committee that revises and develops the MARC format MARC Machine readable cataloging. A system developed by the Library of Congress in 1966 so that libraries can share machine-readable bibliographic data MARC 21 The MARC format created by harmonizing USMARC and CANMARC; increasingly becoming the world standard for bibliographic data mark of omission masthead Punctuation mark showing something has been left out ...

The committee consisting of representatives of the American Library Association, the Australian Committee on Cataloguing, the British Library, the Canadian Committee on Cataloguing, the Library Association and the Library of Congress, established to review, advise on and promote the Anglo-American cataloguing rules journal A periodical issued by an institution, corporation or learned society containing current information and reports of activities or works in a particular field JSC See Joint Steering Committee for the Revision of AACR key-title The unique name given to a serial by the International Serials Data System (ISDS) kit An item containing more than one kind of material, none of which is predominant, e.g., a set of slides and an audiocassette LCMARC Library of Congress machine-readable cataloging format leader Top line of a MARC record that gives information about the record to the computer program that processes it leaf A sheet of paper consisting of two pages, one on each side legend Bytes 6-9 of the MARC leader GLOSSARY 273 limited edition An edition in which a restricted number of copies is printed, often more expensively produced than a regular edition logical record length The length of a self-contained MARC record machine-readable Needing a computer to process or interpret Machine-Readable Bibliographic Information Committee machine-readable cataloging See MARBI See MARC main entry The principal entry in a catalog that contains the complete record of an item manuscript A hand-written or typescript document map A representation, normally to scale, of an area of the earth’s surface or another celestial body MARBI Machine-Readable Bibliographic Information Committee.

What would you consult if you were unsure about the difference between a leaf, a plate and a page? Chapter 5 MARC (CatSkill—module 5) Introduction MARC stands for MAchine Readable Cataloging. The description and headings of all items in the catalog are created according to the Anglo-American cataloguing rules. Coding into MARC format is simply transcribing the description and headings into a form that a computer system can read and manipulate. In the 1960s, librarians at the Library of Congress began work on a system for distributing cataloging information in machine-readable form. The Library of Congress, in consultation with other libraries, developed a standard format for recording cataloging information on computer tape.

pages: 283 words: 78,705

Principles of Web API Design: Delivering Value with APIs and Microservices
by James Higginbotham
Published 20 Dec 2021

It combines the idea of easy documentation generation using Markdown with a structure that makes it machine-readable for supporting code generation and other tooling needs. Since API Blueprint is based on Markdown, any tool capable of rendering and editing files using the Markdown format, including developer IDEs, is able to work with this format. While the ecosystem of tooling isn’t as vast as OAS, it does have considerable community support due to the pre-acquisition efforts of Apiary. As Listing 13.2 shows, it is easy to work with and therefore a popular choice for those seeking to combine Markdown-based documentation alongside a machine-readable API description format.

Keith Casey</doc> </descriptor> </descriptor> </descriptor> </alps> * * * Improving API Discovery Using APIs.json There are multiple API description formats that may be necessary to help developers consume the API using various tools. APIs.json is a description format that assists in API discovery through a machine-readable index file. It is similar to a site map for a website that helps direct search engine indexers to important areas of the website. A single APIs.json file may reference multiple APIs, making this format useful for bundling multiple, separate API description files into a single product or platform view. When combined with other machine-readable formats, APIs may be discovered, indexed, and made available within public or private API catalog. As the name indicates, the default format is JSON although the YAML-based format, shown in Listing 14.6, is also available.

Using a tool that is only provisioned for a subset of the organization is not recommended. Figure 6.2 shows a template that is easy to read and fits both spreadsheet and document formats. Figure 6.2 A template for capturing API profiles in a spreadsheet or document. What About Using the OpenAPI Specification (OAS)? The OpenAPI Specification (OAS) is a machine-readable format used to capture the description of REST-based and RPC-based APIs. The format was designed to aid in the generation of API reference documentation and boilerplate code. As such, the OAS structure is rooted in URL paths. Since API modeling is prior to a complete API design that includes resource paths, OAS isn’t an appropriate format for API profiles.

The Data Journalism Handbook
by Jonathan Gray , Lucy Chambers and Liliana Bounegru
Published 9 May 2012

With all those great technical options, don’t forget the simple options: often it is worth it to spend some time searching for a file with machine-readable data or to call the institution that is holding the data you want. In this chapter we walk through a very basic example of scraping data from an HTML web page. What Is Machine-Readable Data? The goal for most of these methods is to get access to machine-readable data. Machine-readable data is created for processing by a computer, instead of the presentation to a human user. The structure of such data relates to contained information, and not the way it is displayed eventually. Examples of easily machine-readable formats include CSV, XML, JSON, and Excel files, while formats like Word documents, HTML pages, and PDF files are more concerned with the visual layout of the information.

Speaking to experienced data journalists and journalism scholars on Twitter it seems that one of the earliest formulations of what we now recognize as data journalism was in 2006 by Adrian Holovaty, founder of EveryBlock, an information service that enables users to find out what has been happening in their area, on their block. In his short essay “A fundamental way newspaper sites need to change”, he argues that journalists should publish structured, machine-readable data, alongside the traditional “big blob of text”: For example, say a newspaper has written a story about a local fire. Being able to read that story on a cell phone is fine and dandy. Hooray, technology! But what I really want to be able to do is explore the raw facts of that story, one by one, with layers of attribution, and an infrastructure for comparing the details of the fire with the details of previous fires: date, time, place, victims, fire station number, distance from fire department, names and years experience of firemen on the scene, time it took for firemen to arrive, and subsequent fires, whenever they happen.

But what makes this distinctive from other forms of journalism that use databases or computers? How, and to what extent is data journalism different from other forms of journalism from the past? Computer-Assisted Reporting and Precision Journalism Using data to improve reportage and delivering structured (if not machine-readable) information to the public has a long history. Perhaps most immediately relevant to what we now call data journalism is computer-assisted reporting, or CAR, which was the first organized, systematic approach to using computers to collect and analyze data to improve the news. CAR was first used in 1952 by CBS to predict the result of the presidential election.

The Card Catalog: Books, Cards, and Literary Treasures
by Library Of Congress and Carla Hayden
Published 3 Apr 2017

Avram, who joined the Library of Congress in 1965. Avram hastily evaluated the card catalog and devised the first automated cataloging system in the world, known as Machine-Readable Cataloging (MARC). Launched in January 1966, MARC attempted to both convert and manipulate the data stored on a catalog card. Representatives from sixteen libraries were invited to participate in its development, and the collaboration yielded approximately 50,000 machine-readable records containing information for English-language books. Stored on magnetic tape, these catalog records, which incorporated the standard classification scheme, could be searched at a computer terminal.

N., 12 L L’Enfant, Pierre Charles, 46 La Fontaine, Henri, 161 Leypoldt, Frederick, 87 Library Bureau, 84 Library Company of Philadelphia, 46 Library Journal, 84, 87, 147, 148, 151 Library of Babel, 9 Library of Congress, 7, 82, 87, 102, 112, 146 M Machine-Readable Cataloging (MARC), 113, 152, 158 Madison, James, 46, 49 Mann, Thomas, 157 MARC. See Machine-Readable Cataloging Martel, Charles, 108 McKinley, William, 108 Mearns, David, 102 Meehan, John Silva, 79 Mercier, Barthélemy, 21 Mesopotamia, 12 Monroe, James, 49 Morrill, Justin, 104 movable type, 18 Mumford, Lawrence Quincy, 152, 154 N National Constituent Assembly, 21 National Library, 113 New York Public Library, 147, 154 New York State Library School Handbook, 111 New Yorker, The, 159 Nippur, 12 O Odyssey, 12 Online Public Access Catalog (OPAC), 154 Otlet, Paul, 161 P Pandectae, 18 pantographic card-punching machine, 151 paper slip, 82 papyrus, 13 Pearce, James A., 102 Pelz, Paul J., 104 Pinakes, 14, 50 Poor Richard’s Almanack, 57 printing press, 18–19 Proulx, Annie, 156 Publishers Weekly, 87 Putnam, Herbert, 7, 82, 107, 112, 146 Q Qin dynasty, 15 R Rather, John, 159 reeds, 12 Report of the Librarian, 106 Rider, Fremont, 148, 155 Roman Catholic Church, 17, 20 Roosevelt, Theodore, 110 S Saunders, Richard, 57 Schneider, Hope, 155 scrolls, 13–15 Smithmeyer, John L., 104 Smithsonian Institution Library, 102 space concerns, 157.

Anyone who ever thumbed through the cards at a local public library probably referred to many that came from the Library of Congress. The operation grew exponentially over the years and at its peak in 1969, approximately seventy-nine million cards were printed and distributed annually. Coincidentally, this was the same year the Library introduced Machine-Readable Cataloging (MARC), which eventually would supplant the card catalog. The card service lasted nearly a century, with the last cards produced and distributed in 1997. Although the Library of Congress is frequently—and erroneously—credited with the invention of the card catalog, it was ironically one of the last major libraries to embrace it.

pages: 318 words: 87,570

Broken Markets: How High Frequency Trading and Predatory Practices on Wall Street Are Destroying Investor Confidence and Your Portfolio
by Sal Arnuk and Joseph Saluzzi
Published 21 May 2012

This could easily create valuable latency arbitrage opportunities in the trading of ETFs and other tradable products linked to indexes (see Chapter 2, “The Curtain Pulled Back on High Frequency Trading”). Machine-Readable News Knowing how valuable the combination of colocation and data fees are to HFT firms, the exchanges also have begun entering the business of delivering news in a high-speed, “machine-readable” format, which can be easily used by computer-driven trading programs. In December 2011, NASDAQ acquired RapiData, a leading provider of machine-readable economic news to trading firms and financial institutions.6 Two years earlier, Deutsche Borse acquired Need To Know News, another machine-readable news firm. Many of the major financial news services, including Thomson Reuters, Dow Jones, and Bloomberg, also have entered the business.

Many of the major financial news services, including Thomson Reuters, Dow Jones, and Bloomberg, also have entered the business. Machine-readable news data feeds enable HFT computers to react within microseconds to news events, beating out traditional institutional and retail investors. The instant a corporate news release runs on services such as PRNewswire or BusinessWire, it is parsed by machine-readable programs and sent to HFT firms’ subscribers, who incorporate it into their algorithms to make instantaneous trading decisions. Not only are corporate news releases disseminated this way, but major economic news releases from Government and industry organizations, such as jobs numbers from the U.S.

Bureau of Labor Statistics or the Chicago Purchasing Managers Index, are distributed in machine-readable format. Firms such as RapiData market themselves by highlighting their low latency and sophisticated design, which enables news feeds to be integrated easily into trading programs. They also tout that some data is sent directly from government lockups, which allows it to be parsed the instant an embargo is lifted. They station their “reporters” in government press rooms to enter the data into their computers. When the embargo is lifted, the machine-readable news firm releases the news to subscribers. The danger is that the machines can “interpret” the news incorrectly.

pages: 408 words: 63,990

Build Awesome Command-Line Applications in Ruby: Control Your Computer, Simplify Your Life
by David B. Copeland
Published 6 Apr 2012

Our app is definitely playing well with others. The only problem is that machine-readable formats tend not to be very human readable. This wasn’t a problem with ls, whose records (files) have only one field (the name of the file). For complex apps like todo, where there are several fields per record, the output is a bit difficult to read. A seasoned UNIX user would simply pipe our output into awk and format the list to their tastes. We can certainly leave it at that, but there’s a usability concern here. Our app is designed to be used by a user sitting at a terminal. We want to maintain the machine-readable format designed for interoperability with other apps but also want our app to interoperate with its users.

Provide a Pretty-Printing Option The easiest way to provide both a machine-readable output format and a human-readable option is to create a command-line flag or switch to specify the format. We’ve seen how to do this before, but here’s the code we’d use in todo to provide this: play_well/todo/bin/todo ​desc 'List tasks'​ ​command :list do |c|​ ​​ ​ c.desc 'Format of the output'​ ​ c.arg_name 'csv|pretty'​ ​ c.default_value 'pretty'​ ​ c.flag :format​ ​​ ​ c.action do |global_options,options,args|​ ​ if options[:format] == 'pretty'​ ​ # Use the pretty-print format​ ​ elsif options[:format] == 'csv'​ ​ # Use the machine-readable CSV format​ ​ end​ ​ end​ ​end​ We’ve chosen to make the pretty-printed version the default since, as we’ve mentioned, our app is designed primarily for a human user.

If all you did was follow these rules and conventions, you’d be producing great command-line apps. But, we want to make awesome command-line apps, and these rules and conventions can take us only so far. There are still a lot of open questions about implementing your command-line app. Our discussion of “pretty-printed” vs. “machine readable” formats is just one example: how should you decide which default to use? What about files that our app creates or uses; where should they live? Should we always use a one-letter name for our command-line options? When should we use the long-form names? How do we choose the default values for flags?

pages: 245 words: 68,420

Content Everywhere: Strategy and Structure for Future-Ready Content
by Sara Wachter-Boettcher
Published 28 Nov 2012

Just like an editor marks up a book before it goes to print—adding in-line notes that dictate where a block quote should start and stop, or when to use bullets—markup is a way to add directions to your content about what different pieces of text are, allowing you to make automated decisions about what those pieces of text should do when they’re displayed. In other words, it’s the code that wraps around your content chunks and lends them machine-readable meaning. In this chapter, we’ll take a look at why markup matters for content, explore the types of markup you may hear mentioned in conversations with technical teams (as well as which ones are likely to be used by whom), and see why knowing just a bit about this oft-mystified m-word can help you make better decisions about how you plan, structure, write, and share content.

When you use presentational markup, that’s precisely what you’re doing: locking content into just one shade, one way to look...even when it’s being displayed in places where it looks silly. If you can’t be sure exactly where and how your content is going to be displayed, both now and in the future, then purely presentational markup simply doesn’t have enough power to pass muster. Semantic Semantic markup, on the other hand, is designed to reveal, in a machine-readable way, the intrinsic meaning in your content, and to provide the machines that read it information they can use to apply a style sheet that determines how it should be displayed. It gives your content information about itself—telling it things like “this is a headline,” rather than “this should be in large type.”

Microformats Microformats are an open data standard that builds on HTML to add metadata to pieces of content, identifying information as something specific, like a “person” or a “location.” Because microformats are an open standard, many industries and organizations have added new microformats for specialty data. They work by adding specific classes to snippets of HTML, with those classes defining what the content within the snippet is. Microformats can be used to lend machine-readable meaning to chunks of incredibly small pieces of content, like a date or time, even if it’s in the middle of a paragraph of other text. HTML5 Microdata New in the HTML5 spec is the microdata extension, which is built off earlier microformats work. Microdata goes beyond traditional presentational HTML tags and allows you to mark up content with standards-compliant, semantically rich HTML—for example, marking up content as an “event” or “organization.”

pages: 25 words: 5,789

Data for the Public Good
by Alex Howard
Published 21 Feb 2012

All are available to the public and media to download and embed as well. The combination of publishing maps and the open data that drives them simultaneously online is significantly evolved for any government agency, and it serves as a worthy bar for other efforts in the future to meet. USAID accomplished this by migrating its data to an open, machine-readable format. “In the past, we released our data in inaccessible formats — mostly PDFs — that are often unable to be used effectively,” said Van Dyck. “USAID is one of the premiere data collectors in the international development space. We want to start making that data open, making that data sharable, and using that data to tell stories about the crisis and the work we are doing on the ground in an interactive way.”

Park has focused on releasing data at Health.Data.Gov. In a speech to a Hacks and Hackers meetup in New York City in 2011, Park emphasized that HHS wasn’t just releasing new data: “[We’re] also making existing data truly accessible or usable,” he said, taking “stuff that’s in a book or on a website and turning it into machine-readable data or an API.” Park said it’s still quite early in the project and that the work isn’t just about data — it’s about how and where it’s used. “Data by itself isn’t useful. You don’t go and download data and slather data on yourself and get healed,” he said. “Data is useful when it’s integrated with other stuff that does useful jobs for doctors, patients and consumers.”

Smart Disclosure There are enormous economic and civic good opportunities in the “smart disclosure” of personal data, whereby a private company or government institution provides a person with access to his or her own data in open formats. Smart disclosure is defined by Cass Sunstein, Administrator of the White House Office for Information and Regulatory Affairs, as a process that “refers to the timely release of complex information and data in standardized, machine-readable formats in ways that enable consumers to make informed decisions.” For instance, the quarterly financial statements of the top public companies in the world are now available online through the Securities and Exchange Commission. Why does it matter? The interactions of citizens with companies or government entities generate a huge amount of economically valuable data.

Data Wrangling With Python: Tips and Tools to Make Your Life Easier
by Jacqueline Kazil
Published 4 Feb 2016

Microsoft Word documents are an example of the latter, while CSV, JSON, and XML are examples of the former. In this chapter, we will cover how to read files easily handled by machines, and in Chapters 4 and Chapter 5 we will cover files made for human consumption. File formats that store data in a way easily understood by machines are commonly referred to as machine readable. Common machinereadable formats include the following: • Comma-Separated Values (CSV) • JavaScript Object Notation (JSON) • Extensible Markup Language (XML) In spoken and written language, these data formats are typically referred to by their shorter names (e.g., CSV). We will be using these acronyms.

Go ahead and download the code examples from the book’s data repository and move them into your project’s folder. As you follow along in the chapter, we will assume the data from that repository is stored in the same folder where you are writing your Python code. This way, we don’t have to worry about locating the files and can focus instead on importing data with Python. CSV Data The first machine-readable file type we will learn about is CSV. CSV files, or CSVs for short, are files that separate data columns with commas. The files themselves have a .csv extension. Another type of data, called tab-separated values (TSV) data, sometimes gets classi‐ fied with CSVs. TSVs differ only in that they separate data columns with tabs and not commas.

This isn’t necessary for our code, but it helps us preview the data and make sure it’s in the proper form: for item in data: print item Once you are done writing your file you can save and run it. As you can see, opening and converting a JSON file to a list of dictionaries in Python is quite easy. In the next section, we will explore more customized file handling. XML Data XML is often formatted to be both human and machine readable. However, the CSV and JSON examples were a lot easier to preview and understand than the XML file for this dataset. Luckily for us, the data is the same, so we are familiar with it. Down‐ load and save the XML version of the life expectancy rates data in the folder where you are saving content associated with this chapter.

The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences
by Rob Kitchin
Published 25 Aug 2014

Feedback is also welcome via email (Rob.Kitchin@nuim.ie) or Twitter (@robkitchin). Some of the material in this book has been previously published as papers and blog posts, though it has been updated, reworked and extended: Dodge, M. and Kitchin, R. (2005) ‘Codes of life: identification codes and the machine-readable world’, Environment and Planning D: Society and Space, 23(6): 851–81. Kitchin, R. (2013) ‘Big data and human geography: opportunities, challenges and risks’, Dialogues in Human Geography, 3(3): 262–7. Kitchin, R. (2014) ‘The real-time city? Big data and smart urbanism’, GeoJournal 79(1): 1–14.

Given that by their nature open data generate no or little income to fund such service arrangements, nor indeed the costs of opening data, while it is easy to agree that open data should be delivered as a service, in practice it might be an aspiration unless effective funding models are developed (as discussed more fully below). Linked Data The idea of linked data is to transform the Internet from a ‘web of documents’ to a ‘web of data’ through the creation of a semantic web (Berners-Lee 2009; P. Miller, 2010), or what Goddard and Byrne (2010) term a ‘machine-readable web’. Such a vision recognises that all of the information shared on the Web contains a rich diversity of data – names, addresses, product details, facts, figures, and so on. However, these data are not necessarily formally identified as such, nor are they formally structured in such a way as to be easily harvested and used.

These include the production of mainframe computers in the 1950s and 60s; the nascent Internet in the 1970s and 80s that linked such computers together; the wide-scale roll-out of personal computers in the 1980s and 90s; the massive growth of the Internet in the 1990s and the development of Web-based industries, alongside a huge growth in mobile phones and digital devices such as games consoles and digital cameras; the development of mobile, distributed and cloud computing and Web 2.0 in the 2000s; the roll-out of ubiquitous and pervasive computing in the 2010s. Throughout this period a number of transformative effects took place: computational power grew exponentially; devices were networked together; more and more aspects and processes of everyday life became mediated by digital systems; data became ever more indexical and machine-readable; and data storage expanded and became distributed. Computation While the initial mainframe digital computers of the 1950s and 60s provided computation that was more efficient than that provided by people or the analogue devices they used (such as abacus, mechanical calculators, punch-card calculators, analogue computers, etc.), their processing power was limited, and thus the kinds of operations they performed constrained, and they were large and expensive.

pages: 229 words: 68,426

Everyware: The Dawning Age of Ubiquitous Computing
by Adam Greenfield
Published 14 Sep 2006

As a manifestation of the emerging culture of mass amateurization, such open codes would allow small-scale producers—from West Berkeley sculptors to Bangladeshi weaving collectives—to compete on something approaching the same ground as professional manufacturers and distributors. At the same time, though, and despite its designers' clear intentions, a free product identifier could be regarded as a harbinger of the insidious transformation of just about everything into machine-readable information. Such an identifier is not a technical system in the usual sense. It is intangible, nonmaterial. It's nothing more than a convention: a format and perhaps some protocols for handling information expressed in that format. But its shape and conception are strongly conditioned by the existence of parallel conventions—conventions that are embodied in specific technologies.

The whole notion of a Uniform Resource Identifier, for example, which was called into being by the Internet, or a Universal Product Code, which cannot be separated from the technics of bar-coding and its descendent, RFID. And though such conventions may be intangible, they nevertheless have power, in our minds and in the world. The existence of a machine-readable format for object identification, particularly, is a container waiting to be filled, and our awareness that such a thing exists will transform the way we understand the situations around us. Because once we've internalized the notion, any object that might once have had an independent existence—unobserved by anyone outside its immediate physical vicinity, unrecorded and certainly uncorrelated—can be captured and recast as a node.

And then suppose that—largely as a consequence of the automobile manufacturer's successful and public large-scale roll-out of the system—this identification system is adopted by a wide variety of other institutions, private and public. In fact, with minor modifications, it's embraced as the standard driver's license schema by a number of states. And because the various state DMvs collect such data, and the ID-generation system affords them the technical ability to do so, the new licenses wind up inscribed with machine-readable data about the bearer's sex, height, weight and other physical characteristics, ethnicity.... If you're having a hard time swallowing this set-up, consider that history is chock-full of situations where some convention originally developed for one application was adopted as a de facto standard elsewhere.

pages: 435 words: 62,013

HTML5 Cookbook
by Christopher Schmitt and Kyle Simpson
Published 13 Sep 2011

Solution Wrap the human-friendly date-time information in the time element, and specify the machine-readable information via the datetime attribute: <p>Published: <time datetime="2011-01-15">January 15, 2011</time></p> Depending on the date and time you are trying to specify, the following approaches are also valid: Time only <p>The class starts at <time datetime="08:00">8:00 am</time>.</p> Date and time (requires time zone specification) <p>Registration opens on <time datetime="2011-01-15T08:00-07:00">January 15, 2011 at 8:00 am, Mountain Time</time>.</p> Visible machine-readable format (no date-time required) <p>Published: <time>2011-01-15</time></p> Discussion Before we discuss this new element, let’s first address machine readability.

</p> Visible machine-readable format (no date-time required) <p>Published: <time>2011-01-15</time></p> Discussion Before we discuss this new element, let’s first address machine readability. This is simply the notion of enriching your web content with additional, semantic information that machines (search engines, user agents, etc.) can parse to glean more meaning from your content. Note We’re not aware of any machines that are currently parsing the datetime information from time, but we can imagine this soon to be on the horizon, particularly for search engines that want to accurately display time-sensitive results for news queries. Next, let’s talk date-time information on the Web. For machine readability, date-time information has to be specified in the international standard format, ISO 8601.

Note One limitation of time that is worth mentioning is that it can’t be used for imprecise dates, such as “August 2011.” It does require, minimally, the day, month, and year for dates. What about microformats? By this point, you may be wondering about microformats (http://microformats.org), which are HTML-based design patterns for expressing machine-readable semantics. The hCalendar (http://microformats.org/hCalendar) microformat, for example, is used for indicating date-time information. Note Want to learn more about microformats? Check out Emily Lewis’s Microformats Made Simple (New Riders, http://microformatsmadesimple.com) for lots of practical examples and easy-to-understand explanations.

pages: 315 words: 70,044

Learning SPARQL
by Bob Ducharme
Published 15 Jul 2011

Berners-Lee came up with the idea of Linked Data as a set of best practices for sharing data across the web infrastructure so that applications can more easily retrieve data from public sites with no need for screen scraping—for example, to let your calendar program get flight information from multiple airline websites in a common, machine-readable format. These best practices recommend the use of URIs to name things and the use of standards such as RDF and SPARQL. They provide excellent guidelines for the creation of an infrastructure for the semantic web. and the semantics of that data The idea of “semantics” is often defined as “the meaning of words.”

RDFa’s supplemental role in XML and HTML documents makes it excellent for metadata about content in those documents, and utilities are available to pull the triples out of RDFa attributes in a format that lets you query them with SPARQL. Note RDFa’s ability to embed triples in HTML makes it great for sharing machine-readable data in web pages so that automated processes gathering that data don’t need to do screen scraping of those pages. Storing RDF in Databases If you need to store a very large number of triples, keeping them as Turtle or RDF/XML in one big text file may not be your best option, because a system that indexes data and decides which data to load into memory when—that is, a database management system—can be more efficient.

Of all the W3C semantic web standards, OWL is the key one for putting the “semantic” in “semantic web.” The term “semantics” is sometimes defined as the meaning behind words, and those who doubt the value of semantic web technology like to question the viability of storing all the meaning of a word in a machine-readable way. As we saw above, though, we don’t need to store all the meaning of a word to add value to a given set of data. For example, simply knowing that “spouse” is a symmetric term made it possible to find out the identity of Cindy’s spouse, even though this fact was not part of the dataset. Linked Data The idea of Linked Data is newer than that of the semantic web, but sometimes it’s easier to think of the semantic web as building on the ideas behind Linked Data.

pages: 255 words: 75,172

Sleeping Giant: How the New Working Class Will Transform America
by Tamara Draut
Published 4 Apr 2016

Department of Labor, Bureau of Labor Statistics, “Characteristics of Minimum Wage Workers, 2013,” BLS Reports, March 2014, at http://www.​bls.​gov/​cps/​minwage2013.​pdf. 4. Author’s analysis of Current Population Survey Annual Social and Economic Supplement, U.S. Department of Labor, Bureau of Labor Statistics. Data retrieved from IPUMS-CPS: Steven Ruggles et al., Integrated Public Use Microdata Series: Version 5.0 [machine-readable database] (Minneapolis: University of Minnesota, 2010). 5. Author’s analysis of 1970–2000 Decennial Census, 2011–2012 U.S. Census American Community Survey, U.S. Census Bureau Public Use Microdata retrieved from Ruggles et al., Integrated Public Use Microdata Series: Version 5.0. 6. Author’s analysis of Current Population Survey Annual Social and Economic Supplement. 7.

Blair Bowie and Adam Lioz, “Billion-Dollar Democracy: The Unprecedented Role of Money in the 2012 Elections,” Demos, June 2013, at http://www.​demos.​org/​sites/​default/​files/​publications/​billion.​pdf. 15. Jack Metzger, “Politics and the American Class Vernacular,” in John Russo and Sherry Lee Linkon, eds., New Working-Class Studies (Ithaca, NY: ILR Press, 2005), p. 198. 16. Author’s analysis of General Social Surveys, 1972–2010 [machine-readable data file]/Principal Investigator, Tom W. Smith; Co-Principal Investigator, Peter V. Marsden; Co-Principal Investigator, Michael Hout; Sponsored by National Science Foundation—NORC ed.—Chicago: NORC at the University of Chicago [producer]; Storrs, Connecticut: The Roper Center for Public Opinion Research, University of Connecticut. 17.

Department of Labor, Bureau of Labor Statistics, “Data Tables for Overview of May 2012 Occupational Employment and Wages,” March 29, 2013, at http://www.​bls.​gov/​oes/​2012/​may/​featured_​data.​htm#largest. 14. Independent analysis of U.S. Census 2012 and 2011 American Community Survey. Data retrieved from IPUMS-USA: Steven Ruggles et al., Integrated Public Use Microdata Series: Version 5.0 [machine-readable database] (Minneapolis: University of Minnesota, 2010). 15. Bureau of Labor Statistics, “Data Tables for Overview of May 2012 Occupational Employment and Wages.” 16. Coca-Cola’s brand was worth $83.8 billion in 2015 according to Statista, making it fourth on Forbes’s annual list of the most valuable brands. 17.

Designing Web APIs: Building APIs That Developers Love
by Brenda Jin , Saurabh Sahni and Amir Shevat
Published 28 Aug 2018

Incorrect or unclear errors are frustrating and can negatively affect adoption of your APIs. Developers can get stuck and just give up. Meaningful errors are easy to understand, unambiguous, and actionable. They help developers to understand the problem and to address it. Providing these errors with details leads to a better devel‐ oper experience. Error codes that are machine-readable strings allow developers to programmatically handle errors in their code bases. In addition to these strings, it is useful to add longer-form errors, either in the documentation or somewhere else in the payload. These are sometimes referred to as human-readable errors. Even better, personalize these errors per developer.

Group errors into high-level categories Error category System-level error Examples Database connection issue Backend service connection issue Fatal error Business logic error Rate-limited Request fulfilled, but no results were found Business-related reason to deny access to information API request formatting error Required request parameters are missing Combined request parameters are invalid together Authorization error OAuth credentials are invalid for request Token has expired After grouping your error categories throughout your code path, think about what level of communication is meaningful for these errors. Some options include HTTP status codes and headers, as well as machine-readable “codes” or more verbose human-readable error messages returned in the response payload. Keep in mind that you’ll want to return an error response in a format consistent with your non-error responses. For example, if you return a JSON response on a successful request, you should ensure that the error is returned in the same format.

For example, you probably don’t want to bubble up your database errors to the outside world and reveal too much information about your database connections. Table 4-3 offers examples of how you might begin to organize your errors as you design your API. Table 4-3. Organize your errors into status codes, headers, machinereadable codes, and human-readable strings Error category HTTP status HTTP headers Error code (machine-readable) System-level 500 error Business logic 429 error -- -- RetryAfter rate_limit_exceeded API request formatting error Auth error 400 -- 401 -- Error message (humanreadable) -- “You have been rate-limited. See Retry-After and try again.” missing_required_parameter “Your request was missing a {user} parameter.”

Getting Started With Ledger
by Rolf Schröder

. • Check for “unknown” (= not yet recognized) transactions; modify meta.txt to match these bank transactions with your Ledger accounts. • Repeat until done. Let’s go through theses steps in greater detail. Getting the CSV data depends obviously on the financial institution. It’s handy to always save it to the same location in a “machine readable name” (ex: CSV/bankname_<month><year>.csv or CSV/bankname_latest.csv) because this allows for easier scripting. The utility script (ecosystem/convert.py) manipulates the CSV data to make Ledger’s convert understand it. This is mainly replacing the header lines and providing some more info for Ledger like the bank account’s currency.

Imagine you would now go to your NorthBank online banking site and download the CSV data for the last month (the sample repo already contains this file): 19 $ mux start GSWL-private # if not done yet # not really using wget obviously! $ wget https://banking.northbank.com/myaccount/latest.csv CSV/apr2042_northbank.csv Note how I renamed the CSV file to a machine readable name (i.e. added the date in a consistent manner). This now enables us to parse the ledger file with a simple alias: $ mux start GSWL-private # if not done yet # jump to the window 'make' $ lmnorthbank # lm =~ last month, see private/alias.local You should see the data from CSV/apr2042_northbank.csv converted into Ledger’s format.

pages: 387 words: 120,155

Inside the Nudge Unit: How Small Changes Can Make a Big Difference
by David Halpern
Published 26 Aug 2015

It gave the Secretary of State for Business the power to require firms to allow their customers access to their own consumption data in ‘machine-readable form’. This last phrase was critical. In many countries, the UK included, there is legislation in place to enable consumers to get, in written or ‘legible’ form, the data that firms (or public services) hold on them. For example, in the UK you could write to the supermarket giant Tesco and ask for the data they hold on you through your loyalty card. For £10, and after a reasonable delay, Tesco would send you a printout of this data. For most people this pile of paper will be of little use. Imagine instead that you could get access to this same data in machine-readable form, and much more quickly.

As a result of a new consumer power pushed by BIT, energy companies were required to make it easier for customers to access information. In particular, they were required to print on bills a QR code that summarised the customers’ details, patterns of use and their current tariff (see Figure 26). In technical terms, this makes the customers’ data machine-readable. In everyday terms, it means that all customers need to do to save some money is to scan the QR code with their mobile phone, and a switching site app can search the market for the best tariff for them. Instead of switching being a task that would take a few hours, it can be done in a few seconds.

This means helping people find better deals and prices; avoid foods that they are allergic to; improve their diets; and take more direct control over their own behaviour. In their book Nudge, Richard Thaler and Cass Sunstein expressed a closely related idea in the acronym ‘RECAP’: Record, Evaluate, and Compare Alternative Prices. The basic idea was that companies should be required to give the prices and attributes of products in comparable, machine-readable form, so that consumers could make easier and more effective comparisons (see Chapter 3). In the USA, this approach was picked up under the more memorable phrase ‘smart disclosure’, with a push from Richard and Cass, and championed by the appointment of a director of Smart Disclosure in the US Treasury, Sophie Raseman.1 In the UK, we first had a go at pushing the approach to giving consumers access to data on a voluntary basis, since much of our work for the PM was about removing regulatory burdens on business, not adding new ones.

pages: 511 words: 111,423

Learning SPARQL
by Bob Ducharme
Published 22 Jul 2011

Berners-Lee came up with the idea of Linked Data as a set of best practices for sharing data across the web infrastructure so that applications can more easily retrieve data from public sites with no need for screen scraping—for example, to let your calendar program get flight information from multiple airline websites in a common, machine-readable format. These best practices recommend the use of URIs to name things and the use of standards such as RDF and SPARQL. They provide excellent guidelines for the creation of an infrastructure for the semantic web. and the semantics of that data The idea of “semantics” is often defined as “the meaning of words.”

RDFa’s supplemental role in XML and HTML documents makes it excellent for metadata about content in those documents, and utilities are available to pull the triples out of RDFa attributes in a format that lets you query them with SPARQL. Note RDFa’s ability to embed triples in HTML makes it great for sharing machine-readable data in web pages so that automated processes gathering that data don’t need to do screen scraping of those pages. Storing RDF in Databases If you need to store a very large number of triples, keeping them as Turtle or RDF/XML in one big text file may not be your best option, because a system that indexes data and decides which data to load into memory when—that is, a database management system—can be more efficient.

Of all the W3C semantic web standards, OWL is the key one for putting the “semantic” in “semantic web.” The term “semantics” is sometimes defined as the meaning behind words, and those who doubt the value of semantic web technology like to question the viability of storing all the meaning of a word in a machine-readable way. As we saw above, though, we don’t need to store all the meaning of a word to add value to a given set of data. For example, simply knowing that “spouse” is a symmetric term made it possible to find out the identity of Cindy’s spouse, even though this fact was not part of the dataset. We’ll learn more about RDFS and OWL in Chapter 9.

Service Design Patterns: Fundamental Design Solutions for SOAP/WSDL and RESTful Web Services
by Robert Daigneau
Published 14 Sep 2011

Traditional documentation and unit tests typically can’t be used as input to code generation tools that produce Service Connectors (168) or by workflow development tools, nor can this information be read by automated agents at runtime. Service owners could supplement these approaches by providing machine-readable service metadata. Service Descriptor 176 C HAPTER 6 W EB S ERVICE I NFRASTRUCTURES Produce a standardized and machine-readable description of related services that identifies URIs, logical operations, messages, server methods, and usage policies. Server Methods * .. 1 Service Descriptor 1 .. * URI 1 .. * Logical Operations 1 .. * Messages Service Descriptor 1 .. * Usage Policies Service Descriptors provide a consolidated, machine-readable listing that identifies a set of logical operations or resources that are managed by a single organization.

Quality of Service or QoS requirements) for matters such as client authentication, data privacy, service response time, hours of operation, and up-time (i.e., availability) should also be clarified. This information provides the context necessary to produce an API that meets the needs of its clients. Given this information, the developer can proceed to create the logic behind the external API. It is important to note that machine-readable contracts like WSDL can only capture the most basic information required to use a service. Indeed, information such as a detailed explanation of what the service really does, when to use it, how to prepare requests, and how to handle failures is often described through prose and in unit tests. • Autonomy: Consistent and reliable outcomes are more likely when the service controls its own execution and has few dependencies on outside forces.

Changes like those that are listed above can make the processing rules used by clients obsolete and cause them to break. The effect of some of these changes can, however, be mitigated by using the Tolerant Reader pattern (243). The service’s expectations regarding what information is required and what is optional can be explicitly defined through “machine-readable” meta-languages like XSD and JSON Schema. These expectations might also be described in text-based documents (e.g., Word, HTML, etc.) aimed at developers. Regardless of how these expectations are expressed, once the parties have agreed to What Causes Breaking Changes? 230 What Causes Breaking Changes?

Beautiful Data: The Stories Behind Elegant Data Solutions
by Toby Segaran and Jeff Hammerbacher
Published 1 Jul 2009

As with the choice of GoogleDocs as the primary representation of the data, the use of both human-readable and machine-readable representations is crucial to gaining the most benefit from the data set. The only piece of information that does not require two representations is the numerical representation of the solubility itself. As has been noted earlier, we made the decision to not remove the most questionable values from the primary data record. This poses a problem for machine readability, as there is no accepted standard approach for saying “this number is a bit dodgy.” For this work we have elected to mark records that are believed to be inaccurate after human curation and to give a reason for the marking.

Providing the raw data in as comprehensive a fashion as possible and a full description of all the processing and filtering that has taken place means that any user can dig down to the level of detail he or she requires. The raw data will often be difficult or impossible to present in a form that is naturally machine-readable and processable, so the filtering and refinement process also involves making choices about categorization and simplification to provide clear and clean datafiles that can be repurposed. Here we describe the approach we have taken in “beautifying” a set of crowdsourced data by filtering and representing the data in an open form that allows anyone to use it for his own purposes.

To realize the full promise of connected data (e.g., by supporting automated ingest into ChemSpider and other services), and to provide the data in the most general possible way to other researchers, it is necessary to provide a representation that adheres to a recognized standard in syntax as well as in descriptors. The Resource Description Framework, or RDF, provides a route toward exposing the data set in a recognized, machine-readable format. With this format, any information is transformed into statements made up of a “subject,” a “predicate,” and a “value.” For example, the fragment shown in the following code states that the object found in the spreadsheet called solute#59 is defined as the resource at the given URL. RDF uses “namespaces,” or sets of recognized concepts, to define relationships between “resources,” where a resource is any object that can be pointed at by a unique identifier.

pages: 73 words: 17,793

HTML5 for Web Designers
by Jeremy Keith
Published 2 Jan 2010

The only tricky bit in hCalendar is describing dates and times in a machine-readable way. Humans like to describe dates as “May 25th” or “next Wednesday” but parsers expect a nicely-formated ISO date: YYYY-MM-DDThh:mm:ss. The microformats community came up with some clever solutions to this problem, such as using the abbr element: <abbr class="dtstart" title="1992-01-12"> January 12th, 1992 </abbr> If using the abbr element in this way makes you feel a little queasy, there are plenty of other ways of marking up machine-readable dates and times in microformats using the class-value pattern.

pages: 356 words: 105,533

Dark Pools: The Rise of the Machine Traders and the Rigging of the U.S. Stock Market
by Scott Patterson
Published 11 Jun 2012

There was little question the computer revolution that made the AI dream a reality was irrevocably altering financial markets. Information about companies, currencies, bonds, and every other tradable instrument was digitized, fast as light. So-called machine-readable news was a hot new commodity. Breaking news about corporate events such as earnings reports was coded so that superfast algorithms could pick through it and react. Media outlets such as Reuters and Dow Jones published machine-readable news that pattern-recognition computers scanned and reacted to in the blink of an eye. High-tech trading firms gobbled up the information and gunned orders into the market at a rate faster than the beating wings of a hummingbird.

Not only were the best buy and sell orders on the Island system visible, all orders behind those orders were visible. If one trader was bidding $50 for two hundred shares of Intel, while another was bidding $50¼ for five hundred shares, and another was at $50½ for one hundred shares, all the orders were there on the screen to see. And the entire book was available in machine-readable form—meaning computers with the right code could instantly track the book and react at lightning speeds. This was all unheard-of. At the time, it was incredibly expensive for investors to get live stock market data, which was tightly controlled by Nasdaq, the NYSE, and the big trading firms.

The split-second precision demanded by dynamic AI algos that could instantly shift gears as the market changed wasn’t possible with people in the mix. Those quirky market makers were simply too human, prone to mistakes, delays, or pure, old-fashioned greed. Another factor that would fuel AI: flowing streams of digitized data. Through its ITCH feed, Island spit out far more machine-readable data—information coded in a way that computer programs could make sense of it—about stock transactions than Nasdaq and the NYSE combined. The latest trades, bids and offers, volumes, depth of book—it was all available in digital form. For a computer with the bandwidth to crunch it all, it was like seeing the market in 3-D Technicolor compared with an ancient black-and-white cathode-ray tube with bad reception.

pages: 502 words: 107,510

Natural Language Annotation for Machine Learning
by James Pustejovsky and Amber Stubbs
Published 14 Oct 2012

Timeline of Standardization LAF didn’t emerge as an ISO standard from out of nowhere. Here’s a quick rundown of where the standards composing the LAF model originated: 1987: The Text Encoding Initiative (TEI) is founded “to develop guidelines for encoding machine-readable texts in the humanities and social sciences.” The TEI is still an active organization today. See http://www.tei-c.org. 1990: The TEI releases its first set of Guidelines for the Encoding and Interchange of Machine Readable Texts. It recommends that encoding be done using SGML (Standard Generalized Markup Language), the precursor to XML and HTML. 1993: The Expert Advisory Group on Language Engineering Standards (EAGLES) is formed to provide standards for large-scale language resources (ex, corpora), as well as standards for manipulating and evaluating those resources.

A sampling of important corpora Name of corpusYear publishedSizeCollection contents British National Corpus (BNC) 1991–1994 100 million words Cross section of British English, spoken and written American National Corpus (ANC) 2003 22 million words Spoken and written texts Corpus of Contemporary American English (COCA) 2008 425 million words Spoken, fiction, popular magazine, and academic texts What Is a Corpus? A corpus is a collection of machine-readable texts that have been produced in a natural communicative setting. They have been sampled to be representative and balanced with respect to particular factors; for example, by genre—newspaper articles, literary fiction, spoken speech, blogs and diaries, and legal documents. A corpus is said to be “representative of a language variety” if the content of the corpus can be generalized to that variety (Leech 1991).

In order for annotation to provide statistically useful results, it must be done on a sufficiently large dataset, called a corpus. The study of language using corpora is corpus linguistics. Corpus linguistics began in the 1940s, but did not become a feasible way to study language until decades later, when the technology caught up to the demands of the theory. A corpus is a collection of machine-readable texts that are representative of natural human language. Good corpora are representative and balanced with respect to the genre or language that they seek to represent. The uses of computers with corpora have developed over the years from simple key-word-in-context (KWIC) indexes and concordances that allowed full-text documents to be searched easily, to modern, statistically based ML techniques.

pages: 628 words: 107,927

Node.js in Action
by Mike Cantelon , Marc Harter , Tj Holowaychuk and Nathan Rajlich
Published 27 Jul 2013

This is because the string was written to the Buffer using the default human-readable text-based encoding (UTF-8), where the string is represented with 1 byte per character. Node also includes helper functions for reading and writing binary (machine-readable) integer data. These are needed for implementing machine protocols that send raw data types (like ints, floats, doubles, and so on) over the wire. Because you want to store a number value in this example, it’s possible to be more efficient by utilizing the helper function writeInt32LE() to write the number 121234869 as a machine-readable binary integer (assuming a little-endian processor) into a 4-byte Buffer. There are other variations of the Buffer helper functions, as well: writeInt16LE() for smaller integer values writeUInt32LE() for unsigned values writeInt32BE() for big-endian values There are lots more, so be sure to check the Buffer API documentation page (http://nodejs.org/docs/latest/api/buffer.html) if you’re interested in them all.

Creating a public REST API In this section, you’ll implement a RESTful public API for the shoutbox application, so that third-party applications can access and add to publication data. The idea of REST is that application data can be queried and changed using verbs and nouns, represented by HTTP methods and URLs, respectively. A REST request will typically return data in a machine-readable form, such as JSON or XML. To implement an API, you’ll do the following: Design an API that allows users to show, list, remove, and post entries Add Basic authentication Implement routing Provide JSON and XML responses Various techniques can be used to authenticate and sign API requests, but implementing the more complex solutions are beyond the scope of this book.

In the following code snippet, the number is written using the writeInt32LE binary helper function: var b = new Buffer(4); b.writeInt32LE(121234869, 0); console.log(b.length); 4 console.log(b); <Buffer b5 e5 39 07> By storing the value as a binary integer instead of a text string in memory, the data size is decreased by half, from 9 bytes down to 4. Figure 13.2 shows the breakdown of these two buffers and essentially illustrates the difference between human-readable (text) protocols and machine-readable (binary) protocols. Figure 13.2. The difference between representing 121234869 as a text string vs. a little-endian binary integer at the byte level Regardless of what kind of protocol you’re working with, Node’s Buffer class will be able to handle the proper representation. Byte endianness The term endianness refers to the order of the bytes within a multibyte sequence.

pages: 527 words: 147,690

Terms of Service: Social Media and the Price of Constant Connection
by Jacob Silverman
Published 17 Mar 2015

Thomson Reuters, the parent company of the Reuters newswire, has gotten into the sentiment analysis business, examining more than four million blogs and social-media feeds. They claim to be able to offer forecasts of how individual stocks will do, in addition to rating sources and offering big-picture analysis about the state (and sentiment) of the market. Their rival Dow Jones, parent company of the Wall Street Journal, also offers a “machine-readable news” feed that can be plugged into automated trading platforms. This kind of information is chum in the water for hedge funds, which will pay to get practically any information that their competitors don’t have—or pay to get it just a few milliseconds earlier than their peers. Derwent Capital Markets examined 250 million tweets daily and reportedly beat the market in its first month, earning 1.85 percent against a 2.2 percent drop in the S&P 500.

Google Glass could become a kind of roving emotion-meter, providing you with voice analysis of everyone you meet. On a more conceptual level, voice analysis and sentiment analysis are about finding out what you think and feel: your “mood graph.” Social-media companies really would like to know what you are thinking at all times, but they need the data to be machine-readable, which is why we’re prompted to structure our data by tagging emotions, companies, people, and places and why forms of computational analysis promise to automate this process. The data can then be mined and sold on to advertisers, market researchers, and other partners. At its most expansive, this process is an EKG for not only individual opinions but also those of whole demographics, cultures, and communities.

“It can easily be applied to large numbers of people without obtaining their individual consent and without them noticing.” This process will likely develop into a two-way system. As networks begin to understand how we think and feel, they will prompt us for more information or suggest emotional responses, all of which will be machine-readable. They may also allow companies such as Facebook to help us stop self-censoring by pushing us to reconsider deleted updates or to post something when they detect a change in our mood. The writer Nicholas Carr envisions a system that “automates the feels”: “Whenever you write a message or update, the camera in your smartphone or tablet will ‘read’ your eyes and your facial expression, precisely calculate your mood, and append the appropriate emoji.

Mastering Blockchain, Second Edition
by Imran Bashir
Published 28 Mar 2018

Transparency, auditability, and integrity are attributes of blockchain that can go a long way in effectively managing various government functions. Border control Automated border control systems have been in use for decades now to thwart illegal entry into countries and prevent terrorism and human trafficking. Machine-readable travel documents and specifically biometric passports have paved the way for automated border control; however current systems are limited to a certain extent and blockchain technology can provide solutions. A Machine Readable Travel Document (MRTD) standard is defined in document ICAO 9303 (https://www.icao.int/publications/pages/publication.aspx?docnum=9303) by the International Civil Aviation Organization (ICAO) and has been implemented by many countries around the world.

A Ricardian contract is a document that has several of the following properties: A contract offered by an issuer to holders A valuable right held by holders and managed by the issuer Easily readable by people (like a contract on paper) Readable by programs (parsable, like a database) Digitally signed Carries the keys and server information Allied with a unique and secure identifier The preceding information is based on the original definition by Ian Grigg at http://iang.org/papers/ricardian_contract.html. In practice, the contracts are implemented by producing a single document that contains the terms of the contract in legal prose and the required machine-readable tags. This document is digitally signed by the issuer using their private key. This document is then hashed using a message digest function to produce a hash by which the document can be identified. This hash is then further used and signed by parties during the performance of the contract to link each transaction, with the identifier hash thus serving as an evidence of intent.

This hash is then further used and signed by parties during the performance of the contract to link each transaction, with the identifier hash thus serving as an evidence of intent. This is depicted in the next diagram, usually called a bowtie model. The diagram shows number of elements: The World of Law on the left-hand side from where the document originates. This document is a written contract in legal prose with some machine-readable tags. This document is then hashed. The resultant message digest is used as an identifier throughout the World of Accountancy, shown on the right-hand side of the diagram. The World of Accountancy element represents any accounting, trading, and information systems that are being used in the business to perform various business operations.

pages: 570 words: 115,722

The Tangled Web: A Guide to Securing Modern Web Applications
by Michal Zalewski
Published 26 Nov 2011

A typical, delightfully baroque example of the resulting taxonomy may be this: Improper Enforcement of Message or Data Structure Failure to Sanitize Data into a Different Plane Improper Control of Resource Identifiers Insufficient Filtering of File and Other Resource Names for Executable Content Today, there are about 800 names in the CWE dictionary, most of which are as discourse-enabling as the one quoted here. A slightly different school of naturalist thought is manifested in projects such as the Common Vulnerability Scoring System (CVSS), a business-backed collaboration that aims to strictly quantify known security problems in terms of a set of basic, machine-readable parameters. A real-world example of the resulting vulnerability descriptor may be this: AV:LN / AC:L / Au:M / C:C / I:N / A:P / E:F / RL:T / RC:UR / CDP:MH / TD:H / CR:M / IR:L / AR:M Organizations and researchers are expected to transform this 14-dimensional vector in a carefully chosen, use-specific way in order to arrive at some sort of objective, verifiable, numerical conclusion about the significance of the underlying bug (say, “42”), precluding the need to judge the nature of security flaws in any more subjective fashion.

Alas, any useful implementation of the design was out of reach at that time, so, beyond futuristic visions, nothing much happened until transistor-based computers took center stage. The next tangible milestone, in the 1960s, was the arrival of IBM’s Generalized Markup Language (GML), which allowed for the annotation of documents with machine-readable directives indicating the function of each block of text, effectively saying “this is a header,” “this is a numbered list of items,” and so on. Over the next 20 years or so, GML (originally used by only a handful of IBM text editors on bulky mainframe computers) became the foundation for Standard Generalized Markup Language (SGML), a more universal and flexible language that traded an awkward colon- and period-based syntax for a familiar angle-bracketed one.

The Battle over Semantics The low-level syntax of the language aside, HTML is also the subject of a fascinating conceptual struggle: a clash between the ideology and the reality of the online world. Tim Berners-Lee always championed the vision of a semantic web, an interconnected system of documents in which every functional block, such as a citation, a snippet of code, a mailing address, or a heading, has its meaning explained by an appropriate machine-readable tag (say, <cite>, <code>, <address>, or <h1> to <h6>). This approach, he and other proponents argued, would make it easier for machines to crawl, analyze, and index the content in a meaningful way, and in the near future, it would enable computers to reason using the sum of human knowledge.

pages: 233 words: 62,563

Zero: The Biography of a Dangerous Idea
by Charles Seife
Published 31 Aug 2000

New York: Simon and Schuster, 1988. Herodotus. The Histories. Trans. Aubrey de Selincourt. London: Penguin Books, 1954. Hesiod. Theogony. In The Homeric Hymns and Homerica with an English Translation by Hugh G. Evelyn-White. Cambridge, Mass.: Harvard University Press; London: William Heinemann, Ltd., 1914. (Machine readable text.) Hoffman, Paul. The Man Who Loved Only Numbers. New York: Hyperion, 1998. ———. “The Man Who Loves Only Numbers.” The Atlantic Monthly, November 1987: 60. Hooper, Alfred. Makers of Mathematics. New York: Random House, 1948. Horgan, John. The End of Science. Reading, Mass.: Addison-Wesley, 1996.

From Plato in Twelve Volumes, Vol. 1 translated by Harold North Fowler; introduction by W. R. M. Lamb (1966); Vol. 3 translated by W. R. M. Lamb (1967); Vol. 4 translated by Harold North Fowler (1977); Vol. 9 translated by Harold N. Fowler (1925). Cambridge, Mass.: Harvard University Press; London: William Heinemann, Ltd., 1966, 1967, 1977. (Machine readable text.) Plaut, Gunther. The Torah: A Modern Commentary. New York: The Union of American Hebrew Congregations, 1981. Plutarch. Makers of Rome. Trans. Ian Scott-Kilvert. London: Penguin Books, 1965. Plutarch on Sparta. Trans. Richard J. A. Talbert. London: Penguin Books, 1988. The Poetic Edda.

Cambridge, Mass.: The MIT Press, 1987. White, Michael. The Last Sorcerer. Reading, Mass.: Addison-Wesley, 1997. Xenophon. Hellenica. Trans. Carleton L. Brownson. Vols. 1–2, Xenophon in Seven Volumes. Cambridge, Mass.: Harvard University Press; London: William Heinemann, Ltd.; Vol. 1, 1985; Vol. 2, 1986. (Machine readable text.) Web Sites Papyrus of Ani; Egyptian Book of the Dead. Trans. E. A. Wallis Budge. http://www.sas.upenn.edu/African_Studies/Books/Papyrus_Ani.html Clement of Alexandria. The Stromata. http://www.webcom.com/~gnosis/library/strom4.htm “The Life of Hypatia.” http://www.cosmopolis.com/alexandria/ “Frequently Asked Questions in Mathematics.” http://www.cs.unb.ca/~alopez-o/math-faq/ Odenwald, Sten.

pages: 259 words: 67,456

The Mythical Man-Month
by Brooks, Jr. Frederick P.
Published 1 Jan 1975

The administrator and the editor will each need a secretary; the administrator's secretary will handle project correspondence and non-product files. The program clerk. He is responsible for maintaining all the technical records of the team in a programming-product library. The clerk is trained as a secretary and has responsibility for both machine-readable and human-readable files. All computer input goes to the clerk, who logs and keys it if required. The output listings go back to him to be filed and indexed. The most recent runs of any model are kept in a status notebook; all previous ones are filed in a chronological archive. Absolutely vital to Mills's concept is the transformation of programming "from private art to public practice" by making all the computer runs visible to all team members and identifying all programs and data as team property, not private property.

Self-Documenting Programs A basic principle of data processing teaches the folly of trying to maintain independent files in synchronism. It is far better to combine them into one file with each record containing all the information both files held concerning a given key. Yet our practice in programming documentation violates our own teaching. We typically attempt to maintain a machine-readable form of a program and an independent set of human-readable documentation, consisting of prose and flow charts. The results in fact confirm our teachings about the folly of separate files. Program documentation is notoriously poor, and its maintenance is worse. Changes made in the program do not promptly, accurately, and invariably appear in the paper.

Label statements in groups to show correspondences to the statements in the algorithm description in the literature. Use indenting to show structure and grouping. Add logical flow arrows to the listing by hand. They are very helpful in debugging and changing. They may be incorporated in the right margin of the comments space, and made part of the machine-readable text. Use line comments or remark anything that is not obvious. If the techniques above have been used, these will be short and fewer in number than is customary. Put multiple statements on one line, or one statement on several lines to match thought-grouping and to show correspondence to other algorithm description.

Pocket New York City Travel Guide
by Lonely Planet
Published 27 Sep 2012

You must obtain a visa from a US embassy or consulate in your home country if you: › do not currently hold a passport from a VWP country › are from a VWP country, but don’t have a machine-readable passport › are from a VWP country, but currently hold a passport issued between October 26, 2005, and October 25, 2006, that does not have a digital photo on the information page or an integrated chip from the data page. (After October 25, 2006, the integrated chip is required on all machine-readable passports.) › are planning to stay longer than 90 days › are planning to work or study in the US. Behind the Scenes Send us your feedback We love to hear from travellers – your comments keep us on our toes and help make our books better.

For detailed information on subway and bus wheelchair accessibility, call the Accessible Line ( 718-596-8585) or visit www.mta.info/mta/ada for a list of subway stations with elevators or escalators. Also visit www.nycgo.com and search for ‘accessibility.’ Visas The USA Visa Waiver Program (VWP) allows nationals from 36 countries to enter the US without a visa, provided they are carrying a machine-readable passport. For the updated list of countries included in the program and current requirements, see the US Department of State (http://travel.state.gov/visa) website. Citizens of VWP countries need to register with the US Department of Homeland Security (http://esta.cbp.dhs.gov) three days before their visit.

pages: 321 words: 113,564

AI in Museums: Reflections, Perspectives and Applications
by Sonja Thiel and Johannes C. Bernhardt
Published 31 Dec 2023

This is why networks such as AI4LAM, The Museum + AI Network, or Europeana Tech are particularly helpful and should be expanded in future in order to ensure sustainable knowledge transfer for cultural heritage professionals. 3 4 https://github.com/LAION-AI. https://dev-sabio.sudox.nl/about. 89 90 Part 1: Reflections The Value of Cultural Heritage Data Museums, like other businesses and institutions, produce large amounts of data, including images, text, audio, video, user data, metadata, and complementary research. This collection data is of great value to AI development, as generations of curators have worked on the quality of object descriptions and scholarly descriptions of context or related classification systems. Ideally, this information is stored in a machine-readable collection management system and includes qualitycontrolled metadata and standard data or authority files. The collection data is, moreover, linked to high-level ontologies, vocabularies, or thesauri systems such as AAT, GND, Geonames, Wikidata, or ICONCLASS, which ensure the correct use of terms and provide additional context.

For example, text recognition (in other words, optical character recognition [OCR] or handwritten text recognition) extracts the text from a scan to make it machinereadable, layout analysis can structure the various types of content on a scanned page into different sections like text, images, tables, et cetera, and methods from the domain of natural language processing (NLP) can be utilized to extract information (for instance, named entities) from the text or to enrich it semantically (for example, with links to a knowledge base), to name just a few applications. Altogether, the abovementioned processes can be useful in creating machine-readable corpora or datasets from digitized collections, which can in turn again help improve machine learning methods and models (Lee 2022). AI methods and models have provided significant improvements for all of the above applications, for instance, for recognizing text in historical prints (Wick/Reul/Puppe et al. 2018) or handwritten documents (Muehlberger/Seaward/Terras et al. 2019), for document layout analysis (Shen/Zhang/Dell et al. 2021 and Huang/Lv/Cui et al. 2022), for content-based retrieval (Brantl/Schweter 2022), or in the area of named entity recognition and linking (Ehrmann et al. 2020; 2022).

To guide museums in recording provenance, the American Alliance of Museums (AAM) and the International Foundation for Art Research (IFAR) have compiled guidelines on writing provenance texts (Yeide/Walsh/Akinsha 2001; IFAR 2023). These guidelines, with their allowances for variation, do not represent strict standards, nor do they anticipate machine readability. They do, however, introduce writing conventions that have found widespread adoption, especially in the English-speaking provenance world, for instance, organizing texts according to their chronology or using specific punctuation to convey meaning. We found this genre of provenance to be particularly suitable for automatic structuring.

Free as in Freedom
by Sam Williams
Published 16 Nov 2015

Show us your code. Show us it can be done.'" In true hacker fashion, Stallman began looking for existing programs and tools that could be converted into GNU programs and tools. One of the first was a compiler named VUCK, which converted programs written in the popular C programming language into machine-readable code. Translated from the Dutch, the program's acronym stood for the Free University Compiler Kit. Optimistic, Stallman asked the program's 90 author if the program was free. When the author informed him that the words "Free University" were a reference to the Vrije Universiteit in Amsterdam, Stallman was chagrined.

The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters.

If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols.

pages: 402 words: 110,972

Nerds on Wall Street: Math, Machines and Wired Markets
by David J. Leinweber
Published 31 Dec 2008

They provided a picture of the data, the last trade, and the quote. It was the same information one would find by grabbing the end of a paper ticker tape and running it back to find the newest information for the stock of interest. This was literally what people did for a long time to get current quotes. With the advent of true machine-readable market data—the text frontier—it was not long before these basic, simple text “pictures of Gr eatest Hits of Computation in Finance 35 Figure 2.2 Market data systems: then and now. Courtesy of New York Stock Exchange and Thomson Reuters. data” were replaced by screens with an increasing variety of market charts showing both common and customized analytics.

Summary (and Sermonette) These dairy product and calendar examples are obviously contrived. They are not far removed from many ill-conceived quantitative investment and trading ideas. It is just as easy to fool yourself with ideas that are plausible-sounding and no more valid. Just because something appears plausible, that doesn’t mean that it is. The wide availability of machine-readable data, and the tools to analyze it, easily means that there are a lot more regressions going on than Legendre could ever have imagined back in 1805. If you look at 100 regressions that are significant at a level of 95 percent, five of them are there just by chance. Look at 100,000 models at 95 percent significance, and 5,000 are false positives.

In designing MarketMind (and later QuantEx), the goal was not to use AI for its own sake, but rather to apply AI techniques where they could be used appropriately and within their limits to provide an advantage over conventional technologies. A clue to the question of how to apply AI in trading is found by looking at the many thousands of electronic trading support systems that were already in use. A new product must provide a readily perceptible advantage over those systems. Real Charting Soon after reliable machine-readable market data became available in the mid-1980s, there were many services available to traders that would let them put a chart on the screen. As desktop computers became more powerful, people would put more graphs on the screen, and go through them more rapidly. They would look for patterns or relationships and often take some action when they found them.

pages: 761 words: 80,914

Ansible: Up and Running: Automating Configuration Management and Deployment the Easy Way
by Lorin Hochstein
Published 8 Dec 2014

Let’s work through an example of creating a dynamic inventory script that retrieves the details about hosts from Vagrant.2 Our dynamic inventory script is going to need to invoke the vagrant status command. The output shown in Example 3-11 is designed for humans to read, rather than for machines to parse. We can get a list of running hosts in a format that is easier to parse with the --machine-readable flag, like so: $ vagrant status --machine-readable The output looks like this: 1410577818,vagrant1,provider-name,virtualbox 1410577818,vagrant1,state,running 1410577818,vagrant1,state-human-short,running 1410577818,vagrant1,state-human-long,The VM is running. To stop this VM%!(VAGRANT _COMMA) you can run `vagrant halt` to\nshut it down forcefully%!

/usr/bin/env python # Adapted from Mark Mandel's implementation # https://github.com/ansible/ansible/blob/devel/plugins/inventory/vagrant.py # License: GNU General Public License, Version 3 <http://www.gnu.org/licenses/> import argparse import json import paramiko import subprocess import sys def parse_args(): parser = argparse.ArgumentParser(description="Vagrant inventory script") group = parser.add_mutually_exclusive_group(required=True) group.add_argument('--list', action='store_true') group.add_argument('--host') return parser.parse_args() def list_running_hosts(): cmd = "vagrant status --machine-readable" status = subprocess.check_output(cmd.split()).rstrip() hosts = [] for line in status.split('\n'): (_, host, key, value) = line.split(',') if key == 'state' and value == 'running': hosts.append(host) return hosts def get_host_details(host): cmd = "vagrant ssh-config {}".format(host) p = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) config = paramiko.SSHConfig() config.parse(p.stdout) c = config.lookup(host) return {'ansible_ssh_host': c['hostname'], 'ansible_ssh_port': c['port'], 'ansible_ssh_user': c['user'], 'ansible_ssh_private_key_file': c['identityfile'][0]} def main(): args = parse_args() if args.list: hosts = list_running_hosts() json.dump({'vagrant': hosts}, sys.stdout) else: details = get_host_details(args.host) json.dump(details, sys.stdout) if __name__ == '__main__': main() Pre-Existing Inventory Scripts Ansible ships with several dynamic inventory scripts that you can use.

pages: 394 words: 118,929

Dreaming in Code: Two Dozen Programmers, Three Years, 4,732 Bugs, and One Quest for Transcendent Software
by Scott Rosenberg
Published 2 Jan 2006

Which is all fine, except for the distressing fact that most of the human beings who use those machines habitually count from one. And so, down in the guts of the system, where data is stored and manipulated—representations of our money and our work lives and our imaginative creations all translated into machine-readable symbols—computer programs and programming languages often include little offsets, translations of “+1” or “-1,” to make sure that the list of stuff the computer is counting from zero stays in sync with the list of stuff a human user is counting from one. In the binary digital world of computers, all information is reduced to sequences of zeros and ones.

(Monty Python’s form-smashing absurdism has always found some of its truest fans in computer labs; we call the flood of unsolicited and unwanted email “spam” thanks to the Internet pioneers who, looking to name the phenomenon, recalled a Python routine featuring a luncheonette menu offering nothing but variations on “eggs, sausage, spam, spam, spam, and spam.”) Python is an interpreted language. Where compiled languages run programmers’ source code through a compiler ahead of time to translate it into machine-readable binary code, interpreted languages perform that translation when you run the program. The source code gets translated line by line by the interpreter and fed to the processor for execution. This makes interpreted languages less efficient, since you’re always running two programs at once, the program you want to use and the interpreter.

Simonyi wants to give these subject matter experts a set of tools they can use to explain their intentions and needs in a structured way that the computer can understand. Intentional Software’s system will let the nonprogrammer experts define a set of problems—for a hospital administration program, say, they might catalog all the “actors,” their “roles,” tasks that need to be performed, and all other details—in a machine-readable format. That set of definitions, that model, is then fed into a generator program that spits out the end-product software. There is still work for programmers in building the tools for the subject matter experts and in writing the generator. But once that work is done, the nonprogrammers can tinker with their model and make changes in the software without ever needing to “make a humble request to the programmer.”

Data and the City
by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle
Published 2 Aug 2017

According to this mapping: POST is used to create a resource on the server, GET is used to retrieve a resource, PUT is used to update or change the state of a resource, and DELETE used for removing a resource. RESTful services are easier to implement, maintain and utilize, but they do not have powerful and standard support for features like standard contract (or machine-readable service description) distributed transactions, composition and security, which are needed in most enterprise applications, and that is why Sharing and analysing data in smart cities 131 most enterprise applications implement Web Services (Daigneau 2011). In addition, there is no standard discovery mechanism for RESTful services other than HTTP OPTION, which only provides the list of available methods that are supported by a service.

A client (usually a service or software) can request an XML-encoded capabilities document (containing the names of feature types that can be accessed via WFS service, the spatial reference system(s), the spatial extent of the data and information about the operations that are supported) by sending the GetCapabilities request to the WFS service. The GetCapabilities operation is required for any OGC Web service. The purpose of the GetCapabilities operation is to obtain service metadata, which is a machine-readable (and also human-readable) description of the server’s information content and acceptable request parameter values. The purpose of the DescribeFeatureType operation in the WFS standard is to retrieve an XML schema document with a description of the data structure (or schema) of the feature types served by that WFS service.

For users inside the organization the WS-T provides the fastest possible communication speed. Also WS-T can be consumed by other organizations in a city if they have an appropriate service level agreement. Developing applications using WS-W and WS-T bindings is easier for professional developers (or enterprise developers) because the services have a machine-readable contract and creating consumer applications (proxy classes) is almost automatic using integrated development environments. REST endpoints can be used for accessing data and analysis which should be publicly available. The REST services are usually consumed for developing Web 2.0 applications and connected mobile applications (mobile apps that need to be always connected to the internet).

pages: 519 words: 142,646

Track Changes
by Matthew G. Kirschenbaum
Published 1 May 2016

Unsurprisingly, Continental philosophy and theory tried its best to account for these newfound qualities of writing. Friedrich Kittler used the example of word processing and specifically WordPerfect to launch the argument in his most famous refutation of the myth of digital transcendence, drawing attention to the difference between human- and machine-readable writing by noting that the actual word “WordPerfect” is too long to be faithfully rendered under the software’s own native DOS regimen of eight-character file names.76 For Kittler, word processing marked a definitive break with prior writing technologies because words stopped being mere signifiers and become executables instead: “Surely tapping the letter sequence of W, P, and Enter on [a] keyboard does not make the Word perfect, but this simple writing act starts the execution of WordPerfect.”77 And it does so without ambiguity, perfectly predictable each and every time.

The first indicates the actual system components that are to be involved—display, keyboard, printer, speakers, and so forth. The second presents the logical operations of the processor in schematic form. The third column, called the REROUTE, imposes conditionals or branching logic. The fourth and final column was for descriptive notes, human- rather than machine-readable text. Herbert and Barnard believed that a universal visual vocabulary for representing programming concepts would allow a larger number of people to not only use but in fact program computers. “If you can type a letter on a typewriter and tell a stranger how to get to your house, you can write your own programs,” they promised.63 Nonetheless, they understood at the time that the user would still have to manually transliterate the PROGRAMAP to an actual executable language like BASIC.

Users can choose the Hanx Writer in lieu of the IOS’s default text editor, and have the option of outputting their creations (overstrikes and all) to a variety of social media platforms. At the same time, the stylus, together with so-called smart pens, are presenting what is surely the most serious challenge to the keyboard in some time.32 Smart pens, which sync to a nearby digital storage device where the handwriting is interpreted and reproduced as machine-readable text, effectively replicate the input logic of the MT/ST and Redactron, allowing what happens on a sheet of paper to leave a symbolically stored record subject to ongoing manipulation and revision. One researcher at Microsoft is outfitting pens and tablet styluses with sensors to differentiate between different kinds of grips; the basic insight is that people hold the pen differently depending on what they intend to do with it—write, draw, poke at an icon on a screen.

Digital Transformation at Scale: Why the Strategy Is Delivery
by Andrew Greenway,Ben Terrett,Mike Bracken,Tom Loosemore
Published 18 Jun 2018

The chances of it delivering artful ­machine-learning-led services without fundamentally changing the institution itself are slim to none too. One indicator of an organisation’s maturity and readiness for this next wave of technologies – assuming that it already has a digital working culture in place – is how it looks after its data. If an institution knows what data it owns, makes it machine readable, and has considered the data protection and privacy issues that come with the responsibility of looking after it, it might have a fighting chance. Without those things, forget it. Whatever the hype may be, new technologies like machine learning are forcing the right questions into the open.

The good news is that you can start small. Until very recently, the UK government didn’t have a single agreed list of countries. Instead, there were scores of lists, some out of date, some incomplete, some with alternative names. The lack of consistency is maddening enough for people, but more crucially, makes the reliable use of machine-readable data near impossible. A register of single countries is now available for every department to use. It’s a start. Central power The role of the centre in a platform government is up for debate. Most of the argument centres on what role a central department or institution should be responsible for designing and running platforms, versus playing a convener, standard-setting role, versus butting out and gently encouraging departments to play nicely.

pages: 39 words: 4,665

Data Source Handbook
by Pete Warden
Published 15 Feb 2011

><toplevel> <CompleteSuggestion><suggestion data="san francisco <num_queries int="77100000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="20700000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="122000000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="6830000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="103000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="3330000"/></CompleteSuggestion> <CompleteSuggestion> is in what county"/> is full of characters"/> is known for"/> is weird"/> is for carnivores"/> is boring"/> Search Terms | 15 <suggestion data="san francisco is the best city in <num_queries int="63800000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="24100000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="11200000"/></CompleteSuggestion> <CompleteSuggestion><suggestion data="san francisco <num_queries int="409000"/></CompleteSuggestion> </toplevel> the world"/> is gay"/> is burning"/> is overrated"/> Wolfram Alpha The Wolfram Alpha platform pulls together a very broad range of facts and figures on everything from chemistry to finance. The REST API takes in some search terms as input, and returns an XML document containing the results. The output is a series of sections called pods, each containing text and images ready to display to users. Unfortunately there’s no easy way to get a machine-readable version of this information, so you can’t do further processing on the data within your application. It’s still a rich source of supplemental data to add into your own search results, though, which is how Bing is using the service. If you’re a noncommercial user, you can make up to 2,000 queries a month for free, and you can experiment with the interactive API console if you want to explore the service.

pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future
by Martin Ford
Published 4 May 2015

As quoted in Kadim Shubber, “Artificial Artists: When Computers Become Creative,” Wired Magazine–UK, August 13, 2007, http://www.wired.co.uk/news/archive/2013–08/07/can-computers-be-creative/viewgallery/306906. 41. Shubber, “Artificial Artists: When Computers Become Creative.” 42. “Bloomberg Bolsters Machine-Readable News Offering,” The Trade, February 19, 2010, http://www.thetradenews.com/News/Operations_Technology/Market_data/Bloomberg_bolsters_machine-readable_news_offering.aspx. 43. Neil Johnson, Guannan Zhao, Eric Hunsader, Hong Qi, Nicholas Johnson, Jing Meng, and Brian Tivnan, “Abrupt Rise of New Machine Ecology Beyond Human Response Time,” Nature, September 11, 2013, http://www.nature.com/srep/2013/130911/srep02627/full/srep02627.html. 44.

They attempt to profit by detecting and then snapping up shares in front of huge transactions initiated by mutual funds and pension managers. They seek to deceive other algorithms by inundating the system with decoy bids that are then withdrawn within tiny fractions of a second. Both Bloomberg and Dow News Service offer special machine-readable products designed to feed the algorithms’ voracious appetites for financial news that they can—perhaps within milliseconds—turn into profitable trades. The news services also provide real-time metrics that let the machines see which items are attracting the most attention.42 Twitter, Facebook, and the blogosphere are likewise all fodder for these competing algorithms.

pages: 408 words: 105,715

Kingdom of Characters: The Language Revolution That Made China Modern
by Jing Tsu
Published 18 Jan 2022

* * * • • • In the early 1960s, the Library of Congress decided to embark on a massive automation project. It would build a universal catalog system, the kind that the Chinese librarian Bismarck Doo had dreamt of decades earlier but powered by computers. The Library of Congress began converting its paper catalog to a searchable digital index. With this machine-readable system, a user would be able to look up a title in a library thousands of miles away and at any time of the day, as long as he or she had access to a computer. Millions of catalog cards were made obsolete. The project turned the bookish craft of library cataloging into the sleek discipline of information science.

Driven by an urgent mission to contain the global spread of communism, academics, librarians, and the government worked together to build up a knowledge base about China. In the 1960s alone, East Asian libraries acquired as many new holdings as they had during the course of the entire previous century. Digitizing these collections would be a challenge, because machine-readable databases could not accommodate non-Roman script languages like Hebrew, Arabic, Persian, or any of the East Asian languages. A number of American foundations finally took action. In November 1979, the American Council of Learned Societies (ACLS) sponsored a conference titled East Asian Character Processing in Automated Bibliographic Systems at Stanford University.

If two characters look alike, but are semantically different, however, they get separate coding. Whether a character should be unified with another can be a hard call to make, but someone has to make it. The Chinese script so far has defied most attempts to systematize it in a complete way. Should there be a future technology for writing even more exacting than machine-readable codes, any judgment call made by a human user today might very well be seen as an inconsistency later, which would entail more corrections. But no matter, as Lu urges whenever the IRG delegates are locked in stalemates and heated arguments, “we must go on.” And they do. One could see the IRG’s workload as essentially a denial-of-service attack by an unwieldy language: the Chinese script having its revenge on Western technology.

pages: 292 words: 62,575

97 Things Every Programmer Should Know
by Kevlin Henney
Published 5 Feb 2010

Learn Foreign Languages Klaus Marquardt PROGRAMMERS NEED TO COMMUNICATE. A lot. There are periods in a programmer's life when most communication seems to be with the computer—more precisely, with the programs running on that computer. This communication is about expressing ideas in a machine-readable way. This remains an exhilarating prospect: programs are ideas turned into reality, with virtually no physical substance involved. Programmers need to be fluent in the language of the machine, whether real or virtual, and in the abstractions that can be related to that language via development tools.

Good programmers need to be able to stand outside their daily routine, to be aware of other languages that are expressive for other purposes. The time always comes when this pays off. Beyond communication with machines, programmers need to communicate with their peers. Today's large projects are more social endeavors than simply the applied art of programming. It is important to understand and express more than the machine-readable abstractions can. Most of the best programmers I know are also very fluent in their mother tongue, and typically in other languages as well. This is not just about communication with others: speaking a language well also leads to a clarity of thought that is indispensable when abstracting a problem.

pages: 58 words: 12,386

Big Data Glossary
by Pete Warden
Published 20 Sep 2011

Natural Language Toolkit The NLTK is a collection of Python modules and datasets that implement common natural language processing techniques. It offers the building blocks that you need to build more complex algorithms for specific problems. For example, you can use it to break up texts into sentences, break sentences into words, stem words by removing common suffixes (like -ing from English verbs), or use machine-readable dictionaries to spot synonyms. The framework is used by most researchers in the field, so you’ll often find cutting-edge approaches included as modules or as algorithms built from its modules. There are also a large number of compatible datasets available, as well as ample documentation. NLTK isn’t aimed at developers looking for an off-the-shelf solution to domain-specific problems.

pages: 252 words: 72,473

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
by Cathy O'Neil
Published 5 Sep 2016

Replacing a worker earning $50,000 a year: Heather Boushey and Sarah Jane Glynn, “There Are Significant Business Costs to Replacing Employees,” American Progress, November 16, 2012, www.​americanprogress.​org/​issues/​labor/​report/​2012/​11/​16/​44464/​there-​are-​significant-​business-​costs-​to-​replacing-​employees/. Evolv: Jessica Leber, “The Machine-Readable Workforce: Companies Are Analyzing More Data to Guide How They Hire, Recruit, and Promote Their Employees,” MIT Technology Review, May 27, 2013, www.​technologyreview.​com/​news/​514901/​the-​machine-​readable-​workforce/. A pioneer in this field is Gild: Jeanne Meister, “2015: Social HR Becomes A Reality,” Forbes, January 5, 2015, www.​forbes.​com/​sites/​jeannemeister/​2015/​01/​05/​2015-​social-​hr-​becomes-​a-​reality/.

pages: 434 words: 77,974

Mastering Blockchain: Unlocking the Power of Cryptocurrencies and Smart Contracts
by Lorne Lantz and Daniel Cawrey
Published 8 Dec 2020

Use Cases: ICOs There are a number of applications for a computerized transaction protocol using smart contracts. The concept of Ricardian contracts as proposed by Ian Grigg in 1996 provides insight into the realm of use cases for this technology. Innovations include using a cryptographic hash function for identification and defining legal elements as machine-readable by a computer. By being able to execute a set of instructions (via a smart contract) and associate it with an accounting system (via a blockchain), the Ethereum platform can be used to run a number of different dapps. During the early years after Ethereum’s release, it took time for a developer ecosystem to grow.

To read data from the contract you just ping the network directly, like making a call to a public API. However, to write data to the contract, you must send a transaction to the contract address. All read/write interactions with a smart contract require a reference to the contract’s application binary interface (ABI). The ABI is like an API for a smart contract. ABIs are machine-readable, meaning they are easy to parse by client software to understand how to interact with the contract code. An ABI documents all the functions and their attributes. Here is the ABI for the Guestbook smart contract: [{"constant":true,"inputs":[{"name":"_bookentrynumber","type":"uint256"}], "name":"getmessagefromreader","outputs":[{"name":"_messagefromreader", "type":"string"}],"payable":false,"stateMutability":"view","type":"function"}, {"constant":true,"inputs":[],"name":"getnumberofmessagesfromreaders", "outputs":[{"name":"_numberofmessages","type":"uint256"}],"payable":false, "stateMutability":"view","type":"function"}, {"constant":true,"inputs":[],"name":"getmessagefromauthors", "outputs":[{"name":"_name","type":"string"}],"payable":false, "stateMutability":"view","type":"function"}, {"constant":false,"inputs":[{"name":"_messagefromreader","type":"string"}], "name":"setmessagefromreader","outputs":[],"payable":false, "stateMutability":"nonpayable","type":"function"}, {"constant":false,"inputs":[{"name":"_messagefromauthors","type":"string"}], "name":"setmessagefromauthors","outputs":[],"payable":false, "stateMutability":"nonpayable","type":"function"}, {"inputs":[],"payable":false,"stateMutability":"nonpayable","type":"constructor"}] Reading a smart contract Let’s read the data in the Guestbook smart contract.

No Slack: The Financial Lives of Low-Income Americans
by Michael S. Barr
Published 20 Mar 2012

Credit Card Accountability Responsibility and Disclosure (CARD) Act of 2009, Pub. L. No. 111-24, 123 Stat. 1734. 12864-06_CH06_3rdPgs.indd 143 3/23/12 11:56 AM 144 michael s. barr, jane k. dokko, and benjamin j. keys shopping by requiring the public posting to the Federal Reserve of credit-card contracts in machine-readable formats; private firms or nonprofits can develop tools for experts and consumers to use to evaluate these various contracts. The new Consumer Financial Protection Bureau will undoubtedly have occasion to review these and other requirements for credit cards in the future. Overdraft Traditionally, customer payment orders in excess of the funds available in the customer’s bank account, whether by check or by debit card, were declined.

In doing so, it will rely on consumer testing, can issue model disclosures that provide a safe harbor for compliance, and may permit financial institutions to use trial disclosure programs to test out the effectiveness of alternative disclosures to those provided for in the CFPB model form. The bureau is mandated to merge conflicting Real Estate Settlement Procedures Act and TILA mortgage disclosures into a simple form. Consumers are provided with rights to access information about their own product usage in standard, machine-readable formats. Over time, the CFPB may generate research and experimentation that will improve our understanding of consumer financial decisionmaking and in turn will support the bureau’s supervision, rule writing, and enforcement. Behaviorally Informed Credit-Card Regulation Credit-card companies have fine-tuned product offerings and disclosures in a manner that appears to be systematically designed to prey on common psychological biases—biases that limit consumer ability to make rational choices regarding credit-card borrowing (Bar-Gill 2004).

Based on the same understanding that consumers do not shop for penalty fees and that they often misforecast their own behavior, it requires that late fees or other penalty fees must be “reasonable and proportionate,” as determined by implementing rules; that in any event the fees not be larger than the amount charged that is over the limit or late; and that a late fee or other penalty fee cannot be assessed more than once for the same transaction or event. Furthermore, the act takes steps to make it easier for the market to develop mechanisms for consumer comparison shopping by requiring the public posting to the Federal Reserve of credit-card contracts in machine-readable formats; private firms or nonprofits can develop tools for experts and consumers to use to evaluate these various contracts. The Consumer Financial Protection Bureau will undoubtedly have occasion to review these and other requirements for credit cards in the future. Increasing Saving by Low- and Moderate-Income Households We have focused thus far in this chapter on improving outcomes in the credit markets using insights from behavioral economics and industrial organization.

Advanced Software Testing—Vol. 3, 2nd Edition
by Jamie L. Mitchell and Rex Black
Published 15 Feb 2015

A graphical view of how the elements of the tool all fit together is shown in Figure 6–5. Figure 6–5 ePRO-LOG and the test tool’s subsystem interactions The aspect that transformed the tool from a dumb monkey to a model-based testing tool, however, was the ability to read machine-readable requirements specifications to determine the correct appearance of each screen. These requirements were machine readable to support another element of FDA auditing of the software development (rather than testing) process, so there was no extra work to create them. The tool had a parser that allowed it to read those specifications as it walked through the application.

We started with the basic idea of a dumb monkey tool, an unscripted automated test tool that gives input at random. The dumb monkey was implemented in Perl under Cygwin running on a Windows PC. This tool went beyond the typical dumb monkey, though, as it had the ability to check against a model of the system’s behavior, encapsulated in machine-readable requirements specifications. Since we were building a tool we called the monkey, as you can imagine elements of humor entered the project, as illustrated by the terminology shown in Table 6–6. Table 6–6 Terms used for the model-based testing tool The monkey started off working much the same as any dumb monkey tool, sending random inputs to the user interface, forcing a random walk through the diary implemented in the ePRO-LOG application.

pages: 238 words: 93,680

The C Programming Language
by Brian W. Kernighan and Dennis M. Ritchie
Published 15 Feb 1988

We have refined the original examples, and have added new examples in several chapters. For instance, the treatment of complicated declarations is augmented by programs that convert declarations into words and vice versa. As before, all examples have been tested directly from the text, which is in machine-readable form. Appendix A, the reference manual, is not the standard, but our attempt to convey the essentials of the standard in a smaller space. It is meant for easy comprehension by programmers, but not as a definition for compiler writers -- that role properly belongs to the standard itself. Appendix B is a summary of the facilities of the standard library.

It contains a tutorial introduction to get new users started as soon as possible, separate chapters on each major feature, and a reference manual. Most of the treatment is based on reading, writing and revising examples, rather than on mere statements of rules. For the most part, the examples are complete, real programs rather than isolated fragments. All examples have been tested directly from the text, which is in machine-readable form. Besides showing how to make effective use of the language, we have also tried where possible to illustrate useful algorithms and principles of good style and sound design. The book is not an introductory programming manual; it assumes some familiarity with basic programming concepts like variables, assignment statements, loops, and functions.

pages: 270 words: 79,992

The End of Big: How the Internet Makes David the New Goliath
by Nicco Mele
Published 14 Apr 2013

Sometimes people utter the catchall term “digital,” but it’s not clear what that means, either; remember the digital watches of the 1980s? “Open” sounds good: open government, open-source politics, open-source policy. But WikiLeaks brings severe diplomatic and political consequences that “open” doesn’t capture. Just because something is machine readable and online doesn’t necessarily mean it is open. Also, “openness” describes the end result of technology, but it ignores the closed cabal of nerds (of which I’m one) that came up with this technology and defined its political implications. Not to mention that the control a handful of companies exert over our technology is far from open—companies like Apple, Google, and Facebook.

Rather than spend twenty-two hours a week watching television, some Americans might put some of that time into building useful applications for their fellow citizens using the raw material provided by Data.gov. Kundra had good reasons to believe Americans would take advantage of any data the government put online—as long as it went online in a useful, machine-readable format. Before being CIO of the U.S. government, he had been the CIO of the District of Columbia. There, he started a contest encouraging citizens to develop apps for the city. First, he made more than 400 data sets available, many of them in real time. Then he gave away $25,000 in prizes to people who created the best applications for citizens to use.

San Francisco
by Lonely Planet

Visas Required You must obtain a visa from a US embassy or consulate in your home country if you: ➡ Do not currently hold a passport from a VWP country. ➡ Are from a VWP country, but don’t have a machine-readable passport. ➡ Are from a VWP country, but currently hold a passport issued between October 26, 2005, and October 25, 2006, that does not have a digital photo on the information page or an integrated chip from the data page. (After October 25, 2006, the integrated chip is required on all machine-readable passports.) ➡ Are planning to stay longer than 90 days. ➡ Are planning to work or study in the US. Work Visas Foreign visitors are not legally allowed to work in the USA without the appropriate working visa.

Visas Canadians Canadian citizens currently only need proof of identity and citizenship to enter the US – but check the US Department of State for updates, as requirements may change. Visa Waiver Program USA Visa Waiver Program (VWP) allows nationals from 36 countries to enter the US without a visa, provided they are carrying a machine-readable passport. For the updated list of countries included in the program and current requirements, see the US Department of State (http://travel.state.gov/visa) website. Citizens of VWP countries need to register with the US Department of Homeland Security (http://esta.cbp.dhs.gov) three days before their visit.

San Francisco
by Lonely Planet

Visas Required You must obtain a visa from a US embassy or consulate in your home country if you: ➡ Do not currently hold a passport from a VWP country. ➡ Are from a VWP country, but don’t have a machine-readable passport. ➡ Are from a VWP country, but currently hold a passport issued between October 26, 2005, and October 25, 2006, that does not have a digital photo on the information page or an integrated chip from the data page. (After October 25, 2006, the integrated chip is required on all machine-readable passports.) ➡ Are planning to stay longer than 90 days. ➡ Are planning to work or study in the US. Work Visas Foreign visitors are not legally allowed to work in the USA without the appropriate working visa.

Visas Canadians Canadian citizens currently only need proof of identity and citizenship to enter the US – but check the US Department of State for updates, as requirements may change. Visa Waiver Program USA Visa Waiver Program (VWP) allows nationals from 36 countries to enter the US without a visa, provided they are carrying a machine-readable passport. For the updated list of countries included in the program and current requirements, see the US Department of State (http://travel.state.gov/visa) website. Citizens of VWP countries need to register with the US Department of Homeland Security (http://esta.cbp.dhs.gov) three days before their visit.

pages: 330 words: 91,805

Peers Inc: How People and Platforms Are Inventing the Collaborative Economy and Reinventing Capitalism
by Robin Chase
Published 14 May 2015

So he wondered: “Is there any way in which we can use this effort for something that is good for humanity?”12 And so reCAPTCHA was born in 2007. reCAPTCHA takes the effort of typing the characters in a CAPTCHA and repurposes it to solve an entirely different problem. In order to make old newspapers or books useful online, they have to be scanned and the resulting images turned into machine-readable text to be usefully searchable. Sometimes the scanned or photographed image results in words that can’t be decoded using optical character recognition (OCR). This is a problem. When the CAPTCHAs are constructed using words tagged by OCR programs as unreadable, we smart humans do what computers can’t: We easily decode them!

But our experience has been that open data is fueling growth in a variety of economic sectors. To foster this economic growth, the U.S. government needs to do three things. First, we need to make it easy for entrepreneurs to find and use any public government dataset. To the extent possible, we need to put our vast trove of government data online, in machine-readable and ‘liquid’ form, while continuing to protect privacy. Second, supplying data isn’t enough—data doesn’t do anything by itself. Data is only useful if you apply it. We need to engage external and internal users of our data, in person and online, to prioritize releasing the most useful data sets first.”17 Nick’s third point, about problem prioritization, is one I will develop later in Chapter 8, on evolving legacy institutions.

pages: 81 words: 28,090

The Story of the Pony Express
by Glenn D. Bradley
Published 1 Jan 1913

Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg-tm depends upon and cannot survive without wide spread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine readable form accessible by the widest array of equipment including outdated equipment. Many small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States.

pages: 561 words: 157,589

WTF?: What's the Future and Why It's Up to Us
by Tim O'Reilly
Published 9 Oct 2017

This idea also echoes one of the “Eight Principles of Open Government Data” that Carl Malamud, Harvard law professor Larry Lessig, and I, together with a group of about thirty other open data activists, had published after a working group meeting in December 2007. One of those principles is that data should be published in formats that are not just machine readable but machine processable, so that the data could be reused for purposes not envisioned by its original producers. Open data had become a key talking point of the new administration, but most people only thought of it as a tool of government transparency and accountability. A handful saw that there was a real opportunity to make data much more useful to citizens and society.

Data aggregators, who collect data not in order to provide services directly to consumers, but to other businesses, should come in for particular scrutiny, since the data transaction between the consumer and the service provider has been erased, and it is far more likely that the data is being used not for the benefit of the consumer who originally provided it but for the benefit of the purchaser. Disclosure and consent as currently practiced are extraordinarily weak regulatory tools. They allow providers to cloak malicious intent in complex legal language that is rarely read, and if read, impossible to understand. Machine-readable disclosure similar to those designed by Creative Commons for expressing copyright intent would be a good step forward in building privacy-compliant services. A Creative Commons license allows those publishing content to express their intent clearly and simply, ranging from the “All Rights Reserved” of traditional copyright to a license like CC BY-NC-ND (which requires attribution, but allows the content to be shared freely for noncommercial purposes, and does not allow derivative works).

Through a mix of four or five carefully crafted assertions, which are designed to be both machine and human readable, Creative Commons allows users of a photo-sharing site like Flickr or a video-sharing site like YouTube to search only for content matching certain licenses. An equivalent framework for privacy would be very helpful. During the Obama administration, there was a concerted effort toward what is called “Smart Disclosure,” defined as “the timely release of complex information and data in standardized, machine readable formats in ways that enable consumers to make informed decisions.” New technology like the blockchain can also encode contracts and rules, creating new kinds of “smart contracts.” A smart contracts approach to data privacy could be very powerful. Rather than using brute force “Do Not Track” tools in their browser, users could provide nuanced limits to the use of their data.

pages: 134 words: 29,488

Python Requests Essentials
by Rakesh Vidya Chandra and Bala Subrahmanyam Varanasi
Published 16 Jun 2015

Types of data In most cases, we deal with three types of data when working with web sources. They are as follows: • Structured data • Unstructured data • Semistructured Data Structured data Structured data is a type of data that exists in an organized form. Normally, structured data has a predefined format and it is machine readable. Each piece of data that lies in structured data has a relation with every other data as a specific format is imposed on it. This makes it easier and faster to access different parts of data. The structured data type helps in mitigating redundant data while dealing with huge amounts of data.

pages: 471 words: 94,519

Managing Projects With GNU Make
by Robert Mecklenburg and Andrew Oram
Published 19 Nov 2004

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters.

If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.

pages: 297 words: 103,910

Free culture: how big media uses technology and the law to lock down culture and control creativity
by Lawrence Lessig
Published 15 Nov 2004

By developing a free set of licenses that people can attach to their content, Creative Commons aims to mark a range of content that can easily, and reliably, be built upon. These tags are then linked to machine-readable versions of the license that enable computers automatically to identify content that can easily be shared. These three expressions together—a legal license, a human-readable description, and machine-readable tags—constitute a Creative Commons license. A Creative Commons license constitutes a grant of freedom to anyone who accesses the license, and more importantly, an expression of the ideal that the person associated with the license believes in something different than the "All" or "No" extremes.

pages: 135 words: 31,098

ClojureScript: Up and Running
by Stuart Sierra and Luke Vanderhart
Published 24 Oct 2012

As the name suggests, it takes a string argument and returns a single data structure read from that string: (ns example (:require [cljs.reader :as reader])) (reader/read-string "{:a 1 :b 2}") ;;=> {:a 1, :b 2} The opposite of read-string is the built-in ClojureScript function pr-str, or “print to string,” which takes a data structure and returns its string representation: (pr-str {:language "ClojureScript"}) ;;=> "{:language \"ClojureScript\"}" Notice that pr-str automatically escapes special characters and places strings in double quotes, which the print and println functions do not: (println {:language "ClojureScript"}) ;; {:language ClojureScript} ;;=> nil In general, the print, println, and str functions are used for human-readable output, whereas the pr, prn, and pr-str functions are used for machine-readable output. Example Client-Server Application Building a complete client-server application in Clojure and ClojureScript requires some knowledge of Clojure web libraries, which are outside the scope of this book. But the following example should give you an idea of how easy it is to communicate between the two languages.

Understanding search engines: mathematical modeling and text retrieval
by Michael W. Berry and Murray Browne
Published 15 Jan 2005

For over a decade, the UMLS has been working on enabling computer systems to understand medical meaning [64]. The Metathesaurus is one of the components of the UMLS and contains half a million biomedical concepts with over a million different concept names. Obviously, to do this, automated processing of the 20 Chapter 2, Document File Preparation machine-readable versions of its 40 source vocabularies is necessary, but it also requires review and editing by subject experts. The next step in item normalization is applying stop lists to the collection of processing tokens. Stop lists are lists of words that have little or no value as a search term. A good example of a stop list is the list of stop words from the SMART system at Cornell University (see ftp: / / f t p . cs.

pages: 391 words: 105,382

Utopia Is Creepy: And Other Provocations
by Nicholas Carr
Published 5 Sep 2016

A computerized search engine can swiftly parse explicit connections like citations, quotations, and hyperlinks—it feeds on them as a whale feeds on plankton—but it has little sensitivity to more delicate connections, to the implicit, the playful, the covert, the slant. Search engines are literal-minded, not literary-minded. Google’s overarching goal is to make culture machine-readable. We’ve all benefited from its pursuit of that goal, but Google’s vast field of vision has a very large blind spot. Much of what’s most subtle and valuable in culture—and the allusions of artists fall into this category—is too blurry to be read by machines. Kirsch says that T. S. Eliot had to append notes to “The Waste Land” in order to enable readers to track down its many allusions.

The goal is to create a system of “perfectly predictable interaction between individual and environment, in which nothing needs to be said along the way.” Beyond the efficiency gains, Silicon Valley would stand to profit from such a system. By developing a proprietary brain-computer network that renders human cogitation fully machine-readable, the tech industry would be able to transmit, store, parse, and hence to own, the entirety of our thoughts. “Industrial capitalism privatized the means of production,” Davies observed. “Digital capitalism seeks to privatize the means of communication.” That’s already happening. In digitizing human expression, the protocols of social networks are beginning to alter speech to make it more amenable to machine transmission and interpretation.

pages: 371 words: 108,317

The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future
by Kevin Kelly
Published 6 Jun 2016

Over the next 30 years, the great work will be parsing all the information we track and create—all the information of business, education, entertainment, science, sport, and social relations—into their most primeval elements. The scale of this undertaking requires massive cycles of cognition. Data scientists call this stage “machine readable” information, because it is AIs and not humans who will do this work in the zillions. When you hear a term like “big data,” this is what it is about. Out of this new chemistry of information will arise thousands of new compounds and informational building materials. Ceaseless tracking is inevitable, but it is only the start.

See also Flows and flowing literacy, 86, 89, 90, 200–202 Local Motors, 160–61 location tracking, 226, 238, 243 Lost (series), 206, 282 Lucas, George, 198 luxury entertainment, 190 Lyft, 62, 252 machine intelligence, 266, 291. See also artificial intelligence “machine readable” information, 267 Magic Leap, 216 malaria, 241 Malthus, Thomas, 243 Mann, Steve, 247 Manovich, Lev, 200 manufacturing, robots in, 52–53, 55 maps, 272 mathematics, 47, 239, 242–43 The Matrix (1999), 211 maximum likelihood estimation (MLE), 265 McDonalds, 25–26 McLuhan, Marshall, 63, 127 media fluency, 201 media genres, 194–95 medical technology and field AI applications in, 31, 55 and crowdfunding, 157 and diagnoses, 31 future flows of, 80 interpretation services in field of, 69 and lifelogging, 250 new jobs related to automation in, 58 paperwork in, 51 personalization of, 69 and personalized pharmaceuticals, 173 and pooling patient data, 145 and tracking technology, 173, 237, 238–40, 241–42, 243–44, 250 Meerkat, 76 memory, 245–46, 249 messaging, 239–40 metadata, 258–59, 267 microphones, 221 Microsoft, 122–23, 124, 216, 247 minds, variety of, 44–46 Minecraft, 218 miniaturization, 237 Minority Report (2002), 221–22, 255 MIT Media Lab, 219, 220, 222 money, 4, 65, 119–21 monopolies, 209 mood tracking, 238 Moore’s Law, 257 movies, 77–78, 81–82, 168, 204–7 Mozilla, 151 MP3 compression, 165–66 music and musicians AI applications in, 35 creation of, 73–76, 77 and crowdfunding, 157 and free/ubiquitous copies, 66–67 and intellectual property issues, 208–9 and interactivity, 221 liquidity of, 66–67, 73–78 and live performances, 71 low-cost reproduction of, 87 of nonprofessionals, 75–76 and patronage, 72 sales of, 75 soundtracks for content, 76 total volume of recorded music, 165–66 Musk, Elon, 44 mutual surveillance (“coveillance”), 259–64 MyLifeBits, 247 Nabokov, Vladimir, 204 Napster, 66 The Narrative, 248–49, 251 National Geographic, 278 National Science Foundation, 17–18 National Security Agency (NSA), 261 Nature, 32 Negroponte, Nicholas, 16, 219 Nelson, Ted, 18–19, 21, 247 Nest smart thermostat, 253, 283 Netflix and accessibility vs. ownership, 109 and crowdsourcing programming, 160 and on-demand access, 64 and recommendation engines, 39, 154, 169 and reviews, 73, 154 and sharing economy, 138 and tracking technology, 254 Netscape browser, 15 network effect, 40 neural networks, 38–40 newbies, 10–11, 15 new media forms, 194–95 newspapers, 177 Ng, Andrew, 38, 39 niche interests, 155–56 nicknames, 263 nondestructive editing, 206 nonprofits, 157 noosphere, 292 Northwestern University, 225 numeracy, 242–43 Nupedia, 270 OBD chips, 251, 252 obscure or niche interests, 155–56 office settings, 222.

pages: 396 words: 107,814

Is That a Fish in Your Ear?: Translation and the Meaning of Everything
by David Bellos
Published 10 Oct 2011

The “translator’s invisibility,” eloquently denounced by Lawrence Venuti as a symptom of the anti-intellectual, antiforeign bias of Britain and America,1 is also the unintended result of the unbounded nature of the English language itself. The suspicion that the language of translated works is not quite the same as the language the translations purport to be in has given rise to scholarly work based not on anecdotes and intuition but on the automated analysis of quite large bodies of translated texts in machine-readable form. These techniques allow insights into what is now called the “third code”—the language of translations seen as a dialect that can be distinguished from the regular features of the target language.2 In one such investigation, it’s been found that English novels in French translation have at least one language feature that seems quite at variance with novels originally written in French.

Computer-aided human translation and human-aided computer translation are both substantial achievements, and without them the global flows of trade and information of the past few decades would not have been nearly so smooth. Until recently, they remained the preserve of language professionals. What they also did, of course, was to put huge quantities of translation products (translated texts paired with their source texts) in machine-readable form. The invention and the explosive growth of the Internet since the 1990s has made this huge corpus available for free to everyone with a terminal. And then Google stepped in. Using software built on mathematical frameworks originally developed in the 1980s by researchers at IBM, Google has created an automatic-translation tool that is unlike all others.

pages: 363 words: 105,039

Sandworm: A New Era of Cyberwar and the Hunt for the Kremlin's Most Dangerous Hackers
by Andy Greenberg
Published 5 Nov 2019

When Robinson finally cracked those layers of obfuscation after a week of trial and error, he was rewarded with a view of the BlackEnergy sample’s millions of ones and zeros—a collection of data that was, at a glance, still entirely meaningless. This was, after all, the program in its compiled form, translated into machine-readable binary rather than any human-readable programming language. To understand the binary, Robinson would have to watch it execute step-by-step on his computer, unraveling it in real time with a common reverse-engineering tool called IDA Pro that translated the function of its commands into code as they ran.

A pre-internet-era detective might start a rudimentary search for a person by consulting phone books. Matonis started digging into the online equivalent, the directory of the web’s global network known as the domain name system, or DNS. DNS servers translate human-readable domains like “facebook.com” into the machine-readable IP addresses that actually describe the location of a networked computer that runs that site or service, like 69.63.176.13. Matonis began painstakingly checking every IP address his hackers had used as a command-and-control server in the campaign of malicious Word documents he’d just uncovered, translating those domains into any IP addresses that had ever hosted them.

pages: 482 words: 117,962

Exceptional People: How Migration Shaped Our World and Will Define Our Future
by Ian Goldin , Geoffrey Cameron and Meera Balarajan
Published 20 Dec 2010

Technology and Surveillance Technologies used for border control have sought to take advantage of breakthroughs in medicine and electronics. While there is a move to increase accuracy through the use of biometric and other data, there is also a greater reliance on imprecise methods, such as statistical risk analysis. Biometric data is carried on machine-readable passports or identity documents that can be scanned to compare the individual's identity with electronic databases of “watch lists” (often developed through international collaboration).115 The degree of scrutiny applied to potential migrants often relies on generalized “risk” factors based primarily on statistics and sociology.

One or more of these is increasingly required on passports and/or visas for potential migrants to European and North American countries. Following the events of 11 September 2001, the United States introduced the Visitor and Immigrant Status Indicator Technology (US-VISIT) program to use digital photos, machine-readable passports, and electronic monitoring systems at its borders. A digital photograph and inkless fingerprints are taken at the points of arrival and departure to track the entry and exit of visitors and migrants to the United States. Initially, the program was applied only to visitors who require a visa to enter the United States, but in 2004, it was extended to the 27 countries included in the U.S. visa-waiver program (most of Europe, Japan, New Zealand, Australia, and Singapore).

pages: 541 words: 109,698

Mining the Social Web: Finding Needles in the Social Haystack
by Matthew A. Russell
Published 15 Jan 2011

Example 9-2 is the canonical example from the documentation that demonstrates how to turn the IMDB’s page on The Rock into an object in the Open Graph protocol as part of an XHTML document that uses namespaces. These bits of metadata have great potential once realized at a massive scale, because they enable a URI like http://www.imdb.com/title/tt0117500 to unambiguously represent any web page—whether it’s for a person, company, product, etc.—in a machine-readable way and furthers the vision for a semantic web. Example 9-2. Sample RDFa for the Open Graph protocol <html xmlns:og="http://ogp.me/ns#"> <head> <title>The Rock (1996)</title> <meta property="og:title" content="The Rock" /> <meta property="og:type" content="movie" /> <meta property="og:url" content="http://www.imdb.com/title/tt0117500/" /> <meta property="og:image" content="http://ia.media-imdb.com/images/rock.jpg" /> ...

Whether you decide to dive into this topic right now, keep in mind that the data that’s available on the Web is incomplete, and that making a closed-world assumption (i.e., considering all unknown information emphatically false) will entail severe consequences sooner rather than later. Inferencing About an Open World with FuXi Foundational languages such as RDF Schema and OWL are designed so that precise vocabularies can be used to express facts such as the triple (Mr. Green, killed, Colonel Mustard) in a machine-readable way, and this is a necessary but not sufficient condition for the semantic web to be fully realized. Generally speaking, once you have a set of facts, the next step is to perform inference over the facts and draw conclusions that follow from the facts. The concept of formal inference dates back to at least ancient Greece with Aristotle’s syllogisms, and the obvious connection to how machines can take advantage of it has not gone unnoticed by researchers interested in artificial intelligence for the past 50 or so years.

pages: 492 words: 118,882

The Blockchain Alternative: Rethinking Macroeconomic Policy and Economic Theory
by Kariappa Bheemaiah
Published 26 Feb 2017

These results are then analysed by an AI and used to create hypotheses, develop trading strategies, and automatically model investment scenarios. As the operating processes of these companies are almost completely digital in nature, firms such as FundApps have arisen to provide regulatory information via a cloud-based managed service in machine readable language. This allows these firms to adapt faster to changing regulations in the sector. We will touch upon this topic again when discussing smart contracts. Previously, portfolio management and scenario generation required the employment of senior-level executives. However, the automation of these tasks has encouraged incumbents to switch to these services, raising questions on the future landscape of the sector.

Using the data from the Blockchain, the auditing of the institution can be done in real time with a smart contract supplying the information to the auditors’ reporting instruments at predetermined periods. This allows for faster assessment, precise tax determination, quicker detection of discrepancies, and easier enforcement of new regulations. As startups like FundApps provide regulatory information in machine-readable language, the next step in this direction would be to allow a smart contract to make changes in regulation as an input, and adjust the investment and reporting procedures based on these changes. This would reduce time delays in the execution of new regulations, provide greater transparency of financial records, and allow sovereign regulatory bodies with automated capital analysis.

pages: 458 words: 116,832

The Costs of Connection: How Data Is Colonizing Human Life and Appropriating It for Capitalism
by Nick Couldry and Ulises A. Mejias
Published 19 Aug 2019

Government and commercial actors intersect around what French legal scholar Annette Rouvroy has called “data behaviourism.”70 Social knowledge becomes whatever works to enable private or public actors to modulate others’ behavior in their own interests (not disinterested social knowledge but social capture). This, in turn, affects what counts as social knowledge. Potential inputs that are not machine readable become irrelevant, since the goal is always to increase N, the aggregate of what can be counted as data. Meanwhile, data subjects are generating material for machine reading all the time. We are constantly encouraged to act in ways that stimulate further counting, gain us more followers, and achieve better analytics.71 A certain vision of social truth emerges here, expressed by Jeff Malmad, a marketing executive, who says, “The Truth will be present in everything.

Some uses of social data, perhaps by civic or public institutions, may be responsible, modest, and oriented to organizational self-reflection.114 At the same time, the normalization of data extraction as the general means to monitor the world reinforces the “desire for numbers” (provided they are machine readable). New forms of professionalism and expertise around data capture are reshaping the world of work under data colonialism.115 Before the internet, however, as Bruce Schneier notes,116 data sources about social life were limited to company customer records, responses to direct marketing, credit bureau data, and government public records (and, we might add, insurance company data on their insured).

pages: 444 words: 118,393

The Nature of Software Development: Keep It Simple, Make It Valuable, Build It Piece by Piece
by Ron Jeffries
Published 14 Aug 2015

Is it possible to create a platform that allows safe, autonomous delivery into a shared SQL database? Yes, but it requires accommodation from both developers and DBAs. In particular, the difficulty of parsing SQL to do automated sanity checking is too high. Developers and DBAs have to agree on a simpler, machine-readable format that can be scripted against. Many migration frameworks offer XML, JSON, or YAML formats that suffice. Keep in mind that the goal for the platform team is to enable their customers. The team should be trying to take themselves out of the loop on every day-to-day process and focus on building safety and performance into the platform itself.

The following changes are always safe: Require a subset of the previously required parameters Accept a superset of the previously accepted parameters Return a superset of the previously returned values Enforce a subset of the previously required constraints on the parameters If you have machine-readable specifications for your message formats, you should be able to verify these properties by analyzing the new specification relative to the old spec. A tough problem arises that we need to address when applying the Robustness Principle, though. There may be a gap between what we say our service accepts and what it really accepts.

pages: 210 words: 42,271

Programming HTML5 Applications
by Zachary Kessin
Published 9 May 2011

DOCTYPE html> <html> <head> <title>Meter</title> </head> <body> <ul id="tree1" role="tree" tabindex="0" aria-labelledby="label_1"> <li role="treeitem" tabindex="-1" aria-expanded="true">Fruits</li> <li role="group"> <ul> <li role="treeitem" tabindex="-1">Oranges</li> <li role="treeitem" tabindex="-1">Pineapples</li> ... </ul> </li> </ul> </body> </html> Microdata Sometimes it is useful to add machine-readable data to a set of HTML tags. For example, you can use this procedure in a template to encode data into a page that can later be read by JavaScript. To enable such procedures in a standardized way, HTML5 created the concept of microdata, which can be added to HTML5 tags. Traditionally, HTML tags give data about how information should be formatted on-screen, but not about the data itself.

pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money
by Frank J. Ohlhorst
Published 28 Nov 2012

Data analysis is considerably more challenging than simply locating, identifying, understanding, and citing data. For effective large-scale analysis, all of this has to happen in a completely automated manner. This requires differences in data structure and semantics to be expressed in forms that are machine readable and then computer resolvable. It may take a significant amount of work to achieve automated error-free difference resolution. The data preparation challenge even extends to analysis that uses only a single data set. Here there is still the issue of suitable database design, further complicated by the many alternative ways in which to store the information.

pages: 143 words: 43,096

Tel Aviv 2015: The Retro Travel Guide
by Claudia Stein
Published 30 Mar 2015

2.4.1 Immigration Tourists need a passport that is still valid for at least six months from the date of their departure. For tourists from many countries, there is no need to apply for a prearranged visa, and visitors are allowed to stay in the country for up to three months. Please consult with the Israeli Embassy to determine if your country participates in the Visa Waiver Program. Holders of non-machine readable passports will always need a prearranged visa, however. If you want to stay longer in Israel, you can apply for an extension at the Israeli Ministry of Internal Affairs. Information is available on their website: http://mfa.gov.il/mfa/Pages/default.aspx 2.4.2 Departure Please make sure you are at the airport at least 3 hours before departure.

pages: 481 words: 121,669

The Invisible Web: Uncovering Information Sources Search Engines Can't See
by Gary Price , Chris Sherman and Danny Sullivan
Published 2 Jan 2003

With so much attention being paid to both the visible and Invisible Web these days it is important to remember that a massive amount of material is not accessible on the Web, via the Web, or in any electronic format. It only exists in its original format or some other offline archive format like microfilm or microfiche. Will everything in print ultimately be digitized and made available online? Not likely. The expense of converting materials from printed text to machine-readable format is often prohibitive. Much of what we have on the Web we owe to the generosity of venture capitalists willing to back experimental ventures in delivering online content. Highquality offline content will continue to migrate to the Web, but likely at a much slower pace than we’ve seen in the past. 126 This Page Intentionally Left Blank CHAPTER 8 The Future: Revealing the Invisible Web The future Invisible Web will be both larger and smaller than today’s Invisible Web.

A primitive form of metadata (“keywords” and “description” meta tags) has been recognized since 1996, but it has been so widely abused by spammers that most search engines either ignore it or use it only peripherally in calculating relevance for a document. Proposals for metadata standards abound. The standard that seems most likely to achieve something close to universal adoption is RDF (Resource Description Framework), which uses the syntax of XML (Extensible Markup Language). The goal of all metadata standards proposals is to go beyond machine-readable data and create machineunderstandable data on the Web. Among other things, they provide the capability to introduce controlled vocabulary (often organized in thesaurus form) into the search equation. A controlled vocabulary can bring different terms, jargon, and concepts together. Though the standard will provide a structure for describing, classifying, and managing Web documents, it has its own set of vulnerabilities, and not everyone is sanguine about its prospects.

pages: 459 words: 123,220

Our Kids: The American Dream in Crisis
by Robert D. Putnam
Published 10 Mar 2015

From 1970 to 2010, the percentage of black families in Atlanta subsisting on less than $25,000 (in inflation-adjusted 2010 dollars) barely changed, slipping from 31 percent to 30 percent, whereas the percentage of black families with incomes over $100,000 more than doubled, rising from 6 percent to 13 percent. Data are from the author’s analysis of Steven Ruggles, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and Matthew Sobek, “Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database],” (Minneapolis: University of Minnesota, 2010). 8. Raj Chetty, Nathaniel Hendren, Patrick Kline, and Emmanuel Saez, “Where Is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States,” NBER Working Paper No. 19843 (Cambridge: National Bureau of Economic Research, January 2014). 9.

“Street Gangs in Santa Ana, CA,” Streetgangs.com, accessed June 16, 2014, http://www.streetgangs.com/cities/santaana#sthash.rnESeLn4.dpbs. 5. U.S. Census Bureau, from Steven Ruggles, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and Matthew Sobek. Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database] (Minneapolis: University of Minnesota, 2010). 6. All members of these two families are American citizens and all the children are native-born. Undocumented immigrants and their children obviously face additional challenges. 7. Fermin Leal and Scott Martindale, “OC’s Best Public High Schools, 2012,” Orange County Register, May 25, 2014, database accessed February 24, 2014, http://www.ocregister.com/articles/high-331705-college-schools.html?

pages: 165 words: 45,397

Speculative Everything: Design, Fiction, and Social Dreaming
by Anthony Dunne and Fiona Raby
Published 22 Nov 2013

It is more about the positive use of negativity, not negativity for its own sake but to draw attention to a scary possibility in the form of a cautionary tale. A good example of this is Bernd Hopfengaertner's Belief Systems (2009). Hopfengaertner asks what would happen if one of the tech industry's many dreams comes true, if all the research being done by separate companies into making humans machine readable were to combine and move from laboratory to everyday life: combined algorithms and camera systems that can read emotions from faces, gait, and demeanor; neurotechnologies that cannot exactly read minds but can make a good guess at what people are thinking; profiling software that tracks and traces our every click and purchase; and so on.

pages: 168 words: 49,067

Becoming Data Literate: Building a great business, culture and leadership through data and analytics
by David Reed
Published 31 Aug 2021

It is also the case that not every graduate wants to work in those global or start-up environments. At one law firm, part of its pitch to recruit PhDs has been the challenge of transforming the huge amounts of data it holds on dispute resolution in paper and electronic documents into codified and machine-readable data sets that can be used to automate processes. Major organisations tend to have well-established graduate programmes. Recently, some of these have adapted to new skills requirements in the data department, such as the data science graduate scheme at a global bank which is now in its second year and offers a combination of experience across customer science, ML and data engineering.

pages: 190 words: 53,970

Eastern standard tribe
by Cory Doctorow
Published 17 Feb 2004

He's a journalist, editorialist and blogger. Boing Boing (boingboing.net), the weblog he co-edits, is the most linked-to blog on the Net, according to Technorati. He won the John W. Campbell Award for Best New Writer at the 2000 Hugos. You can download this book for free from craphound.com/est. -- * * * Machine-readable metadata: * * * <rdf:RDF xmlns="http://web.resource.org/cc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Work rdf:about="http://craphound.com/est"> <dc:title>Eastern Standard Tribe</dc:title> <dc:date>2004-2-9</dc:date> <dc:description>A novel by Cory Doctorow </dc:description> <dc:creator><Agent> <dc:title>Cory Doctorow</dc:title> </Agent></dc:creator> <dc:rights><Agent> <dc:title>Cory Doctorow</dc:title> </Agent></dc:rights> <dc:type rdf:resource="http://purl.org/dc/dcmitype/Text" /> <license rdf:resource="http://creativecommons.org/licenses/by-nd-nc/1.0" /> </Work> <License rdf:about="http://creativecommons.org/licenses/by-nd-nc/1.0"> <requires rdf:resource="http://web.resource.org/cc/Attribution" /> <permits rdf:resource="http://web.resource.org/cc/Reproduction" /> <permits rdf:resource="http://web.resource.org/cc/Distribution" /> <prohibits rdf:resource="http://web.resource.org/cc/CommercialUse" /> <requires rdf:resource="http://web.resource.org/cc/Notice" /> </License> </rdf:RDF> eof

pages: 234 words: 57,267

Python Network Programming Cookbook
by M. Omar Faruque Sarker
Published 15 Feb 2014

Working with Web Services – XML-RPC, SOAP, and REST In this chapter, we will cover the following recipes: Querying a local XML-RPC server Writing a multithreaded, multicall XML-RPC server Running an XML-RPC server with a basic HTTP authentication Collecting some photo information from Flickr using REST Searching for SOAP methods from an Amazon S3 web service Searching Google for custom information Searching Amazon for books through product search API Introduction This chapter presents some interesting Python recipes on web services using three different approaches, namely, XML Remote Procedure Call (XML-RPC), Simple Object Access Protocol (SOAP), and Representational State Transfer (REST). The idea behind the web services is to enable an interaction between two software components over the Web through a carefully designed protocol. The interface is machine readable. Various protocols are used to facilitate the web services. Here, we bring examples from three commonly used protocols. XML-RPC uses HTTP as the transport medium, and communication is done using XML contents. A server that implements XML-RPC waits for a call from a suitable client. The client calls that server to execute remote procedures with different parameters.

Lonely Planet Pocket San Francisco
by Lonely Planet and Alison Bing
Published 31 Aug 2012

Visas Top Tip Check the US Department of State (http://travel.state.gov/visa) for updates and details on the following requirements. Canadians Proof of identity and citizenship required. Visa Waiver Program (VWP) Allows nationals from 36 countries to enter the US without a visa. It requires a machine-readable passport issued after November 2006. Citizens of VWP countries need to register with the US Department of Homeland Security (http://esta.cbp.dhs.gov/) three days before their visit. There is a $14 fee for registration application; when approved, the registration is valid for two years. Visa required For anyone staying longer than 90 days, or with plans to work or study in the US.

pages: 170 words: 51,205

Information Doesn't Want to Be Free: Laws for the Internet Age
by Cory Doctorow , Amanda Palmer and Neil Gaiman
Published 18 Nov 2014

Some rightsholders make the argument that even this isn’t nearly enough: in lobbying for SOPA, industry representatives argued that they also needed the right to censor DNS records, as well as a ban on tools that might defeat any of this censorship. DNS, remember, is the service that converts human-friendly Internet addresses (like thepiratebay.se) into machine-readable numeric addresses (like 194.71.107.50). You can think of this as being akin to the way your GPS works. You tell your GPS you want to go to 1600 Pennsylvania Avenue, and it converts that to a latitude and longitude like 38.89859, -77.035971, and promptly supplies driving directions to the White House.

pages: 157 words: 53,125

The Fifth Risk
by Michael Lewis
Published 1 Oct 2018

“There’s so much gold in there. People just don’t know how to get to it.” DJ Patil had gone to Washington in 2014 to help people find that gold. He was the human expression of an executive order Obama had signed the year before, insisting that all unclassified government data be made publicly available and that it be machine-readable. DJ assumed he’d need to leave when the man who hired him left office, so that gave him just two years. “We did not have time to collect new data,” he said. “We were just trying to open up what we had.” He set out to make as many connections as possible between the information and the people who could make new sense of it—to encourage them to use the data in novel and interesting ways.

pages: 174 words: 56,405

Machine Translation
by Thierry Poibeau
Published 14 Sep 2017

The problem is not to meet some nonexistent need through nonexistent machine translation.” [Alpac Report, 1966] The report then addressed the more general question of funding machine translation. The report began with a fairly standard definition: machine translation “presumably means going by algorithm from machine-readable source text to useful target text, without recourse to human translation or editing.” The report immediately concluded that no type of automated system existed at the time of drafting the report and that no such system was conceivable in the near future.4 Georgetown’s system was specifically mentioned: after eight years of funding, the system was still unable to produce a proper translation.

pages: 443 words: 51,804

Handbook of Modeling High-Frequency Data in Finance
by Frederi G. Viens , Maria C. Mariani and Ionut Florescu
Published 20 Dec 2011

Prices are determined by the aggregation of traders’ positions. Bornholdt (2001) modified this model introducing an antiferromagnetic coupling between the global magnetization and each spin, as well as a ferromagnetic coupling between the local neighborhood and each spin. In recent years, financial data providers such as Reuters and Bloomberg are offering machine readable news that can be used by trading systems. In this line of research, Seo et al. (2004) and Decker et al. (1996) describe a multiagent portfolio management system that automatically classifies financial news. Thomas (2003) combines news classification with technical analysis indicators in order to generate new trading rules.

See also Generalized autoregressive conditionally heteroskedastic (GARCH) methodology LP spaces, 386 432 LRT failure, 196. See also Likelihood ratio test (LRT) LRT p-values, 192–195 Lunch-time trader activity, 42 Machine learning methods, 48, 64–65 calibration of, 68 Machine learning perspective, 62 Machine-readable news, 64 Major financial events observations centered on, 107 probability curves for, 108 Mancino, Maria Elvira, xiv, 243 Marginal utility function, 299 Mariani, Maria C., xiv, 347, 383 Market capitalization index, 128 Market completeness assumption, 302 Market complexity, modeling of, 99 Market crash, 346 2008, 136 Market index (indices) exponents calculated for, 345 squared returns of, 220 technique for producing, 110 Market index decrease, spread and, 105 Market inefficiencies, for small-space and mid-volume classes, 44 Market microstructure effects, 263 Market microstructure, effects on Fourier estimator, 245 Market microstructure contaminations, 273 Market microstructure model, of ultra high frequency trading, 235–242 Market model, 296–297 Market movement, indicators of, 110 Market reaction, to abnormal price movements, 45 Market-traded option prices, 219 Markov chain, stochastic volatility process with, 401 Markowitz-type optimization, 286 Martingale-difference process, 178.

pages: 717 words: 150,288

Cities Under Siege: The New Military Urbanism
by Stephen Graham
Published 30 Oct 2009

Castles, walled cities, and extensive border battlements have been replaced by gated communities, expansive border zones, and management by “remote control”’.46 The point here is simple: if contemporary power in the cities of both ‘homeland’ and ‘war zone’ is about attempting to separate the spaces, zones, privileges and mobility of the risk-free (who need protection) from risky surrounding populations and infiltrations, then the only possible way to do this is pre-emptively, digitally and with a high degree of technological automation. As a result, militarized targeting becomes crucial, and the software algorithms that continually police the ‘data-sphere’ of machine-readable information, searching for potentially hazardous behaviours, circulations, people, or presences, assume political and sovereign power. This process ‘reinscribe[s] the imaginative geography of the deviant, atypical, abnormal “other” inside the spaces of daily life’, writes Amoore.47 Here, in an intensification of the logic of militarized control, imagined enmity enters the code which drives computerized simulations of normality, threat, and securocratic war.

Thus, ‘national security, at least in the ports, is conceptualized as almost interchangeable with the security of international trade flows’.171 GLOBAL BIOMETRIC REGIME The globe shrinks for those who own it; for the displaced or the dispossessed, the migrant or refugee, no distance is more awesome than the few feet across borders or frontiers.172 In the airline and airport sectors, US homeland security efforts are meant to ensure that the ‘border guard [is] the last line of defense, not the first, in identifying potential threats’.173 The dream system features interoperable ‘smart’ borders, globalized border control, and pre-emptive risk management.174 To this end, the US has developed the US-VISIT programme – US Visitor and Immigrant Status Indicator Technology – for air travel, another application of biometric attempts to ‘objectively’ fix bodies and identities while coercing key US partner nations to adjust their passport systems to biometric standards defined by the US.175 In the Enhanced Border Security and Visa Act of 2002, for example, the US Congress imposed a requirement that the twenty-seven countries within the US Visa Waiver Program (VWP) begin using machine-readable passports that incorporate both biometric and radio-frequency tag (RFID) technology. Nations or blocs that fail to undertake these radical shifts are threatened with losing their coveted status within the VWP. ‘Our leveraging of America’s visa aiver partners, in order to promote the use of the new ID technologies for purposes of national security’, Richard Pruett and Michael Longarzo of the US Army War College write, ‘may prove to be a paradigm for the coming age’.176 The passage-point architectures of overseas airports thus now display symbols of both US and domestic sovereignty (Figure 4.20). 4.20 The ‘global homeland’ orchestrated through the extension of US sovereignty as part of the US visit initiative: Frankfurt airport, Germany.

pages: 189 words: 57,632

Content: Selected Essays on Technology, Creativity, Copyright, and the Future of the Future
by Cory Doctorow
Published 15 Sep 2008

A typical scenario goes like this: a number of suppliers get together and agree on a metadata standard — a Document Type Definition or scheme — for a given subject area, say washing machines. They agree to a common vocabulary for describing washing machines: size, capacity, energy consumption, water consumption, price. They create machine-readable databases of their inventory, which are available in whole or part to search agents and other databases, so that a consumer can enter the parameters of the washing machine he's seeking and query multiple sites simultaneously for an exhaustive list of the available washing machines that meet his criteria.

pages: 196 words: 55,862

Riding for Deliveroo: Resistance in the New Economy
by Callum Cant
Published 11 Nov 2019

A different worker, artificially separated from the shop floor, would convert the information collected from the action of the machinist into a blueprint program for the machine via a process of calculation. These ‘part programmers’ would have as their sole job the conversion of shop-floor information into generic blueprints, a task that required quite limited skill. Another worker, often a woman on half the pay of the part programmer and the machinist, then typed up these blueprints in machine-readable form on paper tape using a coding machine. This tape was then sent to the shop floor and fed into the machine tool by the machinist, who then supervised its operation. The cooperation of these three workers eliminated the need for a skilled machinist, reducing the power of workers in the process.

Western USA
by Lonely Planet

For details, check www.getyouhome.gov. » All foreign passports must meet current US standards and be valid for at least six months longer than your intended stay. » MRP passports issued or renewed after October 26, 2006 must be e-passports (ie have a digital photo and integrated chip with biometric data). If your passport was issued before October 26, 2005, it must be ‘machine readable’ (with two lines of letters, numbers and <<< at the bottom); if it was issued between October 26, 2005, and October 25, 2006, it must be machine readable and include a digital photo. » For more information, consult www.cbp.gov/travel. CLIMATE CHANGE & TRAVEL Every form of transport that relies on carbon-based fuel generates CO2, the main cause of human-induced climate change.

For more information on visa requirements for visiting the USA, including the Electronic System for Travel Authorization (ESTA) now required before arrival for citizens of Visa Waiver Program (VWP) countries, Click here. Passports » Under the Western Hemisphere Travel Initiative (WHTI), all travelers must have a valid machine-readable (MRP) passport when entering the USA by air, land or sea. » The only exceptions are for most US citizens and some Canadian and Mexican citizens traveling by land who can present other WHTI-compliant documents (eg pre-approved ‘trusted traveler’ cards). For details, check www.getyouhome.gov. » All foreign passports must meet current US standards and be valid for at least six months longer than your intended stay. » MRP passports issued or renewed after October 26, 2006 must be e-passports (ie have a digital photo and integrated chip with biometric data).

pages: 162 words: 61,105

Eyewitness Top 10 Los Angeles
by Catherine Gerber
Published 29 Mar 2010

to Pack £ What Californians dress casually, but LA can be chilly in winter, and even in summer you’ll need a jacket or sweater in the evenings, especially near the coast. Sunglasses and hats are must-haves, and are easily available anywhere in LA. & Visas ^ Passports Citizens of the UK, Canada, Australia, New Zealand, Ireland, and several other countries need a valid machine-readable passport to visit the US for a period of up to 90 days. US citizens need a passport for re-entry. If you arrive by air or sea, you must present a round-trip ticket. Nationals of some countries must also complete an Electronic System for Travel Authorization (ESTA) application prior to travel.

pages: 144 words: 55,142

Interlibrary Loan Practices Handbook
by Cherie L. Weible and Karen L. Janke
Published 15 Apr 2011

See IFLA (International Federation of Library Associations and Institutions) international lending custom e-mail requests in ILLiad, 96 management of charges for, 14, 30, 45–46, 75 when domestic sources are exhausted, 23 International Lending and Document Delivery, 14 international library cooperation, 14 international locations, shipping to, 43–44 international publications, web-based finding aids for, 102 Internet document delivery by, 10 full-text documents on, 111–112 Internet Archives, 103 invoicing and billing charges, 29–30, 45–46 ISO ILL Protocol, 10, 13 L labels, shipping, 43 labels for borrowed materials, 26, 81 lenders, selection of, 24 lending libraries identification of, 19–23 selection of, 24 lending policies, 74–75 lending workflow, 37–48 policies, 37–39 procedures, 39–48 statistics, 48–49 See also workflow Liblicense Model License Agreement (LMLA), 58 LibQUAL+ surveys, 82 Libraries Very Interested in Sharing (LVIS), 34, 46 Library of Congress as de facto national library, 3 digital information in, 103–104 library website, use of, 74–75, 104 licensed electronic resources and copyright restrictions, 57–58 procedures for verification, 41–42 limits on requests, 18, 38 limits on services, policies on, 73 LMLA (Liblicense Model License Agreement), 58 load leveling, 4, 6 loan periods, 7, 18, 39, 75 Loansome Doc service, 99 Local Holdings Records (LHRs), use of, 87, 89 locally available materials and fill rate, 78 lost or damaged items policies on, 18, 48 removal from WorldCat records, 86 LVIS (Libraries Very Interested in Sharing), 34, 46 M mailroom practices, 25–26 management of ILL, 69–92 assessment, 77–82 environment, 70–71 policy development, 71–75 staffing and human resources, 82–85 working with other library units, 85–90 MARC (MAchine-Readable Cataloging), development of, 3, 8 materials lent, policies, 38–39 McNaughton Books service, 111, 113 microfilming for preservation, 6, 8, 10 Microsoft Office Document Image Writer, 98 missing items, rechecking for, 80, 88 mission of library and ILL policy, 72 Model Interlibrary Loan Code, 7, 9 modems, invention of, 10–11 monasteries as copy machines, 2 multitype library cooperatives, 8, 70 MyMorph file conversion, 98 N National Agricultural Library (NAL), 3 National Archives and Records Administration (NARA), 104 National Commission on Libraries and Information Science (NCLIS), 9 National Commission on New Technological Uses of Copyrighted Works (CONTU), 11, 53–56 index National Interlibrary Loan Code (1968), 7 National Interlibrary Loan Code (1980 revision), 9 National Interlibrary Loan Code (1993 revision), 9 national libraries, development of, 2–3 National Library of Medicine (NLM) DOCLINE service, 21, 99 file conversion options, 98 origin of, 3 National Union Catalog of Manuscript Collections, 21 National Union Catalog of Pre-1956 Imprints, 7 NCLIS (National Commission on Libraries and Information Science), 9 NDLTD (Networked Digital Library of Theses and Dissertations), 103 “need by” date, 19 Netflix, 111, 113 Networked Digital Library of Theses and Dissertations (NDLTD), 103 New England Depository Library (NEDL), 6 New England Library and Information Network (NELINET), 8 NISO ILL standard and request forms, 19 SERU (Shared E-Resource Understanding), 58 NLM.

Digital Accounting: The Effects of the Internet and Erp on Accounting
by Ashutosh Deshmukh
Published 13 Dec 2005

For example, descriptive markup <para> tells us that either the following item is a paragraph or it is the end of the previous paragraph. These markups can also be used for presentation of content in different data formats such as HTML, Portable Document Format (PDF), relational data tables and so forth. Additionally, descriptive markups are human and machine-readable and are in the public domain; some procedural markups also are human readable. However, human readability does not ensure complete understanding of the markups — descriptive or procedural; some markups may make sense only to machines. Descriptive markups form the basis of markup languages. As Charles Goldfarb, one of original inventors of markup languages, said: “Markups should describe a document’s structure and other attributes and should completely divorce structure from appearance while facilitating indexing and generation of selective views.”

. • Label integrity checking (e.g. maintenance of sensitivity labels when data is exported). • Auditing of labeled objects. • Mandatory access control for all operations. • Ability to specify security level printed on human-readable output (e.g. printers). • Ability to specify security level on any machine-readable output. • Enhanced auditing. • Enhanced protection of Operating System. • Improved documentation. Structured Protection • Notification of security level changes affecting interactive users. • Hierarchical device labels. • Mandatory access over all objects and devices. • Trusted path communications between user and system. • Tracking down of covert storage channels. • Tighter system operations mode into multilevel independent units. • Covert channel analysis. • Improved security testing. • Formal models of TCB. • Version, update and patch analysis and auditing.

pages: 725 words: 168,262

API Design Patterns
by Jj Geewax
Published 19 Jul 2021

For instance, if the response to GetOperation yields the error of the underlying operation, how can we tell the difference between a 500 Internal Server Error that was the result of the operation and the same error that is actually just the fault of the code that handles the GetOperation method? Clearly this means we’ll need an alternative. As we saw previously, the way we’ll handle this involves allowing either a result (of the indicated ResultT type) or an OperationError type, which includes both a machine-readable code, a friendly description for human consumption, and an arbitrary structure for storing additional error details. This means that the GetOperation method should only throw an exception (by returning an HTTP error) if there was an actual error in retrieving the Operation resource. If the Operation resource is retrieved successfully, the error response is simply a part of the result of the LRO.

As we’ll learn in chapter 24, this should always be avoided at all costs, and the best way to do that is to have error codes that are far more useful in code than error messages themselves. In the cases where users need not only to know the error type, but also some additional information about the error, the details field is the perfect place to put this structured, machine-readable information. While this additional information is not strictly prohibited from being in the error message, it’s far more useful in the error details and is certainly required. In general, if the error message was missing, we should not lose any information that can’t be looked up in the API documentation.

pages: 265 words: 69,310

What's Yours Is Mine: Against the Sharing Economy
by Tom Slee
Published 18 Nov 2015

The additional information and an additional communications channel thus has the effect of reinforcing patterns of opportunity that are already there rather than widening the base of participation and influence.45 Kentaro Toyama, an expert in the use of information technology for development, argues that “in contexts where literacy and social capital are unevenly distributed, technology tends to amplify inequalities rather than reduce them.” 46 An email account cannot make you more connected unless you have some existing social network to build on. Development studies scholar Kevin Donovan sees similarities between open data efforts and James Scott’s book Seeing Like a State.47 Open standards and structured, machine-readable data are key parts of the open data program, and for Donovan this formalization and standardization is “far more value-laden than typically considered.” Open data programs, like the state, seek to “make society legible through simplification.” Standardized data, like the state, “operate[s] over a multitude of communities and attempt[s] to eliminate cultural norms through standardization.”

pages: 243 words: 65,374

How We Got to Now: Six Innovations That Made the Modern World
by Steven Johnson
Published 28 Sep 2014

Eventually, as we have already seen, laser technology would prove crucial to digital communications, thanks to its role in fiber optics. But the laser’s first critical application would appear at the checkout counter, with the emergence of bar-code scanners in the mid-1970s. The idea of creating some kind of machine-readable code to identify products and prices had been floating around for nearly half a century. Inspired by the dashes and dots of Morse code, an inventor named Norman Joseph Woodland designed a visual code that resembled a bull’s-eye in the 1950s, but it required a five-hundred-watt bulb—almost ten times brighter than your average lightbulb—to read the code, and even then it wasn’t very accurate.

pages: 430 words: 68,225

Blockchain Basics: A Non-Technical Introduction in 25 Steps
by Daniel Drescher
Published 16 Mar 2017

It turns out that transactions are actually tiny self-contained contracts. They contain all of the necessary information to make a transfer of ownership happen. That insight led to the development of smart contracts that are executed by the blockchain. Similar to transaction data, smart contracts are machine-readable descriptions of the will of the involved parties. But unlike simple transaction data, smart contracts are much more flexible regarding the objects, subjects, actions, and conditions that can be used to describe the desired transfer of ownership. From a technical point of view, smart contracts are self-contained computer programs written in a blockchain-specific programming language.

pages: 296 words: 66,815

The AI-First Company
by Ash Fontana
Published 4 May 2021

GLOSSARY A/B TEST: testing for user preferences by randomly showing two different variants of a product (i.e., variants A and B) to different groups of users; also known as a split test ACYCLIC: jumping between points rather than going through points in the same pattern each time AGENT-BASED MODEL: model that generates the actions of agents and interactions with other agents given the agent’s properties, incentives, and environmental constraints AGGREGATION THEORY: the theory that new entrants in a market can aggregate existing quantities in that market; for example, data points, to create new and valuable products APPLICATION PROGRAMMING INTERFACE: a set of functions that allows applications to communicate with other applications, either to use a feature or fetch data; effectively, a structured way for software to communicate with other pieces of software AREA UNDERNEATH THE CURVE (AUC): the integral of the ROC curve BLOCKCHAIN: decentralized and distributed public ledger of transactions CLUSTERING TOOL: using unsupervised machine learning to group similar objects COMPLEMENTARY DATA: new data that increases the value of existing data CONCAVE PAYOFF: decreasing dividends from using a product CONCEPT DRIFT: when the idea behind the subject of a prediction changes based on observations CONSUMER APP: software application primarily used by individuals (rather than businesses) CONTRIBUTION MARGIN: average price per unit minus labor and quality control costs associated with that unit CONVEX PAYOFF: increasing dividends from using a product COST LEADERSHIP: a form of competitive advantage that comes from having the lowest cost of production with respect to competitors in a given industry CRYPTOGRAPHY: writing and solving codes CRYPTO TOKEN: representation of an asset that is kept on a blockchain CUSTOMER RELATIONSHIP MANAGEMENT SOFTWARE: software that stores and manipulates data about customers CUSTOMER SUPPORT AGENT: employee that is paid to respond to customer support tickets CUSTOMER SUPPORT TICKET: message from user of a product requesting help in using that product CYBERNETICS: the science of control and communication in machines and living things DATA: facts and statistics collected together for reference or analysis DATA ANALYST: person who sets up dashboards, visualizes data, and interprets model outputs DATA DRIFT: (1) when the distribution on which a prediction is based changes such that it no longer represents observed reality; or (2) when the data on which a prediction is based changes such that some of it is no longer available or properly formed DATA ENGINEER: person who cleans data, creates automated data management tools, maintains the data catalogue, consolidates data assets, incorporates new data sources, maintains data pipelines, sets up links to external data sources, and more DATA EXHAUST: data collected when users perform operations in an application, for example clicking buttons and changing values DATA INFRASTRUCTURE ENGINEER: person who chooses the right database, sets up databases, moves data between them, and manages infrastructure cost and more DATA LABELING: adding a piece of information to a piece of data DATA LEARNING EFFECT: the automatic compounding of information DATA LEARNING LOOP: the endogenous and continuous generation of proprietary data from an intelligent system that provides the basis of the next generation of that intelligent system DATA NETWORK: set of data that is built by a group of otherwise unrelated entities, rather than a single entity DATA NETWORK EFFECT: the increase in marginal benefit that comes from adding a new data point to an existing collection of data; the marginal benefit is defined in terms of informational value DATA PRODUCT MANAGER: person who incorporates the data needs of the model with the usability intentions of the product designers and preferences of users in order to prioritize product features that collect proprietary data DATA SCIENTIST: person who sets up and runs data science experiments DATA STEWARD: person responsible for ensuring compliance to data storage standards DEEP LEARNING: artificial neural network with multiple layers DEFENSIBILITY: the relative ability to protect a source of income; for example, an income-generating asset DIFFERENTIAL PRIVACY: system for sharing datasets without revealing the individual data points DIMENSIONALITY REDUCTION: transforming data (using a method such as principal component analysis) to reduce the measures associated with each data point DISRUPTION THEORY: the theory that new entrants in a market can appropriate customers from incumbent suppliers by selling a specialized product to a niche segment of customers at a fraction of the cost of the incumbent’s product DRIFT: when a model’s concept or data diverges from reality EDGE: connections between nodes; also called a link or a line ENTERPRISE RESOURCE PLANNING PRODUCT: software that collects and thus contains data about product inventory ENTRY-LEVEL DATA NETWORK EFFECT: the compounding marginal benefit that comes from adding new data to an existing collection of data; the marginal benefit is defined in terms of informational value to the model computed over that data EPOCH: completed pass of the entire training dataset by the machine learning model ETL (EXTRACT, TRANSFORM, AND LOAD): the three, main steps in moving data from one place to another EXAMPLE: a single input and output pair from which a machine learning model learns FEATURE: set of mathematical functions that are fed data to output a prediction FEDERATED LEARNING: method for training machine learning models across different computers without exchanging data between those computers FIRST-MOVER: company that collects scarce assets, builds technological leadership, and creates switching costs in a market by virtue of entering that market before other companies GAUSSIAN MIXTURE MODEL: probabilistic model representing a subset within a set, assuming a normal distribution, without requiring the observed data match the subset GLOBAL, MULTIUSER MODEL: model that makes predictions about something common to all customers of a given company; this is generally trained on data aggregated across all customers GRADIENT BOOSTED TREE: method for producing a decision tree in multiple stages according to a loss function GRAPH: mathematical structure made up of nodes and edges that is typically used to model interactions between objects HEURISTICS: knowledge acquired by doing HISTOGRAM: diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval HORIZONTAL INTEGRATION: the combination in one product of multiple industry-specific functions normally operated by separate products HUMAN-IN-THE-LOOP SYSTEM: machine learning system that requires human input to generate an output HYPERPARAMETER: parameter that is used to control the machine learning model HYPERTEXT MARKUP LANGUAGE: programming language specifically for writing documents to be displayed in a web browser INCUMBENT: existing market leader INDEPENDENT SOFTWARE VENDOR: a company that publishes software INFORMATION: data that resolves uncertainty for the receiver INSOURCING: finding the resources to complete a task within an existing organization such that it’s not necessary to contract new resources to complete that task INTEGRATOR: software company that builds tools to connect data sources, normalizes data across sources, and updates connections as these sources change INTELLIGENT APPLICATION: application that runs machine learning algorithms over data to make predictions or decisions INTERACTIVE MACHINE LEARNING: machine learning models that collect data from a user, put that data into a model, present the model output back to the user, and so on K-MEANS: unsupervised machine learning method to group objects in a number of clusters based on the cluster with the center point, or average, that’s closest to the object LABEL: the output of a machine learning system based on learning from examples LAYER: aggregation of neurons; layers can be connected to other layers LEAN AI: the process of building a small but complete and accurate AI to solve a specific problem LEARNING EFFECT: the process through which knowledge accumulation leads to an economic benefit LEGACY APPLICATION: application already in use LOSS: the quantum of how right or wrong a model was in making a given prediction LOSS FUNCTION: mathematical function that determines the degree to which the output of a model is incorrect MACHINE LEARNING: computable algorithms that automatically improve with the addition of novel data MACHINE LEARNING ENGINEER: person who implements, trains, monitors, and fixes machine learning models MACHINE LEARNING MANAGEMENT LOOP: automated system for continuous incorporation of real-world data into machine learning models MACHINE LEARNING RESEARCHER: person who sets up and runs machine learning experiments MARKETING SEGMENTATION: dividing customers into groups based on similarity MINIMUM VIABLE PRODUCT: the minimum set of product features that a customer needs for a product to be useful MOAT: accumulation of assets that form a barrier to other parties that may reduce the income-generating potential of those assets MONITORING: observing a product to ensure both quality and reliability NETWORK EFFECT: the increase in marginal benefit that comes from adding a new node to an existing collection of nodes; the marginal benefit is defined in terms of utility to the user of the collection NEURAL NETWORK: collection of nodes that are connected to each other such that they can transmit signals to each other across the edges of the network, with the strength of the signal depending on the weights on each node and edge NEXT-LEVEL DATA NETWORK EFFECT: the compounding marginal benefit that comes from adding new data to an existing collection of data; the marginal benefit is defined in terms of the rate of automatic data creation by the model computed over that data NODE: discrete part of a network that can receive, process, and send signals; also called a vertex or a point OPTICAL CHARACTER RECOGNITION SOFTWARE: software that turns images into machine-readable text PARETO OPTIMAL SOLUTION: achieving 80 percent of the optimal solution for 20 percent of the work PARTIAL PLOT: graph that shows the effect of adding an incremental variable to a function PERSONALLY IDENTIFIABLE INFORMATION: information that can be linked to a real person PERTURBATION: deliberately modifying an object, e.g., data POWER GENERATOR: user that contributes an inordinate amount of data with respect to other users POWER USER: user that uses a product an inordinate amount with respect to other users PRECISION: the number of relevant data points retrieved by the model over the total number of data points PREDICTION USABILITY THRESHOLD: the point at which a prediction becomes useful to a customer PRICING: product usage; for example, hours spent using a product PROOF OF CONCEPT: a project jointly conducted by potential customers and vendors of a software product to prove the value theoretically provided by that, in practice PROPRIETARY INFORMATION: information that is owned by a specific entity and not in the public domain QUERY LANGUAGE: programming language used to retrieve data from a database RANDOM FOREST: method for analyzing data that constructs multiple decision trees and outputs the class of objects that occurs most often among all the objects or the average prediction across all of the decision trees RECALL: the number of relevant data points retrieved by the model over the total number of relevant data points RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE: plot that shows how well the model performed at different discrimination thresholds, e.g., true and false positive rates RECURSION: repeated application of a method REINFORCEMENT LEARNING: ML that learns from objectives RETURN ON INVESTMENT (ROI): calculated by dividing the return from using an asset by the investment in that asset ROI-BASED PRICING: pricing that is directly correlated with a rate of return on an investment SCALE EFFECT: the increase in marginal benefit or reduction in marginal cost that comes from having a higher quantity of the asset or capability that generates the benefit SCATTER PLOT: graph in which the values of two variables are plotted along two axes, the pattern of the resulting points revealing any correlation present SCHEME: the form common to all values in a particular table in a database SECURE MULTIPARTY COMPUTATION: method for jointly computing inputs while keeping the inputs private from the participating computers SENSOR NETWORK: a collection of devices that collect data from the real world SIMULATION: method that generates inputs to put through a program to see if that program fails to execute or generates inaccurate outputs SOFTWARE-AS-A-SERVICE (SAAS): method of delivering software online and licensing that software on a subscription basis SOFTWARE DEVELOPMENT KIT: tool made by software developers to allow other software developers to build on top of or alongside their software STATISTICAL PROCESS CONTROL: quality control process that is based on statistical methods SUPERVISED MACHINE LEARNING: ML that learns from inputs given outputs SUPPORT VECTOR MACHINE: supervised learning model that classifies new data points by category SYSTEM OF ENGAGEMENT: system that actively (e.g., through user input) aggregates information about a particular business function SYSTEM OF RECORD: system that passively aggregates information about a particular business function SYSTEMS INTEGRATOR: an entity that installs new software systems such that they function with customers’ existing systems TALENT LOOP: the compounding competitive advantage in attracting high-quality personnel that comes from having more high quality data than competitors TRANSACTIONAL PRICING: pricing that is directly correlated with the quantum of units transacted through a product, for example, number of processed data points or computation cycles UNSUPERVISED MACHINE LEARNING: ML that learns from inputs without outputs USAGE-BASED PRICING: pricing that is directly correlated with the quantum of product usage; for example, hours of time spent using a product USER INTERFACE: set of objects that exist in software that are manipulated to initiate a function in that software VALUE CHAIN: the process by which a company adds value to an asset; for example, adding value to a data point by processing that data into information, and that information into a prediction VARIABLE IMPORTANCE PLOT: list of the most important variables in a function in terms of their contribution to a given prediction, or predictive power VERSIONING: keeping a copy of every form of a model, program, or dataset VERTICAL INTEGRATION: the combination in one company of multiple stages of production (or points on a value chain) normally operated by separate firms VERTICAL PRODUCT: software product that is only relevant to users in a particular industry WATERFALL CHART: data visualization that shows the result of adding or subtracting sequential values in adjacent columns WEB CRAWLER: program that systematically queries webpages or other documents on the internet; strips out the unnecessary content on those pages, such as formatting; grabs salient data, puts it in a standard document format (e.g., JSON), and puts it in a private data repository WEIGHT: the relative measure of strength ascribed to nodes and edges in a network; this can be automatically or manually adjusted after learning of a more optimal weight WORKFLOW APPLICATION: software that takes a sequence of things that someone does in the real world and puts those steps into an interface that allows for data input at each step ZETTABYTE: 10^21 bytes or 1 trillion gigabytes A B C D E F G H I J K L M N O P Q R S T U V W X Y Z INDEX The page numbers in this index refer to the printed version of this book.

pages: 757 words: 193,541

The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2
by Thomas A. Limoncelli , Strata R. Chalup and Christina J. Hogan
Published 27 Aug 2014

This step provides us with an opportunity to perform aggressive testing early in the process to avoid wasted effort later. 9.3.4 Package During the package step, the files left behind from the previous step are used to create the installation packages. A package is a single file that encodes all the files to be installed plus the machine-readable instructions for how to perform the installation. Because it is a single file, it is more convenient to transport. This step is gated based on whether package creation happened successfully. A simple package format would be a Zip or tar file that contains all the files that will be installed plus an installation script.

• The handoff between teams is written and agreed upon. • Each team has a playbook for tasks related to its involvement with NPR/NPI. • Equipment erasure and disposal is documented and verified. Level 3: Defined • Expectations for how long NPI/NPR will take are defined. • The handoff between teams is encoded in a machine-readable format. • Members of all teams understand their role as it fits into the larger, overall process. • The maximum number of products supported by each team is defined. • The list of each team’s currently supported products is available to all teams. Level 4: Managed • There are dashboards for observing NPI and NPR progress

pages: 280 words: 73,420

Crapshoot Investing: How Tech-Savvy Traders and Clueless Regulators Turned the Stock Market Into a Casino
by Jim McTague
Published 1 Mar 2011

But truth be told, the machines didn’t perform so well that day either. The NYSE had a rudimentary automated system called Designated Order Turnaround (DOT) for trading orders of up to 2,099 shares. Orders sent electronically over the system from brokerage firms to the floor of the NYSE were printed on machine-readable cards and then taken by a clerk to a specialist who in turn executed the order. When the trade was executed, the specialists marked the card and returned it to the clerk, who in turn put it into a reader that transmitted the order-confirmation back to the brokerage firm. There were 128 printers on the exchange floor.

pages: 220 words: 73,451

Democratizing innovation
by Eric von Hippel
Published 1 Apr 2005

He simply expressed his private motivation in a message he posted on July 3, 1991, to the USENET newsgroup comp.os.minix (Wayner 2000): Hello netlanders, Due to a project I’m working on (in minix), I’m interested in the posix standard definition. [Posix is a standard for UNIX designers. A software using POSIX is compatible with other UNIX-based software.] Could somebody please point me to a (preferably) machine-readable format of the latest posix-rules? Ftp-sites would be nice. In response, Torvalds got several return messages with Posix rules and people expressing a general interest in the project. By the early 1992, several skilled programmers contributed to Linux and the number of users increased by the day.

pages: 255 words: 78,207

Web Scraping With Python: Collecting Data From the Modern Web
by Ryan Mitchell
Published 14 Jun 2015

Like its predecessor, Pillow allows you to easily import and manipulate images with a variety of filters, masks, and even pixel-specific transformations: from PIL import Image, ImageFilter kitten = Image.open("kitten.jpg") blurryKitten = kitten.filter(ImageFilter.GaussianBlur) blurryKitten.save("kitten_blurred.jpg") blurryKitten.show() In the preceding example, the image kitten.jpg will open in your default image viewer with a blur added to it and will also be saved in its blurrier state as kitten_blurred.jpg in the same directory. We will use Pillow to perform preprocessing on images to make them more machine readable but as mentioned before, there are many other things you can do with the library aside from these simple manipulations. For more information, check out the Pillow documentation. 162 | Chapter 11: Image Processing and Text Recognition Tesseract Tesseract is an OCR library. Sponsored by Google (a company obviously well known for its OCR and machine learning technologies), Tesseract is widely regarded to be the best, most accurate, open source OCR system available.

pages: 268 words: 75,850

The Formula: How Algorithms Solve All Our Problems-And Create More
by Luke Dormehl
Published 4 Nov 2014

He points to the likes of hedge funds that make large bets whose outcomes are predicated on legal results. Using sophisticated legal modeling algorithms to predict verdicts in such a scenario could directly result in vast quantities of money being made. Other legal scholars have also suggested that data mining could be used to reveal a granularity of wrongfulness in criminal trials, using machine-readable criteria to calculate the exact extent of a person’s guilt. One imagines that it will not be too long before neuroscience (currently in the throes of an algorithmic turn) seeks to establish the exact degree to which a person is free to comply with the law, arguing over questions like determinism versus voluntarism.

pages: 491 words: 77,650

Humans as a Service: The Promise and Perils of Work in the Gig Economy
by Jeremias Prassl
Published 7 May 2018

Terms and conditions of employment were not to be changed as a result of a business transfer, and workers’ acquired rights transferred from one employer to the next.58 In terms of system design, a helpful starting point here could be the General Data Protection Regulation (GDPR), adopted in April 2016. One of the key consumer-protective elements of the legislation is so-called data portability: an individual’s ‘right to receive the personal data concerning him or her . . . in a structured, commonly used and machine-readable for- mat’, including ‘the right to transmit those data to another controller with- out hindrance from the controller to which the personal data have been provided’.59 Portable ratings could operate along similar lines, with stand- ardized metrics accounting for experience, customer friendliness, and work quality, and additional task-specific ratings taking account of task-specific skills, from driving to assembling flatpack furniture.

pages: 305 words: 75,697

Cogs and Monsters: What Economics Is, and What It Should Be
by Diane Coyle
Published 11 Oct 2021

There is some reason to believe that the monster is still running amok in the financial markets, thanks to ‘algos’ (short for algorithm) carrying out ultra-high-frequency trading (HFT). Ultra-high frequency means transactions at intervals of 650 milliseconds or less. This activity has a cluster of support services such as businesses selling ‘the fastest machine-readable economic data and corporate news’, and ‘global proximity hosting’. The latter term refers to traders’ need to locate their computer servers close to the computer servers of the exchanges on which they are trading. The reason is that at a nano-second (millionth of a millisecond) timescale, the speed of light becomes an important physical obstacle.

pages: 196 words: 71,157

A l'ombre des jeunes filles en fleurs - Première partie
by Marcel Proust
Published 1 Dec 2001

Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg-tm depends upon and cannot survive without wide spread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine readable form accessible by the widest array of equipment including outdated equipment. Many small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States.

pages: 619 words: 210,746

Reversing: Secrets of Reverse Engineering
by Eldad Eilam
Published 15 Feb 2005

Depending on the high-level language, this machine code can either be a standard platform-specific object code that is decoded directly by the CPU or it can be encoded in a special platform-independent format called bytecode (see the following section on bytecodes). Compilers of traditional (non-bytecode-based) programming languages such as C and C++ directly generate machine-readable object code from the textual source code. What this means is that the resulting object code, when translated to assembly language by a disassembler, is essentially a machinegenerated assembly language program. Of course, it is not entirely machinegenerated, because the software developer described to the compiler what needed to be done in the high-level language.

The output representation is usually a lower-level translation of the same program. Such lower-level representation is usually read by hardware or software, and rarely by people. The bottom line is usually that compilers transform programs from their high-level, human-readable form into a lower-level, machine-readable form. During the translation process, compilers usually go through numerous improvement or optimization steps that take advantage of the compiler’s “understanding” of the program and employ various algorithms to improve the code’s efficiency. As I have already mentioned, these optimizations tend to have a strong “side effect”: they seriously degrade the emitted code’s readability.

pages: 792 words: 48,468

Tcl/Tk, Second Edition: A Developer's Guide
by Clif Flynt
Published 18 May 2003

If not caught, display the informationalString and stack trace and abort the script evaluation. informationalString Information about the error condition. Info A string to initialize the errorInfo string. Note that the Tcl interpreter may append more information about the error to this string. Code A machine-readable description of the error that occurred. This will be saved in the global errorCode variable. The next example shows some ways of using the catch and error commands. Example 6.3 Script Example proc errorProc {first second} { global errorInfo # $fail will be non-zero if $first is non-numeric. set fail [catch {expr 5 * $first} result] # if $fail is set, generate an error if {$fail} { error “Bad first argument” } # This will fail if $second is non-numeric or 0 set fail [catch {expr $first/$second} dummy] if {$fail} { error “Bad second argument” \ “second argument fails math test\n$errorInfo” } error “errorProc always fails” “evaluating error” \ [list USER {123} {Non-Standard User-Defined Error}] } # Example Script puts “call errorProc with a bad first argument” set fail [catch {errorProc X 0} returnString] 6.3 Exception Handling and Introspection if {$fail} { puts “Failed in errorProc” puts “Return string: $returnString” puts “Error Info: $errorInfo\n” } puts “call errorProc with a 0 second argument” if {[catch {errorProc 1 0} returnString]} { puts “Failed in errorProc” puts “Return string: $returnString” puts “Error Info: $errorInfo\n” } puts “call errorProc with valid arguments” set fail [catch {errorProc 1 1} returnString] if {$fail} { if {[string first USER $errorCode] == 0} { puts “errorProc failed as expected” puts “returnString is: $returnString” puts “errorInfo: $errorInfo” } else { puts “errorProc failed for an unknown reason” } } Script Output call errorProc with a bad first argument Failed in errorProc Return string: Bad first argument Error Info: Bad first argument while executing “error “Bad first argument”” (procedure “errorProc” line 10) invoked from within “errorProc X 0” call errorProc with a 0 second argument Failed in errorProc Return string: Bad second argument Error Info: second argument fails math test divide by zero while executing “expr $first/$second” (procedure “errorProc” line 15) 151 152 Chapter 6  Building Complex Data Structures with Lists and Arrays invoked from within “errorProc 1 0” call errorProc with valid arguments errorProc failed as expected returnString is: errorProc always fails errorInfo: evaluating error (procedure “errorProc” line 1) invoked from within “errorProc 1 1” Note the differences in the stack trace returned in errorInfo in the error returns.

If your application needs to include information that is already in the errorInfo variable, you can append that information by including $errorInfo in your message, as done with the second test. The errorInfo variable contains what should be human-readable text to help a developer debug a program. The errorCode variable contains a machine-readable description to enable a script to handle exceptions intelligently. The errorCode data is a list in which the first field identifies the class of error (ARITH, CHILDKILLED, POSIX, and so on), and the other fields contain data related to this error. The gory details are in the on-line manual/help pages under tclvars.

Cultural Backlash: Trump, Brexit, and Authoritarian Populism
by Pippa Norris and Ronald Inglehart
Published 31 Dec 2018

In these cases, other standard reference sources were used to classify populist parties, including Tim Immerzeel, Marcel Lubbers, and Hilde Coffé. 2011. Expert Judgment Survey of European Political Parties. Utrecht: Utrecht University; Marcel Part III From Values to Votes 253 Lubbers. 2000 [principal investigator]. Expert Judgment Survey of Western-­ European Political Parties 2000 [machine readable data set]. Nijmegen, the Netherlands: NWO, Department of Sociology, University of Nijmegen; Tim Immerzeel, Marcel Lubbers, and Hilde Coffé. 2016. ‘Competing with the radical right: Distances between the European radical right and other parties on typical radical right issues.’ Party Politics 22 (6): 823–834. 47.

The Politics of Unreason: Rightwing Extremism in America 1790–1977. 2nd edn. Chicago: University of Chicago Press. Lipset, Seymour Martin and Stein Rokkan. 1967. Party Systems and Voter Alignments. New York: Free Press. Lubbers, Marcel. 2000. [principal investigator] Expert Judgment Survey of Western-European Political Parties 2000 [machine readable data set]. Nijmegen, the Netherlands: NWO, Department of Sociology, University of Nijmegen. Lubbers, Marcel, Merove Gijsberts, and Peer Scheepers. 2002. ‘Extreme rightwing voting in Western Europe.’ European Journal of Political Research, 41(3): 345–378. Lubbers, Marcel and Peer Scheepers. 2000.

pages: 371 words: 78,103

Webbots, Spiders, and Screen Scrapers
by Michael Schrenk
Published 19 Aug 2009

For example, if you rearranged the data in Listing 26-11, the webbot would still interpret it correctly. The same could not be said for the XML data. And while the protocol is slightly less platform independent than XML, most computer programs are still capable of interpreting the data, as done in the example PHP script in Listing 26-12. SOAP No discussion of machine-readable interfaces is complete without mentioning the Simple Object Access Protocol (SOAP). SOAP is designed to pass instructions and data between specific types of web pages (known as web services) and scripts run by webbots, webservers, or desktop applications. SOAP is the successor of earlier protocols that make remote application calls, like Remote Procedure Call (RPC), Distributed Component Object Model (DCOM), and Common Object Request Broker Architecture (CORBA).

pages: 261 words: 78,884

$2.00 A Day: Living on Almost Nothing in America
by Kathryn Edin and H. Luke Shaefer
Published 31 Aug 2015

.” [>] was white: Our student research collaborator Vincent Fusaro, a doctoral student in social work and political science at the University of Michigan, analyzed data from the harmonized version of the March Current Population Survey (CPS) Supplement from 1968 to 1995 (Miriam King, Steven Ruggles, J. Trent Alexander, Sarah Flood, Katie Genadek, Matthew B. Schroeder, Brandon Trampe, and Rebecca Vick, Integrated Public Use Microdata Series, Current Population Survey: Version 3.0 [machine-readable database] [Minneapolis: University of Minnesota, 2010]). He produced annual estimates of the demographic characteristics of AFDC recipients. In only one year was the proportion of household heads receiving AFDC who were black slightly above 40 percent. In every year between 1968 and 1995, a majority of the recipients were white. [>] increased long-term poverty: Murray, Losing Ground. [>] “spider’s web of dependency”: Ronald Reagan, State of the Union address, February 4, 1986, http://www.presidency.ucsb.edu/ws/?

pages: 291 words: 81,703

Average Is Over: Powering America Beyond the Age of the Great Stagnation
by Tyler Cowen
Published 11 Sep 2013

Rather than “reading articles,” we will consult the programs to spit out the results of their meta-studies, summarizing the research work to date, much as Rybka spits out an evaluation of a chess position. What used to be an individual journal article will become an input into the programs. The “expert” might be someone trained in making sense of the machine’s output, or turning the data into machine-readable form, rather than someone who does the actual work generating the estimates. That’s the single biggest change in economic science we can expect over the next fifty years. When it comes to “the new paradigm,” a lot of people are expecting the next Marx, Keynes, or Hayek. The changes to come will be more radical than that and they will challenge the very relationship that the scientist has to his or her craft of study.

pages: 308 words: 84,713

The Glass Cage: Automation and Us
by Nicholas Carr
Published 28 Sep 2014

“If the computational system is invisible as well as extensive,” the lab’s chief technologist, Mark Weiser, wrote in a 1999 article in IBM Systems Journal, “it becomes hard to know what is controlling what, what is connected to what, where information is flowing, [and] how it is being used.”13 We’d have to place a whole lot of trust in the people and companies running the system. The excitement about ubiquitous computing proved premature, as did the anxiety. The technology of the 1990s was not up to making the world machine-readable, and after the dot-com crash, investors were in no mood to bankroll the installation of expensive microchips and sensors everywhere. But much has changed in the succeeding fifteen years. The economic equations are different now. The price of computing gear has fallen sharply, as has the cost of high-speed data transmission.

pages: 304 words: 82,395

Big Data: A Revolution That Will Transform How We Live, Work, and Think
by Viktor Mayer-Schonberger and Kenneth Cukier
Published 5 Mar 2013

Arguing that governments are only custodians of the information they collect, and that the private sector and society will be more innovative, advocates of open data call on official bodies to publicly release data for purposes both civic and commercial. To work, of course, the data must be in a standardized, machine-readable form so it can be easily processed. Otherwise, the information might be considered public only in name. The idea of open government data got a big boost when President Barack Obama, on his first full day in office on January 21, 2009, issued a presidential memorandum ordering the heads of federal agencies to release as much data as possible.

pages: 302 words: 85,877

Cult of the Dead Cow: How the Original Hacking Supergroup Might Just Save the World
by Joseph Menn
Published 3 Jun 2019

The best known put a secure operating system on a memory card; the software would function properly even if the overall computer were compromised. Its features included an unchangeable logging system. The software would have been among the best possible defenses to the mass surveillance revealed by Snowden. Google did not release a finished version before Mudge left for a new venture: a nonprofit to examine code from binaries, the machine-readable instructions that programs give to computers, and score them based on standard safety features. Mudge and his wife Sarah’s Cyber Independent Testing Lab functioned like the labs at Consumer Reports, scanning for the digital equivalent of automatic brakes and seat belts, all without needing access to the source code.

pages: 254 words: 81,009

Busy
by Tony Crabbe
Published 7 Jul 2015

Ann Burnett, Denise Gorsline, Julie Semlak, and Adam Tyma, “Earning the Badge of Honor: The Social Construction of Time and Pace of Life.” Paper presented at the Annual Meeting of the NCA 93rd Annual Convention, TBA, Chicago, IL, November 14, 2007. 5. Tom W. Smith et al., General Social Surveys, 1972–2010 (machine-readable data file) (Chicago: National Opinion Research Center, 2011), in Brigid Schulte, Overwhelmed: Work, Love, and Play When No One Has the Time (New York: Sarah Crichton Books, 2014). 6. B. S. McEwen, “Allostasis and allostatic load: implications for neuropsychopharmacology,” Neuropsychopharmacology 22 (2000): 108–24. 7.

pages: 321

Finding Alphas: A Quantitative Approach to Building Trading Strategies
by Igor Tulchinsky
Published 30 Sep 2019

Sometimes vendors do parsing and processing before providing data to their clients; fundamental data is an example. For unstructured yet sophisticated data, such as news, Twitter posts, and so on, vendors typically apply natural language processing techniques to analyze the content of the raw data. They provide machine-readable data to their clients instead of raw data that is only human-readable. Some vendors even sell alpha models directly – this means the data itself is the output of alpha models. The clients need only to load the data and trade according to it. Such alpha models are risky, however, because they may be overfit and/or overcrowded, with many clients of the same vendor trading the same model.

pages: 326 words: 84,180

Dark Matters: On the Surveillance of Blackness
by Simone Browne
Published 1 Oct 2015

So when Mayer instructs her viewers that when it comes to facial recognition, it “isn’t about blending in” but rather “sticking out, yet remaining undetected” and that “black lipstick is a great way to cover lots of surface on your face quickly,” she points out the productive possibilities that come from being unseen, where blackness, in this case applying black makeup, could be subversive in its capacity to distort and interfere when it comes to machine readability and standard algorithms. For this digital camouflage technique to be most effective, meaning for cv Dazzle to render the subject either unrecognized or a false match, it is often a matter of contrast. Adam Harvey’s cv Dazzle Look Book features mainly white-looking women with hair styled in dissymmetrical ways that work to partially conceal their facial features and certain facial landmarks, like the space between the eyes.

pages: 288 words: 86,995

Rule of the Robots: How Artificial Intelligence Will Transform Everything
by Martin Ford
Published 13 Sep 2021

As far back as 2013, a group of physicists studied financial markets and published a paper in the journal Nature declaring that “an emerging ecology of competitive machines featuring ‘crowds’ of predatory algorithms” existed and that algorithmic trading had perhaps already progressed beyond the control—and even comprehension—of the humans who designed the systems.20 Those algorithms now incorporate the latest advances in AI, their influence on markets has increased dramatically, and the ways in which they interact have grown even more incomprehensible. Many algorithms, for example, have the ability to tap directly into machine-readable news sources provided by companies like Bloomberg and Reuters and then trade on that information in tiny fractions of a second. When it comes to short-term moment-by-moment trading, no human being can begin to comprehend the details of what is unfolding, let alone attempt to outsmart the algorithms.

pages: 453 words: 79,218

Lonely Planet Best of Hawaii
by Lonely Planet

If you’re not sure whether something’s sacred, consider that in Hawaiian thinking, everything is sacred, especially in nature. oDon’t stack rocks or wrap them in ti leaves at waterfalls, heiau (temples) etc. This bastardization of the ancient Hawaiian practice of leaving hoʻokupu (offerings) at sacred sites is littering. Passports oA machine-readable passport (MRP) is required for all foreign citizens to enter the USA. oYour passport must be valid for six months beyond your expected dates of stay in the USA. oIf your passport was issued/renewed after October 26, 2006, you need an ‘e-passport’ with a digital photo and an integrated chip containing biometric data.

pages: 422 words: 86,414

Hands-On RESTful API Design Patterns and Best Practices
by Harihara Subramanian
Published 31 Jan 2019

username=my_username --header 'Content-Type: application/json' --header 'Accept: application/json' We can conclude the following: Support media type selection using a query parameter: To support clients with simple links and debugging, REST APIs should support media type selection through a query parameter named accept, with a value format that mirrors that of the accept HTTP request header An example is REST APIs should prefer a more precise and generic approach as following media type, using the GET https://swapi.co/api/planets/1/?format=json query parameter identification over the other alternatives Windows OS users can use MobaXterm (https://mobaxterm.mobatek.net/) or any SSH clients that supports Unix commands. Representations As we know, machine-readable description of a resource's current state with a request or response is a representation, and it can be in different formats. The following section discusses the rules for most common resource formats, such as JSON and hypermedia, and error types in brief. Message body format REST API communications in the distributed environment are most often as a text-based format, and we will discuss the JSON text-format representation rules as follows: Use JSON for resource representation and it should be well-formed You may use XML and other formats as well Don't create additional envelopes or any custom transport wrappers and leverage only HTTP envelopes Hypermedia representation As we have understood from Chapter 1, Introduction to the Basics of RESTful Architecture, REST API clients can programmatically navigate using a uniform link structure as a HATEOAS response, and following are a few rules related to hypermedia representations.

pages: 708 words: 223,211

The Friendly Orange Glow: The Untold Story of the PLATO System and the Dawn of Cyberculture
by Brian Dear
Published 14 Jun 2017

And I think that’s why he and Don would get along so extraordinarily well.” The “turn ’em loose” recruiting style led to great things. Steve Singer, Hanson says, “wrote an entire compiler from scratch, he was quite amazing.” The compiler (a system program that takes a human-readable programming language and converts it into machine-readable form for execution by the central processor) was called CATO, standing for, appropriately enough, “Compiler for Automatic Teaching Operations.” Frampton, like so many Uni kids would do in coming years, had one day wandered over to CSL and marveled at the work being done in the lab. He was particularly impressed with the Cornfield project.

Joseph’s College in Indiana, and went on to Ohio State University for a master’s and a PhD in music theory. The research for both graduate degrees involved extensive use of computers. For his master’s thesis, he wrote a program that enabled a person to type a musical score into a keypunch machine using a special, machine-readable code. The system then analyzed the music. For his 1973 doctoral dissertation he constructed a computerized, musical equivalent of Professor Henry Higgins: given a piece of music, the system could determine what country it was written in by analyzing its stylistic traits. He took sixteen string quartets: four from Germany, four from France, four from Czechoslovakia, and four from Russia.

The rough guide to the Grand Canyon
by Greg Ward and Rough Guides
Published 27 May 2003

IRLINES© # " 4*$4 6OEFSUIFVISAªWAIVERªPROGRAM QBTTQPSUIPMEFSTGSPN#SJUBJO *SFMBOE "VTUSBMJB  /FX;FBMBOE BOENPTU&VSPQFBODPVOUSJFTEPOPUSFRVJSFWJTBTGPSUSJQTUPUIF 64 JODMVEJOH)BXBJJ TPMPOHBTUIFZTUBZMFTTUIBOOJOFUZEBZTJOUIF64BOEIBWF BOPOXBSEPSSFUVSOUJDLFU4JNQMZmMMJOUIFWJTBXBJWFSGPSNEJTUSJCVUFEPOZPVS JODPNJOHQMBOF BOEB64JNNJHSBUJPOPGmDFSXJMMDIFDLUIFDPNQMFUFEGPSNBUZPVS QPJOUPGBSSJWBM 'PSZPVUPCFFMJHJCMFGPSBWJTBXBJWFS ZPVSQBTTQPSUNVTUCFMACHINE READABLE  BOEBMMDIJMESFONVTUIBWFJOEJWJEVBMQBTTQPSUT)PMEFSTPGPMEFS VOSFBEBCMFQBTT QPSUTNVTUFJUIFSPCUBJOOFXPOFTPSBQQMZGPSWJTBTQSJPSUPUSBWFM'PSGVMMEFUBJMT  WJTJUIUUQUSBWFMTUBUFHPWWXQ TOWN©ALONG©THE©INTERSTATE ©4HE©kNAL©©MILES© SOUTH© FROM© *ACOB© ,AKE© TO© THE© .ORTH© 2IM© ARE© ON« !

pages: 398 words: 86,855

Bad Data Handbook
by Q. Ethan McCallum
Published 14 Nov 2012

RPM Revenue Per 1,000 Impressions (usually ad impressions). CTR Click Through Rate—Ratio of Clicks to Impressions. Used as a measure of the success of an advertising campaign or content recommendation. XML Extensible Markup Language—Text-based markup language designed to be both human and machine-readable. JSON JavaScript Object Notation—Lightweight text-based open standard designed for human-readable data interchange. Natively supported by JavaScript, so often used by JavaScript widgets on websites to communicate with back-end servers. CSV Comma Separated Value—Text file containing one record per row, with fields separated by commas.

pages: 389 words: 87,758

No Ordinary Disruption: The Four Global Forces Breaking All the Trends
by Richard Dobbs and James Manyika
Published 12 May 2015

More than 90 percent of eBay commercial sellers export to other countries, compared with an average of less than 25 percent of traditional small businesses.28 12 The Disruptive Dozen Twelve technologies have massive potential for disruption in the coming decade CHANGING THE BUILDING BLOCKS OF EVERYTHING 1.Next-generation genomics Fast, low-cost gene sequencing, advanced big data analytics, and synthetic biology (“writing” DNA) 2.Advanced materials Materials designed to have superior characteristics (e.g., strength, weight, conductivity) or functionality RETHINKING ENERGY COMES OF AGE 3.Energy storage Devices or systems that store energy for later use, including batteries 4.Advanced oil and gas exploration and recovery Exploration and recovery techniques that make extraction of unconventional oil and gas economical 5.Renewable energy Generation of electricity from renewable sources with reduced harmful climate impact MACHINES WORKING FOR US 6.Advanced robotics Increasingly capable robots with enhanced senses, dexterity, and intelligence used to automate tasks or augment humans 7.Autonomous and near-autonomous vehicles Vehicles that can navigate and operate with reduced or no human intervention 8.3-D printing Additive manufacturing techniques to create objects by printing layers of material based on digital models IT AND HOW WE USE IT 9.Mobile Internet Increasingly inexpensive and capable mobile computing devices and Internet connectivity 10.Internet of things Networks of low-cost sensors and actuators for data collection, monitoring, decision making, and process optimization 11.Cloud technology Use of computer hardware and software resources delivered over a network or the Internet, often as a service 12.Automation of knowledge work Intelligent software systems that can perform knowledge work tasks involving unstructured commands and subtle judgments The data avalanche is set to become more powerful only because of a movement toward “open data,” in which data are freely shared beyond their originating organizations—including governments and businesses—in a machine-readable format at low cost. More than forty nations, including Canada, India, and Singapore, have committed to opening up their electronic data, everything from weather records and crime statistics to transport data. The excitement about open data has largely revolved around the potential to empower citizens and improve the delivery of public services, ranging from urban transportation to personalized health care.

pages: 285 words: 86,853

What Algorithms Want: Imagination in the Age of Computing
by Ed Finn
Published 10 Mar 2017

Their variations of pragmatism then inspire elaborate responses and counter-solutions, or what communication researcher Tarleton Gillespie calls the “tacit negotiation” we perform to adapt ourselves to algorithmic systems: we enunciate differently when speaking to machines, use hashtags to make updates more machine-readable, and describe our work in search engine-friendly terms.13 The tacit assumptions lurking beneath the pragmatist’s definition are becoming harder and harder to ignore. The apparent transparency and simplicity of computational systems are leading many to see them as vehicles for unbiased decision-making.

pages: 371 words: 93,570

Broad Band: The Untold Story of the Women Who Made the Internet
by Claire L. Evans
Published 6 Mar 2018

“We are now, twenty-seven years after the Web, living in a world that is driven by data,” Wendy reminds me. How and why that data are linked is becoming increasingly important, especially as we teach machines to interpret connections for us—in order for artificial intelligence to understand the Web, it will need an additional layer of machine-readable information on top of our documents, a kind of meta-Web that proponents call the Semantic Web. While humans might understand connections intuitively, and are willing to ignore when links rot or lead nowhere, computers require more consistent information about the source, the destination, and the meaning of every link.

pages: 692 words: 95,244

Speaking JavaScript: An In-Depth Guide for Programmers
by Axel Rauschmayer
Published 25 Feb 2014

This method is provided by the ECMAScript Internationalization API (see The ECMAScript Internationalization API) and does not make much sense without it. Date.prototype.toUTCString(): Tue, 30 Oct 2001 16:43:07 GMT Date and time, in UTC. Date.prototype.toGMTString(): Deprecated; use toUTCString() instead. Date and time (machine-readable) Date.prototype.toISOString(): 2001-10-30T16:43:07.856Z All internal properties show up in the returned string. The format is in accordance with Date Time Formats; the time zone is always Z. Date.prototype.toJSON(): This method internally calls toISOString(). It is used by JSON.stringify() (see JSON.stringify(value, replacer?

pages: 326 words: 91,532

The Pay Off: How Changing the Way We Pay Changes Everything
by Gottfried Leibbrandt and Natasha de Teran
Published 14 Jul 2021

In 1973 alone, credit card losses were estimated to be almost $300 million, or 1.15 per cent of sales. It turned out the answer had already been invented. One day in the early 1960s, Dorothea Parry had been busy with the housework when her husband Forrest, an engineer at IBM, came home to regale her with the problems he was having at work. He had been tasked with designing a machine-readable identity card for CIA officials. His plan was to attach a strip of magnetised tape to a plastic card, but glue warped the tape, making it unreadable. The resourceful Mrs Parry suggested he use an iron to melt the strip onto the card. It worked, and the magnetised data strip, or ‘magstripe’, was born.

The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do
by Erik J. Larson
Published 5 Apr 2021

RDF helped knowledge bases become, in essence, computational encyclopedias (as the late AI researcher John Haugeland once put it) with larger projects’ knowledge bases having thousands of triples. AI researchers hoped that the ease of use would encourage even non-experts to make triples—a dream articulated by Tim Berners-Lee, the creator of HTML. Berners-Lee called it the Semantic Web, because with web pages converted into machine-readable RDF statements, computers would know what everything meant. The web would be intelligently readable by computers. AI researchers touted knowledge bases as the end of brittle systems using only statistics—because, after all, statistics aren’t sufficient for understanding. The Semantic Web and other knowledge base–centered projects in AI could finally “know” about the world, and do more than just number-crunch.

pages: 1,065 words: 229,099

Real World Haskell
by Bryan O'Sullivan , John Goerzen , Donald Stewart and Donald Bruce Stewart
Published 2 Dec 2008

That would fail, because we haven’t written a main function, which GHC calls to start the execution of a standalone program. After ghc completes, if we list the contents of the directory, it should contain two new files: SimpleJSON.hi and SimpleJSON.o. The former is an interface file, in which ghc stores information about the names exported from our module in machine-readable form. The latter is an object file, which contains the generated machine code. Generating a Haskell Program and Importing Modules Now that we’ve successfully compiled our minimal library, we’ll write a tiny program to exercise it. Create the following file in your text editor and save it as Main.hs: -- file: ch05/Main.hs module Main () where import SimpleJSON main = print (JObject [("foo", JNumber 1), ("bar", JBool False)]) Notice the import directive that follows the module declaration.

Serialization with read and show You may often have a data structure in memory that you need to store on disk for later retrieval or to send across the network. The process of converting data in memory to a flat series of bits for storage is called serialization. It turns out that read and show make excellent tools for serialization. show produces output that is both human- and machine-readable. Most show output is also syntactically valid Haskell, though it is up to people that write Show instances to make it so. Parsing large strings String handling in Haskell is normally lazy, so read and show can be used on quite large data structures without incident. The built-in read and show instances in Haskell are efficient and implemented in pure Haskell.

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by Martin Kleppmann
Published 17 Apr 2017

The triple-store data model is completely independ‐ ent of the semantic web—for example, Datomic [40] is a triple-store that does not claim to have anything to do with it.vii But since the two are so closely linked in many people’s minds, we should discuss them briefly. The semantic web is fundamentally a simple and reasonable idea: websites already publish information as text and pictures for humans to read, so why don’t they also publish information as machine-readable data for computers to read? The Resource Description Framework (RDF) [41] was intended as a mechanism for different web‐ sites to publish data in a consistent format, allowing data from different websites to be automatically combined into a web of data—a kind of internet-wide “database of everything.”

It was started in 2009 as a subproject of Hadoop, as a result of Thrift not being a good fit for Hadoop’s use cases [21]. Avro also uses a schema to specify the structure of the data being encoded. It has two schema languages: one (Avro IDL) intended for human editing, and one (based on JSON) that is more easily machine-readable. Our example schema, written in Avro IDL, might look like this: record Person { string userName; union { null, long } favoriteNumber = null; array<string> interests; } The equivalent JSON representation of that schema is as follows: { "type": "record", "name": "Person", "fields": [ {"name": "userName", "type": "string"}, {"name": "favoriteNumber", "type": ["null", "long"], "default": null}, {"name": "interests", "type": {"type": "array", "items": "string"}} ] } First of all, notice that there are no tag numbers in the schema.

pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by Martin Kleppmann
Published 16 Mar 2017

The triple-store data model is completely independent of the semantic web—for example, Datomic [40] is a triple-store that does not claim to have anything to do with it.vii But since the two are so closely linked in many people’s minds, we should discuss them briefly. The semantic web is fundamentally a simple and reasonable idea: websites already publish information as text and pictures for humans to read, so why don’t they also publish information as machine-readable data for computers to read? The Resource Description Framework (RDF) [41] was intended as a mechanism for different websites to publish data in a consistent format, allowing data from different websites to be automatically combined into a web of data—a kind of internet-wide “database of everything.”

It was started in 2009 as a subproject of Hadoop, as a result of Thrift not being a good fit for Hadoop’s use cases [21]. Avro also uses a schema to specify the structure of the data being encoded. It has two schema languages: one (Avro IDL) intended for human editing, and one (based on JSON) that is more easily machine-readable. Our example schema, written in Avro IDL, might look like this: record Person { string userName; union { null, long } favoriteNumber = null; array<string> interests; } The equivalent JSON representation of that schema is as follows: { "type": "record", "name": "Person", "fields": [ {"name": "userName", "type": "string"}, {"name": "favoriteNumber", "type": ["null", "long"], "default": null}, {"name": "interests", "type": {"type": "array", "items": "string"}} ] } First of all, notice that there are no tag numbers in the schema.

pages: 400 words: 94,847

Reinventing Discovery: The New Era of Networked Science
by Michael Nielsen
Published 2 Oct 2011

This doesn’t just mean the information conventionally shared in scientific papers, but all information of scientific value, from raw experimental data and computer code to all the questions, ideas, folk knowledge, and speculations that are currently locked up inside the heads of individual scientists. Information not on the network can’t do any good. In an ideal world, we’d achieve a kind of extreme openness. That means expressing all our scientific knowledge in forms that are not just human-readable, but also machine-readable, as part of a data web, so computers can help us find meaning in our collective knowledge. It means opening the scientific community up to the rest of society, in a two-way exchange of information and ideas. It means an ethic of sharing, in which all information of scientific value is put on the network.

pages: 322 words: 99,066

The End of Secrecy: The Rise and Fall of WikiLeaks
by The "Guardian" , David Leigh and Luke Harding
Published 1 Feb 2011

The small company that licensed it, Tableau Software, removed the graphic from its public site – also feeling the pressure (though there was no direct contact) from Lieberman’s office. The dominoes then started to fall. The company EveryDNS, which provides free routing services (translating human-readable addresses such as wikileaks.org into machine readable internet addresses such as 64.64.12.170) terminated the wikileaks.org domain name. It also deleted all email addresses associated with it. Justifying the move, EveryDNS said the constant hacker attacks on WikiLeaks were inconveniencing other customers. In effect, WikiLeaks had now vanished from the web for anyone who couldn’t work out how to discover a numeric address for the site.

pages: 304 words: 22,886

Nudge: Improving Decisions About Health, Wealth, and Happiness
by Richard H. Thaler and Cass R. Sunstein
Published 7 Apr 2008

This would ensure that borrowers at least know what their payments will be when the teaser rate ends. It would be a good idea to add some kind of worst-case scenario information so that borrowers can see how much their payments could go up in the future. Lenders would also have to provide a machine-readable detailed RECAP report, one that incorporates all the fees and interest rate provisions, including teaser rates, what the variable-rate changes are linked to, caps on the changes per year, and so forth. This information would allow independent third parties to offer much better advice. Our strong hunch is that if the RECAP data were made available, third-party services would emerge to compare lenders.

pages: 349 words: 95,972

Messy: The Power of Disorder to Transform Our Lives
by Tim Harford
Published 3 Oct 2016

It seems as though actively encouraging staff to take control is far scarcer than the benign neglect of Building 20. Why is creativity something that happens only when the boss isn’t looking? Some clues come from the remarkable career of Robert Propst. Propst was a sculptor, painter, art teacher, and inventor of devices as varied as a vertical timber harvester and a machine-readable livestock tag. A trained chemical engineer, he’d spent the Second World War managing beachhead logistics in the South Pacific. In 1958, he was hired by the Herman Miller company, a manufacturer of office furniture. Herman Miller’s managers thought Propst was a genius.28 He was certainly an independent spirit: he stayed in Ann Arbor, Michigan, 150 miles away from Herman Miller’s headquarters in Zeeland, later persuading the company to set up a research division there to accommodate him.

pages: 364 words: 102,926

What the F: What Swearing Reveals About Our Language, Our Brains, and Ourselves
by Benjamin K. Bergen
Published 12 Sep 2016

Retrieved from http://www.livescience.com/16570-profanity-tv-video-games-teen-aggression.html. Williams, J. N. (1992). Processing polysemous words in context: Evidence for interrelated meanings. Journal of Psycholinguistic Research, 21(3), 193–218. Wilson, M. D. (1988). The MRC psycholinguistic database: Machine readable dictionary, version 2. Behavioural Research Methods, Instruments and Computers, 20(1), 6–11. Winslow, A. G. (1900 [1772]). Diary of Anna Green Winslow: A Boston school girl of 1771. Boston: Houghton, Mifflin and Company. Xu, J., Gannon, P. J., Emmorey, K., Smith, J. F., and Braun, A. R. (2009).

Cataloging the World: Paul Otlet and the Birth of the Information Age
by Alex Wright
Published 6 Jun 2014

In Paul Otlet et La Bibliologie, Fondateur du Mundaneum (1868–1944). Architecte du savoir, artisan de paix, 159–175. Bruxelles: Les Impressions Nouvelles, 2010. ———. “Web 2.0 and the Semantic Web in Research from a Historical Perspective. The Designs of Paul Otlet (1868–1944) for Telecommunication and Machine Readable Documentation to Organize Research and Society.” Knowledge Organization 36, no. 4 (2009): 214–26. Visite d’Andrew Carnegie au Palais Mondial-Mundaneum/1913 (BXLS). Le Mundaneum, 1913. http://www.youtube.com/watch?v=Q2T7mk16zqs&list=PLjMHaWxVFdUi4-133HuZ1dllZAvJ-3Nu&index=9. Von Bethmann-Hollweg, Chancellor Theobald.

pages: 332 words: 100,601

Rebooting India: Realizing a Billion Aspirations
by Nandan Nilekani
Published 4 Feb 2016

When we already have a centralized system capable of storing the Aadhaar data of 1.2 billion Indians, there is no reason why we can’t build a similar system for voter ID data as well. Transparency is a second, equally important attribute. As part of the government’s open-data initiative, voter lists should be made available for anyone to inspect in a machine-readable format, one that any computer can recognize and process. That is not the case today; voter rolls are available only in formats that computers cannot read, which means that you have to download the data manually and build your own software to read and analyse them, unless you are up to the task of sifting through hundreds upon thousands of entries yourself.

RDF Database Systems: Triples Storage and SPARQL Query Processing
by Olivier Cure and Guillaume Blin
Published 10 Dec 2014

For example, as a semi-structured data model, RDF data sets can be described with expressive schema languages, such as RDF Schema (RDFS) or Web Ontology Language (OWL), and can be linked to other documents present on the Web, forming the Linked Data movement. With the emergence of Linked Data, a pattern for hyperlinking machine-readable data sets that extensively uses RDF, URIs, and HTTP, we can consider that more and more data will be directly produced in or transformed into RDF. In 2013, the linked open data (LOD), a set of RDF data produced from open data sources, is considered to contain over 50 billion triples on domains as diverse as medicine, culture, and science, just to name a few.

pages: 268 words: 109,447

The Cultural Logic of Computation
by David Golumbia
Published 31 Mar 2009

The Emperor’s New Mind: Concerning Computers, Minds, and the Laws of Physics. New York: Oxford University Press. ———. 1994. Shadows of the Mind: A Search for the Missing Science of Consciousness. New York: Oxford University Press. Phillipson, Robert. 1992. Linguistic Imperialism. Oxford: Oxford University Press. Pichler, Alois. 1995. “Advantages of a Machine-Readable Version of Wittgenstein’s Nachlass.” In Johannessen and Nordenstam, eds., Culture and Value: Philosophy and the Cultural Sciences. The Austrian Ludwig Wittgenstein Society, Vienna. Pierrehumbert, Janet. 1993. “Prosody, Intonation, and Speech Technology.” In Bates and Weischedel (1993), 257–282.

The Paths Between Worlds: This Alien Earth Book One
by Paul Antony Jones
Published 19 Mar 2019

“What is it?” I asked, wondering at how perfectly the lines had been etched into the slate’s surface. “In anticipation of the memory loss I will undoubtedly experience when my battery is exhausted, I have inscribed all pertinent information of the events leading up to today onto the slate in machine-readable code. It includes your names and everything you have told me, and that I should trust what you tell me without question. It will save us all time, I believe. I will update the slate on a daily basis until I am able to find a more permanent solution to my memory loss.” “Wow!” I said, honestly impressed.

pages: 420 words: 100,811

We Are Data: Algorithms and the Making of Our Digital Selves
by John Cheney-Lippold
Published 1 May 2017

More generally, in this type of algorithmic regulation, we see how the current, almost obsessive push toward datafying everything and anything (like our fitness routines, ovulation cycles, sleep patterns, even the quality of our posture) becomes much more than a simple transcoding of life into machine-readable information. This process of datafication is also the process of materially connecting these data “points” with power. By checking up on us each and every time we make a datafied step, rather than only when we stand before a judge or a police officer detains us, power becomes exceptionally intimate and efficient.

Artificial Whiteness
by Yarden Katz

Department of Justice that recognized the “growing ecosystem of third-party intermediaries”—like “businesses, nonprofits, and news agencies”—that “are making use of open criminal justice data, investing time, money, and resources into processing data before use, prepping data through cleaning, standardizing and organizing, and linking and aggregating different data sets together.” It was recommended that “to encourage such reuse, data should be machine-readable and structured for interoperability.”25 This is essentially a call for more data collection, as well as the very “interoperability” that anticarceral activists argued enables mass deportation and incarceration. Through these prescriptions, critical experts normalize the prison-industrial complex.

pages: 358 words: 106,729

Fault Lines: How Hidden Fractures Still Threaten the World Economy
by Raghuram Rajan
Published 24 May 2010

So did the loan officer’s knowledge that his client would be back to haunt his conscience if he put him in an unaffordable house. But as investment banks put together gigantic packages of mortgages, the judgment calls became less and less important in credit assessments: after all, there was no way to code the borrower’s capacity to hold a job in an objective, machine-readable way.9 Indeed, recording judgment calls in a way that could not be supported by hard facts might have opened the mortgage lender to lawsuits alleging discrimination. All that seemed to matter to the investment banks and the rating agencies were the numerical credit score of the borrower and the amount of the loan relative to house value.

pages: 502 words: 107,657

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die
by Eric Siegel
Published 19 Feb 2013

Following the open data movement, often embracing a not-for-profit philosophy, many data sets are available online from fields like biodiversity, business, cartography, chemistry, genomics, and medicine. Look at one central index, www.kdnuggets.com/datasets, and you’ll see what amounts to lists of lists of data resources. The Federal Chief Information Officer of the United States launched Data.gov “to increase public access to high value, machine readable datasets generated by . . . the Government.” Data.gov sports over 390,000 data sets, including data about marine casualties, pollution, active mines, earthquakes, and commercial flights. Its growth is prescribed: a directive in 2009 obliged all U.S. federal agencies to post at least three “high-value” data sets.

pages: 354 words: 105,322

The Road to Ruin: The Global Elites' Secret Plan for the Next Financial Crisis
by James Rickards
Published 15 Nov 2016

After observing events from July through August 2007, I was convinced that this crisis dynamic really was unstoppable and would spread widely. Treasury needed to know and act soon. “You should issue an order requiring all banks and hedge funds to report their derivatives positions to you, in detail, with counterparty names, underlying instruments, payment flows, and termination dates. The information should be in standardized machine-readable form delivered one week from the date the order goes out. Anyone who can’t deliver should go to the top of your problem list. Once you get the information, hire IBM Global Services to process it for you in a secure environment so there’s no leakage. Build a matrix and find out who owes what to whom.

pages: 398 words: 107,788

Coding Freedom: The Ethics and Aesthetics of Hacking
by E. Gabriella Coleman
Published 25 Nov 2012

Sharkie’s and Matthew’s accounts were typical of developers who first learned about free software in the 1980s and early to mid-1990s, especially those who were young or students, without a steady income to pay for expensive software like compilers (a tool that transforms source code, written in a programming language, into machine readable binary code). Early in their relationship with this technology, most hackers developed a strong pragmatic and utilitarian commitment to free software. But the underlying philosophy underwent change as more developers started to attach and make their own meanings. Access to source code and the model of open development represented by Linux, they said, was a superior technical methodology.

pages: 345 words: 105,722

The Hacker Crackdown
by Bruce Sterling
Published 15 Mar 1992

Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg-tm depends upon and cannot survive without wide spread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine readable form accessible by the widest array of equipment including outdated equipment. Many small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States.

pages: 397 words: 102,910

The Idealist: Aaron Swartz and the Rise of Free Culture on the Internet
by Justin Peters
Published 11 Feb 2013

“I know I will never be happy without trying to see if I can change the world for the better in a major manner,” he wrote in 1990.59 That January, Hart traveled to the American Library Association’s midwinter meeting to proselytize for e-books and Project Gutenberg. There, he vowed, “There will be 10,000 Machine-Readable-Texts available by Dec. 31, 2000, even if I had to make them all myself.”60 * * * IN 1990, a British computer scientist named Tim Berners-Lee wrote an article for a house newsletter at CERN, a particle-physics laboratory in Switzerland. Berners-Lee programmed software at CERN, and, like many idealistic coders before him, he had become enamored of the Gospel of Richard Stallman.

pages: 370 words: 105,085

Joel on Software
by Joel Spolsky
Published 1 Aug 2004

Raymond has just written a long book about UNIX programming called The Art of UNIX Programming exploring his own culture in great detail.1 You can buy the book and read it on paper, or, if Raymond's politics are just too anti-idiotarian2 for you to consider giving him money, you can even read it online3 for free and rest assured that the author will not receive a penny for his hard work.4 Let's look at a small example. The UNIX programming culture holds in high esteem programs that can be called from the command line, that take arguments that control every aspect of their behavior, and whose output can be captured as regularly formatted, machine-readable plain text. Such programs are valued because they can be easily incorporated into other programs or larger software systems by programmers. To take one miniscule example, there is a core value in the UNIX culture, which Raymond calls "Silence is Golden," that a program that has done exactly what you told it to do successfully should provide no output whatsoever.5 It doesn't matter if you've just typed a 300-character command line to create a file system, or built and installed a complicated piece of software, or sent a manned rocket to the moon.

pages: 354 words: 26,550

High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems
by Irene Aldridge
Published 1 Dec 2009

MatLab and R have emerged as the industry’s most popular quantitative modeling choices. Internet-wide information-gathering software facilitates highfrequency fundamental pricing of securities. Promptly capturing rumors and news announcements enhances forecasts of short-term price moves. Thomson/Reuters has a range of products that deliver real-time news in a machine-readable format. Trading software incorporates optimal execution algorithms for achieving the best execution price within a given time interval through timing of trades, decisions on market aggressiveness, and sizing orders into optimal lots. New York–based MarketFactory provides a suite of software tools to help automated traders get an extra edge in the market, help their models scale, increase their fill ratios, reduce slippage, and thereby improve profitability (P&L).

pages: 461 words: 106,027

Zero to Sold: How to Start, Run, and Sell a Bootstrapped Business
by Arvid Kahl
Published 24 Jun 2020

Your customers may not need it all the time, but this data format is surprisingly relevant for retention. It's commonly used when your customers want to experiment with what using your data with another tool would be like. If you allow them to do this easily, it can further consolidate your product as an essential part of completing tasks. Produce machine-readable data like JSON. A service that can be accessed by humans and machines is twice as valuable as one that can only be used by people. Allow your more technical customers to use your product in a programmatical way, and the impact of your solution will increase significantly. After all, the types of solutions that other businesses can create by integrating your service as one of many are thousand-fold and extraordinary!

pages: 432 words: 106,612

Trillions: How a Band of Wall Street Renegades Invented the Index Fund and Changed Finance Forever
by Robin Wigglesworth
Published 11 Oct 2021

HBS itself had no computers available to students, so he would have to trek over the Charles River to MIT to get his regular fix. There, he met an MIT professor who was attempting to see if he could predict future stock market prices from past trading volumes and patterns. McQuown became his “data dog,” collecting the raw numbers of stock market prices from the magazine Barron’s, converting them into a machine-readable form, and testing the professor’s hypotheses through MIT’s IBM mainframe. McQuown graduated in 1961, but his background and interests raised eyebrows among some of the investment banks he interviewed with. “What the hell would an engineer want to do on Wall Street?” one interviewer inquired.7 These days, degrees in engineering, physics, and mathematics are de rigueur in finance.

pages: 416 words: 112,268

Human Compatible: Artificial Intelligence and the Problem of Control
by Stuart Russell
Published 7 Oct 2019

As with assistants for daily life, the up-front cost of creating the necessary general knowledge in each of these three areas amortizes across billions of users. In the case of health, for example, we all have roughly the same physiology, and detailed knowledge of how it works has already been encoded in machine-readable form.11 Systems will adapt to your individual characteristics and lifestyle, providing preventive suggestions and early warning of problems. In the area of education, the promise of intelligent tutoring systems was recognized even in the 1960s,12 but real progress has been a long time coming.

The Deep Learning Revolution (The MIT Press)
by Terrence J. Sejnowski
Published 27 Sep 2018

Barbara Oakley and I developed a popular MOOC called “Learning How to Learn” that teaches you how to become a better learner (figure 1.11) and a follow-up MOOC called “Mindshift” that teaches you how to reinvent yourself and change your lifestyle (both MOOCs will be described in chapter 12). As you interact with the Internet, you are generating big data about yourself that is machine readable. You are being targeted by ads generated The Rise of Machine Learning 23 Figure 1.11 “Learning How to Learn,” a massive open online course (MOOC) that teaches you how to become a better learner is the most popular MOOC on the Internet, with over 3 million learners. Courtesy of Terrence Sejnowski and Barbara Oakley.

pages: 382 words: 105,819

Zucked: Waking Up to the Facebook Catastrophe
by Roger McNamee
Published 1 Jan 2019

Accordingly, you have a right in a clear and transparent manner to: (1) consent or opt in when personal information is collected or shared with a third party and to limit the use of personal information not necessary to provide the requested service. (2) obtain, correct or delete personal data held by a company. (3) be notified immediately when a security breach is discovered. (4) be able to move data in useable, machine-readable format. I loved the simplicity and directness of this data bill of rights. Similar in objectives and approach to a bill introduced by Senators Klobuchar and Kennedy, as well as to Europe’s GDPR, Representative Lofgren’s proposal could be a valuable first step in an effort to regulate the platforms.

pages: 406 words: 105,602

The Startup Way: Making Entrepreneurship a Fundamental Discipline of Every Enterprise
by Eric Ries
Published 15 Mar 2017

A federal job guarantee might achieve similar outcomes: jacobinmag.com/​2017/​02/​federal-job-guarantee-universal-basic-income-investment-jobs-unemployment/. 21. nytimes.com/​2016/​12/​17/​business/​economy/​universal-basic-income-finland.html. 22. qz.com/​696377/​y-combinator-is-running-a-basic-income-experiment-with-100-oakland-families. 23. kauffman.org/​what-we-do/​resources/​entrepreneurship-policy-digest/​can-social-insurance-unlock-entrepreneurial-opportunities. 24. theatlantic.com/​business/​archive/​2016/​06/​netherlands-utrecht-universal-basic-income-experiment/​487883/; theguardian.com/​world/​2016/​oct/​28/​universal-basic-income-ontario-poverty-pilot-project-canada. 25. vox.com/​new-money/​2017/​2/​13/​14580874/​google-self-driving-noncompetes. 26. kauffman.org/​what-we-do/​resources/​entrepreneurship-policy-digest/​how-intellectual-property-can-help-or-hinder-innovation. 27. forbes.com/​2009/​08/​10/​government-internet-software-technology-breakthroughs-oreilly.html. 28. obamawhitehouse.archives.gov/​the-press-office/​2013/​05/​09/​executive-order-making-open-and-machine-readable-new-default-government-. 29. Chopra, Innovative State, pp. 121–22. 30. hbr.org/​2017/​02/​a-few-unicorns-are-no-substitute-for-a-competitive-innovative-economy. 31. site.warrington.ufl.edu/​ritter/​files/​2017/​06/​IPOs2016Statistics.pdf. 32. jstor.org/​stable/​1806983?seq=1#page_scan_tab_contents; larrysummers.com/​2017/​06/​01/​secular-stagnation-even-truer-today. 33. techcrunch.com/​2017/​06/​28/​a-look-back-at-amazons-1997-ipo. 34. niskanencenter.org/​blog/​future-liberalism-politicization-everything/.

pages: 392 words: 108,745

Talk to Me: How Voice Computing Will Transform the Way We Live, Work, and Think
by James Vlahos
Published 1 Mar 2019

Some of the latest generative techniques were derived from advances in machine translation, so let’s detour briefly to explain those. The classic technique is for computers to start by analyzing sentences in a source language. The sentences are then transformed phrase by phrase into an interlingua, a machine-readable digital halfway house that encodes the linguistic information. Finally, sentences are converted from the interlingua into the target human language following all of the definitions and grammatical rules of that language. This process, which is known as “phrase-based statistical machine translation,” is every bit as onerous as it sounds.

pages: 405 words: 105,395

Empire of the Sum: The Rise and Reign of the Pocket Calculator
by Keith Houston
Published 22 Aug 2023

Founded in 1878 to make ticket punches for tram and bus conductors, the Sumlock Company* of postwar Britain was a parochial echo of the corporate giants on the other side of the Atlantic.1 Where IBM built gleaming mainframes, Sumlock peddled comptometers first developed in the nineteenth century.2 Where AT&T’s Bell Labs inhabited a neoclassical tower in Manhattan, Sumlock’s engineers toiled in a converted air-raid shelter in London’s suburban hinterlands.3 And where Burroughs, NCR, and General Electric worked with the Federal Reserve to develop a nationwide system of machine-readable checks, Sumlock fretted about competition from a chain of tea shops.4 On the flip side, the British firm was nothing if not adaptable. Sumlock had rebounded from a fire that leveled its factory at the end of the nineteenth century to become Britain’s premier manufacturer of taximeters and racecourse betting boards, or “totalizators.”

pages: 982 words: 221,145

Ajax: The Definitive Guide
by Anthony T. Holdener
Published 25 Jan 2008

XML, the eXtensible Markup Language, is an Internet-friendly format for data and documents, invented by the World Wide Web Consortium (W3C). The word Markup in the term denotes a way to express a document’s structure within the document itself. XML has its roots in the Standard Generalized Markup Language (SGML), which is used in publishing. HTML was an application of SGML to web publishing. 843 XML was created to do for machine-readable documents on the Web what HTML did for human-readable documents: provide a commonly agreed-upon syntax so that processing the underlying format becomes commonplace and documents are made accessible to all users. The current version of the W3C Recommendation is the XML 1.1 (Second Edition), published on September 29, 2006 and available at http://www.w3.org/ TR/xml11/.

If your friends had used number as you did to denote the phone number, and not phone, there would not have been a problem. However, as it is, this second file probably will not be usable by programs set up to work with the first file; from the program’s perspective, it is not valid. For validity to be a useful general concept, we need a machine-readable way to say what a valid document is; that is, which elements and attributes must be present and in what order. XML 1.0 achieves this by introducing document type definitions (DTDs). DTDs The purpose of a DTD is to express which elements and attributes are allowed in a certain document type and to constrain the order in which elements must appear within that document type.

Coastal California Travel Guide
by Lonely Planet

Depending on your country of origin, the rules for entering the USA keep changing. Double-check current visa requirements before arriving. ACurrently, under the US Visa Waiver Program (VWP), visas are not required for citizens of 38 countries for stays up to 90 days (no extensions) if you have a machine-readable passport (MRP) that meets current US standards and is valid for six months beyond your intended stay. ACitizens of VWP countries must still register online with the Electronic System for Travel Authorization (ESTA; https://esta.cbp.dhs.gov) at least 72 hours before travel. Once approved, ESTA registration ($14) is valid for up to two years or until your passport expires, whichever comes first.

If you drive into California across the border from Mexico or from the neighboring states of Oregon, Nevada or Arizona, you may have to stop for a quick questioning and inspection by California Department of Food & Agriculture (www.cdfa.ca.gov) agents. Passport AUnder the Western Hemisphere Travel Initiative (WHTI), all travelers must have a valid machine-readable passport (MRP) when entering the USA by air, land or sea. AThe only exceptions are for some US, Canadian and Mexican citizens traveling by land who can present other WHTI-compliant documents (eg pre-approved ‘trusted traveler’ cards). A regular driver's license is not sufficient. AAll foreign passports must meet current US standards and be valid for at least six months longer than your intended stay.

pages: 918 words: 260,504

Nature's Metropolis: Chicago and the Great West
by William Cronon
Published 2 Nov 2009

George Smith’s famous Chicago Marine and Fire Insurance Company was so successful that it supplied a sizable portion of Chicago’s circulating currency during the 1840s and early 1850s, behaving as much like a bank as an insurance company. It was able to do this partly because the state legislature had imposed steep restrictions on the ability of Illinois banks to issue notes. 37.Burrows, Fifty Years in Iowa, 183. 38.ICPSR, “Historical Demographic, Economic and Social Data: The United States, 1790–1970” (machine-readable dataset of census statistics). Statistical work for this book is based on an eleven-state subset of the master series, containing economic and demographic statistics for Michigan, Indiana, Illinois, Wisconsin, Minnesota, Iowa, Missouri, Kansas, Nebraska, North Dakota, and South Dakota between 1840 and 1900.

Peoria Board of Trade. Annual Reports. 1871–75. Poor’s Manual of Railroads. Prairie Farmer. 1848–1900. St. Louis Chamber of Commerce, Annual Reports. Sears, Roebuck and Company. Catalogs. U.S. Censuses of Population, Agriculture, and Manufactures. 1840–1900. (In original printed reports and in machine-readable datasets of the ICPSR.) U.S. Department of Agriculture. Yearbooks. 1885–1960. Western Monthly. Western Rural. 1866–80. Wisconsin Lumberman. 1870–80. Wisconsin State Agricultural Society. Transactions. 1870–90. Wisconsin State Grange. Proceedings and Bulletins of the Executive Committee. 1872–90.

pages: 390 words: 113,737

Someone comes to town, someone leaves town
by Cory Doctorow
Published 1 Jul 2005

Campbell Award for Best New Writer at the 2000 Hugo awards and his novel Down and Out in the Magic Kingdom (http://craphound.com/down/) won the Locus Award for Best First Novel the same year that his short story collection A Place So Foreign and Eight More (http://craphound.com/place/) won the Sunburst Award for best Canadian science fiction book. His other books include Eastern Standard Tribe (http://craphound.com/est/) and Rapture of the Nerds (with Charles Stross). Join my mailing list for infrequent notices of books, articles, stories and appearances. http://www.ctyme.com/mailman/listinfo/doctorow * * * Machine-readable metadata * * * <rdf:RDF xmlns="http://web.resource.org/cc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Work rdf:about="http://craphound.com/someone"> <dc:title>Someone Comes to Town, Someone Leaves Town</dc:title> <dc:date>2005-7-1</dc:date> <dc:description>A novel by Cory Doctorow </dc:description> <dc:creator><Agent> <dc:title>Cory Doctorow</dc:title> </Agent></dc:creator> <dc:rights><Agent> <dc:title>Cory Doctorow</dc:title> </Agent></dc:rights> <dc:type rdf:resource="http://purl.org/dc/dcmitype/Text" /> <license rdf:resource="http://creativecommons.org/licenses/by-nd-nc/1.0" /> </Work> <License rdf:about="http://creativecommons.org/licenses/by-nd-nc/1.0"> <requires rdf:resource="http://web.resource.org/cc/Attribution" /> <permits rdf:resource="http://web.resource.org/cc/Reproduction" /> <permits rdf:resource="http://web.resource.org/cc/Distribution" /> <prohibits rdf:resource="http://web.resource.org/cc/CommercialUse" /> <requires rdf:resource="http://web.resource.org/cc/Notice" /> </License> <rdf:RDF xmlns="http://web.resource.org/cc/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <License rdf:about="http://creativecommons.org/licenses/devnations/2.0/"> <permits rdf:resource="http://web.resource.org/cc/Reproduction" /> <permits rdf:resource="http://web.resource.org/cc/Distribution" /> <requires rdf:resource="http://web.resource.org/cc/Notice" /> <requires rdf:resource="http://web.resource.org/cc/Attribution" /> <permits rdf:resource="http://web.resource.org/cc/DerivativeWorks" /> <prohibits rdf:resource="http://web.resource.org/cc/HighIncomeNationUse" /> </License> </rdf:RDF> eof

pages: 463 words: 118,936

Darwin Among the Machines
by George Dyson
Published 28 Mar 2012

“It is the Elixir or Philosophers Stone, to which all Nations, and every thing within those Nations must be subservient, either by faire meanes or by foule.”31 An economy is a system that assigns numerical values to tangible and intangible things. These numbers, having a peculiar tendency, common to all numbers, of lending themselves to intelligent processing, start moving the things around. The history of money has been a step-by-step progression from things to numbers: numbers stamped on coins, numbers printed on banknotes, machine-readable codes on checks, coded electronic transfers between numbered accounts, credit-card numbers transferred over the phone, and now a host of competing forms of digital currency, represented by numbers alone. The relations between money and information go both ways: the flow of information conveys and represents money, and the flow of money conveys and represents information.

pages: 429 words: 114,726

The Computer Boys Take Over: Computers, Programmers, and the Politics of Technical Expertise
by Nathan L. Ensmenger
Published 31 Jul 2010

Backus spoke their language, published in their journals, and shared their disdain for coders and other “technicians.” Second, FORTRAN was designed specifically to solve the kinds of problems that interested academics. Its use of algebraic expressions greatly simplified the process of defining mathematical problems in machine-readable syntax. Finally, and perhaps most significantly, FORTRAN provided them more direct access to the computer. Its introduction “caused a partial revolution in the way in which computer installations were run because it became not only possible but quite practical to have engineers, scientists, and other people actually programming their own problems without the intermediary of a professional programmer.”27 The use of FORTRAN actually became the centerpiece of an ongoing debate about “open” versus “closed” programming “shops.”

pages: 352 words: 120,202

Tools for Thought: The History and Future of Mind-Expanding Technology
by Howard Rheingold
Published 14 May 2000

One innovation of Turing's stemmed from the fact that computers based on Boolean logic operate only on input that is in the form of binary numbers (i.e., numbers expressed in powers of two, using only two symbols), while humans are used to writing numbers in the decimal system (in which numbers are expressed in powers of ten, using ten symbols. Turing was involved in the writing of instruction tables that automatically converted human-written decimals to machine-readable binary digits. If basic operations like addition, multiplication, and decimal-to-binary conversion could be fed to the machine in terms of instruction tables, Turing saw that it would be possible to build up heirarchies of such tables. The programmer would no longer have to worry about writing each and every operational instruction, step by repetitive step, and would thus be freed to write programs for more complex operations.

pages: 409 words: 112,055

The Fifth Domain: Defending Our Country, Our Companies, and Ourselves in the Age of Cyber Threats
by Richard A. Clarke and Robert K. Knake
Published 15 Jul 2019

This software learns not just from what it sees on its endpoint, not just from what happens on other endpoints on the network, but, in a classic example of Metcalfe’s law, they learn from every endpoint on every network on which they are deployed. A second widespread use of AI today is in applications known as vulnerability managers. AI can intake machine-readable intelligence reports on new threats and can automatically prioritize those threats based upon what it already knows or can quickly find out about your network. For example, an AI-driven vulnerability manager could check to see whether you have already addressed the weakness the new attack exploits.

pages: 385 words: 112,842

Arriving Today: From Factory to Front Door -- Why Everything Has Changed About How and What We Buy
by Christopher Mims
Published 13 Sep 2021

Darin is leading me through a maze of conveyors—here he pauses to unlatch a three-foot section of one and lift it out of our path, like a raised drawbridge—to the heart of a fulfillment center that serves the whole of the U.S. Northeast. We pass people walking up and down aisles between walls of shelving stuffed, floor to ceiling, with blue bins. We pause to watch one employee, a middle-aged woman, work. She uses a bar code scanner to light up the machine-readable labels on the front of a cubby, plucks an item, then drops it in a plastic tote riding on a cart. Periodically, she wheels those carts to a nearby conveyor and places the blue tote into a stream of others, all flowing in the same direction. At the end of the conveyor, they disappear into some other part of the Byzantine clockwork that surrounds us.

pages: 1,172 words: 114,305

New Laws of Robotics: Defending Human Expertise in the Age of AI
by Frank Pasquale
Published 14 May 2020

They need to be addressed squarely, however, so that roboticists do not adopt emotional simulation as simply one more obvious entailment of well-established AI research. Even if the prospect of AI claiming rights, resources, and respect from humans seems far-fetched, the first steps toward it are degrading the quality of many of our current interactions. There is increasing pressure to simplify emotion’s measure or monitoring so it can become machine-readable and machine-expressible. Supposedly, this is done simply to serve us better. It is far easier, for example, to cross-correlate six Facebook reaction buttons with user behavior than the hundreds or thousands of subtle forms of disapproval, curiosity, intrigue, resignation, and so on that pass across users’ faces while scrolling through a newsfeed.

pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by Orly Lobel
Published 17 Oct 2022

The mainstream books were the winners of Newbery and Caldecott Medals from 1923 to 2019; diversity books were a set of books identified by the Association for Library Service to Children as highlighting diverse communities. The research used Google’s machine learning vision platform consisting of facial recognition, evaluation of skin color, and classification of race, gender, and age on the illustrations in these books; a “text-to-data pipeline” scanned pages to machine-readable text and searched for words expressing gender, nationality, and color. The researchers found that over the past century, mainstream books—those that children are more likely to be exposed to—still contain racial and gender biases. Most of the images in these mainstream children’s books are of white male characters; although images of female characters still appear more than text about female characters, women and girls continue to be underrepresented in mainstream children’s books.

pages: 444 words: 117,770

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma
by Mustafa Suleyman
Published 4 Sep 2023

This was the first time I had encountered evidence that language modeling was making real progress, and I quickly became fixated, reading hundreds of papers, deeply immersing myself in the burgeoning field. By the summer of 2020, I was convinced that the future of computing was conversational. Every interaction with a computer is already a conversation of sorts, just using buttons, keys, and pixels to translate human thoughts to machine-readable code. Now that barrier was starting to break down. Machines would soon understand our language. It was, and still is, a thrilling prospect. Long before the much-publicized launch of ChatGPT, I was part of the team at Google working on a new large language model that we called LaMDA, short for Language Model for Dialogue Applications.

California
by Sara Benson
Published 15 Oct 2010

If your passport does not meet current US standards, you’ll be turned back at the border, even if you’re from a VWP country and have travel authorization. If your passport was issued before October 26, 2005, it must be ‘machine readable’ (with two lines of letters, numbers and <<< at the bottom); if it was issued between October 26, 2005, and October 25, 2006, it must be machine-readable and include a digital photo; and if it was issued on or after October 26, 2006, it must be an ePassport containing a digital photo and an integrated RFID chip containing biometric data. Citizens from all non-VWP countries, as well as those whose passports are not machine-readable or otherwise don’t meet the current US standards, will need to wrangle a nonimmigrant visa from a US consulate or embassy abroad.

pages: 597 words: 119,204

Website Optimization
by Andrew B. King
Published 15 Mar 2008

announced that its web crawler would begin to process microformats and other forms of structured metadata, making it available to developers as a way to present richer search results. [34] For example, instead of a blue link and a plain text abstract, a search result for an electronic gizmo could contain a thumbnail of the device, its price, its availability, reviews, and perhaps a link to buy it immediately. The greater the percentage of search results that take advantage of metadata (from any search engine), the greater interest site owners have in structuring their sites accordingly. Metadata generally means machine-readable "data about data," which can take many forms. Perhaps the simplest form falls under the classification of microformats, [35] which can be as simple as a single attribute value such as nofollow, described in more detail in "Step 10: Build Inbound Links with Online Promotion," earlier in this chapter.

pages: 593 words: 118,995

Relevant Search: With Examples Using Elasticsearch and Solr
by Doug Turnbull and John Berryman
Published 30 Apr 2016

And you can identify feasibility risks that may arise as a result of hard-to-get information or information that doesn’t quite answer your users’ questions. Let’s first consider the content being searched—restaurants. At a minimum, a restaurant is represented as a name and an address. Usually this information is publicly available. Restaurant names and addresses that are more accurate and machine readable can be purchased from a business database provider. Restaurant names will be of obvious use, both in search and in the presentation to the user on the details page. You need to use the address for location information. You’ll use a geolocation service to convert restaurant addresses into their corresponding latitude-longitude coordinates.

pages: 561 words: 120,899

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant From Two Centuries of Controversy
by Sharon Bertsch McGrayne
Published 16 May 2011

During the Second World War, Warren Weaver of the Rockefeller Foundation was impressed with how “a multiplicity of languages impedes cultural interchange between the peoples of the earth and is a serious deterrent to international understanding.”6 Struck by the power of mechanized cryptography and by Claude Shannon’s new information theory, Weaver suggested that computerized statistical methods could treat translation as a cryptography problem. In the absence of computer power and a wealth of machine-readable text, Weaver’s idea lay fallow for decades. Ever since, the holy grail of translators has been a universal machine that can transform written and spoken words from one language into any other. As part of this endeavor, linguists like Noam Chomsky developed structural rules for English sentences, subjects, verbs, adjectives, and grammar but failed to produce an algorithm that could explain why one string of words makes an English sentence while another string does not.

pages: 320 words: 87,853

The Black Box Society: The Secret Algorithms That Control Money and Information
by Frank Pasquale
Published 17 Nov 2014

“Explanations” like “too many revolving accounts” or “time since last account opened too short” are reason codes;25 rather than explain what happened in a straightforward way, they simply name a factor in the decision. We know it was more important than other, unnamed factors, but we have little sense of how the weighing went. While the term code can connote law (as in the Internal Revenue Code) or software (which involves the “coding” of instructions into machine-readable formats), it can also suggest a deliberately hidden meaning.26 Someone sends a “coded message” in order to avoid detection, to keep third parties from understanding exactly what is going on. In algorithmic decision making, this third, mysterious aspect of code too often predominates. For example, with credit decisions, there are so many vague or confl icting reason codes that it is possible to rationalize virtually any decision.27 Maybe you have too many accounts open, maybe you have too few— either could contribute, at any given time, to a decision to reduce a credit score or reject an application. 150 THE BLACK BOX SOCIETY But what we really care about is that the data at the heart of the decision was right, and that it didn’t include illicit or unfair considerations.

Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data
by Dipanjan Sarkar
Published 1 Dec 2016

By definition, a body of text under analysis is often a document, and by applying various techniques we usually convert this document to a vector of words, which is a numeric array whose values are specific weights for each word that could either be its frequency, its occurrence, or various other depictions—some of which we will explore in Chapter 3. Often the text needs to be cleaned and processed to remove noisy terms and data, called text pre-processing. Once we have the data in a machine-readable and understandable format, we can apply relevant algorithms based on the problem to be solved at hand. The applications of text analytics are manifold. Some of the most popular ones include the following: Spam detection News articles categorization Social media analysis and monitoring Bio-medical Security intelligence Marketing and CRM Sentiment analysis Ad placements Chatbots Virtual assistants Summary Congratulations on sticking it out till the end of this long chapter!

pages: 960 words: 125,049

Mastering Ethereum: Building Smart Contracts and DApps
by Andreas M. Antonopoulos and Gavin Wood Ph. D.
Published 23 Dec 2018

. */ jumpi(tag_1, iszero(callvalue)) 0x0 dup1 revert tag_1: /* "Example.sol":115:125 msg.sender */ caller /* "Example.sol":99:112 contractOwner */ 0x0 dup1 /* "Example.sol":99:125 contractOwner = msg.sender */ 0x100 exp dup2 sload dup2 0xffffffffffffffffffffffffffffffffffffffff mul not and swap1 dup4 0xffffffffffffffffffffffffffffffffffffffff and mul or swap1 sstore pop /* "Example.sol":26:132 contract example {... */ dataSize(sub_0) dup1 dataOffset(sub_0) 0x0 codecopy 0x0 return stop sub_0: assembly { /* "Example.sol":26:132 contract example {... */ mstore(0x40, 0x60) 0x0 dup1 revert auxdata: 0xa165627a7a7230582056b99dcb1edd3eece01f27c9649c5abcc14a435efe3b... } The --bin-runtime option produces the machine-readable hexadecimal bytecode: 60606040523415600e57600080fd5b336000806101000a81548173 ffffffffffffffffffffffffffffffffffffffff 021916908373 ffffffffffffffffffffffffffffffffffffffff 160217905550603580605b6000396000f3006060604052600080fd00a165627a7a7230582056b... You can investigate what’s going on here in detail using the opcode list given in “The EVM Instruction Set (Bytecode Operations)”.

pages: 482 words: 121,173

Tools and Weapons: The Promise and the Peril of the Digital Age
by Brad Smith and Carol Ann Browne
Published 9 Sep 2019

The three companies announced the Open Data Initiative, launched a month later, designed to provide a technology platform and tools to enable organizations to federate data while continuing to own and maintain control of the data they share. It will include tech tools that organizations can use to identify and assess the useful data they already possess and put it into a machine-readable and structured format suitable for sharing. Perhaps as much as anything else, an open-data revolution will require experimentation to get this right. Before our dinner ended, I pulled up a chair next to Trunnell and asked what we might do together. I was especially intrigued by the opportunity to advance work that we at Microsoft were already pursuing with other cancer institutes in our corner of North America, including with leading organizations in Vancouver, British Columbia.

pages: 400 words: 121,988

Trading at the Speed of Light: How Ultrafast Algorithms Are Transforming Financial Markets
by Donald MacKenzie
Published 24 May 2021

Fragmentation: transactions in or changes in the order books for the same shares on different trading venues. 4. Related shares and other instruments: changes in the market for, e.g., shares whose price is correlated with that of the shares being traded. Note: A signal is a data pattern that informs an algorithm’s trading. A number of other classes of signal are in more specialized use, such as machine-readable corporate or macroeconomic news releases. There are many other sources of information used in trading, including automated analysis of social media “sentiment,” satellite data on, e.g., oil-tanker movements, etc. (although such data are often more useful to trading firms with longer time horizons than those of HFT).

pages: 580 words: 125,129

Androids: The Team That Built the Android Operating System
by Chet Haase
Published 12 Aug 2021

Instead, the original source code needs to be compiled into a different binary version for each different type of hardware you want to run it on. Separate compilers create unique executables for every type of machine on which the code will be run. Along comes Java. The Java compiler translates source code not into machine-readable code, but into an intermediate representation called bytecode. This code can be executed on any computer platform that has an additional piece of software running on it called a runtime. The runtime interprets the bytecode and translates it into the binary representation of that computer, essentially compiling it on the fly.

pages: 494 words: 121,217

Tracers in the Dark: The Global Hunt for the Crime Lords of Cryptocurrency
by Andy Greenberg
Published 15 Nov 2022

When his parents saw his interest in the machine and bought him a Commodore 64, his life was transformed. Soon, frustrated by the inelegant way the machine displayed text on the screen, he was coding his own word processor. The Commodore didn’t have a real compiler, the software that turns human-readable computer commands in a language like BASIC into machine-readable instructions. Instead, it interpreted the commands in his program one by one as it ran, which Gronager found far too slow and inefficient. Undaunted—and unaware that he was taking on an absurdly technical challenge for a self-taught middle schooler—Gronager began writing his programs directly in Assembly, the near-native language of the Commodore’s processor, the better to get his hands as close as possible to the computer’s real mechanics.

pages: 487 words: 124,008

Your Face Belongs to Us: A Secretive Startup's Quest to End Privacy as We Know It
by Kashmir Hill
Published 19 Sep 2023

On its home page, Venmo had an image of an iPhone, and on its screen was the Venmo newsfeed, showing users who publicly paid each other for things: “Jenny P charged Stephen H, ‘Lyft.’ ” “Raymond A charged Jessica J, ‘Power and Internet.’ ” “Steve M paid Thomas V, ‘nom nom nom.’ ”[*] This wasn’t just a little made-up product demo; it was an actual live feed of transactions in real time, written in a machine-readable format, and it included the users’ full names and links to their profile photos. “You could hit this one URL with a web script and get a hundred transactions,” Ton-That said of the scraper he built for the site. “And you could just keep hitting it all day. And each transaction was like ‘Here’s a photo you can download.’ ” Ton-That designed his computer program to visit the URL every two seconds.

pages: 945 words: 292,893

Seveneves
by Neal Stephenson
Published 19 May 2015

“Take the first one that’s unlabeled,” Moira said. “Leave the door open, please.” Dr. Andrada coughed as the chilly air made his throat spasm. He opened one of the small hatches and slid the sample rack into it. In the meantime Moira was using a handheld printer to generate a sticker identifying the sample in English, in Filipino, and in a machine-readable bar code language. Once Dr. Andrada had returned to the central module, she went up to the open hatch, verified that the sample rack was properly seated in the tubular cavity beyond, then closed the hatch and affixed the sticker to its front. Printed on the hatch was a unique identification number and a bar code conveying the same thing, which she zapped and then double-checked.

A yellow-striped tube stretched away into the part of Quarantine set aside for people who, like them, were returning from the surface of Earth. A few meters in, they were confronted by a one-way door, constructed so that only one person at a time could pass through it. Hanging on a rack nearby were a number of hard bracelets, color-coded to indicate that they were intended for Survey personnel, also striped with machine-readable glyphs. Kath Two selected one and ratcheted it around her wrist. After a moment, a red diode began to blink on its back and digits began to count time. She waved it at the door, which unlocked itself and allowed her to pass into the tube beyond. This part of the Q consisted essentially of plumbing: a snarl of human-sized pipes that drained people away from incoming ships and let them pool in separate reservoirs until they had passed muster.

pages: 598 words: 140,612

Triumph of the City: How Our Greatest Invention Makes Us Richer, Smarter, Greener, Healthier, and Happier
by Edward L. Glaeser
Published 1 Jan 2011

Richard Wright: The Life and Times. New York: Holt, 2001. Rucker, Walter C., and James N. Upton. Encyclopedia of American Race Riots. Westport, CT: Greenwood, 2007. Ruggles, Steven, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and Matthew Sobek. Integrated Public Use Microdata Series, ver. 5.0 (machine-readable database). Minneapolis: University of Minnesota, 2010. Ruskin, John. The Genius of John Ruskin: Selections from His Writings, ed. John D. Rosenberg. New York: Routledge, 1980; Charlottesville: University Press of Virginia, 1997. ———. The Works of John Ruskin. London: G. Allen, 1903. Russell, Josiah C.

Frommer's Denver, Boulder & Colorado Springs
by Eric Peterson
Published 1 Jan 2005

E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. (You can identify an e-Passport by the symbol on the bottom center cover of your passport.) If your passport doesn’t have this feature, you can still travel without a visa if it is a valid passport issued before October 26, 2005, and includes a machine-readable zone, or between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to www. travel.state.gov/visa. Citizens of all other countries must have (1) a valid passport that expires at least 6 months later than the scheduled end of their visit to the U.S., and (2) a tourist visa, which may be obtained without charge from any U.S. consulate.

pages: 666 words: 131,148

Frommer's Seattle 2010
by Karl Samson
Published 10 Mar 2010

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if it is a valid passport issued before October 26, 2005, and includes a machine-readable zone, or between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to http://travel.state.gov/visa. Canadian citizens may enter the United States without visas; they will need to show passports (if traveling by air) and proof of residence, however.

From Airline Reservations to Sonic the Hedgehog: A History of the Software Industry
by Martin Campbell-Kelly
Published 15 Jan 2003

Disclosing the code would make it relatively easy to re-engineer the 108 Chapter 4 software and produce a functional replica perfectly legally. For this reason, Informatics decided to rely on trade secret law for protection. It required its customers and its employees to sign non-disclosure agreements. The programs were kept secret, only machine-readable binary code being supplied to the customer. The product was supplied by a perpetual license, which could be revoked should the user make unauthorized copies for other installations or for non-purchasers. Informatics did, however, fully utilize copyright law to protect the supporting documentation for Mark IV as conventional literary works.

pages: 260 words: 130,109

Frommer's Kauai
by Jeanette Foster
Published 27 Feb 2004

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if it is a valid passport issued before October 26, 2005, and includes a machine-readable zone, or between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to http://travel.state.gov/visa. Canadian citizens may enter the United States without visas; they will need to show passports and proof of residence, however. Citizens of all other countries must have (1) a valid passport that expires at least 6 months later than the scheduled end of their visit to the U.S., and (2) a tourist visa.

pages: 452 words: 134,502

Hacking Politics: How Geeks, Progressives, the Tea Party, Gamers, Anarchists and Suits Teamed Up to Defeat SOPA and Save the Internet
by David Moon , Patrick Ruffini , David Segal , Aaron Swartz , Lawrence Lessig , Cory Doctorow , Zoe Lofgren , Jamie Laurie , Ron Paul , Mike Masnick , Kim Dotcom , Tiffiniy Cheng , Alexis Ohanian , Nicole Powers and Josh Levy
Published 30 Apr 2013

And since digital locks don’t work against determined attackers, the only way to keep files, programs, and keys out of wide circulation is to give rights holders the legal authority to demand that files be removed without court orders, to establish national censor walls that monitor Internet traffic and interdict requests for sites that rights holders have added to blacklists, and to ban tools that defeat any of this censorship. The Stop Online Piracy Act (SOPA) and the Protect Intellectual Property Act (PIPA), as well as related proposals, would ban the circumvention of Domain Name System (DNS) blocks and allow for IP blocking. DNS converts human-friendly Internet addresses (like ThePirateBay.se) into machine-readable numeric addresses (like 194.71.107.50). Efforts, like DNSSEC, to add a layer of security to DNS and detect and evade shenanigans at DNS servers would be illegal under SOPA and PIPA, as DNSSEC can’t (and shouldn’t be expected to) distinguish between the false DNS records doctored by a criminal, an oppressive government, and a record label.

pages: 752 words: 131,533

Python for Data Analysis
by Wes McKinney
Published 30 Dec 2011

XML and HTML: Web Scraping Python has many libraries for reading and writing data in the ubiquitous HTML and XML formats. lxml (http://lxml.de) is one that has consistently strong performance in parsing very large files. lxml has multiple programmer interfaces; first I’ll show using lxml.html for HTML, then parse some XML using lxml.objectify. Many websites make data available in HTML tables for viewing in a browser, but not downloadable as an easily machine-readable format like JSON, HTML, or XML. I noticed that this was the case with Yahoo! Finance’s stock options data. If you aren’t familiar with this data; options are derivative contracts giving you the right to buy (call option) or sell (put option) a company’s stock at some particular price (the strike) between now and some fixed point in the future (the expiry).

pages: 464 words: 127,283

Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia
by Anthony M. Townsend
Published 29 Sep 2013

All transit operators face the thorny problem of communicating schedules, delays, and arrival information to millions of riders. Apps provide a quick, cheap, flexible, intuitive, and convenient way to push both schedules and real-time updates to anyone with a smartphone. As of early 2012, over two hundred transit agencies in North America were publishing some form of schedule information using a machine-readable format called General Transit Feed Specification, developed in 2005 by Google engineer Chris Harrelson and Bibiana McHugh, a technology manager at Portland, Oregon’s Tri-Met transit authority.21 Unlike most contest-generated apps, transit apps have a huge preexisting market, making it possible to build viable businesses that leverage open government data.

Stocks for the Long Run, 4th Edition: The Definitive Guide to Financial Market Returns & Long Term Investment Strategies
by Jeremy J. Siegel
Published 18 Dec 2007

The firm wanted to investigate how well people had done investing in common stock and could not find reliable historical data. Professor Lorie teamed up with colleague Lawrence Fisher to build a database of securities data that could answer that question. With computer technology in its infancy, Lorie and Fisher created the Center for Research in Security Prices (CRSP, pronounced “crisp”) that compiled the first machine-readable file of stock prices dating from 1926 that was to become the accepted database for academic and professional research. The database currently contains all stocks traded on the New York and American Stock Exchanges and the Nasdaq. At the end of 2006, the market value of the 6,744 stocks was $18 trillion.

How I Became a Quant: Insights From 25 of Wall Street's Elite
by Richard R. Lindsey and Barry Schachter
Published 30 Jun 2007

A brief sermonette on how to avoid fooling yourself too badly is found in a talk I gave to a convention of computer scientists in 2002.13 The label quantitative suggests that we are talking about numerically driven strategies. In the Internet era, we find ourselves drinking from an JWPR007-Lindsey 26 May 7, 2007 16:12 h ow i b e cam e a quant information fire hose that includes prodigious amounts of text as well. The original quants were the first to exploit the machine-readable numerical data. Now, many are using computational language approaches to analyze text. The original customers for these technologies, again, were the military and civilian intelligence agencies. Their sources were clandestine intercepts, and later, Web content. Financial textual sources of interest include the usual news suspects, both specialized and general, and many sources of pre-news such as the SEC, the courts, and government agencies.

pages: 517 words: 139,477

Stocks for the Long Run 5/E: the Definitive Guide to Financial Market Returns & Long-Term Investment Strategies
by Jeremy Siegel
Published 7 Jan 2014

The firm wanted to investigate how well people had done investing in common stock, and it could not find reliable historical data. Professor Lorie teamed up with colleague Lawrence Fisher to build a database of securities data that could answer that question. With computer technology in its infancy, Lorie and Fisher created the Center for Research in Security Prices (CRSP, pronounced “crisp”) that compiled the first machine-readable file of stock prices dating from 1926 that was to become the accepted database for academic and professional research. The database currently contains all stocks traded on the New York and American Stock Exchanges and the Nasdaq. At the end of 2012, the market value of the nearly 5,000 stocks in the database was near $19 trillion.

pages: 455 words: 133,719

Overwhelmed: Work, Love, and Play When No One Has the Time
by Brigid Schulte
Published 11 Mar 2014

The institute found an increasing number of men in dual-earning couples experiencing conflict between the pressures of work and home, from 35 percent in 1977 to 60 percent in 2008, higher even than women, whose stress rose from 41 to 47 percent in the same period. http://familiesandwork.org/site/research/reports/newmalemystique.pdf. See also Brad Harrington, Fred Van Deusen, and Beth Humberd, The New Dad: Caring, Committed, and Conflicted (Boston: Boston College Center for Work & Family, 2011), www.bc.edu/content/dam/files/centers/cwf/pdf/FH-Study-Web-2.pdf. 8. Tom W. Smith et al., General Social Surveys, 1972–2010 (machine-readable data file) (Chicago: National Opinion Research Center, 2011), www3.norc.org/GSS+Website/. I ran a table on the question, “In general, how do you feel about your time, would you say you always feel rushed, sometimes or almost never?” and sorted the data by sex and by number of children. The question was asked in 1982, 1996, and 2004. 9.

pages: 474 words: 130,575

Surveillance Valley: The Rise of the Military-Digital Complex
by Yasha Levine
Published 6 Feb 2018

.… The very existence of a National Data Center may encourage certain federal officials to engage in questionable surveillance tactics. For example, optical scanners—devices with the capacity to read a variety of type fonts or handwriting at fantastic rates of speed—could be used to monitor our mail. By linking scanners with a computer system, the information drawn in by the scanner would be converted into machine-readable form and transferred into the subject’s file in the National Data Center. Then, with sophisticated programming, the dossiers of all of the surveillance subject’s correspondents could be produced at the touch of a button, and an appropriate entry—perhaps “associates with known criminals”—could be added to all of them.

pages: 505 words: 133,661

Who Owns England?: How We Lost Our Green and Pleasant Land, and How to Take It Back
by Guy Shrubsole
Published 1 May 2019

Since then, the growth of GIS (Geographic Information System) mapping tools has transformed how maps can be made and shared. An EU directive called INSPIRE has forced the Land Registry and Ordnance Survey to publish digital maps showing the outlines of all land parcels in England and Wales – but not who owns them, and with licensing restrictions in place on reproducing the maps. Machine-readable datasets and open-source software have made it easier to analyse complex datasets detailing who owns land, while modern web mapping allows us to create powerful online maps. The Open Data movement has also sought to shift culture, both within government and wider civil society, so that previously closed data is made open and easily accessible.

pages: 520 words: 134,627

Unacceptable: Privilege, Deceit & the Making of the College Admissions Scandal
by Melissa Korn and Jennifer Levitz
Published 20 Jul 2020

Early on a Saturday, the students would enter either Jack Yates High or West Hollywood Prep, where Riddell would meet them. Armed with their No. 2 pencils, the students would begin to take their tests, following special instructions. Singer’s game plan even included a way to deal with the Scantron sheets, the machine-readable paper on which students typically fill in bubbles to test questions: He told kids to write their answers on separate sheets of paper, to avoid mistakes on the grid of dots. He really had them do it so Riddell could bubble in the correct answers later, without having to first erase all those wrong bubbles.

Southwest USA Travel Guide
by Lonely Planet

In most cases, your passport must be valid for at least another six months after you are due to leave the USA. If your passport doesn’t meet current US standards you’ll be turned back at the border. If your passport was issued before October 26, 2005, it must be machine readable (with two lines of letters, numbers and the repeated symbol <<< at the bottom); if it was issued between October 26, 2005 and October 25, 2006, it must be machine readable and include a digital photo on the data page or integrated chip with information from the data page; and if it was issued on or after October 26, 2006, it must be an e-Passport with a digital photo and an integrated chip containing information from the data page.

pages: 528 words: 146,459

Computer: A History of the Information Machine
by Martin Campbell-Kelly and Nathan Ensmenger
Published 29 Jul 2013

The first flowcharts were developed for use in industrial engineering in the 1920s, and von Neumann (who had first trained as a chemical engineer) began applying them to computer programs in the 1950s. Flowcharts were intended to serve as the blueprint for computer programmers, an intermediate step between the analysis of a system and its implementation as a software application. The second step in the translation process involved the coding of the algorithm in machine-readable form and had already been partially addressed by Wilkes and Wheeler. By about 1953, however, the center of programming research had moved from England to the United States, where there was heavy investing in computers with large memories, backed up with magnetic tapes and drums. Computers now had perhaps ten times the memory of the first prototypes.

pages: 550 words: 154,725

The Idea Factory: Bell Labs and the Great Age of American Innovation
by Jon Gertner
Published 15 Mar 2012

A visitor could also try something called a portable “pager,” a big, blocky device that could alert doctors and other busy professionals when they received urgent calls.2 New York’s fair would dwarf Seattle’s. The crowds were expected to be immense—probably somewhere around 50 or 60 million people in total. Pierce and David’s 1961 memo recommended a number of exhibits: “personal hand-carried telephones,” “business letters in machine-readable form, transmitted by wire,” “information retrieval from a distant computer-automated library,” and “satellite and space communications.” By the time the fair opened in April 1964, though, the Bell System exhibits, housed in a huge white cantilevered building nicknamed the “floating wing,” described a more conservative future than the one Pierce and David had envisioned.

pages: 492 words: 153,565

Countdown to Zero Day: Stuxnet and the Launch of the World's First Digital Weapon
by Kim Zetter
Published 11 Nov 2014

File paths show the folder and subfolders where a file or document is stored on a computer. The file path for a document called “my résumé” stored in a computer’s Documents folder on the C: drive would look like this—c:\documents\myresume.doc. Sometimes when programmers run source code through a compiler—a tool that translates human-readable programming language into machine-readable binary code—the file path indicating where the programmer had stored the code on his computer gets placed in the compiled binary file. Most malware writers configure their compilers to eliminate the file path, but Stuxnet’s attackers didn’t do this, either by accident or not. The path showed up as b:\myrtus\src\objfre_w2k_x86\i386\guava.pdb in the driver file, indicating that the driver was part of a project the programmer had called “guava,” which was stored on his computer in a directory named “myrtus.”

pages: 595 words: 143,394

Rigged: How the Media, Big Tech, and the Democrats Seized Our Elections
by Mollie Hemingway
Published 11 Oct 2021

Poll workers, many of whom had no hands-on training because of the pandemic, were often befuddled by the new technology,” O’Brien said.24 J. Alex Halderman, a University of Michigan professor of computer science, said he’d analyzed the voting system, which issues a “QR” code for each ballot—a machine-readable optical label that contains information about the item to which it is attached—and found that “there’s nothing that stops an attacker from just duplicating one, and the duplicate would count the same as the original bar code.”25 Halderman had also tested whether voters could catch deliberately placed errors on their ballot, and only 7 percent did.

pages: 688 words: 147,571

Robot Rules: Regulating Artificial Intelligence
by Jacob Turner
Published 29 Oct 2018

The Internet is not the only mass data set which might be prone to similar problems of inherent bias . It is possible that Google ’s TensorFlow as well as Amazon and Microsoft ’s Gluon libraries of machine learning software might have similar latent defects. 3.2.4 Data Available Is Insufficiently Detailed Sometimes the entire universe of data available in a machine-readable format is insufficiently detailed to achieve unbiased results. For example, AI might be asked to determine which candidates are best suited to jobs as labourers on a building site based on data from successful incumbent workers. If the only data made available to the AI are age and gender, then it is most likely that the AI will select younger men for the job.

pages: 550 words: 160,356

Snow Crash
by Neal Stephenson
Published 15 Jul 2003

He uploads it to the CIC database—the Library, formerly the Library of Congress, but no one calls it that anymore. Most people are not entirely clear on what the word "congress" means. And even the word "library" is getting hazy. It used to be a place full of books, mostly old ones. Then they began to include videotapes, records, and magazines. Then all of the information got converted into machine-readable form, which is to say, ones and zeroes. And as the number of media grew, the material became more up to date, and the methods for searching the Library became more and more sophisticated, it approached the point where there was no substantive difference between the Library of Congress and the Central Intelligence Agency.

Ghost in the Wires: My Adventures as the World's Most Wanted Hacker
by Kevin Mitnick
Published 14 Aug 2011

Once again I couldn’t believe how easy it was, with no roadblocks being thrown up in front of me. I felt a great sense of accomplishment and the kind of satisfaction I had known as a kid in Little League when I hit a home run. But later that day, I realized, Damn! I had never thought to grab the compiler—the program that translates the source code written by a programmer into “machine-readable” code, the ones and zeros that a computer, or the processor in a cell phone, can understand. So that became my next challenge. Did Motorola develop their own compiler for the 68HC11 processor used in the MicroTac, or did they purchase it from another software vendor? And how was I going to get it?

Frommer's San Diego 2011
by Mark Hiss
Published 2 Jan 2007

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if the valid passport was issued before October 26, 2005, and includes a machine-readable zone; or if the valid passport was issued between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to http://travel.state.gov/visa. Canadian citizens may enter the United States without visas, but will need to show passports and proof of residence.

pages: 552 words: 168,518

MacroWikinomics: Rebooting Business and the World
by Don Tapscott and Anthony D. Williams
Published 28 Sep 2010

If other companies were to follow suit, a patent pool for neglected diseases would provide a significant boost to researchers who have been working on treatments for diseases such as TB, malaria, and river blindness. In the near future, intelligent machines will accelerate biomedical research even further. Bradley’s team already translates their “human readable” logs into a machine-readable format. “We really want to get to the point where machines can design experiments, can execute experiments, and can analyze them,” he says. Soon it could become difficult to tell if you’re interacting with a machine or a human. “We’re not actually that far from that point,” he says. “And I think that’s when things will really accelerate.”

Turing's Cathedral
by George Dyson
Published 6 Mar 2012

Hedi Selberg transferred her expertise to the Princeton Plasma Physics Laboratory, and Ralph Slutz became director of computing at the National Center for Atmospheric Research, in Boulder, Colorado. Richard Melville and Hewitt Crane went to the Stanford Research Institute, developing, among other things, the ERMA system for electronic clearing of machine-readable checks between banks. Dick Snyder returned to RCA, working on magnetic-core memory but unable to persuade RCA, as Zworykin had managed to with television, to take the lead. Morris Rubinoff returned to the University of Pennsylvania, and for an interval to Philco, where he supervised the design of the Philco 2000, the first fully transistorized computer, with asynchronous arithmetic, a feature that had been developed at IAS.

pages: 606 words: 157,120

To Save Everything, Click Here: The Folly of Technological Solutionism
by Evgeny Morozov
Published 15 Nov 2013

For all we know, since the Nazis had an enviable train system, they’d be all for making their train data universally accessible. As Yu and Robinson argue, “A government can provide ‘open data’ on politically neutral topics even as it remains deeply opaque and unaccountable. The Hungarian cities of Budapest and Szeged, for example, both provide online, machine-readable transit schedules, allowing Google Maps to route users on local trips.” Isn’t such data both open and governmental? It surely is. But it may not make Hungary any more democratic. In fact, while the country has been nudging ever closer to authoritarian rule, it might have also emerged as one of the successes of “open government.”

pages: 625 words: 167,097

Kiln People
by David Brin
Published 15 Jan 2002

I sure would hate to go through stuff like that. But how else can anything be retrieved? Only the original human template can inload a duplicate's full memory. No other person or computer can substitute. If the template's missing or dead, all you can do is physically sift the copy's brain for crude sepia images -- the only data that's machine-readable from golemflesh. The rest -- your consciousness Standing Wave, the core sense of self that some call the soul -- is little more than useless static. There used to be an old riddle. Are the colors you see the same as the ones I see? When you smell a rose, are you experiencing the same heady sensations that I do, when I sniff the same flower?

Smart Grid Standards
by Takuro Sato
Published 17 Nov 2015

Thirdly, all other specifics of the hardware environment shall be shielded by the other layers in the interface profile. 3.4.2.4 Information Exchange Model In IEC 61968 series, the following items from a compliant utility inter-application infrastructure are needed. Firstly, it shall have one logical IEC 61968 IEM and its implementation may be physically distributed. This facility allows information exchanged among components to be declared in a publicly accessible manner. Secondly, the IEM shall be accessible in machine-readable and platform-independent form. Thirdly, information is exchanged between components via one or more events whose types are defined in the IEM. Fourthly, the IEM shall maintain descriptions of the contents, syntax, and semantics (i.e., meaning) of the information exchanged Smart Grid Standards 106 between components.

pages: 496 words: 162,951

We Were Soldiers Once...and Young: Ia Drang - the Battle That Changed the War in Vietnam
by Harold G. Moore and Joseph L. Galloway
Published 19 Oct 1991

It also focuses on interrogation reports and "knowledgeability briefs" concerning 35 North Vietnamese prisoners--officers, NCOs, and line soldiers--most of whom were captured by the 1st Cavalry Division (Airmobile) during October and November 1965. In the matter of dates of death, homes of record, birthdates, and, in some cases, places of death, three books, one document, and eyewitness accounts were all utilized. The basic reference was the Casualty Information System, 1961-1981 (machine-readable record), Records of the Adjutant General's Office, Record Group 407, National Archives Building. Next, the book Vietnam Veterans Memorial: Directory of Names, published by the Vietnam Veterans Memorial Fund, Inc., Washington, D.C. (May 1991), was most helpful in crosschecking ranks, birth and death dates, and homes of record.

Frommer's San Francisco 2012
by Matthew Poole , Erika Lenkert and Kristin Luna
Published 4 Oct 2011

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if the valid passport was issued before October 26, 2005, and includes a machine-readable zone; or if the valid passport was issued between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to http://travel.state.gov/visa. Canadian citizens may enter the United States without visas, but will need to show passports and proof of residence.

pages: 719 words: 181,090

Site Reliability Engineering: How Google Runs Production Systems
by Betsy Beyer , Chris Jones , Jennifer Petoff and Niall Richard Murphy
Published 15 Apr 2016

The config ultimately acts as a configuration layer that allows all the other components to be wired together. It’s designed to be human-readable and configurable. Auxon Configuration Language Engine acts based upon the information it receives from the Intent Config. This component formulates a machine-readable request (a protocol buffer that can be understood by the Auxon Solver. It applies light sanity checking to the configuration, and is designed to act as the gateway between the human-configurable intent definition and the machine-parseable optimization request. Auxon Solver is the brain of the tool.

pages: 612 words: 187,431

The Art of UNIX Programming
by Eric S. Raymond
Published 22 Sep 2003

To resolve this conflict, notice that it's the server's job to use the registry data, but the task of carefully error-checking that data could be handed off to another program to be run by human editors each time the registry is modified. One Unix solution would be a separate auditing program that analyzes either a machine-readable specification of the ruleset format or the source of the server code to determine the set of properties it uses, parses the Freeciv registry to determine the set of properties it provides, and prepares a difference report.[62] The aggregate of all Freeciv data files is functionally similar to a Windows registry, and even uses a syntax resembling the textual portions of registries.

pages: 661 words: 185,701

The Future of Money: How the Digital Revolution Is Transforming Currencies and Finance
by Eswar S. Prasad
Published 27 Sep 2021

Alipay’s success in resolving the lack of trust between transacting parties resulted in its rapid and tremendous growth, which soon led to its adoption even on other platforms outside the Alibaba ecosystem. Alipay has played a key role in innovations such as the QR code–based payment technology that has put the means of payment in the hands of the customer (in the form of a mobile phone) and requires the merchant only to have a QR reader (QR code stands for “Quick Response code,” a machine-readable matrix bar code). This allows merchants to process payments even if they are off-line or lack a stable internet or mobile phone connection. QR readers are also markedly cheaper to set up and maintain than the point-of-sale processors associated with debit and credit cards. Alipay’s success has inspired competition.

pages: 894 words: 190,485

Write Great Code, Volume 1
by Randall Hyde
Published 6 Aug 2012

Character Representation Although computers are famous for their “number-crunching” capabilities, the truth is that most computer systems process character data far more often than numbers. Given the importance of character manipulation in modern software, a thorough understanding of character and string data is necessary if you’re going to write great code. The term character refers to a human or machine-readable symbol that is typically a nonnumeric entity. In general, a character is any symbol that you can type on a keyboard or display on a video display. Note that in addition to alphabetic characters, character data includes punctuation symbols, numeric digits, spaces, tabs, carriage returns (the enter key), other control characters, and other special symbols.

pages: 743 words: 201,651

Free Speech: Ten Principles for a Connected World
by Timothy Garton Ash
Published 23 May 2016

As your supermarket food packet is supposed to tell you what’s in it, so the metadata tag should tell you who originated, first published and subsequently changed the story you’re reading online, and the journalistic code of practice (if any) the originating or republishing platform adheres to. In 2012, the New York Times introduced rNews, a new standard for embedding machine-readable publishing metadata into HTML documents’, and similar schema.org tags have been developed with the backing of major internet companies.33 Then there is the idea of a more general sort of kitemarking, so that you know what kind of speech store you are visiting. Is it an organic foods delicatessen, a supermarket, a corner store or an unauthorised street vendor’s stall?

pages: 420 words: 219,075

Frommer's New Mexico
by Lesley S. King
Published 2 Jan 1999

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if the valid passport was issued before October 26, 2005, and includes a machine-readable zone; or if the valid passport was issued between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to www.travel.state.gov/visa. Canadian citizens may enter the United States without visas, but will need to show passports and proof of residence.

pages: 968 words: 224,513

The Art of Assembly Language
by Randall Hyde
Published 8 Sep 2003

This generally produces more accurate results and requires far less silicon than having a separate coprocessor that supports decimal arithmetic. 2.14 Characters Perhaps the most important data type on a personal computer is the character data type. The term character refers to a human or machine-readable symbol that is typically a nonnumeric entity. In general, the term character refers to any symbol that you can normally type on a keyboard (including some symbols that may require multiple key presses to produce) or display on a video display. Many beginners often confuse the terms character and alphabetic character.

pages: 388 words: 211,314

Frommer's Washington State
by Karl Samson
Published 2 Nov 2010

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if the valid passport was issued before October 26, 2005, and includes a machine-readable zone; or if the valid passport was issued between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to http://travel.state.gov/visa. Canadian citizens may enter the United States without visas, but will need to show passports and proof of residence.

pages: 1,085 words: 219,144

Solr in Action
by Trey Grainger and Timothy Potter
Published 14 Sep 2014

The only difference between the Term and the Raw query parsers is that the Raw query parser searches for the exact token in the Solr index, but the Term query parser searches for the readable version of the term. In certain kinds of fields, such as numeric fields that internally store values in a trie structure for greater search efficiency, the Term query parser will accept the readable version of the number (1.5), whereas the Raw query parser will accept the machine-readable version (the internal representation of the field in the index). The number 1 in an integer field may be represented by a trie structure in the Solr index with a token such as `#8;#0;#0;#0;#1;. The following queries would both return a document containing the integer value 1. {!term f=myintfield}1 {!

pages: 678 words: 216,204

The Wealth of Networks: How Social Production Transforms Markets and Freedom
by Yochai Benkler
Published 14 May 2006

The person or small group starts by developing a part of this project, up to a point where the whole utility--if it is simple enough--or some important part of it, is functional, though it might have much room for improvement. At this point, the person makes the program freely available to others, with its source code--instructions in a human-readable language that explain how the software does whatever it does when compiled into a machine-readable language. When others begin [pg 67] to use it, they may find bugs, or related utilities that they want to add (e.g., the photo-retouching software only increases size and sharpness, and one of its users wants it to allow changing colors as well). The person who has found the bug or is interested in how to add functions to the software may or may not be the best person in the world to actually write the software fix.

Seeking SRE: Conversations About Running Production Systems at Scale
by David N. Blank-Edelman
Published 16 Sep 2018

At our size, it is possible for ProdEng to review all incidents company-wide and identify issues that occur across teams or that might be solved by a cross-team effort. Postmortem documentation follows a specific format guided by a template in the internal wiki. We prioritize ease of use for humans over machine readability, but do also extract some automated reports. A weekly meeting is the forum for discussing recent incidents, resolutions, and future improvements. The meetings are attended by at least one representative for each incident on the agenda, ProdEng, and anyone else who wants to join. The majority of attendees are engineers directly involved in the development of the affected services.

pages: 389 words: 210,632

Frommer's Oregon
by Karl Samson
Published 26 Apr 2010

Citizens of these nations also need to present a round-trip air or cruise ticket upon arrival. E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. If your passport doesn’t have this feature, you can still travel without a visa if it is a valid passport issued before October 26, 2005, and includes a machine-readable zone, or between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to http://travel.state.gov/ visa. Canadian citizens may enter the United States without visas; they will need to show passports (if traveling by air) and proof of residence, however.

pages: 496 words: 174,084

Masterminds of Programming: Conversations With the Creators of Major Programming Languages
by Federico Biancuzzi and Shane Warden
Published 21 Mar 2009

The stuff we know how to do is giving guesses as to what the completions might be and show you what the parameters are, and it can show you other references if it’s good, and find the definitions. It’s harder of course to find a function you don’t know the name of that does something, I mean you’re sure somewhere in this mess is a function that formats numbers in a machine-readable way and puts commas in or something like that, right? But how do you remember the name? How do you find the names of these functions? And so when people write these libraries, they try naming conventions, informal naming conventions and things like that. But a lot of that stuff just doesn’t scale very well.

The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise
by Martin L. Abbott and Michael T. Fisher
Published 1 Dec 2009

D IFFERENT U SES FOR G RID C OMPUTING Build Steps There are many different types of compilers and many different processes that source code goes through to become code that can be executed by a machine. At a high level, there are either compiled languages or interpreted languages. Forget about just in time (JIT) compilers and bytecode interpreters; compiled languages are ones that the code written by the engineers is reduced to machine readable code ahead of time using a compiler. Interpreted languages use an interpreter to read the code from the source file and execute it at runtime. Here are the rudimentary steps that are followed by most compilation processes and the corresponding input/output: • In Source code 1. Preprocessing.

pages: 976 words: 235,576

The Meritocracy Trap: How America's Foundational Myth Feeds Inequality, Dismantles the Middle Class, and Devours the Elite
by Daniel Markovits
Published 14 Sep 2019

(S.D.N.Y. 2012) (“To further ensure that loans would proceed as quickly as possible to closing, Countrywide revamped the compensation structure of those involved in loan origination, basing performance bonuses solely on volume.”). the model of an assembly line: Rajan, Fault Lines, 128 (“But as investment banks put together gigantic packages of mortgages, the judgment calls became less and less important in credit assessments: after all, there was no way to code the borrower’s capacity to hold a job in an objective, machine-readable way. Indeed, recording judgment calls in a way that could not be supported by hard facts might have opened the mortgage lender to lawsuits alleging discrimination. All that seemed to matter to the investment banks and the rating agencies were the numerical credit score of the borrower and the amount of the loan relative to house value.

The Rough Guide to New York City
by Martin Dunford
Published 2 Jan 2009

Be warned, some converters may not be able to handle certain high-wattage items, especially those with heated elements. Entry requirements Under the Visa Waiver Program, citizens of Australia, Ireland, New Zealand, and the UK do not require visas for visits to the US of ninety days or less. You will, however, need to present a machine-readable passport and a completed visa waiver form to Immigration upon arrival; the latter will be provided by your travel agent or by the airline. Canadians now require a passport to cross the border, but can travel in the US for an unlimited amount of time without a visa. For visa information, visit Wwww.travel.state.gov.

Rough Guide to San Francisco and the Bay Area
by Nick Edwards and Mark Ellwood
Published 2 Jan 2009

Although regulations have been continually tightening up since 9/11, citizens of 27 countries, including the UK, Ireland, Australia, New Zealand, and most Western European countries, visiting the United States for a period of less than ninety days can still enter the country on the Visa Waiver Scheme. The requisite visa waiver form (I-94W) is provided by the airline during check-in or on the plane, and presented to an immigration official on arrival. However, all passports accompanying an I-94W must now be machine readable and any issued after October 2006 must include a digital chip containing biometric data (these are now automatically issued by most countries but check.) Anybody whose passport does not meet these requirements will require some sort of visa for even a short stay in America, as will anybody planning to stay over three months: check wwww.dhs .gov for updates and the list of Visa Waiver Scheme countries.

The Rough Guide to New York City
by Rough Guides
Published 21 May 2018

Once given, authorizations are valid for multiple entries into the US for around two years – it’s recommended that you submit an ESTA application as soon as you begin making travel plans (in most cases the ESTA will be granted immediately, but it can sometimes take up to 72 hours to get a response). You’ll need to present a machine-readable passport to Immigration upon arrival. Note that ESTA currently only applies to visitors arriving by air or cruise ship: crossing the land border from Canada or Mexico, those qualifying for the Visa Waiver Program do not need to apply for ESTA – instead you must fill in an I-94W form, though this may change in future.

Data Mining: Concepts and Techniques: Concepts and Techniques
by Jiawei Han , Micheline Kamber and Jian Pei
Published 21 Jun 2011

Here, we illustrate classic problems in machine learning that are highly related to data mining. ■ Supervised learning is basically a synonym for classification. The supervision in the learning comes from the labeled examples in the training data set. For example, in the postal code recognition problem, a set of handwritten postal code images and their corresponding machine-readable translations are used as the training examples, which supervise the learning of the classification model. ■ Unsupervised learning is essentially a synonym for clustering. The learning process is unsupervised since the input examples are not class labeled. Typically, we may use clustering to discover classes within the data.

Coastal California
by Lonely Planet

Under the US Department of Homeland Security (DHS) registration program, US-VISIT (www.dhs.gov/us-visit), almost all visitors (excluding, for now, many Canadian, some Mexican citizens and also children under age 14) will be digitally photographed and have their electronic (inkless) fingerprints scanned upon arrival; the process typically takes just a minute. For Mexico land-border crossings, turn to Click here. For US customs for all international arrivals, Click here. For visa requirements, Click here. Passport »Under the Western Hemisphere Travel Initiative (WHTI), all travelers must have a valid machine-readable (MRP) passport when entering the USA by air, land or sea. »The only exceptions are for most US citizens and some Canadian and Mexican citizens traveling by land who can present other WHTI-compliant documents (eg pre-approved ‘trusted traveler’ cards). For details, check www.getyouhome.gov. »All foreign passports must meet current US standards and be valid for at least six months longer than your intended stay. »MRP passports issued or renewed after October 26, 2006 must be e-passports (ie have a digital photo and integrated chip with biometric data).

pages: 2,466 words: 668,761

Artificial Intelligence: A Modern Approach
by Stuart Russell and Peter Norvig
Published 14 Jul 2019

We have seen throughout this book that factored or structured models allow for more expressive power and better generalization. We will see in Section 25.1 that a factored model called word embeddings gives a better ability to generalize. One type of structured word model is a dictionary, usually constructed through manual labor. For example, WordNet is an open-source, hand-curated dictionary in machine-readable format that has proven useful for many natural language applications1 Below is the WordNet entry for “kitten:” “kitten” <noun.animal> (“young domestic cat“) IS A: young_mammal “kitten” <verb.body> (“give birth to kittens“) EXAMPLE: “our cat kittened again this year” WordNet will help you separate the nouns from the verbs, and get the basic categories (a kitten is a young mammal, which is a mammal, which is an animal), but it won’t tell you the details of what a kitten looks like or acts like.

Chapter 24 explained the key elements of natural language, including grammar and semantics. Systems based on parsing and semantic analysis have demonstrated success on many tasks, but their performance is limited by the endless complexity of linguistic phenomena in real text. Given the vast amount of text available in machine-readable form, it makes sense to consider whether approaches based on data-driven machine learning can be more effective. We explore this hypothesis using the tools provided by deep learning systems (Chapter 22). We begin in Section 25.1 by showing how learning can be improved by representing words as points in a high-dimensional space, rather than as atomic values.

pages: 1,302 words: 289,469

The Web Application Hacker's Handbook: Finding and Exploiting Security Flaws
by Dafydd Stuttard and Marcus Pinto
Published 30 Sep 2007

If this process is not carried out safely, attackers may be able to submit malicious input to interfere with the database and potentially read and write sensitive data. These attacks are described in Chapter 9, along with detailed explanations of the SQL language and how it can be used. XML Extensible Markup Language (XML) is a specification for encoding data in a machine-readable form. Like any markup language, the XML format separates a document into content (which is data) and markup (which annotates the data). Markup is primarily represented using tags, which may be start tags, end tags, or empty-element tags: <tagname> </tagname> ctagname /> Start and end tags are paired into elements and may encapsulate document content or child elements: <pet>ginger</pet> <petsxdog>spot</dogxcat>paws</catx/pets> Tags may include attributes, which are name/value pairs: <data version="2.1 "xpets> . . .

USA Travel Guide
by Lonely, Planet

Your passport must be valid for at least six months longer than your intended stay in the USA. Also, if your passport does not meet current US standards, you’ll be turned back at the border. If your passport was issued before October 26, 2005, it must be ‘machine readable’ (with two lines of letters, numbers and <<< at the bottom); if it was issued between October 26, 2005, and October 25, 2006, it must be machine readable and include a digital photo; and if it was issued on or after October 26, 2006, it must be an e-Passport with a digital photo and an integrated RFID chip containing biometric data. Air Airports The USA has more than 375 domestic airports, but only a baker’s dozen are the main international gateways.

Hawaii
by Jeff Campbell
Published 4 Nov 2009

There are 35 countries currently participating; they are Andorra, Australia, Austria, Belgium, Brunei, Czech Republic, Denmark, Estonia, Finland, France, Germany, Hungary, Iceland, Ireland, Italy, Japan, South Korea, Latvia, Liechtenstein, Lithuania, Luxembourg, Malta, Monaco, the Netherlands, New Zealand, Norway, Portugal, San Marino, Singapore, Slovakia, Slovenia, Spain, Sweden, Switzerland and the UK. Under this program you must have a return ticket (or onward ticket to any foreign destination) that is nonrefundable in the USA. If your passport was issued/renewed after October 26, 2006, you need an ‘e-passport’ with digital chip; otherwise, you need a machine-readable passport. Visitors who don’t qualify for the Visa Waiver Program need a visa. Basic requirements are a valid passport, recent photo, travel details and often proof of financial stability. Students and adult males also must fill out supplemental travel documents. Those planning to travel through other countries before arriving in the USA are better off applying for their US visa in their home country rather than while on the road.

Hawaii Travel Guide
by Lonely Planet

Upon arrival in the USA, most foreign citizens (excluding for now, many Canadians, some Mexicans, all children under age 14 and seniors over age 79) must register with the Department of Homeland Security (DHS; www.dhs.gov), which entails having electronic (inkless) fingerprints and a digital photo taken. PASSPORTS A machine-readable passport (MRP) is required for all foreign citizens to enter the USA. Your passport must be valid for six months beyond your expected dates of stay in the USA. If your passport was issued/renewed after October 26, 2006, you need an 'e-passport' with a digital photo and an integrated chip containing biometric data.

pages: 675 words: 344,555

Frommer's Hawaii 2009
by Jeanette Foster
Published 2 Jan 2008

E-Passports contain computer chips capable of storing biometric information, such as the required digital photograph of the holder. (You can identify an e-Passport by the symbol on the bottom center cover of your passport.) If your passport doesn’t have this feature, you can still travel without a visa if it is a valid passport issued before October 26, 2005, and includes a machine-readable zone, or between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to www.travel.state.gov/visa. Citizens of all other countries must have (1) a valid passport that expires at least 6 months later than the scheduled end of their visit to the U.S., and (2) a tourist visa, which may be obtained without charge from any U.S. consulate.

Eastern USA
by Lonely Planet

The US State Department (www.travel.state.gov/visa) has the latest information or check with a US consulate in your home country. Visa Waiver Program & ESTA Under the US visa-waiver program, visas are not required for citizens of 36 countries – including most EU members, Japan, Australia, New Zealand and the UK – for visits of up to 90 days (no extensions allowed), as long as you can present a machine-readable passport and are approved under the Electronic System for Travel Authorization (ESTA; www.cbp.gov/esta). Note you must register at least 72 hours before arrival and there’s a $14 fee for processing and authorization. In essence, ESTA requires that you register specific information online (name, address, passport info, etc) prior to entering the US.

Caribbean Islands
by Lonely Planet

Dial-A-Ride ( 776-1277) helps with transportation needs on the island. On St John, Concordia Eco-Tents (www.maho.org) provides well-regarded accessible lodging. Visas Visitors from most Western countries do not need a visa to enter the USVI if they are staying less than 90 days. This holds true as long as you can present a machine-readable passport and are approved under the Electronic System for Travel Authorization (ESTA; www.cbp.gov/esta) . Note that you must register for ESTA at least 72 hours before arrival, and there’s a US$14 fee for processing and authorization. If you do need a visa, contact your local embassy. The US State Department (www.travel.state.gov) has the latest information on admission requirements.

Frommer's California 2009
by Matthew Poole , Harry Basch , Mark Hiss and Erika Lenkert
Published 2 Jan 2009

E-P assports contain computer chips capable of storing biometric information, such as the r equired digital photograph of the holder. (You can identify an e-Passport by the symbol on the bottom center cover of y our passport.) If your passport doesn’t hav e this featur e, y ou can still travel without a visa if it is a v alid passport issued before October 26, 2005, and includes a machine-readable zone, or if it was issued between October 26, 2005, and October 25, 2006, and includes a digital photograph. For more information, go to www.travel.state. gov/visa. Citizens of all other countries must have (1) a v alid passpor t that expir es at least 6 months later than the scheduled end of their visit to the U.S., and (2) a tourist visa, which may be obtained without charge from any U.S. consulate.

Programming Python
by Mark Lutz
Published 5 Jan 2011

"pp 4e" <pp4e@learning-python.com>, "lu,tz" <lutz@learning-python.com>, lutz@rmi.net Finally, if you are running this live, you will also find the mail save file on your machine, containing the one message we asked to be saved in the prior session; it’s simply the raw text of saved emails, with separator lines. This is both human and machine-readable—in principle, another script could load saved mail from this file into a Python list by calling the string object’s split method on the file’s text with the separator line as a delimiter. As shown in this book, it shows up in file C:\temp\savemail.txt, but you can configure this as you like in the mailconfig module