iterative process

back to index

description: numerical method in which the n-th approximation of the solution is obtained on the basis on the (n-1) previous approximations

207 results

pages: 893 words: 199,542

Structure and interpretation of computer programs
by Harold Abelson , Gerald Jay Sussman and Julie Sussman
Published 25 Jul 1996

At each step, all we need to keep track of, for any n, are the current values of the variables product, counter, and max-count. We call this an iterative process. In general, an iterative process is one whose state can be summarized by a fixed number of state variables, together with a fixed rule that describes how the state variables should be updated as the process moves from state to state and an (optional) end test that specifies conditions under which the process should terminate. In computing n!, the number of steps required grows linearly with n. Such a process is called a linear iterative process. The contrast between the two processes can be seen in another way.

One reason that the distinction between process and procedure may be confusing is that most implementations of common languages (including Ada, Pascal, and C) are designed in such a way that the interpretation of any recursive procedure consumes an amount of memory that grows with the number of procedure calls, even when the process described is, in principle, iterative. As a consequence, these languages can describe iterative processes only by resorting to special-purpose “looping constructs” such as do, repeat, until, for, and while. The implementation of Scheme we shall consider in chapter 5 does not share this defect. It will execute an iterative process in constant space, even if the iterative process is described by a recursive procedure. An implementation with this property is called tail-recursive. With a tail-recursive implementation, iteration can be expressed using the ordinary procedure call mechanism, so that special iteration constructs are useful only as syntactic sugar.31 Exercise 1.9.

Using these, design a multiplication procedure analogous to fast-expt that uses a logarithmic number of steps. Exercise 1.18. Using the results of exercises 1.16 and 1.17, devise a procedure that generates an iterative process for multiplying two integers in terms of adding, doubling, and halving and uses a logarithmic number of steps.40 Exercise 1.19. There is a clever algorithm for computing the Fibonacci numbers in a logarithmic number of steps. Recall the transformation of the state variables a and b in the fib-iter process of section 1.2.2: a ⟵ a + b and b ⟵ a. Call this transformation T, and observe that applying T over and over again n times, starting with 1 and 0, produces the pair Fib(n + 1) and Fib(n).

pages: 1,387 words: 202,295

Structure and Interpretation of Computer Programs, Second Edition
by Harold Abelson , Gerald Jay Sussman and Julie Sussman
Published 1 Jan 1984

At each step, all we need to keep track of, for any , are the current values of the variables product, counter, and max-count. We call this an iterative process. In general, an iterative process is one whose state can be summarized by a fixed number of state variables, together with a fixed rule that describes how the state variables should be updated as the process moves from state to state and an (optional) end test that specifies conditions under which the process should terminate. In computing , the number of steps required grows linearly with . Such a process is called a linear iterative process. The contrast between the two processes can be seen in another way.

One reason that the distinction between process and procedure may be confusing is that most implementations of common languages (including Ada, Pascal, and C) are designed in such a way that the interpretation of any recursive procedure consumes an amount of memory that grows with the number of procedure calls, even when the process described is, in principle, iterative. As a consequence, these languages can describe iterative processes only by resorting to special-purpose “looping constructs” such as do, repeat, until, for, and while. The implementation of Scheme we shall consider in Chapter 5 does not share this defect. It will execute an iterative process in constant space, even if the iterative process is described by a recursive procedure. An implementation with this property is called tail-recursive. With a tail-recursive implementation, iteration can be expressed using the ordinary procedure call mechanism, so that special iteration constructs are useful only as syntactic sugar.31 Exercise 1.9: Each of the following two procedures defines a method for adding two positive integers in terms of the procedures inc, which increments its argument by 1, and dec, which decrements its argument by 1.

Using these, design a multiplication procedure analogous to fast-expt that uses a logarithmic number of steps. Exercise 1.18: Using the results of Exercise 1.16 and Exercise 1.17, devise a procedure that generates an iterative process for multiplying two integers in terms of adding, doubling, and halving and uses a logarithmic number of steps.40 Exercise 1.19: There is a clever algorithm for computing the Fibonacci numbers in a logarithmic number of steps. Recall the transformation of the state variables and in the fib-iter process of 1.2.2: and . Call this transformation , and observe that applying over and over again times, starting with 1 and 0, produces the pair and .

pages: 704 words: 182,312

This Is Service Design Doing: Applying Service Design Thinking in the Real World: A Practitioners' Handbook
by Marc Stickdorn , Markus Edgar Hormess , Adam Lawrence and Jakob Schneider
Published 12 Jan 2018

Just like building a house shouldn’t end with an architect’s plan, a service design project shouldn’t end with ideas on paper. 33 PLAN FOR ITERATION; THEN ADAPT Service design is explorative, so you can never plan exactly what you will be doing each day. But you will need to plan your time investment and financial budget – so make plans that are flexible enough to allow you an adaptive and iterative process in the time you have. 34 ZOOM IN AND ZOOM OUT As you iterate, keep switching your focus between small details or momentary exchanges, and the holistic service experience. IT’S ALL SERVICES You can apply service design to anything – services, digital and physical products, internal processes, government offerings, employee or stakeholder experience … It’s not just about making “customers” happy. 35 1 Moritz, S. (2005).

Faced with an operational context that is increasingly VUCA (volatile, uncertain, complex, and ambiguous), the US Army adopted a new approach in 2010 with its field manual FM 5-0. The manual “is a set of guidelines to be adhered to by military commanders when planning and decision-making for the battlefield. FM 5-0 is unique, as it is the first time that design – and specifically ‘design thinking,’ the iterative process of problem-solving that is considered by some typical of design – was introduced into the vocabulary of the military field manual.” 1 As our world is cranking out innovations at an unprecedented rate, more and more industries are being shaken up by disruptive shifts. The business world is increasingly described as VUCA – volatile, uncertain, complex, and ambiguous. 2 With that much pressure, you also need to be able to quickly adapt your problem-solving, innovation, and design skills.

As this book is about doing, this chapter presents an actionable framework for service design research, based on common academic standards. THE BASIC PROCESS OF SERVICE DESIGN RESEARCH Figure 5-1. Research activities are embedded in an iterative sequence with other activities of ideation, prototyping, and implementation. Iterations and research loops Design research is an iterative process – a sequence of research loops within and between activities. Starting point Usually research starts with a brief from an internal or external client. Based on some preparatory research, you define research questions and start research planning. Output There are various potential outputs of design research, from informal inspirations to formal research reports.

pages: 270 words: 75,626

User Stories Applied: For Agile Software Development
by Mike Cohn
Published 1 Mar 2004

Stories are often and legitimately split because they were intentionally written as epics to start with, or because they are too big to fit into an iteration. If you find yourself frequently splitting stories for other reasons, you may be doing it too often. Chapter 15, Using Stories with Scrum 15.1 Describe the differences between an incremental and an iterative process. Answer: An iterative process is one that makes progress through successive refinement. An incremental process is one in which software is built and delivered in pieces. 15.2 What is the relationship between the product backlog and the sprint backlog? Answer: Items are moved from the product backlog to the sprint backlog at the start of a sprint. 15.3 What is meant by a potentially shippable product increment?

Each user story represents a discrete piece of functionality; that is, something a user would be likely to do in a single setting. This makes user stories appropriate as a planning tool. You can assess the value of shifting stories between releases far better than you can assess the impact of leaving out one or more “The system shall…” statements. An iterative process is one that makes progress through successive refinement. A development team takes a first cut at a system, knowing it is incomplete or weak in some (perhaps many) areas. They then successively refine those areas until the product is satisfactory. With each iteration the software is improved through the addition of greater detail.

In this chapter we’ll look at Scrum, another agile process, and will see how stories can be integrated as an important part of Scrum.[1] Terms that are part of the Scrum lexicon will be italicized when first used. Scrum Is Iterative and Incremental Like XP, Scrum is both an iterative and an incremental process. Since these words are used so frequently without definition, we’ll define them. An iterative process is one that makes progress through successive refinement. A development team takes a first cut at a system, knowing it is incomplete or weak in some (perhaps many) areas. They then iteratively refine those areas until the product is satisfactory. With each iteration the software is improved through the addition of greater detail.

pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by Martin Kleppmann
Published 16 Mar 2017

batch processing, Relational Model Versus Document Model, Batch Processing-Summary, Glossarycombining with stream processinglambda architecture, The lambda architecture unifying technologies, Unifying batch and stream processing comparison to MPP databases, Comparing Hadoop to Distributed Databases-Designing for frequent faults comparison to stream processing, Processing Streams comparison to Unix, Philosophy of batch process outputs-Philosophy of batch process outputs dataflow engines, Dataflow engines-Discussion of materialization fault tolerance, Bringing related data together in the same place, Philosophy of batch process outputs, Fault tolerance, Messaging Systems for data integration, Batch and Stream Processing-Unifying batch and stream processing graphs and iterative processing, Graphs and Iterative Processing-Parallel execution high-level APIs and languages, MapReduce workflows, High-Level APIs and Languages-Specialization for different domains log-based messaging and, Replaying old messages maintaining derived state, Maintaining derived state MapReduce and distributed filesystems, MapReduce and Distributed Filesystems-Key-value stores as batch process output(see also MapReduce) measuring performance, Describing Performance, Batch Processing outputs, The Output of Batch Workflows-Key-value stores as batch process outputkey-value stores, Key-value stores as batch process output search indexes, Building search indexes using Unix tools (example), Batch Processing with Unix Tools-Sorting versus in-memory aggregation Bayou (database), Uniqueness in log-based messaging Beam (dataflow library), Unifying batch and stream processing bias, Bias and discrimination big ball of mud, Simplicity: Managing Complexity Bigtable data model, Data locality for queries, Column Compression binary data encodings, Binary encoding-The Merits of SchemasAvro, Avro-Code generation and dynamically typed languages MessagePack, Binary encoding-Binary encoding Thrift and Protocol Buffers, Thrift and Protocol Buffers-Datatypes and schema evolution binary encodingbased on schemas, The Merits of Schemas by network drivers, The Merits of Schemas binary strings, lack of support in JSON and XML, JSON, XML, and Binary Variants BinaryProtocol encoding (Thrift), Thrift and Protocol Buffers Bitcask (storage engine), Hash Indexescrash recovery, Hash Indexes Bitcoin (cryptocurrency), Tools for auditable data systemsByzantine fault tolerance, Byzantine Faults concurrency bugs in exchanges, Weak Isolation Levels bitmap indexes, Column Compression blockchains, Tools for auditable data systemsByzantine fault tolerance, Byzantine Faults blocking atomic commit, Three-phase commit Bloom (programming language), Designing Applications Around Dataflow Bloom filter (algorithm), Performance optimizations, Stream analytics BookKeeper (replicated log), Allocating work to nodes Bottled Water (change data capture), Implementing change data capture bounded datasets, Summary, Stream Processing, Glossary(see also batch processing) bounded delays, Glossaryin networks, Synchronous Versus Asynchronous Networks process pauses, Response time guarantees broadcast hash joins, Broadcast hash joins brokerless messaging, Direct messaging from producers to consumers Brubeck (metrics aggregator), Direct messaging from producers to consumers BTM (transaction coordinator), Introduction to two-phase commit bulk synchronous parallel (BSP) model, The Pregel processing model bursty network traffic patterns, Can we not simply make network delays predictable?

InfiniteGraph (database), Graph-Like Data Models InnoDB (storage engine)clustered index on primary key, Storing values within the index not preventing lost updates, Automatically detecting lost updates preventing write skew, Characterizing write skew, Implementation of two-phase locking serializable isolation, Implementation of two-phase locking snapshot isolation support, Snapshot Isolation and Repeatable Read inside-out databases, Designing Applications Around Dataflow(see also unbundling databases) integrating different data systems (see data integration) integrity, Timeliness and Integritycoordination-avoiding data systems, Coordination-avoiding data systems correctness of dataflow systems, Correctness of dataflow systems in consensus formalization, Fault-Tolerant Consensus integrity checks, Don’t just blindly trust what they promise(see also auditing) end-to-end, The end-to-end argument, The end-to-end argument again use of snapshot isolation, Snapshot Isolation and Repeatable Read maintaining despite software bugs, Maintaining integrity in the face of software bugs Interface Definition Language (IDL), Thrift and Protocol Buffers, Avro intermediate state, materialization of, Materialization of Intermediate State-Discussion of materialization internet services, systems for implementing, Cloud Computing and Supercomputing invariants, Consistency(see also constraints) inversion of control, Separation of logic and wiring IP (Internet Protocol)unreliability of, Cloud Computing and Supercomputing ISDN (Integrated Services Digital Network), Synchronous Versus Asynchronous Networks isolation (in transactions), Isolation, Single-Object and Multi-Object Operations, Glossarycorrectness and, Aiming for Correctness for single-object writes, Single-object writes serializability, Serializability-Performance of serializable snapshot isolationactual serial execution, Actual Serial Execution-Summary of serial execution serializable snapshot isolation (SSI), Serializable Snapshot Isolation (SSI)-Performance of serializable snapshot isolation two-phase locking (2PL), Two-Phase Locking (2PL)-Index-range locks violating, Single-Object and Multi-Object Operations weak isolation levels, Weak Isolation Levels-Materializing conflictspreventing lost updates, Preventing Lost Updates-Conflict resolution and replication read committed, Read Committed-Implementing read committed snapshot isolation, Snapshot Isolation and Repeatable Read-Repeatable read and naming confusion iterative processing, Graphs and Iterative Processing-Parallel execution J Java Database Connectivity (JDBC)distributed transaction support, XA transactions network drivers, The Merits of Schemas Java Enterprise Edition (EE), The problems with remote procedure calls (RPCs), Introduction to two-phase commit, XA transactions Java Message Service (JMS), Message brokers compared to databases(see also messaging systems) comparison to log-based messaging, Logs compared to traditional messaging, Replaying old messages distributed transaction support, XA transactions message ordering, Acknowledgments and redelivery Java Transaction API (JTA), Introduction to two-phase commit, XA transactions Java Virtual Machine (JVM)bytecode generation, The move toward declarative query languages garbage collection pauses, Process Pauses process reuse in batch processors, Dataflow engines JavaScriptin MapReduce querying, MapReduce Querying setting element styles (example), Declarative Queries on the Web use in advanced queries, MapReduce Querying Jena (RDF framework), The RDF data model Jepsen (fault tolerance testing), Aiming for Correctness jitter (network delay), Network congestion and queueing joins, Glossaryby index lookup, Reduce-Side Joins and Grouping expressing as relational operators, The move toward declarative query languages in relational and document databases, Many-to-One and Many-to-Many Relationships MapReduce map-side joins, Map-Side Joins-MapReduce workflows with map-side joinsbroadcast hash joins, Broadcast hash joins merge joins, Map-side merge joins partitioned hash joins, Partitioned hash joins MapReduce reduce-side joins, Reduce-Side Joins and Grouping-Handling skewhandling skew, Handling skew sort-merge joins, Sort-merge joins parallel execution of, Comparing Hadoop to Distributed Databases secondary indexes and, Other Indexing Structures stream joins, Stream Joins-Time-dependence of joinsstream-stream join, Stream-stream join (window join) stream-table join, Stream-table join (stream enrichment) table-table join, Table-table join (materialized view maintenance) time-dependence of, Time-dependence of joins support in document databases, Convergence of document and relational databases JOTM (transaction coordinator), Introduction to two-phase commit JSONAvro schema representation, Avro binary variants, Binary encoding for application data, issues with, JSON, XML, and Binary Variants in relational databases, The Object-Relational Mismatch, Convergence of document and relational databases representing a résumé (example), The Object-Relational Mismatch Juttle (query language), Designing Applications Around Dataflow K k-nearest neighbors, Specialization for different domains Kafka (messaging), Message brokers, Using logs for message storageKafka Connect (database integration), API support for change streams, Deriving several views from the same event log Kafka Streams (stream processor), Stream analytics, Maintaining materialized viewsfault tolerance, Rebuilding state after a failure leader-based replication, Leaders and Followers log compaction, Log compaction, Maintaining materialized views message offsets, Using logs for message storage, Idempotence request routing, Request Routing transaction support, Atomic commit revisited usage example, Thinking About Data Systems Ketama (partitioning library), Partitioning proportionally to nodes key-value stores, Data Structures That Power Your Databaseas batch process output, Key-value stores as batch process output hash indexes, Hash Indexes-Hash Indexes in-memory, Keeping everything in memory partitioning, Partitioning of Key-Value Data-Skewed Workloads and Relieving Hot Spotsby hash of key, Partitioning by Hash of Key, Summary by key range, Partitioning by Key Range, Summary dynamic partitioning, Dynamic partitioning skew and hot spots, Skewed Workloads and Relieving Hot Spots Kryo (Java), Language-Specific Formats Kubernetes (cluster manager), Designing for frequent faults, Separation of application code and state L lambda architecture, The lambda architecture Lamport timestamps, Lamport timestamps Large Hadron Collider (LHC), Summary last write wins (LWW), Converging toward a consistent state, Implementing Linearizable Systemsdiscarding concurrent writes, Last write wins (discarding concurrent writes) problems with, Timestamps for ordering events prone to lost updates, Conflict resolution and replication late binding, Separation of logic and wiring latencyinstability under two-phase locking, Performance of two-phase locking network latency and resource utilization, Can we not simply make network delays predictable?

, Different values written at different times, Leaders and Followers, Request Routing Helix (cluster manager) (see Helix) profile (example), The Object-Relational Mismatch reference to company entity (example), Many-to-One and Many-to-Many Relationships Rest.li (RPC framework), Current directions for RPC Voldemort (database) (see Voldemort) Linux, leap second bug, Software Errors, Clock Synchronization and Accuracy liveness properties, Safety and liveness LMDB (storage engine), B-tree optimizations, Indexes and snapshot isolation loadapproaches to coping with, Approaches for Coping with Load describing, Describing Load load testing, Describing Performance load balancing (messaging), Multiple consumers local indexes (see document-partitioned indexes) locality (data access), The Object-Relational Mismatch, Data locality for queries, Glossaryin batch processing, Distributed execution of MapReduce, Example: analysis of user activity events, Dataflow engines in stateful clients, Clients with offline operation, Stateful, offline-capable clients in stream processing, Stream-table join (stream enrichment), Rebuilding state after a failure, Stream processors and services, Uniqueness in log-based messaging location transparency, The problems with remote procedure calls (RPCs)in the actor model, Distributed actor frameworks locks, Glossarydeadlock, Implementation of two-phase locking distributed locking, The leader and the lock-Fencing tokens, Locking and leader electionfencing tokens, Fencing tokens implementation with ZooKeeper, Membership and Coordination Services relation to consensus, Summary for transaction isolationin snapshot isolation, Implementing snapshot isolation in two-phase locking (2PL), Two-Phase Locking (2PL)-Index-range locks making operations atomic, Atomic write operations performance, Performance of two-phase locking preventing dirty writes, Implementing read committed preventing phantoms with index-range locks, Index-range locks, Detecting writes that affect prior reads read locks (shared mode), Implementing read committed, Implementation of two-phase locking shared mode and exclusive mode, Implementation of two-phase locking in two-phase commit (2PC)deadlock detection, Limitations of distributed transactions in-doubt transactions holding locks, Holding locks while in doubt materializing conflicts with, Materializing conflicts preventing lost updates by explicit locking, Explicit locking log sequence number, Setting Up New Followers, Consumer offsets logic programming languages, Designing Applications Around Dataflow logical clocks, Timestamps for ordering events, Sequence Number Ordering, Ordering events to capture causalityfor read-after-write consistency, Reading Your Own Writes logical logs, Logical (row-based) log replication logs (data structure), Data Structures That Power Your Database, Glossaryadvantages of immutability, Advantages of immutable events compaction, Hash Indexes, Performance optimizations, Log compaction, State, Streams, and Immutabilityfor stream operator state, Rebuilding state after a failure creating using total order broadcast, Using total order broadcast implementing uniqueness constraints, Uniqueness in log-based messaging log-based messaging, Partitioned Logs-Replaying old messagescomparison to traditional messaging, Logs compared to traditional messaging, Replaying old messages consumer offsets, Consumer offsets disk space usage, Disk space usage replaying old messages, Replaying old messages, Reprocessing data for application evolution, Unifying batch and stream processing slow consumers, When consumers cannot keep up with producers using logs for message storage, Using logs for message storage log-structured storage, Data Structures That Power Your Database-Performance optimizationslog-structured merge tree (see LSM-trees) replication, Leaders and Followers, Implementation of Replication Logs-Trigger-based replicationchange data capture, Change Data Capture-API support for change streams(see also changelogs) coordination with snapshot, Setting Up New Followers logical (row-based) replication, Logical (row-based) log replication statement-based replication, Statement-based replication trigger-based replication, Trigger-based replication write-ahead log (WAL) shipping, Write-ahead log (WAL) shipping scalability limits, The limits of total ordering loose coupling, Separation of logic and wiring, Materialization of Intermediate State, Making unbundling work lost updates (see updates) LSM-trees (indexes), Making an LSM-tree out of SSTables-Performance optimizationscomparison to B-trees, Comparing B-Trees and LSM-Trees-Downsides of LSM-trees Lucene (storage engine), Making an LSM-tree out of SSTablesbuilding indexes in batch processes, Building search indexes similarity search, Full-text search and fuzzy indexes Luigi (workflow scheduler), MapReduce workflows LWW (see last write wins) M machine learningethical considerations, Bias and discrimination(see also ethics) iterative processing, Graphs and Iterative Processing models derived from training data, Application code as a derivation function statistical and numerical algorithms, Specialization for different domains MADlib (machine learning toolkit), Specialization for different domains magic scaling sauce, Approaches for Coping with Load Mahout (machine learning toolkit), Specialization for different domains maintainability, Maintainability-Evolvability: Making Change Easy, The Future of Data Systemsdefined, Summary design principles for software systems, Maintainability evolvability (see evolvability) operability, Operability: Making Life Easy for Operations simplicity and managing complexity, Simplicity: Managing Complexity many-to-many relationshipsin document model versus relational model, Which data model leads to simpler application code?

pages: 361 words: 107,461

How I Built This: The Unexpected Paths to Success From the World's Most Inspiring Entrepreneurs
by Guy Raz
Published 14 Sep 2020

It’s called iteration—the incremental evolution of a product or service. It is a phenomenon that is natural to innovation and foundational to the development of products as they come to market and vie for the attention of discerning (and often distracted) consumers. Typically, there are two phases to the iterative process prior to launch. The first involves tinkering with your idea until it works and you, as its creator, are satisfied with what you have. The second entails exposing the working idea to the public and tweaking the product based on their feedback until it catches on—either with a buyer, a major investor, a retail partner, or a critical mass of your customers.

The exact amount of time you spend in the first phase of development isn’t as important as making sure you don’t get stuck there for too long. Every idea, no matter how great, has a shelf life. If you don’t get it off that shelf and out into the world in time, no amount of feedback you get during the second phase of the iterative process can overcome a lack of interest or mitigate first-mover advantage if someone beats you to the punch. Moving to phase two can be tough for people who don’t handle criticism well, or who are dogged by that familiar yet unattainable form of perfectionism that has trapped the next great American novel on the desks or hard drives of countless aspiring writers since forever.

They actively seek it out, in fact. Because while they know what they want to do, and they know why and how they want to do it, they also know that they have no idea if anyone will actually like what they’re making. And that’s always essential to keep in mind. Nowhere is this aspect of the iterative process more evident than in the energy bar business. Somehow, over the years, I’ve managed to interview the creators of three of them—Gary Erickson of Clif Bar, Peter Rahal of RXBar, and Lara Merriken of Lärabar. They are unique characters with similar entrepreneurial journeys, and I think what attracted me to their stories, as someone who loves to cook for other people myself, is just how fraught it can be to come up with a new recipe and then try to get it exactly right so people won’t just eat it, but they’ll love it, too.

pages: 353 words: 97,029

How Big Things Get Done: The Surprising Factors Behind Every Successful Project, From Home Renovations to Space Exploration
by Bent Flyvbjerg and Dan Gardner
Published 16 Feb 2023

All the elements are pulled together, and the actual movie that will fill theaters and be seen on televisions the world over is finally created. “By the time you see the film,” Docter said, “it’s about the ninth version of the movie that we’ve put up.” WHY ITERATION WORKS This process involves “an insane amount of work,” Docter acknowledged. But a highly iterative process such as Pixar’s is worth the extraordinary work it entails, for four reasons. First, iteration frees people to experiment, as Edison did with such success. “I need the freedom to just try a bunch of crap out. And a lot of times it doesn’t work,” Docter told me. With this process, that’s fine.

Theranos, a company founded by its charismatic nineteen-year-old CEO, Elizabeth Holmes—with former secretaries of state George Shultz and Henry Kissinger as board members—raised $1.3 billion from investors after it claimed to have developed a spectacular new blood-testing technology.22 It was a mirage, and Theranos collapsed amid a hailstorm of fraud charges and lawsuits.23 Third, an iterative process such as Pixar’s corrects for a basic cognitive bias that psychologists call the “illusion of explanatory depth.” Do you know how a bicycle works? Most people are sure they do, yet they are unable to complete a simple line drawing that shows how a bicycle works. Even when much of the bicycle is already drawn for them, they can’t do it.

By requiring Pixar film directors to walk through every step from the big to the small and show exactly what they will do, Pixar’s process forces them to explain. Illusions evaporate long before production begins, which is when they would become dangerous and expensive.24 That brings us to the fourth reason why iterative processes work, which I touched on in chapter 1: Planning is cheap. Not in absolute terms, perhaps. The rough videos Pixar produces require a director leading a small team of writers and artists. Keeping them all working for years is a significant cost. But compared to the cost of producing digital animation ready for theaters, which requires hundreds of highly skilled people using the most advanced technology in the world, movie stars doing voices, and leading composers creating the score, it is so minor that even making experimental videos over and over again is relatively inexpensive.

pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms
by Mehmed Kantardzić
Published 2 Jan 2003

The ability to extract useful knowledge hidden in these data and to act on that knowledge is becoming increasingly important in today’s competitive world. The entire process of applying a computer-based methodology, including new techniques, for discovering knowledge from data is called data mining. Data mining is an iterative process within which progress is defined by discovery, through either automatic or manual methods. Data mining is most useful in an exploratory analysis scenario in which there are no predetermined notions about what will constitute an “interesting” outcome. Data mining is the search for new, valuable, and nontrivial information in large volumes of data.

In the second step, when the structure of the model is known, all we need to do is apply optimization techniques to determine parameter vector t such that the resulting model y* = f(u,t*) can describe the system appropriately. In general, system identification is not a one-pass process: Both structure and parameter identification need to be done repeatedly until a satisfactory model is found. This iterative process is represented graphically in Figure 1.1. Typical steps in every iteration are as follows: 1. Specify and parameterize a class of formalized (mathematical) models, y* = f(u,t*), representing the system to be identified. 2. Perform parameter identification to choose the parameters that best fit the available data set (the difference y − y* is minimal). 3.

One reason is that data mining is not simply a collection of isolated tools, each completely different from the other and waiting to be matched to the problem. A second reason lies in the notion of matching a problem to a technique. Only very rarely is a research question stated sufficiently precisely that a single and simple application of the method will suffice. In fact, what happens in practice is that data mining becomes an iterative process. One studies the data, examines it using some analytic technique, decides to look at it another way, perhaps modifying it, and then goes back to the beginning and applies another data-analysis tool, reaching either better or different results. This can go around many times; each technique is used to probe slightly different aspects of data—to ask a slightly different question of the data.

pages: 223 words: 60,936

Remote Work Revolution: Succeeding From Anywhere
by Tsedal Neeley
Published 14 Oct 2021

Last and most important of the vectors was people. People consume the many products that Unilever provides—whether it is experiencing a soap’s scent, tasting ice cream, or drinking tea—and people were behind the work of taking each of these goods from an idea to reality. Agile methodology—with its emphasis on digital technology, iterative processes, and close collaboration—informed all three vectors. Unilever’s transformation to digital technology was what propelled its adoption of agile teams. Most astonishingly, the synergy between both these innovations enabled a sprawling, legacy multinational to bridge the global-local divide. AppFolio: Born Agile Unlike Unilever, AppFolio was born digital.

Hawkins characterized the collaborative atmosphere at the AppFolio office as “interrupt-driven,” where his role as a leader was to be present but not overbearing—available whenever his team members needed guidance, but otherwise hands-off. Hawkins valued the opportunity to see his teammates at the office. He espoused an open-door policy where anyone could drop by his desk to pose a question or ask for help. He viewed his job as an iterative process of responding to his team’s needs as they came up in real time. AppFolio Goes Remote Like so many companies across the world, AppFolio’s entire existence abruptly and dramatically changed in 2020. When COVID-19 put the United States on lockdown overnight, AppFolio’s sudden shift to remote work posed a direct challenge to the company’s agile in-person practices of teamwork.

Once they felt familiar with one another, a sense of mutual trust would make it easier to disagree without fear of causing offense. She also found it useful to hire an outside consultant to work with two people who came from a military background and were used to a command-and-control style of decision-making instead of a collaborative, iterative process. Whereas the previous team’s relative sameness had created dynamics of cognitive agreement and efficiency, the new team was eventually able to create a healthy team dynamic of open debate, discussion, and friction—the qualities that ultimately led to innovative solutions. Molinas likened her team to the United Nations because of its broad array of perspectives, and laughed when she reported, “I believe I am seeing a much healthier business and healthier team dynamic.

pages: 204 words: 66,619

Think Like an Engineer: Use Systematic Thinking to Solve Everyday Challenges & Unlock the Inherent Values in Them
by Mushtak Al-Atabi
Published 26 Aug 2014

Steve Jobs Design can mean different things in different contexts, from art to engineering to product design. In general, design refers to the clever arrangement of different components to work harmoniously to deliver a value or perform a task. The Design stage of the CDIO process refers to the cumulative and iterative process of bringing imagination a step closer to manifestation through the use of science, mathematics and common sense to convert resources and achieve prescribed objectives. It is important to stress here that the Conceiving and Designing processes are not separated but integrated and working on the Designing stage may require going back to tweak or even totally change the ideas developed at the Conceiving stage. 5.1 Function and Form Design refers to creating the marriage between the form (the look) and function of products and objects.

Likewise, a slot affords inserting, and a button affords pushing. Good designs ensure that affordance is made visible and mapped clearly towards the desired functions so that the user can easily figure out how the product can be used. More on this is discussed later in the book. 5.2 The Design Process The Design process is very evolutionary and iterative process by nature. At different stages of the design process we may need to go back to experiments and even to repeat the Conceiving process. The Design process is used to design super-systems as well as subsystems and sub-subsystems. For example, a design team may be designing a car (super-system) while other teams are designing subsystems such as the engine or the exhaust muffler for the car.

Christopher Chew Chapter 8 Ergonomics: Human-Centred Design “Human-centred design (is) meeting people where they are and really taking their needs and feedback into account. When you let people participate in the design process, you find that they often have ingenious ideas about what would really help them. And it’s not a onetime thing; it’s an iterative process.” Melinda Gates, Gates Foundation “Others approach a challenge from the point of view that says, 'We have the smartest people in the world; therefore, we can think this through.' We approach it from the point of view that the answer is out there, hidden in plain sight, so let's go observe human behaviour and see where the opportunities are.”

pages: 247 words: 69,593

The Creative Curve: How to Develop the Right Idea, at the Right Time
by Allen Gannett
Published 11 Jun 2018

Across the various fields I studied, creative people all had their own methods of refining ideas in order to end up with a shortlist of those with the highest probability of success. While I don’t have a cutesy acronym for this process, in every field creators used the four steps I outlined at Ben & Jerry’s: Conceptualization, Reduction, Curation, and Feedback. This iteration process allows anyone to refine their work to find the ideal spot on the creative curve. What does this look like in other fields? Is making ice cream truly the same as, say, making movies? The Data of Film One of the most surprising things I learned over the course of my research into creative success was how similar creative processes are across different fields.

In short, Jacobson was absorbing popular taste in the same way our former video store clerk turned Netflix head, Ted Sarandos, learned about movies back in Arizona. Recognized for her hard work and keen insights, her career started to take off, and by the time she was thirty-six she had taken the helm of Walt Disney Motion Pictures Group. In 2007, she founded Color Force. Intrigued by how films, and the film industry, use iterative processes and data, I called her to find out how studios try to craft the perfect blockbuster. Screenwriting comes first. Jacobson explained that the process of screenwriting is far different from a writer sealing themselves up in a remote escape in the woods, emerging after they have typed the words “The End.”

The dial test is now often replaced by an online equivalent, the goal being to reach a larger, more representative audience. What filmmakers and studio stakeholders are really after is maximizing the odds that undecided potential audience members—the equivalent of election swing voters—will buy tickets to the movie. Penn explained, “Trailer testing is an iterative process. You’re going to go into the lab, if you will, which is talking to consumers either in person or online and you’re going to try out different creative explorations of what you think is the most marketable premise to sell the movie.” Along the way, testing uncovers key elements to which the audience responds.

pages: 132 words: 31,976

Getting Real
by Jason Fried , David Heinemeier Hansson , Matthew Linderman and 37 Signals
Published 1 Jan 2006

—Matt Hamer, developer and product manager, Kinja Table of contents | Essay list for this chapter | Next essay Rinse and Repeat Work in iterations Don't expect to get it right the first time. Let the app grow and speak to you. Let it morph and evolve. With web-based software there's no need to ship perfection. Design screens, use them, analyze them, and then start over again. Instead of banking on getting everything right upfront, the iterative process lets you continue to make informed decisions as you go along. Plus, you'll get an active app up and running quicker since you're not striving for perfection right out the gate. The result is real feedback and real guidance on what requires your attention. Iterations lead to liberation You don't need to aim for perfection on the first try if you know it's just going to be done again later anyway.

No One's Going to Read It I can't even count how many multi-page product specifications or business requirement documents that have languished, unread, gathering dust nearby my dev team while we coded away, discussing problems, asking questions and user testing as we went. I've even worked with developers who've spent hours writing long, descriptive emails or coding standards documents that also went unread. Webapps don't move forward with copious documentation. Software development is a constantly shifting, iterative process that involves interaction, snap decisions, and impossible-to-predict issues that crop up along the way. None of this can or should be captured on paper. Don't waste your time typing up that long visionary tome; no one's going to read it. Take consolation in the fact that if you give your product enough room to grow itself, in the end it won't resemble anything you wrote about anyway.

pages: 516 words: 157,437

Principles: Life and Work
by Ray Dalio
Published 18 Sep 2017

Design in such a way that you produce good results even when people make mistakes. 13.4 Recognize that design is an iterative process. Between a bad “now” and a good “then” is a “working through it” period. That “working through it” period is when you try out different processes and people, seeing what goes well or poorly, learning from the iterations, and moving toward the ideal systematic design. Even with a good future design picture in mind, it will naturally take some mistakes and learning to get to a good “then” state. People frequently complain about this kind of iterative process because it tends to be true that people are happier with nothing at all than with something imperfect, even though it would be more logical to have the imperfect thing.

Visualize alternative machines and their outcomes, and then choose. c. Consider second- and third-order consequences, not just first-order ones. d. Use standing meetings to help your organization run like a Swiss clock. e. Remember that a good machine takes into account the fact that people are imperfect. 13.4 Recognize that design is an iterative process. Between a bad “now” and a good “then” is a “working through it” period. a. Understand the power of the “cleansing storm.” 13.5 Build the organization around goals rather than tasks. a. Build your organization from the top down. b. Remember that everyone must be overseen by a believable person who has high standards.

Imagine how silly and unproductive it would be to respond to your ski instructor as if he were blaming you when he told you that you fell because you didn’t shift your weight properly. It’s no different if a supervisor points out a flaw in your work process. Fix it and move on. a. Get over “blame” and “credit” and get on with “accurate” and “inaccurate.” Worrying about “blame” and “credit” or “positive” and “negative” feedback impedes the iterative process that is essential to learning. Remember that what has already happened lies in the past and no longer matters except as a lesson for the future. The need for phony praise needs to be unlearned. 3.3 Observe the patterns of mistakes to see if they are products of weaknesses. Everyone has weaknesses and they are generally revealed in the patterns of mistakes they make.

pages: 410 words: 114,005

Black Box Thinking: Why Most People Never Learn From Their Mistakes--But Some Do
by Matthew Syed
Published 3 Nov 2015

.* Vowles said: The secret to modern F1 is not really to do with big ticket items; it is about hundreds of thousands of small items, optimized to the nth degree. People think that things like engines are based upon high-level strategic decisions, but they are not. What is an engine except many iterations of small components? You start with a sensible design, but it is the iterative process that guides you to the best solution. Success is about creating the most effective optimization loop. I also spoke to Andy Cowell, the leader of the team that devised the engine. His attitude was a carbon copy of that of Vowles. We got our development engine up and running in late December [2012].

“A cyclone has a number of variables: size of entry, exit, angle, diameter, length: and the trying thing is that if you change one dimension, it affects all the others.” His discipline was astonishing. “I couldn’t afford a computer, so I would hand-write the results into a book,” he recalls. “In the first year alone, I conducted literally hundreds of experiments. It was a very, very thick book.” But as the intensive, iterative process gradually solved the problem of separating ultra-fine dust, Dyson came up against another problem: long pieces of hair and fluff. These were not being separated from the airflow by the cyclone dynamics. “They were just coming out of the top along with the air,” he says. “It was another huge problem and it didn’t seem as if a conventional cyclone could solve it.”

But now consider what happens next. The story line is pulled apart. As the animation gets into operation, each frame, each strand of the story, each scene is subject to debate, dissent, and testing. All told, it takes around twelve thousand storyboard drawings to make one ninety-minute feature, and because of the iterative process, story teams often create more than 125,000 storyboards by the time the film is actually delivered. Monsters, Inc. is a perfect illustration of a creative idea adapted in the light of criticism. It started off with a plot centered on a middle-aged accountant who hates his job and who is given a sketchbook by his mother.

pages: 2,054 words: 359,149

The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities
by Justin Schuh
Published 20 Nov 2006

However, the most important thing learned from that experience is that it’s best to use several techniques and switch between them periodically for the following reasons: You can concentrate intensely for only a limited time. Different vulnerabilities are easier to find from different perspectives. Variety helps you maintain discipline and motivation. Different people think in different ways. Iterative Process The method for performing the review is a simple, iterative process. It’s intended to be used two or three times over the course of a work day. Generally, this method works well because you can switch to a less taxing auditing activity when you start to feel as though you’re losing focus. Of course, your work day, constitution, and preferred schedule might prompt you to adapt the process further, but this method should be a reasonable starting point.

OPERATIONAL REVIEW Introduction Exposure Attack Surface Insecure Defaults Access Control Unnecessary Services Secure Channels Spoofing and Identification Network Profiles Web-Specific Considerations HTTP Request Methods Directory Indexing File Handlers Authentication Default Site Installations Overly Verbose Error Messages Public-Facing Administrative Interfaces Protective Measures Development Measures Host-Based Measures Network-Based Measures Summary 4. APPLICATION REVIEW PROCESS Introduction Overview of the Application Review Process Rationale Process Outline Preassessment Scoping Application Access Information Collection Application Review Avoid Drowning Iterative Process Initial Preparation Plan Work Reflect Documentation and Analysis Reporting and Remediation Support Code Navigation External Flow Sensitivity Tracing Direction Code-Auditing Strategies Code Comprehension Strategies Candidate Point Strategies Design Generalization Strategies Code-Auditing Tachniques Internal Flow Analysis Subsystem and Dependency Analysis Rereading Code Desk-Checking Test Cases Code Auditor’s Toolbox Source Code Navigators Debuggers Binary Navigation Tools Fuzz-Testing Tools Case Study: OpenSSH Preassessment Implementation Analysis High-Level Attack Vectors Documentation of Findings Summary II.

Because you’re working entirely from the implementation first, you could end up reviewing a lot of code that isn’t security relevant. However, you won’t know that until you develop a higher-level understanding of the application. As part of a bottom-up review, maintaining a design model of the system throughout the assessment is valuable. If you update it after each pass through the iterative process, you can quickly piece together the higher-level organization. This design model doesn’t have to be formal. On the contrary, it’s best to use a format that’s easy to update and can capture potentially incomplete or incorrect information. You can opt for DFD sketches and class diagrams, or you can use a simple text file for recording your design notes.

pages: 416 words: 39,022

Asset and Risk Management: Risk Oriented Finance
by Louis Esch , Robert Kieffer and Thierry Lopez
Published 28 Nov 2005

Let us now assume that we wish to determine a solution a with a degree of precision ε. We could stop the iterative process on the basis of the error estimation formula. These formulae, however, require a certain level of information on the derivative f  (x), information that is not easy to obtain. On the other hand, the limit specification εa will not generally be known beforehand.3 Consequently, we are running the risk of ε, the accuracy level sought, never being reached, as it is better than the limit precision εa (ε < εa ). In this case, the iterative process will carry on indefinitely. This leads us to accept the following stop criterion:  |xn − xn−1 | < ε |xn+1 − xn | ≥ |xn − xn−1 | This means that the iteration process will be stopped when the iteration n produces a variation in value less than that of the iteration n + 1.

This leads us to accept the following stop criterion:  |xn − xn−1 | < ε |xn+1 − xn | ≥ |xn − xn−1 | This means that the iteration process will be stopped when the iteration n produces a variation in value less than that of the iteration n + 1. The value of ε will be chosen in a way that prevents the iteration from stopping too soon. 8.2 PRINCIPAL METHODS Defining an iterative method is based ultimately on defining the function h(x) of the equation x = g(x) ≡ x − h(x)f (x). The choice of this function will determine the order of the method. 8.2.1 First order methods The simplest choice consists of taking h(x) = m = constant = 0. 8.2.1.1 Chord method This defines the chord method (Figure A8.1), for which the iteration is xn+1 = xn − mf (xn ).

.  In addition, if the Jacobian matrix J(x), defined by [J(x)]ij = gj (x) xi is such that for every x ∈ I , ||J(x)|| ≤ m for a norm compatible with m < 1, Lipschitz’s condition is satisfied. The order of convergence is defined by lim k→∞ ||ek+1 || =C ||ek ||p where C is the constant for the asymptotic error. 8.3.2 Principal methods If one chooses a constant matrix A as the value for A(x), the iterative process is the generalisation in n dimensions of the chord method. If the inverse of the Jacobian matrix of f is chosen as the value of A(x), we will obtain the generalisation in n dimensions of the Newton–Raphson method. Another approach to solving the equation f (x) = 0 involves using the i th equation to determine the (i + 1)th component.

pages: 398 words: 31,161

Gnuplot in Action: Understanding Data With Graphs
by Philipp Janert
Published 2 Jan 2010

You may also want to take a look at the gnuplot reference documention for further discussion and additional features. Since the fitting algorithm is an iterative process, it’s not guaranteed to converge. If the iteration doesn’t converge, or converges to an obviously wrong solution, try to initialize the fitting parameters with better starting values. Unless the variables have been initialized explicitly, they’ll be equal to zero, which is often a particularly bad starting value. In special situations, you may also want to try hand-tuning the iteration process itself by fiddling with values of FIT_START_LAMBDA and FIT_LAMBDA_FACTOR. All fitting parameters should be of roughly equal scale.

It takes only one line to read and plot a data file, and most of the command syntax is straightforward and quite intuitive. Gnuplot does not require programming or any deeper understanding of its command syntax to get started. So this is the fundamental workflow of all work with gnuplot: plot, examine, repeat—until you have found out whatever you wanted to learn from the data. Gnuplot supports the iterative process model required for exploratory work perfectly. 1.3.1 Gnuplot isn’t GNU To dispel one common confusion right away: gnuplot isn’t GNU software, has nothing to do with the GNU project, and isn’t released under the GNU Public License (GPL). Gnuplot is released under a permissive open source license.

In general, the colors are distributed rather uniformly over the entire spectrum, because this matches up with the regularly varying function in this plot. 9.4.2 A complex figure As an example of a graph that includes a lot of fine detail, I’ve chosen a section from the edge of the Mandelbrot set. The Mandelbrot set is the set of all points in the complex plane for which a certain simple iteration process stays bounded. What’s noteworthy here is that the border between points inside the set and outside of it isn’t smooth—in fact the border is “infinitely” complicated, showing details at all levels of magnification.6 For points far from the Mandelbrot set, the iteration will diverge quickly (after just a few steps).

pages: 287 words: 44,739

Guide to business modelling
by John Tennent , Graham Friend and Economist Group
Published 15 Dec 2005

The graph in Chart 15.11 shows the npv for a range of discount rates. 184 15. PROJECT APPRAISAL AND COMPANY VALUATION Chart 15.11 The NPV of a project with a range of discount rates The graph is a curved shape so the irr has to be found by trial and error or interpolation between two known points. Even spreadsheets use a trial and error iterative process to find the breakeven point. In the example it was possible to find the point almost exactly. In spreadsheets the irr function can be used to find the breakeven interest rate. The syntax is: ⫽IRR(range,guess) The range is the cash flows in the model and the guess is the point near where the irr is expected to be found.

Conceptual errors These constitute a flaw in the logic, the rationale or the mechanisms depicted in the model. As the business modelling process map in Chapter 6 (Chart 6.1, page 34) indicated, developing an understanding of the logical flows and the relationships within the environment is an iterative process. The testing phase offers the modeller another opportunity to increase and test his or her understanding of the business. User errors These occur in poorly structured and badly documented models with limited checks on user inputs and inadequately trained users. The problems may arise as a result of human error, but the fundamental problem often lies with the design of the model.

However, if a circular reference is an integral part of the design, such as in the case of interest calculations, then the spreadsheet package must be instructed to find a set of values that satisfy the circularity. The values represent an equilibrium that is effectively the solution to a set of simultaneous equations. To allow the spreadsheet to solve the circular reference, the modeller should select tools➞options➞calculation tab➞iteration. The model uses an iterative process where a range of values is used until a consistent set of results is found. Additional error handling may be required in the presence of circular references because if, for example, a #DIV/0!, #N/A! or #REF! occurs, the model will be unable to find a solution and the errors become compounded by the circularity.

pages: 343 words: 93,544

vN: The First Machine Dynasty (The Machine Dynasty Book 1)
by Madeline Ashby
Published 28 Jul 2012

So you'd better fuel up now." Amy re-examined the plates. They were the smart kind; if she'd asked, they would have told her how many ounces she was eating from each. But she didn't need to ask. "There's too much here," she said. "If I eat all this without having to repair myself, it'll trigger the iteration process." She leaned as far forward as the Cuddlebug would allow. "Will I have to repair myself?" "No. It'll just wear you out, that's all." "How do you know?" "I've seen it happen." Dr Singh stood. "I thought you'd be happy with the spread. Your mother says you were never allowed to eat as much as you wanted.

Asimov's Frankenstein Complex notion isn't just an early version of Mori's Uncanny Valley hypothesis, it's a reasonable extension of the fear that when we create in our own image, we will inevitably re-create the worst parts of ourselves. In other words: "I'm fucked up – therefore my kids will be fucked up, too." When I completed the submission draft of this book, I had just finished the first year of my second Master's – a design degree in strategic foresight. So I had spent months listening to discussions about the iterative process. And I started to realize that a self-replicating species of machine wouldn't have the usual fears about its offspring repeating its signature mistakes, nor would it have that uncanny response to copying. Machines like that could consider their iterations as prototypes, and nothing more. Stephen King has a famous adage about killing your darlings, and they could do that – literally – without a flood of oxytocin or normative culture telling them different.

But even so, the book was rejected by a bunch of different publishers, and I still had to re-write the whole opening of the submission draft before the book became sale-able. David Nickle was invaluable for that – we watched A History of Violence together and suddenly everything clicked. Normally he gives me the end of all my stories, and this time he helped me see a new beginning. In short: it was an iterative process. ANGRY ROBOT A member of the Osprey Group Lace Market House, 54-56 High Pavement, Nottingham, NG1 1HW, UK www.angryrobotbooks.com No three rules An Angry Robot paperback original 2012 1 Copyright © Madeline Ashby 2012 Madeline Ashby asserts the moral right to be identified as the author of this work.

Functional Programming in Scala
by Paul Chiusano and Rúnar Bjarnason
Published 13 Sep 2014

Though because this style of organization is so common in FP, we sometimes don't bother to distinguish between an ordinary functional library and a "combinator library". 7.2 Choosing data types and functions Our goal in this section is to discover a data type and a set of primitive functions for our domain, and derive some useful combinators. This will be a somewhat meandering journey. Functional design can be a messy, iterative process. We hope to show at least a stylized view of this messiness that nonetheless gives some insight into how functional design proceeds in the real world. Don't worry if you don't follow absolutely every bit of discussion throughout this process. This chapter is a bit like peering over the shoulder of someone as they think through possible designs.

And lastly, you can look at your implementation and come up with laws you expect to hold based on your implementation.17 Footnote 17mThis last way of generating laws is probably the weakest, since it can be a little too easy to just have the laws reflect the implementation, even if the implementation is buggy or requires all sorts of unusual side conditions that make composition difficult. EXERCISE 13: Can you think of other laws that should hold for your implementation of unit, fork, and map2? Do any of them have interesting consequences? 7.2.4 Expressiveness and the limitations of an algebra Functional design is an iterative process. After you've written down your API and have at least a prototype implementation, try using it for progressively more complex or realistic scenarios. Often you'll find that these scenarios require only some combination of existing primitive or derived combinators, and this is a chance to factor out common usage patterns into other combinators; occasionally you'll find situations where your existing primitives are insufficient.

But even if you decide you like the existing library's solution, spending an hour or two of playing with designs and writing down some type signatures is a great way to learn more about a domain, understand the design tradeoffs, and improve your ability to think through design problems. 8.3 Choosing data types and functions In this section, we will embark on another messy, iterative process of discovering data types and a set of primitive functions and combinators for doing property-based testing. As before, this is a chance to peer over the shoulder of someone working through possible designs. The particular path we take and the library we arrive at isn't necessarily the same as what you would discover.

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by Martin Kleppmann
Published 17 Apr 2017

Batch Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Batch Processing with Unix Tools Simple Log Analysis The Unix Philosophy MapReduce and Distributed Filesystems MapReduce Job Execution Reduce-Side Joins and Grouping Map-Side Joins The Output of Batch Workflows Comparing Hadoop to Distributed Databases Beyond MapReduce Materialization of Intermediate State Graphs and Iterative Processing High-Level APIs and Languages Summary 391 391 394 397 399 403 408 411 414 419 419 424 426 429 11. Stream Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Transmitting Event Streams Messaging Systems Partitioned Logs Databases and Streams Keeping Systems in Sync Change Data Capture Event Sourcing State, Streams, and Immutability Processing Streams Uses of Stream Processing Reasoning About Time Stream Joins Fault Tolerance Summary 440 441 446 451 452 454 457 459 464 465 468 472 476 479 12.

Thus, when using a dataflow engine, materialized datasets on HDFS are still usually the inputs and the final outputs of a job. Like with MapReduce, the inputs are immutable and the output is completely replaced. The improvement over MapReduce is that you save yourself writing all the intermediate state to the filesystem as well. Beyond MapReduce | 423 Graphs and Iterative Processing In “Graph-Like Data Models” on page 49 we discussed using graphs for modeling data, and using graph query languages to traverse the edges and vertices in a graph. The discussion in Chapter 2 was focused around OLTP-style use: quickly executing queries to find a small number of vertices matching certain criteria.

The opposite of bounded. 558 | Glossary Index A aborts (transactions), 222, 224 in two-phase commit, 356 performance of optimistic concurrency con‐ trol, 266 retrying aborted transactions, 231 abstraction, 21, 27, 222, 266, 321 access path (in network model), 37, 60 accidental complexity, removing, 21 accountability, 535 ACID properties (transactions), 90, 223 atomicity, 223, 228 consistency, 224, 529 durability, 226 isolation, 225, 228 acknowledgements (messaging), 445 active/active replication (see multi-leader repli‐ cation) active/passive replication (see leader-based rep‐ lication) ActiveMQ (messaging), 137, 444 distributed transaction support, 361 ActiveRecord (object-relational mapper), 30, 232 actor model, 138 (see also message-passing) comparison to Pregel model, 425 comparison to stream processing, 468 Advanced Message Queuing Protocol (see AMQP) aerospace systems, 6, 10, 305, 372 aggregation data cubes and materialized views, 101 in batch processes, 406 in stream processes, 466 aggregation pipeline query language, 48 Agile, 22 minimizing irreversibility, 414, 497 moving faster with confidence, 532 Unix philosophy, 394 agreement, 365 (see also consensus) Airflow (workflow scheduler), 402 Ajax, 131 Akka (actor framework), 139 algorithms algorithm correctness, 308 B-trees, 79-83 for distributed systems, 306 hash indexes, 72-75 mergesort, 76, 402, 405 red-black trees, 78 SSTables and LSM-trees, 76-79 all-to-all replication topologies, 175 AllegroGraph (database), 50 ALTER TABLE statement (SQL), 40, 111 Amazon Dynamo (database), 177 Amazon Web Services (AWS), 8 Kinesis Streams (messaging), 448 network reliability, 279 postmortems, 9 RedShift (database), 93 S3 (object storage), 398 checking data integrity, 530 amplification of bias, 534 of failures, 364, 495 Index | 559 of tail latency, 16, 207 write amplification, 84 AMQP (Advanced Message Queuing Protocol), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 message ordering, 446 analytics, 90 comparison to transaction processing, 91 data warehousing (see data warehousing) parallel query execution in MPP databases, 415 predictive (see predictive analytics) relation to batch processing, 411 schemas for, 93-95 snapshot isolation for queries, 238 stream analytics, 466 using MapReduce, analysis of user activity events (example), 404 anti-caching (in-memory databases), 89 anti-entropy, 178 Apache ActiveMQ (see ActiveMQ) Apache Avro (see Avro) Apache Beam (see Beam) Apache BookKeeper (see BookKeeper) Apache Cassandra (see Cassandra) Apache CouchDB (see CouchDB) Apache Curator (see Curator) Apache Drill (see Drill) Apache Flink (see Flink) Apache Giraph (see Giraph) Apache Hadoop (see Hadoop) Apache HAWQ (see HAWQ) Apache HBase (see HBase) Apache Helix (see Helix) Apache Hive (see Hive) Apache Impala (see Impala) Apache Jena (see Jena) Apache Kafka (see Kafka) Apache Lucene (see Lucene) Apache MADlib (see MADlib) Apache Mahout (see Mahout) Apache Oozie (see Oozie) Apache Parquet (see Parquet) Apache Qpid (see Qpid) Apache Samza (see Samza) Apache Solr (see Solr) Apache Spark (see Spark) 560 | Index Apache Storm (see Storm) Apache Tajo (see Tajo) Apache Tez (see Tez) Apache Thrift (see Thrift) Apache ZooKeeper (see ZooKeeper) Apama (stream analytics), 466 append-only B-trees, 82, 242 append-only files (see logs) Application Programming Interfaces (APIs), 5, 27 for batch processing, 403 for change streams, 456 for distributed transactions, 361 for graph processing, 425 for services, 131-136 (see also services) evolvability, 136 RESTful, 133 SOAP, 133 application state (see state) approximate search (see similarity search) archival storage, data from databases, 131 arcs (see edges) arithmetic mean, 14 ASCII text, 119, 395 ASN.1 (schema language), 127 asynchronous networks, 278, 553 comparison to synchronous networks, 284 formal model, 307 asynchronous replication, 154, 553 conflict detection, 172 data loss on failover, 157 reads from asynchronous follower, 162 Asynchronous Transfer Mode (ATM), 285 atomic broadcast (see total order broadcast) atomic clocks (caesium clocks), 294, 295 (see also clocks) atomicity (concurrency), 553 atomic increment-and-get, 351 compare-and-set, 245, 327 (see also compare-and-set operations) replicated operations, 246 write operations, 243 atomicity (transactions), 223, 228, 553 atomic commit, 353 avoiding, 523, 528 blocking and nonblocking, 359 in stream processing, 360, 477 maintaining derived data, 453 for multi-object transactions, 229 for single-object writes, 230 auditability, 528-533 designing for, 531 self-auditing systems, 530 through immutability, 460 tools for auditable data systems, 532 availability, 8 (see also fault tolerance) in CAP theorem, 337 in service level agreements (SLAs), 15 Avro (data format), 122-127 code generation, 127 dynamically generated schemas, 126 object container files, 125, 131, 414 reader determining writer’s schema, 125 schema evolution, 123 use in Hadoop, 414 awk (Unix tool), 391 AWS (see Amazon Web Services) Azure (see Microsoft) B B-trees (indexes), 79-83 append-only/copy-on-write variants, 82, 242 branching factor, 81 comparison to LSM-trees, 83-85 crash recovery, 82 growing by splitting a page, 81 optimizations, 82 similarity to dynamic partitioning, 212 backpressure, 441, 553 in TCP, 282 backups database snapshot for replication, 156 integrity of, 530 snapshot isolation for, 238 use for ETL processes, 405 backward compatibility, 112 BASE, contrast to ACID, 223 bash shell (Unix), 70, 395, 503 batch processing, 28, 389-431, 553 combining with stream processing lambda architecture, 497 unifying technologies, 498 comparison to MPP databases, 414-418 comparison to stream processing, 464 comparison to Unix, 413-414 dataflow engines, 421-423 fault tolerance, 406, 414, 422, 442 for data integration, 494-498 graphs and iterative processing, 424-426 high-level APIs and languages, 403, 426-429 log-based messaging and, 451 maintaining derived state, 495 MapReduce and distributed filesystems, 397-413 (see also MapReduce) measuring performance, 13, 390 outputs, 411-413 key-value stores, 412 search indexes, 411 using Unix tools (example), 391-394 Bayou (database), 522 Beam (dataflow library), 498 bias, 534 big ball of mud, 20 Bigtable data model, 41, 99 binary data encodings, 115-128 Avro, 122-127 MessagePack, 116-117 Thrift and Protocol Buffers, 117-121 binary encoding based on schemas, 127 by network drivers, 128 binary strings, lack of support in JSON and XML, 114 BinaryProtocol encoding (Thrift), 118 Bitcask (storage engine), 72 crash recovery, 74 Bitcoin (cryptocurrency), 532 Byzantine fault tolerance, 305 concurrency bugs in exchanges, 233 bitmap indexes, 97 blockchains, 532 Byzantine fault tolerance, 305 blocking atomic commit, 359 Bloom (programming language), 504 Bloom filter (algorithm), 79, 466 BookKeeper (replicated log), 372 Bottled Water (change data capture), 455 bounded datasets, 430, 439, 553 (see also batch processing) bounded delays, 553 in networks, 285 process pauses, 298 broadcast hash joins, 409 Index | 561 brokerless messaging, 442 Brubeck (metrics aggregator), 442 BTM (transaction coordinator), 356 bulk synchronous parallel (BSP) model, 425 bursty network traffic patterns, 285 business data processing, 28, 90, 390 byte sequence, encoding data in, 112 Byzantine faults, 304-306, 307, 553 Byzantine fault-tolerant systems, 305, 532 Byzantine Generals Problem, 304 consensus algorithms and, 366 C caches, 89, 553 and materialized views, 101 as derived data, 386, 499-504 database as cache of transaction log, 460 in CPUs, 99, 338, 428 invalidation and maintenance, 452, 467 linearizability, 324 CAP theorem, 336-338, 554 Cascading (batch processing), 419, 427 hash joins, 409 workflows, 403 cascading failures, 9, 214, 281 Cascalog (batch processing), 60 Cassandra (database) column-family data model, 41, 99 compaction strategy, 79 compound primary key, 204 gossip protocol, 216 hash partitioning, 203-205 last-write-wins conflict resolution, 186, 292 leaderless replication, 177 linearizability, lack of, 335 log-structured storage, 78 multi-datacenter support, 184 partitioning scheme, 213 secondary indexes, 207 sloppy quorums, 184 cat (Unix tool), 391 causal context, 191 (see also causal dependencies) causal dependencies, 186-191 capturing, 191, 342, 494, 514 by total ordering, 493 causal ordering, 339 in transactions, 262 sending message to friends (example), 494 562 | Index causality, 554 causal ordering, 339-343 linearizability and, 342 total order consistent with, 344, 345 consistency with, 344-347 consistent snapshots, 340 happens-before relationship, 186 in serializable transactions, 262-265 mismatch with clocks, 292 ordering events to capture, 493 violations of, 165, 176, 292, 340 with synchronized clocks, 294 CEP (see complex event processing) certificate transparency, 532 chain replication, 155 linearizable reads, 351 change data capture, 160, 454 API support for change streams, 456 comparison to event sourcing, 457 implementing, 454 initial snapshot, 455 log compaction, 456 changelogs, 460 change data capture, 454 for operator state, 479 generating with triggers, 455 in stream joins, 474 log compaction, 456 maintaining derived state, 452 Chaos Monkey, 7, 280 checkpointing in batch processors, 422, 426 in high-performance computing, 275 in stream processors, 477, 523 chronicle data model, 458 circuit-switched networks, 284 circular buffers, 450 circular replication topologies, 175 clickstream data, analysis of, 404 clients calling services, 131 pushing state changes to, 512 request routing, 214 stateful and offline-capable, 170, 511 clocks, 287-299 atomic (caesium) clocks, 294, 295 confidence interval, 293-295 for global snapshots, 294 logical (see logical clocks) skew, 291-294, 334 slewing, 289 synchronization and accuracy, 289-291 synchronization using GPS, 287, 290, 294, 295 time-of-day versus monotonic clocks, 288 timestamping events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 Rest.li (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!

pages: 1,758 words: 342,766

Code Complete (Developer Best Practices)
by Steve McConnell
Published 8 Jun 2004

Iterate, Repeatedly, Again and Again Iteration is appropriate for many software-development activities. During your initial specification of a system, you work with the user through several versions of requirements until you're sure you agree on them. That's an iterative process. When you build flexibility into your process by building and delivering a system in several increments, that's an iterative process. If you use prototyping to develop several alternative solutions quickly and cheaply before crafting the final product, that's another form of iteration. Iterating on requirements is perhaps as important as any other aspect of the software-development process.

As shown in Figure 5-9, a class is a lot like an iceberg: seven-eighths is under water, and you can see only the one-eighth that's above the surface. Figure 5-9. A good class interface is like the tip of an iceberg, leaving most of the class unexposed Designing the class interface is an iterative process just like any other aspect of design. If you don't get the interface right the first time, try a few more times until it stabilizes. If it doesn't stabilize, you need to try a different approach. An Example of Information Hiding Suppose you have a program in which each object is supposed to have a unique ID stored in a member variable called id.

Iterate You might have had an experience in which you learned so much from writing a program that you wished you could write it again, armed with the insights you gained from writing it the first time. The same phenomenon applies to design, but the design cycles are shorter and the effects downstream are bigger, so you can afford to whirl through the design loop a few times. Design is an iterative process. You don't usually go from point A only to point B; you go from point A to point B and back to point A. As you cycle through candidate designs and try different approaches, you'll look at both high-level and low-level views. The big picture you get from working with high-level issues will help you to put the low-level details in perspective.

pages: 372 words: 101,174

How to Create a Mind: The Secret of Human Thought Revealed
by Ray Kurzweil
Published 13 Nov 2012

(For an algorithmic description of genetic algorithms, see this endnote.)11 The key to a genetic algorithm is that the human designers don’t directly program a solution; rather, we let one emerge through an iterative process of simulated competition and improvement. Biological evolution is smart but slow, so to enhance its intelligence we greatly speed up its ponderous pace. The computer is fast enough to simulate many generations in a matter of hours or days, and we’ve occasionally had them run for as long as weeks to simulate hundreds of thousands of generations. But we have to go through this iterative process only once; as soon as we have let this simulated evolution run its course, we can apply the evolved and highly refined rules to real problems in a rapid fashion.

After processing the 1,025th vector, one of those clusters now has more than one point. We keep processing points in this way, always maintaining 1,024 clusters. After we have processed all the points, we represent each multipoint cluster by the geometric center of the points in that cluster. We continue this iterative process until we have run through all the sample points. Typically we would process millions of points into 1,024 (210) clusters; we’ve also used 2,048 (211) or 4,096 (212) clusters. Each cluster is represented by one vector that is at the geometric center of all the points in that cluster. Thus the total of the distances of all the points in the cluster to the center point of the cluster is as small as possible.

pages: 193 words: 98,671

The Inmates Are Running the Asylum
by Alan Cooper
Published 24 Feb 2004

I'll talk a lot about goals in the next chapter, but we discover them in the same way we discover personas. We determine the relevant personas and their goals in a process of successive refinement during our initial investigation of the problem domain. Typically, we start with a reasonable approximation and quickly converge on a believable population of personas. Although this iterative process is similar to the iterative process used by software engineers during the implementation process, it is significantly different in one major respect. Iterating the design and its premises is quick and easy because we are working in paper and words. Iterating the implementation is slow and difficult because it requires code.

They then listen to the complaints and feedback, measure the patterns of the user's navigation clicks, change the weak parts, and then ship it again. Generally, programmers aren't thrilled about the iterative method because it means extra work for them. Typically, it's managers new to technology who like the iterative process because it relieves them of having to perform rigorous planning, thinking, and product due diligence (in other words, interaction design). Of course, it's the users who pay the dearest price. They have to suffer through one halfhearted attempt after another before they get a program that isn't too painful.

Mastering Machine Learning With Scikit-Learn
by Gavin Hackeling
Published 31 Oct 2014

In marketing, clustering is used to find segments of similar consumers. In the following sections, we will work through an example of using the K-Means algorithm to cluster a dataset. Clustering with the K-Means algorithm The K-Means algorithm is a clustering method that is popular because of its speed and scalability. K-Means is an iterative process of moving the centers of the clusters, or the centroids, to the mean position of their constituent points, and re-assigning instances to their closest clusters. The titular K is a hyperparameter that specifies the number of clusters that should be created; K-Means automatically assigns observations to clusters but cannot determine the appropriate number of clusters.

Each cluster's distortion is equal to the sum of the squared distances between its centroid and its constituent instances. The distortion is small for compact clusters and large for clusters that contain scattered instances. The parameters that minimize the cost function are learned through an iterative process of assigning observations to clusters and then moving the clusters. First, the clusters' centroids are initialized to random positions. In practice, setting the centroids' positions equal to the positions of randomly selected observations yields the best results. During each iteration, K-Means assigns observations to the cluster that they are closest to, and then moves the centroids to their assigned observations' mean location.

pages: 285 words: 58,517

The Network Imperative: How to Survive and Grow in the Age of Digital Business Models
by Barry Libert and Megan Beck
Published 6 Jun 2016

This is often the same group of leaders who undertook pinpointing your business model. Designing a new business, with a new business model, is a daunting undertaking, so we want to reassure you that you won’t get it right the first time. Don’t put that expectation on yourself or your team. Instead, we recommend you use an iterative process: the team creates a draft, shows it to experts in the organization for feedback, and then revisits and revises the draft. The idea is to go through the Visualize step quickly and then repeat with more care and more insight. Giving people a week to digest the new ideas between iterations will greatly increase the effectiveness of your next meeting.

Iteration is a necessary part of the process. Analyzing Your Contribution After selecting your top network opportunities, you consider the complementary piece: the value your firm can return to the network. You probably won’t be able to finalize your thinking on network self-service before you move on. Again, this is an iterative process. Now that you’ve begun to home in on some key networks and identify what they can provide for themselves, it’s time to start thinking about how your company fits in to the picture. For each of your top networks, answer the following questions with regard to the needs you believe they could self-serve.

pages: 462 words: 172,671

Clean Code: A Handbook of Agile Software Craftsmanship
by Robert C. Martin
Published 1 Jan 2007

If you look carefully, you will notice that I reversed several of the decisions I made earlier in this chapter. For example, I inlined some extracted methods back into formatCompactedComparison, and I changed the sense of the shouldNotBeCompacted expression. This is typical. Often one refactoring leads to another that leads to the undoing of the first. Refactoring is an iterative process full of trial and error, inevitably converging on something that we feel is worthy of a professional. Conclusion And so we have satisfied the Boy Scout Rule. We have left this module a bit cleaner than we found it. Not that it wasn’t clean already. The authors had done an excellent job with it.

See Hungarian Notation horizontal alignment, of code, 87–88 horizontal formatting, 85–90 horizontal white space, 86 HTML, in source code, 69 Hungarian Notation (HN), 23–24, 295 Hunt, Andy, 8, 289 hybrid structures, 99 I if statements duplicate, 276 eliminating, 262 if-else chain appearing again and again, 290 eliminating, 233 ignored tests, 313 implementation duplication of, 173 encoding, 24 exposing, 94 hiding, 94 wrapping an abstraction, 11 Implementation Patterns, 3, 296 implicity, of code, 18 import lists avoiding long, 307 shortening in SerialDate, 270 imports, as hard dependencies, 307 imprecision, in code, 301 inaccurate comments, 54 inappropriate information, in comments, 286 inappropriate static methods, 296 include method, 48 inconsistency, in code, 292 inconsistent spellings, 20 incrementalism, 212–214 indent level, of a function, 35 indentation, of code, 88–89 indentation rules, 89 independent tests, 132 information inappropriate, 286 too much, 70, 291–292 informative comments, 56 inheritance hierarchy, 308 inobvious connection, between a comment and code, 70 input arguments, 41 instance variables in classes, 140 declaring, 81 hiding the declaration of, 81–82 passing as function arguments, 231 proliferation of, 140 instrumented classes, 342 insufficient tests, 313 integer argument(s) defining, 194 integrating, 224–225 integer argument functionality, moving into ArgumentMarshaler, 215–216 integer argument type, adding to Args, 212 integers, pattern of changes for, 220 IntelliJ, 26 intent explaining in code, 55 explanation of, 56–57 obscured, 295 intention-revealing function, 19 intention-revealing names, 18–19 interface(s) defining local or remote, 158–160 encoding, 24 implementing, 149–150 representing abstract concerns, 150 turning ArgumentMarshaler into, 237 well-defined, 291–292 writing, 119 internal structures, objects hiding, 97 intersection, of domains, 160 intuition, not relying on, 289 inventor of C++, 7 Inversion of Control (IoC), 157 InvocationHandler object, 162 I/O bound, 318 isolating, from change, 149–150 isxxxArg methods, 221–222 iterative process, refactoring as, 265 J jar files, deploying derivatives and bases in, 291 Java aspects or aspect-like mechanisms, 161–166 heuristics on, 307–309 as a wordy language, 200 Java 5, improvements for concurrent development, 182–183 Java 5 Executor framework, 320–321 Java 5 VM, nonblocking solutions in, 327–328 Java AOP frameworks, 163–166 Java programmers, encoding not needed, 24 Java proxies, 161–163 Java source files, 76–77 javadocs as clutter, 276 in nonpublic code, 71 preserving formatting in, 270 in public APIs, 59 requiring for every function, 63 java.util.concurrent package, collections in, 182–183 JBoss AOP, proxies in, 163 JCommon library, 267 JCommon unit tests, 270 JDepend project, 76, 77 JDK proxy, providing persistence support, 161–163 Jeffries, Ron, 10–11, 289 jiggling strategies, 190 JNDI lookups, 157 journal comments, 63–64 JUnit, 34 JUnit framework, 252–265 Junit project, 76, 77 Just-In-Time Compiler, 180 K keyword form, of a function name, 43 L L, lower-case in variable names, 20 language design, art of programming as, 49 languages appearing to be simple, 12 level of abstraction, 2 multiple in one source file, 288 multiples in a comment, 270 last-in, first-out (LIFO) data structure, operand stack as, 324 Law of Demeter, 97–98, 306 LAZY INITIALIZATION/EVALUATION idiom, 154 LAZY-INITIALIZATION, 157 Lea, Doug, 182, 342 learning tests, 116, 118 LeBlanc’s law, 4 legacy code, 307 legal comments, 55–56 level of abstraction, 36–37 levels of detail, 99 lexicon, having a consistent, 26 lines of code duplicating, 173 width of, 85 list(s) of arguments, 43 meaning specific to programmers, 19 returning a predefined immutable, 110 literate code, 9 literate programming, 9 Literate Programming, 141 livelock, 183, 338 local comments, 69–70 local variables, 324 declaring, 292 at the top of each function, 80 lock & wait, 337, 338 locks, introducing, 185 log4j package, 116–118 logical dependencies, 282, 298–299 LOGO language, 36 long descriptive names, 39 long names, for long scopes, 312 loop counters, single-letter names for, 25 M magic numbers obscuring intent, 295 replacing with named constants, 300–301 main function, moving construction to, 155, 156 managers, role of, 6 mandated comments, 63 manual control, over a serial ID, 272 Map adding for ArgumentMarshaler, 221 methods of, 114 maps, breaking the use of, 222–223 marshalling implementation, 214–215 meaningful context, 27–29 member variables f prefix for, 257 prefixing, 24 renaming for clarity, 259 mental mapping, avoiding, 25 messy code.

See POJOs platforms, running threaded code, 188 pleasing code, 7 pluggable thread-based code, 187 POJO system, agility provided by, 168 POJOs (Plain-Old Java Objects) creating, 187 implementing business logic, 162 separating threaded-aware code, 190 in Spring, 163 writing application domain logic, 166 polyadic argument, 40 polymorphic behavior, of functions, 296 polymorphic changes, 96–97 polymorphism, 37, 299 position markers, 67 positives as easier to understand, 258 expressing conditionals as, 302 of decisions, 301precision as the point of all naming, 30 predicates, naming, 25 preemption, breaking, 338 prefixes for member variables, 24 as useless in today’s environments, 312–313 pre-increment operator, ++, 324, 325, 326 “prequel”, this book as, 15 principle of least surprise, 288–289, 295 principles, of design, 15 PrintPrimes program, translation into Java, 141 private behavior, isolating, 148–149 private functions, 292 private method behavior, 147 problem domain names, 27 procedural code, 97 procedural shape example, 95–96 procedures, compared to objects, 101 process function, repartitioning, 319–320 process method, I/O bound, 319 processes, competing for resources, 184 processor bound, code as, 318 producer consumer execution model, 184 producer threads, 184 production environment, 127–130 productivity, decreased by messy code, 4 professional programmer, 25 professional review, of code, 268 programmers as authors, 13–14 conundrum faced by, 6 responsibility for messes, 5–6 unprofessional, 5–6 programming defined, 2 structured, 48–49 programs, getting them to work, 201 pronounceable names, 21–22 protected variables, avoiding, 80 proxies, drawbacks of, 163 public APIs, javadocs in, 59 puns, avoiding, 26–27 PUTFIELD instruction, as atomic, 325 Q queries, separating from commands, 45–46 R random jiggling, tests running, 190 range, including end-point dates in, 276 readability of clean tests, 124 of code, 76 Dave Thomas on, 9 improving using generics, 115 readability perspective, 8 readers of code, 13–14 continuous, 184 readers-writers execution model, 184 reading clean code, 8 code from top to bottom, 37 versus writing, 14 reboots, as a lock up solution, 331 recommendations, in this book, 13 redesign, demanded by the team, 5 redundancy, of noise words, 21 redundant comments, 60–62, 272, 275, 286–287 ReentrantLock class, 183 refactored programs, as longer, 146 refactoring Args, 212 code incrementally, 172 as an iterative process, 265 putting things in to take out, 233 test code, 127 Refactoring (Fowler), 285 renaming, fear of, 30 repeatability, of concurrency bugs, 180 repeatable tests, 132 requirements, specifying, 2 resetId, byte-code generated for, 324–325 resources bound, 183 processes competing for, 184 threads agreeing on a global ordering of, 338 responsibilities counting in classes, 136 definition of, 138 identifying, 139 misplaced, 295–296, 299 splitting a program into main, 146 return codes, using exceptions instead, 103–105 reuse, 174 risk of change, reducing, 147 robust clear code, writing, 112 rough drafts, writing, 200 runnable interface, 326 run-on expressions, 295 run-on journal entries, 63–64 runtime logic, separating startup from, 154 S safety mechanisms, overridden, 289 scaling up, 157–161 scary noise, 66 schema, of a class, 194 schools of thought, about clean code, 12–13 scissors rule, in C++, 81 scope(s) defined by exceptions, 105 dummy, 90 envying, 293 expanding and indenting, 89 hierarchy in a source file, 88 limiting for data, 181 names related to the length of, 22–23, 312 of shared variables, 333 searchable names, 22–23 Second Law, of TDD, 122 sections, within functions, 36 selector arguments, avoiding, 294–295 self validating tests, 132 Semaphore class, 183 semicolon, making visible, 90 “serial number”, SerialDate using, 271 SerialDate class making it right, 270–284 naming of, 270–271 refactoring, 267–284 SerialDateTests class, 268 serialization, 272 server, threads created by, 319–321 server application, 317–318, 343–344 server code, responsibilities of, 319 server-based locking, 329 as preferred, 332–333 with synchronized methods, 185 “Servlet” model, of Web applications, 178 Servlets, synchronization problems, 182 set functions, moving into appropriate derivatives, 232, 233–235 setArgument, changing, 232–233 setBoolean function, 217 setter methods, injecting dependencies, 157 setup strategy, 155 SetupTeardownIncluder.java listing, 50–52 shape classes, 95–96 shared data, limiting access, 181 shared variables method updating, 328 reducing the scope of, 333 shotgun approach, hand-coded instrumentation as, 189 shut-down code, 186 shutdowns, graceful, 186 side effects having none, 44 names describing, 313 Simmons, Robert, 276 simple code, 10, 12 Simple Design, rules of, 171–176 simplicity, of code, 18, 19 single assert rule, 130–131 single concepts, in each test function, 131–132 Single Responsibility Principle (SRP), 15, 138–140 applying, 321 breaking, 155 as a concurrency defense principle, 181 recognizing violations of, 174 server violating, 320 Sql class violating, 147 supporting, 157 in test classes conforming to, 172 violating, 38 single value, ordered components of, 42 single-letter names, 22, 25 single-thread calculation, of throughput, 334 SINGLETON pattern, 274 small classes, 136 Smalltalk Best Practice Patterns, 296 smart programmer, 25 software project, maintenance of, 175 software systems.

pages: 312 words: 35,664

The Mathematics of Banking and Finance
by Dennis W. Cox and Michael A. A. Cox
Published 30 Apr 2006

The current cost can now be calculated as: 142.86 cost(x1 ) + 285.71 cost(x2 ) = 142.86 × 1 + 285.71 × 1 = 428.57 There are now no remaining negative opportunity costs, so the solution has variable 1 as 142.86 and variable 2 as 285.71 with the remaining variables no longer being used since they are now zero. 17.4 THE CONCERNS WITH THE APPROACH In practice when you are inputting data into a system and then using some iterative process to try to find a better estimate of what is the best strategy, you are actually conducting the process laid out in this chapter – it is just that the actual work is normally embedded within a computer program. However, where possible there are real merits in carrying out the analysis in a manual form, not the least of which is the relative complexity of the software solutions currently available.

. ; options design/approach to analysis, data 129–47 dice-rolling examples, probability theory 21–3, 53–5 differentiation 251 discount factors adjusted discount rates 228–9 net present value (NPV) 220–1, 228–9, 231–2 discrete data bar charts 7–12, 13 concepts 7–12, 13, 44–5, 53–5, 72 discrete uniform distribution, concepts 53–5 displays see also presentational approaches data 1–5 Disraeli, Benjamin 1 division notation 280, 282 dynamic programming complex examples 184–7 concepts 179–87 costs 180–82 examples 180–87 principle of optimality 179–87 returns 179–80 schematic 179–80 ‘travelling salesman’ problem 185–7 e-mail surveys 50–1 economic order quantity see also stock control concepts 195–201 examples 196–9 empowerment, staff 189–90 error sum of the squares (SSE), concepts 122–5, 133–47 errors, data analysis 129–47 estimates mean 76–81 probability theory 22, 25–6, 31–5, 75–81 Euler, L. 131 288 Index events independent events 22–4, 35, 58, 60, 92–5 mutually exclusive events 22–4, 58 probability theory 21–35, 58–66, 92–5 scenario analysis 40, 193–4, 271–4 tree diagrams 30–5 Excel 68, 206–7 exclusive events see mutually exclusive events expected errors, sensitivity analysis 268–9 expected value, net present value (NPV) 231–2 expert systems 275 exponent notation 282–4 exponential distribution, concepts 65–6, 209–10, 252–5 external fraud 272–4 extrapolation 119 extreme value distributions, VaR 262–4 F distribution ANOVA (analysis of variance) 110–20, 127, 134–7 concepts 85–9, 110–20, 127, 134–7 examples 85–9, 110–20, 127, 137 tables 85–8 f notation 8–9, 13–20, 26, 38–9, 44–5, 65–6, 85 factorial notation 53–5, 283–4 failure probabilities see also reliability replacement of assets 215–18, 249–60 feasibility polygons 152–7, 163–4 finance selection, linear programming 164–6 fire extinguishers, ANOVA (analysis of variance) 123–7 focus groups 51 forward recursion 179–87 four by four tables 94–5 fraud 272–4, 276 Fréchet distribution 262 frequency concepts 8–9, 13–20, 37–45 cumulative frequency polygons 13–20, 39–40, 203 graphical presentational approaches 8–9, 13–20 frequentist approach, probability theory 22, 25–6 future cash flows 219–25, 227–34, 240–1 fuzzy logic 276 Garbage In, Garbage Out (GIGO) 261–2 general rules, linear programming 167–70 genetic algorithms 276 ghost costs, transport problems 172–7 goodness of fit test, chi-squared test 91–5 gradient (a notation), linear regression 103–4, 107–20 graphical method, linear programming 149–57, 163–4 graphical presentational approaches concepts 1–20, 149–57, 235–47 rules 8–9 greater-than notation 280–4 Greek alphabet 283 guesswork, modelling 191 histograms 2, 7, 13–20, 41, 73 class intervals 13–20, 44–5 comparative histograms 14–19 concepts 7, 13–20, 41, 73 continuous data 7, 13–14 examples 13–20, 73 skewness 41 uses 7, 13–20 holding costs 182–5, 197–201, 204–8 home insurance 10–12 Hopfield 275 horizontal axis bar charts 8–9 histograms 14–20 linear regression 103–4, 107–20 scatter plots 2–5, 103 hypothesis testing concepts 77–81, 85–95, 110–27 examples 78–80, 85 type I and type II errors 80–1 i notation 8–9, 13–20, 28–30, 37–8, 103–20 identification data 2–5, 261–5 trends 241–7 identity rule 282 impact assessments 21, 271–4 independent events, probability theory 22–4, 35, 58, 60, 92–5 independent variables, concepts 2–5, 70, 103–20, 235 infinity, normal distribution 67–72 information, quality needs 190–4 initial solution, linear programming 167–70 insurance industry 10–12, 29–30 integers 280–4 integration 65–6, 251 intercept (b notation), linear regression 103–4, 107–20 interest rates base rates 240 daily movements 40, 261 project evaluation 219–25, 228–9 internal rate of return (IRR) concepts 220–2, 223–5 examples 220–2 interpolation, IRR 221–2 interviews, uses 48, 51–2 inventory control see stock control Index investment strategies 149–57, 164–6, 262–5 IRR see internal rate of return iterative processes, linear programming 170 j notation 28–30, 37, 104–20, 121–2 JP Morgan 263 k notation 20, 121–7 ‘know your customer’ 272 Kohonen self-organising maps 275 Latin squares concepts 131–2, 143–7 examples 143–7 lead times, stock control 195–201 learning strategies, neural networks 275–6 less-than notation 281–4 lethargy pitfalls, decisions 189 likelihood considerations, scenario analysis 272–3 linear programming additional variables 167–70 concepts 149–70 concerns 170 constraining equations 159–70 costs 167–70, 171–7 critique 170 examples 149–57, 159–70 finance selection 164–6 general rules 167–70 graphical method 149–57, 163–4 initial solution 167–70 iterative processes 170 manual preparation 170 most profitable loans 159–66 optimal advertising allocation 154–7 optimal investment strategies 149–57, 164–6 returns 149–57, 164–6 simplex method 159–70, 171–2 standardisation 167–70 time constraints 167–70 transport problems 171–7 linear regression analysis 110–20 ANOVA (analysis of variance) 110–20 concepts 3, 103–20 equation 103–4 examples 107–20 gradient (a notation) 103–4, 107–20 intercept (b notation) 103–4, 107–20 interpretation 110–20 notation 103–4 residual sum of the squares 109–20 slope significance test 112–20 uncertainties 108–20 literature searches, surveys 48 289 loans finance selection 164–6 linear programming 159–66 risk assessments 159–60 log-normal distribution, concepts 257–8 logarithms (logs), types 20, 61 losses, banks 267–9, 271–4 lotteries 22 lower/upper quartiles, concepts 39–41 m notation 55–8 mail surveys 48, 50–1 management information, graphical presentational approaches 1–20 Mann–Whitney test see U test manual preparation, linear programming 170 margin of error, project evaluation 229–30 market prices, VaR 264–5 marketing brochures 184–7 mathematics 1, 7–8, 196–9, 219–20, 222–5, 234, 240–1, 251, 279–84 matrix plots, concepts 2, 4–5 matrix-based approach, transport problems 171–7 maximum and minimum, concepts 37–9, 40, 254–5 mean comparison of two sample means 79–81 comparisons 75–81 concepts 37–45, 59–60, 65–6, 67–74, 75–81, 97–8, 100–2, 104–27, 134–5 confidence intervals 71, 75–81, 105, 109, 116–20, 190, 262–5 continuous data 44–5, 65–6 estimates 76–81 hypothesis testing 77–81 linear regression 104–20 normal distribution 67–74, 75–81, 97–8 sampling 75–81 mean square causes (MSC), concepts 122–7, 134–47 mean square errors (MSE), ANOVA (analysis of variance) 110–20, 121–7, 134–7 median, concepts 37, 38–42, 83, 98–9 mid-points class intervals 44–5, 241–7 moving averages 241–7 minimax regret rule, concepts 192–4 minimum and maximum, concepts 37–9, 40 mode, concepts 37, 39, 41 modelling banks 75–81, 85, 97, 267–9, 271–4 concepts 75–81, 83, 91–2, 189–90, 195–201, 215–18, 261–5 decision-making pitfalls 189–91 economic order quantity 195–201 290 Index modelling (cont.) guesswork 191 neural networks 275–7 operational risk 75, 262–5, 267–9, 271–4 output reviews 191–2 replacement of assets 215–18, 249–60 VaR 261–5 moments, density functions 65–6, 83–4 money laundering 272–4 Monte Carlo simulation bank cashier problem 209–12 concepts 203–14, 234 examples 203–8 Monty Hall problem 212–13 queuing problems 208–10 random numbers 207–8 stock control 203–8 uses 203, 234 Monty Hall problem 34–5, 212–13 moving averages concepts 241–7 even numbers/observations 244–5 moving totals 245–7 MQMQM plot, concepts 40 MSC see mean square causes MSE see mean square errors multi-way tables, concepts 94–5 multiplication notation 279–80, 282 multiplication rule, probability theory 26–7 multistage sampling 50 mutually exclusive events, probability theory 22–4, 58 n notation 7, 20, 28–30, 37–45, 54–8, 103–20, 121–7, 132–47, 232–4 n!

. ; options design/approach to analysis, data 129–47 dice-rolling examples, probability theory 21–3, 53–5 differentiation 251 discount factors adjusted discount rates 228–9 net present value (NPV) 220–1, 228–9, 231–2 discrete data bar charts 7–12, 13 concepts 7–12, 13, 44–5, 53–5, 72 discrete uniform distribution, concepts 53–5 displays see also presentational approaches data 1–5 Disraeli, Benjamin 1 division notation 280, 282 dynamic programming complex examples 184–7 concepts 179–87 costs 180–82 examples 180–87 principle of optimality 179–87 returns 179–80 schematic 179–80 ‘travelling salesman’ problem 185–7 e-mail surveys 50–1 economic order quantity see also stock control concepts 195–201 examples 196–9 empowerment, staff 189–90 error sum of the squares (SSE), concepts 122–5, 133–47 errors, data analysis 129–47 estimates mean 76–81 probability theory 22, 25–6, 31–5, 75–81 Euler, L. 131 288 Index events independent events 22–4, 35, 58, 60, 92–5 mutually exclusive events 22–4, 58 probability theory 21–35, 58–66, 92–5 scenario analysis 40, 193–4, 271–4 tree diagrams 30–5 Excel 68, 206–7 exclusive events see mutually exclusive events expected errors, sensitivity analysis 268–9 expected value, net present value (NPV) 231–2 expert systems 275 exponent notation 282–4 exponential distribution, concepts 65–6, 209–10, 252–5 external fraud 272–4 extrapolation 119 extreme value distributions, VaR 262–4 F distribution ANOVA (analysis of variance) 110–20, 127, 134–7 concepts 85–9, 110–20, 127, 134–7 examples 85–9, 110–20, 127, 137 tables 85–8 f notation 8–9, 13–20, 26, 38–9, 44–5, 65–6, 85 factorial notation 53–5, 283–4 failure probabilities see also reliability replacement of assets 215–18, 249–60 feasibility polygons 152–7, 163–4 finance selection, linear programming 164–6 fire extinguishers, ANOVA (analysis of variance) 123–7 focus groups 51 forward recursion 179–87 four by four tables 94–5 fraud 272–4, 276 Fréchet distribution 262 frequency concepts 8–9, 13–20, 37–45 cumulative frequency polygons 13–20, 39–40, 203 graphical presentational approaches 8–9, 13–20 frequentist approach, probability theory 22, 25–6 future cash flows 219–25, 227–34, 240–1 fuzzy logic 276 Garbage In, Garbage Out (GIGO) 261–2 general rules, linear programming 167–70 genetic algorithms 276 ghost costs, transport problems 172–7 goodness of fit test, chi-squared test 91–5 gradient (a notation), linear regression 103–4, 107–20 graphical method, linear programming 149–57, 163–4 graphical presentational approaches concepts 1–20, 149–57, 235–47 rules 8–9 greater-than notation 280–4 Greek alphabet 283 guesswork, modelling 191 histograms 2, 7, 13–20, 41, 73 class intervals 13–20, 44–5 comparative histograms 14–19 concepts 7, 13–20, 41, 73 continuous data 7, 13–14 examples 13–20, 73 skewness 41 uses 7, 13–20 holding costs 182–5, 197–201, 204–8 home insurance 10–12 Hopfield 275 horizontal axis bar charts 8–9 histograms 14–20 linear regression 103–4, 107–20 scatter plots 2–5, 103 hypothesis testing concepts 77–81, 85–95, 110–27 examples 78–80, 85 type I and type II errors 80–1 i notation 8–9, 13–20, 28–30, 37–8, 103–20 identification data 2–5, 261–5 trends 241–7 identity rule 282 impact assessments 21, 271–4 independent events, probability theory 22–4, 35, 58, 60, 92–5 independent variables, concepts 2–5, 70, 103–20, 235 infinity, normal distribution 67–72 information, quality needs 190–4 initial solution, linear programming 167–70 insurance industry 10–12, 29–30 integers 280–4 integration 65–6, 251 intercept (b notation), linear regression 103–4, 107–20 interest rates base rates 240 daily movements 40, 261 project evaluation 219–25, 228–9 internal rate of return (IRR) concepts 220–2, 223–5 examples 220–2 interpolation, IRR 221–2 interviews, uses 48, 51–2 inventory control see stock control Index investment strategies 149–57, 164–6, 262–5 IRR see internal rate of return iterative processes, linear programming 170 j notation 28–30, 37, 104–20, 121–2 JP Morgan 263 k notation 20, 121–7 ‘know your customer’ 272 Kohonen self-organising maps 275 Latin squares concepts 131–2, 143–7 examples 143–7 lead times, stock control 195–201 learning strategies, neural networks 275–6 less-than notation 281–4 lethargy pitfalls, decisions 189 likelihood considerations, scenario analysis 272–3 linear programming additional variables 167–70 concepts 149–70 concerns 170 constraining equations 159–70 costs 167–70, 171–7 critique 170 examples 149–57, 159–70 finance selection 164–6 general rules 167–70 graphical method 149–57, 163–4 initial solution 167–70 iterative processes 170 manual preparation 170 most profitable loans 159–66 optimal advertising allocation 154–7 optimal investment strategies 149–57, 164–6 returns 149–57, 164–6 simplex method 159–70, 171–2 standardisation 167–70 time constraints 167–70 transport problems 171–7 linear regression analysis 110–20 ANOVA (analysis of variance) 110–20 concepts 3, 103–20 equation 103–4 examples 107–20 gradient (a notation) 103–4, 107–20 intercept (b notation) 103–4, 107–20 interpretation 110–20 notation 103–4 residual sum of the squares 109–20 slope significance test 112–20 uncertainties 108–20 literature searches, surveys 48 289 loans finance selection 164–6 linear programming 159–66 risk assessments 159–60 log-normal distribution, concepts 257–8 logarithms (logs), types 20, 61 losses, banks 267–9, 271–4 lotteries 22 lower/upper quartiles, concepts 39–41 m notation 55–8 mail surveys 48, 50–1 management information, graphical presentational approaches 1–20 Mann–Whitney test see U test manual preparation, linear programming 170 margin of error, project evaluation 229–30 market prices, VaR 264–5 marketing brochures 184–7 mathematics 1, 7–8, 196–9, 219–20, 222–5, 234, 240–1, 251, 279–84 matrix plots, concepts 2, 4–5 matrix-based approach, transport problems 171–7 maximum and minimum, concepts 37–9, 40, 254–5 mean comparison of two sample means 79–81 comparisons 75–81 concepts 37–45, 59–60, 65–6, 67–74, 75–81, 97–8, 100–2, 104–27, 134–5 confidence intervals 71, 75–81, 105, 109, 116–20, 190, 262–5 continuous data 44–5, 65–6 estimates 76–81 hypothesis testing 77–81 linear regression 104–20 normal distribution 67–74, 75–81, 97–8 sampling 75–81 mean square causes (MSC), concepts 122–7, 134–47 mean square errors (MSE), ANOVA (analysis of variance) 110–20, 121–7, 134–7 median, concepts 37, 38–42, 83, 98–9 mid-points class intervals 44–5, 241–7 moving averages 241–7 minimax regret rule, concepts 192–4 minimum and maximum, concepts 37–9, 40 mode, concepts 37, 39, 41 modelling banks 75–81, 85, 97, 267–9, 271–4 concepts 75–81, 83, 91–2, 189–90, 195–201, 215–18, 261–5 decision-making pitfalls 189–91 economic order quantity 195–201 290 Index modelling (cont.) guesswork 191 neural networks 275–7 operational risk 75, 262–5, 267–9, 271–4 output reviews 191–2 replacement of assets 215–18, 249–60 VaR 261–5 moments, density functions 65–6, 83–4 money laundering 272–4 Monte Carlo simulation bank cashier problem 209–12 concepts 203–14, 234 examples 203–8 Monty Hall problem 212–13 queuing problems 208–10 random numbers 207–8 stock control 203–8 uses 203, 234 Monty Hall problem 34–5, 212–13 moving averages concepts 241–7 even numbers/observations 244–5 moving totals 245–7 MQMQM plot, concepts 40 MSC see mean square causes MSE see mean square errors multi-way tables, concepts 94–5 multiplication notation 279–80, 282 multiplication rule, probability theory 26–7 multistage sampling 50 mutually exclusive events, probability theory 22–4, 58 n notation 7, 20, 28–30, 37–45, 54–8, 103–20, 121–7, 132–47, 232–4 n!

pages: 338 words: 106,936

The Physics of Wall Street: A Brief History of Predicting the Unpredictable
by James Owen Weatherall
Published 2 Jan 2013

To fully understand markets, and to model them as safely as possible, these facts must be accounted for. And Mandelbrot is singularly responsible for discovering the shortcomings of the Bachelier-Osborne approach, and for developing the mathematics necessary to study them. Getting the details right may be an ongoing project — indeed, we should never expect to finish the iterative process of improving our mathematical models — but there is no doubt that Mandelbrot took a crucially important step forward. After a decade of interest in the statistics of markets, Mandelbrot gave up on his crusade to replace normal distributions with other Lévy-stable distributions. By this time, his ideas on randomness and disorder had begun to find applications in a wide variety of other fields, from cosmology to meteorology.

A sledgehammer may be great for laying train rails, but you need to recognize that it won’t be very good for hammering in finishing nails on a picture frame. I believe the history that I have recounted in this book supports the closely related claims that models in finance are best thought of as tools for certain kinds of purposes, and also that these tools make sense only in the context of an iterative process of developing models and then figuring out when, why, and how they fail — so that the next generation of models are robust in ways that the older models were not. From this perspective, Bachelier represents a first volley, the initial attempt to apply new ideas from statistical physics to an entirely different set of problems.

As sociologist Donald MacKenzie has observed, financial models are as much the engine behind markets as they are a camera capable of describing them. This means that the markets financial models are trying to capture are a moving target. Far from undermining the usefulness of models in understanding markets, the fact that markets are constantly evolving only makes the iterative process I have emphasized more important. Suppose that Sornette’s model of market crashes is perfect for current markets. Even then, we have to remain ever vigilant. What would happen if investors around the world started using his methods to predict crashes? Would this prevent crashes from occurring?

pages: 396 words: 112,748

Chaos: Making a New Science
by James Gleick
Published 18 Oct 2011

The Mandelbrot set became a kind of public emblem for chaos, appearing on the glossy covers of conference brochures and engineering quarterlies, forming the centerpiece of an exhibit of computer art that traveled internationally in 1985 and 1986. Its beauty was easy to feel from these pictures; harder to grasp was the meaning it had for the mathematicians who slowly understood it. Many fractal shapes can be formed by iterated processes in the complex plane, but there is just one Mandelbrot set. It started appearing, vague and spectral, when Mandelbrot tried to find a way of generalizing about a class of shapes known as Julia sets. These were invented and studied during World War I by the French mathematicians Gaston Julia and Pierre Fatou, laboring without the pictures that a computer could provide.

It was impossible to tell. For a one-dimensional process, no one need actually resort to experimental trial. It is easy enough to establish that numbers greater than one lead to infinity and the rest do not. But in the two dimensions of the complex plane, to deduce a shape defined by an iterated process, knowing the equation is generally not enough. Unlike the traditional shapes of geometry, circles and ellipses and parabolas, the Mandelbrot set allows no shortcuts. The only way to see what kind of shape goes with a particular equation is by trial and error, and the trial-and–error style brought the explorers of this new terrain closer in spirit to Magellan than to Euclid.

In Germany they built huge apartment blocks in the Bauhaus style and people move out, they don’t like to live there. There are very deep reasons, it seems to me, in society right now to dislike some aspects of our conception of nature.” Peitgen had been helping a visitor select blowups of regions of the Mandelbrot set, Julia sets, and other complex iterative processes, all exquisitely colored. In his small California office he offered slides, large transparencies, even a Mandelbrot set calendar. “The deep enthusiasm we have has to do with this different perspective of looking at nature. What is the true aspect of the natural object? The tree, let’s say—what is important?

pages: 205 words: 20,452

Data Mining in Time Series Databases
by Mark Last , Abraham Kandel and Horst Bunke
Published 24 Jun 2004

The time complexity becomes O(N nmpP ), m << n, implying a substantial speedup. 5.2.3. Perturbation-Based Iterative Refinement The set median represents an approximation of the generalized median string. The greedy algorithms and the genetic search techniques also give approximate solutions. An approximate solution p̄ can be further improved by an iterative process of systematic perturbations. This idea was first suggested in [20]. But no algorithmic details are specified there. A concrete algorithm for realizing systematic perturbations is given in [26]. For each 184 X. Jiang, H. Bunke and J. Csirik position i, the following operations are performed: (i) Build perturbations • Substitution: Replace the i-th symbol of p̄ by each symbol of Σ in turn and choose the resulting string x with the smallest consensus error relative to S. • Insertion: Insert each symbol of Σ in turn at the i-th position of p̄ and choose the resulting string y with the smallest consensus error relative to S. • Deletion: Delete the i-th symbol of p̄ to generate z.

Note that the consensus errors of digit 6 are substantially larger than those of the other digits because of the definition of consensus error as the sum, but not the average, of the distances to all input samples. The best results are achieved by GA, followed by the dynamic approach. Except for digit 1, the greedy algorithm reveals some weakness. Looking at the median for digits 2, 3 and 6 it seems that the iterative process terminates too early, resulting in a string (digit) much shorter than it should be. The reason lies in the simple termination criterion defined in [4]. It works well for the (short) words used there, but obviously encounters difficulties in dealing with longer strings occurring in our study. At first glance, the dynamic approach needs more computation time than the greedy algorithm.

pages: 260 words: 67,823

Always Day One: How the Tech Titans Plan to Stay on Top Forever
by Alex Kantrowitz
Published 6 Apr 2020

As a teenager, Sayman was invaluable. He could help Zuckerberg learn Snapchat’s culture. “He would point us to, ‘Here’s the media that I follow,’ or ‘Here are the people I think are influential, who are cool,’” Zuckerberg said. “I’d go follow those people, or talk to them, have them come in. That ends up being the iterative process of learning what matters.” Zuckerberg said he followed these tastemakers on Instagram, and confirmed he’s a Snapchat user too. “I try to use all the stuff,” he told me. “If you want to learn, there are so many lessons out there where people will tell you about things you’re not doing as well as you could.

After assigning groups to work on the design of the car with and without a steering wheel, design removed the wheel completely, creating further technical challenges for the team now tasked with building for full autonomy. “Some design team said, ‘We’ll just take it off,’” the ex–Apple engineer said of the steering wheel. “The design team would say, ‘Oh yeah, we could get a car without a steering wheel in four or five years.’ In reality it doesn’t work that way. The lack of a fully supported iterative process is hurting Apple in terms of the new initiatives.” A second ex-Apple engineer who worked on Project Titan also marveled at design’s influence. “On top of the engineering challenges, you add these design challenges that make it almost impossible,” he said. “Engineers obviously don’t have much say on the design.

Bulletproof Problem Solving
by Charles Conn and Robert McLean
Published 6 Mar 2019

It sets out a simple but rigorous approach to defining problems, disaggregating them into manageable pieces, focusing good analytic tools on the most important parts, and then synthesizing findings to tell a powerful story. While the process has a beginning and end, we encourage you to think of problem solving as an iterative process rather than a linear one. At each stage we improve our understanding of the problem and use those greater insights to refine our early answers. In this chapter we outline the overall Bulletproof Problem Solving Process, introducing you to the seven steps that later chapters will address in more detail.

This is why the problem is rarely solved at the point of your Eureka moment. You'll have to be persuasive to convince powerful stakeholders to follow your plans. Remember, humans are visual learners and love storytelling above all else. Treat the seven‐steps process like an accordion. We have frequently referred to the seven steps as an iterative process. We have also sought to emphasize that you can compress or expand steps depending on the issue, so in some ways it is also like an accordion. Use one‐day answers to motivate the team and decision maker to go to the level of analysis that the problem requires. Don't be intimidated by any problem you face.

pages: 232 words: 71,237

Kill It With Fire: Manage Aging Computer Systems
by Marianne Bellotti
Published 17 Mar 2021

Follow-ups People often say things we don’t expect in design exercises, requiring us to divert from the structure we’ve set out for a moment to understand this new piece of information. Follow-up questions or activities are used to go deeper on individual issues as they appear. Aggregation At some point—maybe after a single exercise or after a series of interviews—we need to look at all the data and draw a conclusion. Just like engineering, design is often an iterative process. The conclusion of one exercise may create the research question for the next. For example, if a user research session reveals that users don’t understand how to interact with the product, future research sessions will test alternative interfaces until the organization has found something that works for users.

Philippe Kruchten, “Architectural Blueprints—The ‘4+1’ View Model of Software Architecture,” IEEE Software 12 no. 6 (November 1995): 42–50. 9 How to Finish When I was working for the United Nations (UN), my boss at the time would regularly start conversations with the words “When the website is done . . .” to which I would respond, “The website is never done.” I count among my accomplishments the fact that by the time I left the UN, people had bought in to the agile, iterative process that treats software as a living thing that needs to be constantly maintained and improved. They stopped saying “When the website is done . . .” Technology is never done, but modernization projects can be. This chapter covers how to define success in a way that makes it clear when the modernization effort is completed and what to do next.

pages: 52 words: 14,333

Growth Hacker Marketing: A Primer on the Future of PR, Marketing, and Advertising
by Ryan Holiday
Published 2 Sep 2013

The old way—where product development and marketing were two distinct and separate processes—has been replaced. We all find ourselves in the same position: needing to do more with less and finding, increasingly, that the old strategies no longer generate results. So in this book, I am going to take you through a new cycle, a much more fluid and iterative process. A growth hacker doesn’t see marketing as something one does, but rather as something one builds into the product itself. The product is then kick-started, shared, and optimized (with these steps repeated multiple times) on its way to massive and rapid growth. The chapters of this book follow that structure.

pages: 252 words: 73,131

The Inner Lives of Markets: How People Shape Them—And They Shape Us
by Tim Sullivan
Published 6 Jun 2016

The pioneers of information economics set the profession on a path to better describing the nature of markets, which has in turn led the current generation to turn its attention outward to dabble in the design of markets and policy. A great many applied theorists and empirical economists are, together, able to match theories up to data they can use to evaluate how they perform in practice. We hope that the iterative process of theorizing and testing of theories in the field and the reformulating of theories (a process of experimentation that we’re in the midst of) will make it more likely that economics’ increased influence on the world is ultimately for the better. 5 BUILDING AN AUCTION FOR EVERYTHING THE TALE OF THE ROLLER-SKATING ECONOMIST In the fall of 2006, the Japanese baseball phenom Daisuke Matsuzaka announced his interest in moving to the American big leagues.

But that’s really all that Gale and Shapley provided: a conceptual framework that market designers have, for several decades now, been applying, evaluating, and refining. They’ve learned from its successes and, unfortunately, learned even more from its inevitable failures: modeling real-life exchanges is an imprecise, iterative process in which many of us find ourselves as experimental subjects. The Complicated Job of Engineering Matches Market designer Al Roth likes to use a bridge-building metaphor to explain the contrast between his own work and that of design pioneers like Shapley. Suppose you want to build a suspension bridge connecting Brooklyn and Manhattan.

Human Transit: How Clearer Thinking About Public Transit Can Enrich Our Communities and Our Lives
by Jarrett Walker
Published 22 Dec 2011

In practice, this is a great way to make the bureaucracy grind to a halt; the coordination challenge becomes the main goal, and staff have little time leftover to do the actual planning work. The better solution, in my experience, is for the plan to pass back and forth between land use and transit agencies, in an iterative process, such as that sketched in figure 16-1. The land use agency does a plan about urban structure. Then, transit planners do a long-range network plan whose core message is: “Here are the transit consequences of the proposed urban structure. Here is where we will need rapid transit, here is where we’ll need frequent local transit, and here’s where no frequent transit can be supported.

That way, when developments are being approved, the short-term land use planner can check whether the location is a good or bad one for transit and can judge developments accordingly. Meanwhile, as the long-term land use planners stare at the transit map, they have new ideas for how to build communities around the proposed lines and stations. These iterative processes scare some people, because they feel circular and therefore potentially endless. But every step is building elements of a durable consensus about a city’s future shape, and at every step certain projects will move into implementation. The goal is to show everyone the transportation consequences of their decisions about where to locate, so that those decisions, expressing the self-interest of each player, collectively produce a more efficient transit system, and thus a more resilient city.

pages: 332 words: 93,672

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy
by George Gilder
Published 16 Jul 2018

The idea of big data is that the previous slow, clumsy, step-by-step search for knowledge by human brains can be replaced if two conditions are met: All the data in the world can be compiled in a single “place,” and algorithms sufficiently comprehensive to analyze them can be written. Upholding this theory of knowledge is a theory of mind derived from the pursuit of artificial intelligence. In this view, the brain is also fundamentally algorithmic, iteratively processing data to reach conclusions. Belying this notion of the brain is the study of actual brains, which turn out be much more like sensory processors than logic machines. Yet the direction of AI research is essentially unchanged. Like method actors, the AI industry has accepted that its job is to act “as if” the brain were a logic machine.

Google’s path to riches, for which it can show considerable evidence, is that with enough data and enough processors it can know better than we do what will satisfy our longings. Even as the previous systems of the world were embodied and enabled in crucial technologies, so the Google system of the world is embodied and enabled in a technological vision called cloud computing. If the Google theory is that universal knowledge is attained through the iterative processing of enormous amounts of data, then the data have to be somewhere accessible to the processors. Accessible in this case is defined by the speed of light. The speed-of-light limit—nine inches in a billionth of a second—requires the aggregation of processors and the memory in some central place, with energy available to access and process the data.

pages: 556 words: 46,885

The World's First Railway System: Enterprise, Competition, and Regulation on the Railway Network in Victorian Britain
by Mark Casson
Published 14 Jul 2009

It should be emphasized that this counterfactual is not superior for every conceivable consignment of traYc, but only for a typical consignment of a certain type of traYc. The counterfactual system is developed from a ‘blank sheet of paper’—almost literally—and not by simply exploring variations to the conWguration of the actual network. It is constructed using an iterative process, as explained below. To avoid the need for a full evaluation of the performance of the network after each iteration, a set of simple criteria were used to guide the initial formulation of the model. These criteria represent conditions that the counterfactual would 6 The World’s First Railway System almost certainly have to fulWl if it were to stand any chance of matching the performance of the actual system.

Finally, seven regional samples were constructed—reflecting the fact that most journeys on any railway system tend to be relatively short. Comparing the results for different samples illustrates how well the actual system served different types of traffic and different parts of the country. The counterfactual network was constructed using an iterative process. The performance of the actual system was first assessed. This proved to be a most illuminating process, indicating that the actual performance of the system for many categories of traffic was much inferior to what has often been suggested— particularly in the enthusiasts’ literature. An initial counterfactual system was then constructed, using only a limited number of local lines, and its performance compared with the actual system.

Wherever possible, local lines feed into the hubs identified at the first stage. By concentrating interchange traffic at a limited number of hubs, the number of stops that long-distance trains need to make for connection purposes is reduced. At the same time, the power of hubs to act as a ‘one-stop shop’ for local connections is increased. An iterative process was then followed to fine-tune the interfaces between trunk and local networks. The final stage is based on the counterfactual timetable. The preparation of the timetable provides an opportunity to assess whether upgrading certain links to permit higher speeds would improve connections. Connections are improved when speeding up one link relative to others allows trains entering a hub from the accelerated link to make connections with trains that they would otherwise miss.

pages: 1,606 words: 168,061

Python Cookbook
by David Beazley and Brian K. Jones
Published 9 May 2013

For example: >>> s = ' hello world \n' >>> s = s.strip() >>> s 'hello world' >>> If you needed to do something to the inner space, you would need to use another technique, such as using the replace() method or a regular expression substitution. For example: >>> s.replace(' ', '') 'helloworld' >>> import re >>> re.sub('\s+', ' ', s) 'hello world' >>> It is often the case that you want to combine string stripping operations with some other kind of iterative processing, such as reading lines of data from a file. If so, this is one area where a generator expression can be useful. For example: with open(filename) as f: lines = (line.strip() for line in f) for line in lines: ... Here, the expression lines = (line.strip() for line in f) acts as a kind of data transform.

_child_iter = None return next(self) # Advance to the next child and start its iteration else: self._child_iter = next(self._children_iter).depth_first() return next(self) The DepthFirstIterator class works in the same way as the generator version, but it’s a mess because the iterator has to maintain a lot of complex state about where it is in the iteration process. Frankly, nobody likes to write mind-bending code like that. Define your iterator as a generator and be done with it. 4.5. Iterating in Reverse Problem You want to iterate in reverse over a sequence. Solution Use the built-in reversed() function. For example: >>> a = [1, 2, 3, 4] >>> for x in reversed(a): ... print(x) ... 4 3 2 1 Reversed iteration only works if the object in question has a size that can be determined or if the object implements a __reversed__() special method.

Instead, it simply examines the set of items from the front of each input sequence and emits the smallest one found. A new item from the chosen sequence is then read, and the process repeats itself until all input sequences have been fully consumed. 4.16. Replacing Infinite while Loops with an Iterator Problem You have code that uses a while loop to iteratively process data because it involves a function or some kind of unusual test condition that doesn’t fall into the usual iteration pattern. Solution A somewhat common scenario in programs involving I/O is to write code like this: CHUNKSIZE = 8192 def reader(s): while True: data = s.recv(CHUNKSIZE) if data == b'': break process_data(data) Such code can often be replaced using iter(), as follows: def reader(s): for chunk in iter(lambda: s.recv(CHUNKSIZE), b''): process_data(data) If you’re a bit skeptical that it might work, you can try a similar example involving files.

pages: 632 words: 166,729

Addiction by Design: Machine Gambling in Las Vegas
by Natasha Dow Schüll
Published 15 Jan 2012

Scholars of the so-called experience economy have picked up on this tendency toward collaboration in which corporate concerns “lie ever closer to the concerns of the consumer” and products are understood to emerge through a dynamic process of “co-creation.”75 The sociologist Michel Callon and his colleagues, for instance, have described consumer product design as an iterative process of successive adjustment in which “what is sought after is a very close relationship between what the consumer wants and expects, on the one hand, and what is offered, on the other.”76 They construe this relationship as a symmetrical “collaboration between supply and demand” wherein corporations and consumers meet on even ground, holding roughly equal hands, in order to mutually satisfy their respective desires.

The historian David Courtwright likewise views Las Vegas as an exemplary site of what he calls “limbic capitalism” (or “the reorientation of capitalist enterprise [around] providing transient but habitual pleasures, whether drugs or pornography of gambling or even sweet and fatty foods”) (2005, 121). 74. Terranova 2000. 75. Thrift 2006, 284, 279. 76. Callon, Méadl, and Rabeharisoa 2002, 202, emphasis mine. Callon and his colleagues describe the iterative process of customization as one of “qualification” and “requalification,” in which product qualities are progressively attributed, stabilized, objectified, and arranged. They are interested in the “relays and relations between the predilections and passions of the individual and the attributes and image of the product,” as Nikolas Rose puts it in his own work (1999, 245). 77.

gambling machines (technology): addictiveness of; automatic payouts on; bank hoppers in; bill acceptors on; bonus features on; cash-access systems for; cheating at; coinless; credit-based; deceptive aspects of; evolution of; haptic; mathematical programming off; as objects of enchantment; and player trust; role of in addiction; seating for; shift from handles to push-buttons on; sound of; speed of; stop features on; television monitors on; and TITO; and tolerance formation; touchscreens on; visuals of. See also gambling machine design; machine gambling; video poker; video slots. gambling machine design: attention to sensory features in; capacitive; calculative rationality of; and the death drive; ergonomic; as inscription; as an iterative process; and market research; player-centric; and promotion of irresponsible behavior; turn from art to science in. See also gambling machines (technology);; multiline video slots; near misses; random number generator; video poker. games. See contrived contingency; machine gambling; play. gaming industry.

Digital Accounting: The Effects of the Internet and Erp on Accounting
by Ashutosh Deshmukh
Published 13 Dec 2005

By focusing on the flow of materials to the customer, this plan helps unearth constraints and bottlenecks in the process, which are then managed or neutralized. This understanding is made legal via agreements and contracts. A sample CPFR contract is available at www.cpfr.org, • Forecasting sales: A single forecast is developed based on sales data, production capacities and promotional activities. This forecast is an iterative process and each partner must fully commit to the forecast before it is finalized. This single forecast is split in two parts – order forecast and sales forecast. The order forecast represents demand between the supplier and the customer, and sales forecast represents demand from the customer’s customers.

The General Ledger Cycle 281 pants can be notified via e-mail that these forms are available to set the budget process rolling. These forms support both top-down and bottom-up budgets. In the case of topdown budgets, the forms can be populated with desired numbers and sent downward. For bottom-up budgets, the budget numbers can come from the lower levels of the organization. The iterative process can be then carried out. The forms are subject to strict ownership and access requirements. The budget director establishes the ownership and accountability for each form by defining authorization for user groups. Authorization rights include read, write, modify and submit. The systems administrator manages access controls and issuance of passwords.

As usual, there is no unanimity in the definitions; though the description of the CPM process is fairly standard. The CPM process consists of seven steps: strategy formulation, scenario analysis, planning and budgeting, communication, monitoring, forecasting and reporting. Strategy formulation is carried out at the top management level. This is an iterative process and different scenarios are analyzed from different angles to select a set of strategies. These strategies are converted into operational plans and financial budgets. The planning and budgeting process needs to be speedy and flexible to be useful. Key performance measures are designed for each area, and the budgets and performance measures are communicated to all levels.

Addiction by Design: Machine Gambling in Las Vegas
by Natasha Dow Schüll
Published 19 Aug 2012

Scholars of the so-called experience economy have picked up on this tendency toward collaboration in which corporate concerns “lie ever closer to the concerns of the consumer” and products are understood to emerge through a dynamic process of “co-creation.”75 The sociologist Michel Callon and his colleagues, for instance, have described consumer product design as an iterative process of successive adjustment in which “what is sought after is a very close relationship between what the consumer wants and expects, on the one hand, and what is offered, on the other.”76 They construe this relationship as a symmetrical “collaboration between supply and demand” wherein corporations and consumers meet on even ground, holding roughly equal hands, in order to mutually satisfy their respective desires.

The historian David Courtwright likewise views Las Vegas as an exemplary site of what he calls “limbic capitalism” (or “the reorientation of capitalist enterprise [around] providing transient but habitual pleasures, whether drugs or pornography of gambling or even sweet and fatty foods”) (2005, 121). 74. Terranova 2000. 75. Thrift 2006, 284, 279. 76. Callon, Méadl, and Rabeharisoa 2002, 202, emphasis mine. Callon and his colleagues describe the iterative process of customization as one of “qualification” and “requalification,” in which product qualities are progressively attributed, stabilized, objectified, and arranged. They are interested in the “relays and relations between the predilections and passions of the individual and the attributes and image of the product,” as Nikolas Rose puts it in his own work (1999, 245). 77.

gambling machines (technology): addictiveness of; automatic payouts on; bank hoppers in; bill acceptors on; bonus features on; cash-access systems for; cheating at; coinless; credit-based; deceptive aspects of; evolution of; haptic; mathematical programming off; as objects of enchantment; and player trust; role of in addiction; seating for; shift from handles to push-buttons on; sound of; speed of; stop features on; television monitors on; and TITO; and tolerance formation; touchscreens on; visuals of. See also gambling machine design; machine gambling; video poker; video slots. gambling machine design: attention to sensory features in; capacitive; calculative rationality of; and the death drive; ergonomic; as inscription; as an iterative process; and market research; player-centric; and promotion of irresponsible behavior; turn from art to science in. See also gambling machines (technology);; multiline video slots; near misses; random number generator; video poker. games. See contrived contingency; machine gambling; play. gaming industry.

Thinking with Data
by Max Shron
Published 15 Aug 2014

Having to clarify our thoughts down to a few sentences per part is extremely helpful. Once we have them clear (or at least know what is still unclear), we can go out and acquire data, clarify our understanding, start the technical work, clarify our understanding, gradually converge on something smart and useful, and…clarify our understanding. Data science is an iterative process. Context (Co) Every project has a context, the defining frame that is apart from the particular problems we are interested in solving. Who are the people with an interest in the results of this project? What are they generally trying to achieve? What work, generally, is the project going to be furthering?

pages: 353 words: 104,146

European Founders at Work
by Pedro Gairifo Santos
Published 7 Nov 2011

But it got me thinking around that time about monetizing it. I was actually more bothered about what happens when all these early adopters have the product, and perhaps some of them have donated generously. That’s very nice. What happens next? Is it literally just the iterative process of adding more and more features? Or is there something a bit more to this? I think over time it became very apparent that to keep up the iterative process, I needed more staff. I needed help to just keep doing this. We needed to integrate Facebook, and LinkedIn, and more recently Foursquare. To make it more of a hub than simply a Twitter client. So, yes, it didn’t take very long for me to start thinking in that way.

pages: 376 words: 110,796

Realizing Tomorrow: The Path to Private Spaceflight
by Chris Dubbs , Emeline Paat-dahlstrom and Charles D. Walker
Published 1 Jun 2011

"They build a rocket for a year, go out into the desert, they press the button and hope it doesn't blow up. It really rarely works right." He wanted to follow the same "rapid iterative process" he used in developing software and apply it to his rocketry business. "The background that I came from in software is you compile and test maybe a dozen times a day. It's a cyclic thing where you try to make it right but much of the benefit you get is in the exploration of the process, not so much plan it out perfect, implement it perfect for it to work. It's an iterative process of exploring your options." Carmack taught himself aerospace engineering and became one of Armadillo's principal engineers for the project.

pages: 445 words: 105,255

Radical Abundance: How a Revolution in Nanotechnology Will Change Civilization
by K. Eric Drexler
Published 6 May 2013

These can then be separated through cascade sorting processes well suited to atomically precise mechanisms, as discussed in Nanosystems. 153with a site that binds a feedstock molecule: A mechanism downstream can probe each site to ensure that it’s full and push empties on a path that leads them around for another go. With an iterative process of this kind, the free energy requirement for reliable binding can approach the minimum required for reducing entropy, which can be thought of as the work required for compression in a configuration space with both positional and angular coordinates. Computational chemists will note that free energy calculations involving oriented, non-solvated molecules in a rigid binding site are far less challenging than calculations that must probe the configuration space of mobile, solvated, conformationally flexible molecules.

A large slowdown factor from this base (to reduce phonon drag in bearings, for example) is compatible with a still-enormous product throughput. 154chemical steps that prepare reactive bits of molecular structure: To work reliably with small reactive groups and fine-grained structures, placement mechanisms must be stiff enough to adequately constrain thermal fluctuations. (Appendix I and Appendix II place this requirement in perspective.) 156each step typically must expend substantial chemical energy: As with binding, an iterated process with conditional repetition can in some instances avoid this constraint. 157density functional methods . . . applied in conservative ways: Methods in quantum chemistry have limited accuracy and ranges of applicability that must be kept in mind when considering which methods to use and how far to trust their results.

pages: 121 words: 36,908

Four Futures: Life After Capitalism
by Peter Frase
Published 10 Mar 2015

In a paper presented to a 2014 conference of the Association of Computing Machinery, a group of medical researchers presented a method for automatically generating plausible hypotheses for scientists to test, using data mining techniques.20 Such approaches could eventually be applied to other formulaic, iterative processes like the design of pop songs or smartphone games. What’s more, there is also another way for private companies to avoid employing workers for some of these tasks: turn them into activities that people will find pleasurable and will thus do for free on their own time. The computer scientist Luis von Ahn has specialized in developing such “games with a purpose”: applications that present themselves to end users as enjoyable diversions but which also perform a useful computational task, what von Ahn calls “Human Computation.”21 One of Von Ahn’s early games asked users to identify objects in photos, and the data was then fed back into a database that was used for searching images, a technology later licensed by Google to improve its Image Search.

pages: 123 words: 37,853

Do Improvise: Less push. More pause. Better results. A new approach to work (and life) (Do Books)
by Poynton, Robert
Published 14 May 2013

You may learn how not to do something, or use the experience to understand why something didn’t work. You might also discover something you weren’t looking for, which is how many of the most important discoveries get made. You can see a mistake as a ‘mis-take’, like an actor on a film set, and regard it as another take, or attempt, in an iterative process that strives to get better. This doesn’t mean you should make mistakes on purpose, but given that they are going to occur anyway, it provides a constructive way to respond and a good way to direct your energy. If mistakes are opportunities, you don’t need to make apologies, look for scapegoats or find excuses: you just get on with working out how to use them.

pages: 161 words: 39,526

Applied Artificial Intelligence: A Handbook for Business Leaders
by Mariya Yao , Adelyn Zhou and Marlene Jia
Published 1 Jun 2018

Anand Rao, Innovation Lead for US Analytics at PwC, and his team use the agile method to run four-week sprints on AI projects, during which they transform an idea into an initial implementation.(82) After each cycle, the team reviews project performance to determine whether the project needs more data or is even worth pursuing further. As Rao explains, this iterative process “gives the option to the client to experiment in a bite-size piece, as opposed to big chunk investments,” thereby mitigating risk. PwC uses efficient teams of only two or three people and runs up to 80 sprints a year. Technical Debt Building a successful machine learning model is just the first step to creating an AI product.

pages: 170 words: 42,196

Don't Make Me Think!: A Common Sense Approach to Web Usability
by Steve Krug
Published 2 Jan 2000

No one has the resources to set up the kind of controlled experiment you’d need. What testing can do is provide you with invaluable input which, taken together with your experience, professional judgment, and common sense, will make it easier for you to choose wisely—and with greater confidence—between “a” and “b.” > Testing is an iterative process. Testing isn’t something you do once. You make something, test it, fix it, and test it again. > Nothing beats a live audience reaction. One reason why the Marx Brothers’ movies are so wonderful is that before they started filming they would go on tour on the vaudeville circuit and perform scenes from the movie, doing five shows a day, improvising constantly and noting which lines got the best laughs.

pages: 137 words: 44,363

Design Is a Job
by Mike Monteiro
Published 5 Mar 2012

Never apologize for what you’re not showing. By the time you’re presenting, you should be focused on presenting what you have, not making excuses for what you don’t. And you need to believe what you’re saying to convince the client of the same. If you think the work is on the way to meeting their goals then say that. Design is an iterative process, done with a client’s proper involvement at key points. The goal isn’t always to present finished work; it’s to present work at the right time. I’ve met a few designers over the years who feel like selling design is manipulation. Manipulation is convincing someone that the truth is different than what it seems.

pages: 199 words: 43,653

Hooked: How to Build Habit-Forming Products
by Nir Eyal
Published 26 Dec 2013

The Hook Model can be a helpful tool for filtering out bad ideas with low habit potential as well as a framework for identifying room for improvement in existing products. However, after the designer has formulated new hypotheses, there is no way to know which ideas will work without testing them with actual users. Building a habit-forming product is an iterative process and requires user-behavior analysis and continuous experimentation. How can you implement the concepts in this book to measure your product’s effectiveness in building user habits? Through my studies and discussions with entrepreneurs at today’s most successful habit-forming companies, I’ve distilled this process into what I term Habit Testing.

Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps
by Valliappa Lakshmanan , Sara Robinson and Michael Munn
Published 31 Oct 2020

3 See https://oreil.ly/kDndF for a primer on call and put options. 4 The dataset was generated based on the PaySim research proposed in this paper: EdgarLopez-Rojas , Ahmad Elmir, and Stefan Axelsson, “PaySim: A financial mobile money simulator for fraud detection,” 28th European Modeling and Simulation Symposium, EMSS, Larnaca, Cyprus (2016): 249–255. Chapter 4. Model Training Patterns Machine learning models are usually trained iteratively, and this iterative process is informally called the training loop. In this chapter, we discuss what the typical training loop looks like, and catalog a number of situations in which you might want to do something different. Typical Training Loop Machine learning models can be trained using different types of optimization.

After splitting our data, we’ll use the same type of analysis we employed in the “Before training” part of this section on each split of our data: training, validation, and test. As seen from this analysis, there is no one-size-fits-all solution or evaluation metric for model fairness. It is a continuous, iterative process that should be employed throughout an ML workflow—from data collection to deployed model. Trade-Offs and Alternatives There are many ways to approach model fairness in addition to the pre- and post-training techniques discussed in the Solution section. Here, we’ll introduce a few alternative tools and processes for achieving fair models.

Engineering Security
by Peter Gutmann

The Transformation is fairly straightforward, we want to go from an untrusted to a trusted state, or more abstractly we want to solve the problem of trusted knowledge distribution. 66 Anyone who’s ever owned a cat can easily appreciate how this would lead to CATWOE. 256 Threats Finally, we have the Root Definition, which is that we want to validate customers using systems that we don’t control over a network that we don’t control against systems that we do control in a situation where it’s advantageous for attackers to manipulate the process, and it all has to be done on a shoestring budget (as “Problems without Solutions” on page 396 points out, there’s a good reason why the sorts of problems that PSMs were created to address are known as “wicked problems”). The above is only one particular way of approaching things, which is why Figure 68 showed this stage as being part of a very iterative process. At the moment we’ve framed the problem from the point of view of the defender. What happens when we look at it from the attacker’s perspective? In other words rather than looking at what the defenders are trying to achieve, can we look at what the attackers are trying to achieve? About a decade ago the primary motivation for attackers would have been ego gratification, whereas today it’s far more likely to be a commercial motive.

So by changing this part of the model to “attackers (owners) cannot undetectably manipulate the communications” and then operating this new variant on paper, it’s possible to see whether the change improves (or worsens) the overall situation. As Figure 69 shows, these last two stages of the SSM are a very iterative process. Much more so than for the defenders, the attackers would use this stage to insert a typical attack into the model, run through it to see how well it works, and then try again if it doesn’t (although given that any attack will be successful if you throw enough of someone else’s resources at it, and the attackers have little shortage of those, the number of iterations may be less than expected).

The response from a security auditor on seeing this was “someone needs shooting”. 140 In case you’re wondering about Password-Based Key Derivation Function No.1, technically there isn’t one, it’s a gap left for historical reasons. Passwords on the Server 605 whose only use is to be passed back to another DPAPI function [290] (DPAPI is covered in more detail in “Case Study: Apple’s Keychain” on page 622). An iterated hash function isn’t always the most appropriate thing to use for processing passwords. While the iteration process helps slow down passwordguessing attacks it also makes legitimate password checks a lot more laborious, and really only makes sense if you expect the attacker to be able to steal your password database. As pointed out earlier, if an attacker is in a position to steal your password database then you probably have bigger problems than password-cracking to worry about.

pages: 167 words: 50,652

Alternatives to Capitalism
by Robin Hahnel and Erik Olin Wright

If the proposals are rejected, households revise them. 5.Neighborhood consumption councils aggregate the approved individual consumption requests of all households in the neighborhood, append requests for whatever neighborhood public goods they want, and submit the total list as the neighborhood consumption council’s request in the planning process. 6.Higher-level federations of consumption councils make requests for whatever public goods are consumed by their membership. 7.On the basis of all of the consumption proposals along with the production proposals from worker councils, the IFB recalculates the indicative prices and, where necessary, sends proposals back to the relevant councils for revision. 8.This iterative process continues until no revisions are needed. There are two issues that I would like to raise with this account about how household consumption planning would actually work in practice: (1) How useful is household consumption planning? (2) How marketish are “adjustments”? How Useful Is Household Consumption Planning?

pages: 165 words: 50,798

Intertwingled: Information Changes Everything
by Peter Morville
Published 14 May 2014

For example, information architects are often associated with what the Agile software community calls Big Design Up Front. And it’s true that in the early days of the Web, our wireframes fit nicely into the sequential process of the waterfall model. We created blueprints for websites before designers and developers got involved. Many of us would have preferred a more collaborative, iterative process but were constrained by management’s step by step plans. Since then, the context has changed. While we still plan new sites, much of our work is about measuring and improving what exists. And when we do a responsive redesign, for instance, we know wireframes aren’t enough, so we work with designers and developers to build HTML prototypes we can test on many devices.

pages: 761 words: 231,902

The Singularity Is Near: When Humans Transcend Biology
by Ray Kurzweil
Published 14 Jul 2005

(For an algorithmic description of genetic algorithms, see this note.175) The key to a GA is that the human designers don't directly program a solution; rather, they let one emerge through an iterative process of simulated competition and improvement. As we discussed, biological evolution is smart but slow, so to enhance its intelligence we retain its discernment while greatly speeding up its ponderous pace. The computer is fast enough to simulate many generations in a matter of hours or days or weeks. But we have to go through this iterative process only once; once we have let this simulated evolution run its course, we can apply the evolved and highly refined rules to real problems in a rapid fashion.

With the reverse engineering of the human brain we will be able to apply the parallel, self-organizing, chaotic algorithms of human intelligence to enormously powerful computational substrates. This intelligence will then be in a position to improve its own design, both hardware and software, in a rapidly accelerating iterative process. But there still appears to be a limit. The capacity of the universe to support intelligence appears to be only about 1090 calculations per second, as I discussed in chapter 6. There are theories such as the holographic universe that suggest the possibility of higher numbers (such as 10120), but these levels are all decidedly finite.

pages: 189 words: 52,741

Lifestyle Entrepreneur: Live Your Dreams, Ignite Your Passions and Run Your Business From Anywhere in the World
by Jesse Krieger
Published 2 Jun 2014

This encourages creativity and you may be pleasantly surprised Once the initial sketches designs are completed I’ll look for various elements in the logo that I like and write feedback asking them to incorporate various aspects from the initial designs into a new round of logos based on my feedback. This is an iterative process where each round of designs helps clarify the idea I have in mind and informs the directions I give the designer for the next round of improvements. Generally going through this process 2-3 times gets me 80-90% of the way there and then the final changes usually revolve around changing font styles, adjusting color schemes and the placement of elements within the logo.

pages: 222 words: 53,317

Overcomplicated: Technology at the Limits of Comprehension
by Samuel Arbesman
Published 18 Jul 2016

A humble approach to our technologies helps us strive to understand these human-made, messy constructions, yet still yield to our limits. And this humble approach to technology fits quite nicely with biological thinking. While at every moment an incremental approach to knowledge provides additional understanding of a system, this iterative process will always feel incomplete. And that’s okay. New York Times columnist David Brooks has noted, “Wisdom starts with epistemological modesty.” Humility, alongside an interest in the details of complex systems, can do what both fear and worship cannot: help us peer and poke around the backs of our systems, even if we never look them in the face with complete understanding.

pages: 176 words: 54,784

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
by Mark Manson
Published 12 Sep 2016

And that will be a good thing. Because that will mean I have grown. There’s a famous Michael Jordan quote about him failing over and over and over again, and that’s why he succeeded. Well, I’m always wrong about everything, over and over and over again, and that’s why my life improves. Growth is an endlessly iterative process. When we learn something new, we don’t go from “wrong” to “right.” Rather, we go from wrong to slightly less wrong. And when we learn something additional, we go from slightly less wrong to slightly less wrong than that, and then to even less wrong than that, and so on. We are always in the process of approaching truth and perfection without actually ever reaching truth or perfection.

pages: 196 words: 54,339

Team Human
by Douglas Rushkoff
Published 22 Jan 2019

Our science fiction movies depict races of robots taking revenge on their human overlords—as if this problem is somehow more relevant than the unacknowledged legacy of slavery still driving racism in America, or the twenty-first-century slavery on which today’s technological infrastructure depends. We are moving into a world where we care less about how other people regard us than how AIs do. 56. Algorithms do reflect the brilliance of the engineers who craft them, as well as the power of iterative processes to solve problems in novel ways. They can answer the specific questions we bring them, or even generate fascinating imitations of human creations, from songs to screenplays. But we are mistaken if we look to algorithms for direction. They are not guided by a core set of values so much as by a specific set of outcomes.

pages: 696 words: 143,736

The Age of Spiritual Machines: When Computers Exceed Human Intelligence
by Ray Kurzweil
Published 31 Dec 1998

The above paradigm is called an evolutionary (sometimes called genetic) algorithm.31 The system designers don’t directly program a solution; they let one emerge through an iterative process of simulated competition and improvement. Recall that evolution is smart but slow, so to enhance its intelligence we retain its discernment while greatly speeding up its ponderous pace. The computer is fast enough to simulate thousands of generations in a matter of hours or days or weeks. But we have only to go through this iterative process one time. Once we have let this simulated evolution run its course, we can apply the evolved and highly refined rules to real problems in a rapid fashion.

pages: 517 words: 147,591

Small Wars, Big Data: The Information Revolution in Modern Conflict
by Eli Berman , Joseph H. Felter , Jacob N. Shapiro and Vestal Mcintyre
Published 12 May 2018

You gather data and see that places with successes and failures all have copper mines, but all places with successes have agricultural districts and all places with failures do not. Once you recognize that some aspect of an agricultural economy interacts well with the new welfare program, you can design it and deploy it more effectively. As we mentioned, sometimes the iterative process yields big surprises. Eli’s evening with development contractors in Kabul was one of those, revealing that they were themselves lacking a doctrine. Since then, we have been chipping away at the question of how development assistance should be designed in an insurgency setting, iterating between a large set of theories and a growing set of empirical findings, as we will discuss at greater length in subsequent chapters.

See also labor market Edhi, Abdul Sattar, 232 Eggers, Andrew C., 136 Egypt, 99–100 election fraud, 283–87 ELN (National Liberation Army), 255 El Salvador, 310 Elvidge, Chris, 14 empirical, as opposed to quantitative, 331n20 Empirical Studies of Conflict (ESOC), 25–54; background on, 25–31; epistemology of, 33–43; establishing causal relationships as goal of, 43–50; incremental accumulation of facts in, 53; iterative process of, 50–53; value of work done by, 316 endorsement experiments, 193–94, 196–97, 231–32, 357n28 Enikolopov, Ruben, 132 equilibrium, in game theoretic models, 62, 69–77, 338n17 Erbil, Iraq, 112–13 ESOC. See Empirical Studies of Conflict ESOC Philippines Database, xvii, 38, 137, 268, 270, 325–27, 333n17, 368n11 Estrada, Joseph, 66 Ethiopia, 140 European Union, 296 Fair, Christine, 47, 192, 194–95 Fallujah, Iraq, 110–11, 113, 152 FARC (Revolutionary Armed Forces of Colombia), 159, 235, 255–56, 309 Fearon, James, 226, 363n13 Federally Administered Tribal Areas, 11, 231, 297 Felter, Joe, 24–29, 32, 49, 58, 60–61, 89, 129–31, 134–36, 139, 179, 186, 208, 220, 224, 236, 241, 253, 257–58, 268–74, 305, 325–27 Fetzer, Thiemo, 251 Fine, Patrick, 156 1st Infantry Division, 3–7, 9 Fishman, Brian, 190 Flanagan, Mary, 79 Fluor, 110–12 FM (Field Manual) 3–24, 7 FMLN (Farabundo Martí National Liberation Front), 310 food aid, 139–42 foreign aid.

pages: 1,233 words: 239,800

Public Places, Urban Spaces: The Dimensions of Urban Design
by Matthew Carmona , Tim Heath , Steve Tiesdell and Taner Oc
Published 15 Feb 2010

In each development phase, particularly the design phase, the urban designer’s thought processes can be disaggregated into a series of stages:• Setting goals – in conjunction with other actors (particularly clients and stakeholders) and having regard to economic and political realities, proposed timescale, and client and stakeholder requirements. • Analysis – gathering and analysing information and ideas that might inform the design solution. • Visioning – generating and developing possible solutions through an iterative process of imaging and presenting – usually informed by personal experience and design philosophies. • Synthesis and prediction – testing the generated solutions as a means to identify workable alternatives. • Decision-making – identifying alternatives to be discarded and those worthy of further refinement or promoting as the preferred design solution

Each stage represents a complex set of activities, which, while generally portrayed as a linear process, is iterative and cyclical, and less mechanistic and more intuitive than diagrams of the design process suggest. Furthermore, in each phase, the nature of the problem changes and evolves as new information and influences come to bear, resulting in an iterative process with designs – including design policies and other guidance – reconsidered in the light of new objectives, or implemented in part and later changed and adapted as new influences come to bear. At this level, urban design parallels similar design processes in urban planning at the city-wide scale, architectural design of individual buildings and landscape design across the range of scales (Figure 3.13).

Zeisel (1981, 2006) conceived of a ‘design spiral’, in which design interconnects three basic activities – imaging, presenting and testing – relying on two types of information – heuristic and catalytic for imaging, and a body of knowledge for testing. He conceived of design as a cyclical and iterative process through which solutions are gradually refined through a series of creative leaps or ‘conceptual shifts’ as designers continuously modify their results in the light of new information (Figure 3.14). FIGURE 3.14 The design spiral (Image: John Zeisel 1981). A problem is identified, to which the designer forms a tentative solution, a range of solutions or, more generally, approaches to a solution.

pages: 211 words: 58,677

Philosophy of Software Design
by John Ousterhout
Published 28 Jan 2018

The result is CS 190 at Stanford University. In this class I put forth a set of principles of software design. Students then work through a series of projects to assimilate and practice the principles. The class is taught in a fashion similar to a traditional English writing class. In an English class, students use an iterative process where they write a draft, get feedback, and then rewrite to make improvements. In CS 190, students develop a substantial piece of software from scratch. We then go through extensive code reviews to identify design problems, and students revise their projects to fix the problems. This allows students to see how their code can be improved by applying design principles.

Data Mining: Concepts and Techniques: Concepts and Techniques
by Jiawei Han , Micheline Kamber and Jian Pei
Published 21 Jun 2011

All the patterns in each “ball” are then fused together to generate a set of superpatterns. These superpatterns form a new pool. If the pool contains more than K patterns, the next iteration begins with this pool for the new round of random drawing. As the support set of every superpattern shrinks with each new iteration, the iteration process terminates. Note that Pattern-Fusion merges small subpatterns of a large pattern instead of incrementally-expanding patterns with single items. This gives the method an advantage to circumvent midsize patterns and progress on a path leading to a potential colossal pattern. The idea is illustrated in Figure 7.10.

A number of automated techniques have been proposed that search for a “good” network structure. These typically use a hill-climbing approach that starts with an initial structure that is selectively modified. 9.2.3. Backpropagation “How does backpropagation work?” Backpropagation learns by iteratively processing a data set of training tuples, comparing the network's prediction for each tuple with the actual known target value. The target value may be the known class label of the training tuple (for classification problems) or a continuous value (for numeric prediction). For each training tuple, the weights are modified so as to minimize the mean-squared error between the network's prediction and the actual target value.

It tackles the problem in an iterative, greedy way. Like the k-means algorithm, the initial representative objects (called seeds) are chosen arbitrarily. We consider whether replacing a representative object by a nonrepresentative object would improve the clustering quality. All the possible replacements are tried out. The iterative process of replacing representative objects by other objects continues until the quality of the resulting clustering cannot be improved by any replacement. This quality is measured by a cost function of the average dissimilarity between an object and the representative object of its cluster. Specifically, let o1, …, ok be the current set of representative objects (i.e., medoids).

pages: 186 words: 50,651

Interactive Data Visualization for the Web
by Scott Murray
Published 15 Mar 2013

The increased speed enables us to work with much larger datasets of thousands or millions of values; what would have taken years of effort by hand can be mapped in a moment. Just as important, we can rapidly experiment with alternate mappings, tweaking our rules and seeing their output re-rendered immediately. This loop of write/render/evaluate is critical to the iterative process of refining a design. Sets of mapping rules function as design systems. The human hand no longer executes the visual output; the computer does. Our human role is to conceptualize, craft, and write out the rules of the system, which is then finally executed by software. Unfortunately, software (and computation generally) is extremely bad at understanding what, exactly, people want.

pages: 202 words: 62,199

Essentialism: The Disciplined Pursuit of Less
by Greg McKeown
Published 14 Apr 2014

For example, when I was still in the exploratory mode of the book, before I’d even begun to put pen to paper (or fingers to keyboard), I would share a short idea (my minimal viable product) on Twitter. If it seemed to resonate with people there, I would write a blog piece on Harvard Business Review. Through this iterative process, which required very little effort, I was able to find where there seemed to be a connection between what I was thinking and what seemed to have the highest relevancy in other people’s lives. It is the process Pixar uses on their movies. Instead of starting with a script, they start with storyboards—or what have been described as the comic book version of a movie.

pages: 217 words: 63,287

The Participation Revolution: How to Ride the Waves of Change in a Terrifyingly Turbulent World
by Neil Gibb
Published 15 Feb 2018

This “heuristic” – or trial-and-error – approach to product development is a critical characteristic of the participatory innovation mind-set. There was no hanging out in labs trying to develop the perfect product. Woodman’s lab was the waves, and his cohorts were the other surfers who clustered around him in the café as he shared his early results. It was an agile, iterative process. Like Eastman’s trip to the Dominican Republic and Land’s moment on the beach with his daughter, Woodman’s insight emerged directly from real-life experience. There was no need for focus groups or workshops on what users might need – like Simon Mottram at Rapha, Woodman had deep empathy with the needs of the user, because he was the user.

pages: 523 words: 61,179

Human + Machine: Reimagining Work in the Age of AI
by Paul R. Daugherty and H. James Wilson
Published 15 Jan 2018

The pain point might be a cumbersome, lengthy internal process (for example, an HR department taking an inordinate amount of time to fill staff positions), or it might be a frustrating, time-consuming external process (for example, customers having to file multiple forms to get a medical procedure approved by their insurer). Often, identifying such opportunities for process reimagination is an iterative process. Consider the case of a large agricultural company that was developing an AI system to help farmers improve their operations. The system would have access to an enormous amount of data from a variety of sources, including information on soil properties, historic weather data, etc. The initial plan was to build an application that would help farmers better predict their crop yields for upcoming seasons.

pages: 333 words: 64,581

Clean Agile: Back to Basics
by Robert C. Martin
Published 13 Oct 2019

Essentially, both movements want to achieve very similar things. They both want customer satisfaction, they both desire close collaboration, and they both value short feedback loops. Both want to deliver high-quality, valuable work, and both want professionalism. In order to achieve business agility, companies need not only collaborative and iterative processes, but also good engineering skills. Combining Agile and Craftsmanship is the perfect way to achieve that. Conclusion At the Snowbird meeting in 2001, Kent Beck said that Agile was about the healing of the divide between development and business. Unfortunately, as the project managers flooded into the Agile community, the developers—who had created the Agile community in the first place—felt dispossessed and undervalued.

pages: 543 words: 163,997

The Billion-Dollar Molecule
by Barry Werth

Each week the company tested dozens of new compounds to see how they bound to FKBP, whether they inhibited protein folding and whether they stopped, through any of a half-dozen mechanisms, T cells from proliferating in test tubes. There was an animal pharmacology lab where promising molecules were pumped into mice with skin grafts on their foot pads, a simple model for testing rejection. Drug development is an iterative process, a backbreaking multiyear series of assays aimed at funneling molecules upward through an evolutionary gauntlet. At the lowest level, molecules are tested for chemical affinity to a target, usually a protein. Those that are successful are examined for biochemical activity. If they’re active—if they change the functioning of a target—they go into cells.

Whether or not he needed a molecular doorstop—a group of atoms to prop open the flap region—Murcko now doubted more than ever that the effector region alone accounted for the drug’s action. The atomic configuration around the active site, as Boger had speculated, seemed also to be involved. Murcko pinned Vertex’s computer network with dozens of studies comparing the binding energy of the two complexes. He had always talked about designing drugs as an “iterative process,” a kind of smart person’s trial and error. Not that he considered himself smarter than the chemists, but by modeling different atomic configurations, then calculating their efficiency, he could predict which ones would be most potent. He could see how moving a few atoms an angstrom or two to fill an empty pocket might make a compound bind more tightly.

pages: 233 words: 67,596

Competing on Analytics: The New Science of Winning
by Thomas H. Davenport and Jeanne G. Harris
Published 6 Mar 2007

Once the pieces fall into place, it still takes time for an organization to get the large-scale results it needs to become an analytical competitor. Changing business processes and employee behaviors is always the most difficult and time-consuming part of any major organizational change. And by its nature, developing an analytical capability is an iterative process, as managers gain better insights into the dynamics of their business over time by working with data and refining analytical models. Our research and experience suggests that it takes eighteen to thirty-six months of regularly working with data to start developing a steady stream of rich insights that can be translated into practice.

pages: 265 words: 70,788

The Wide Lens: What Successful Innovators See That Others Miss
by Ron Adner
Published 1 Mar 2012

Any red light that appears on your map—whether because of a partner’s inability to deliver or unwillingness to cooperate, or due to a problem on your part—must be addressed. This can mean any number of scenarios, from managing incentives to finding a way to eliminate the troublesome link in your blueprint. Often, identifying the most promising path is an iterative process. Only once you have made the necessary adjustments can you confidently start your engines. This is not to say that seeing all green guarantees success; you will still face all the usual unknowns of the market and its vagaries. Execution is critical. But unless you have a plan to get to green across the board, expect delays and disappointment even if you deliver your own part flawlessly.

pages: 239 words: 64,812

Geek Sublime: The Beauty of Code, the Code of Beauty
by Vikram Chandra
Published 7 Nov 2013

“Of all the different types of people I’ve known, hackers and painters are among the most alike,” writes Graham. “What hackers and painters have in common is that they’re both makers. Along with composers, architects, and writers, what hackers and painters are trying to do is make good things.”1 According to Graham, the iterative processes of programming—write, debug (discover and remove bugs, which are coding errors, mistakes), rewrite, experiment, debug, rewrite—exactly duplicate the methods of artists: “The way to create something beautiful is often to make subtle tweaks to something that already exists, or to combine existing ideas in a slightly new way … You should figure out programs as you’re writing them, just as writers and painters and architects do.”2 Attention to detail further marks good hackers with artist-like passion: All those unseen details [in a Leonardo da Vinci painting] combine to produce something that’s just stunning, like a thousand barely audible voices all singing in tune.

pages: 202 words: 64,725

Designing Your Life: How to Build a Well-Lived, Joyful Life
by Bill Burnett and Dave Evans
Published 12 Sep 2016

Knowing the current status of your health / work / play / love dashboard gives you a framework and some data about yourself, all in one place. Only you know what’s good enough or not good enough—right now. After a few more chapters and a few more tools and ideas, you may want to come back to this assessment and check the dashboard one more time, to see if anything has changed. Since life design is an iterative process of prototypes and experimentation, there are lots of on ramps and off ramps along the way. If you’re beginning to think like a designer, you will recognize that life is never done. Work is never done. Play is never done. Love and health are never done. We are only done designing our lives when we die.

pages: 212 words: 68,690

Independent Diplomat: Dispatches From an Unaccountable Elite
by Carne Ross
Published 25 Apr 2007

The 2003 war is discussed in chapter 4, but I do not share the view of those who think the war was “about” oil, any more than I think French and Russian opposition (or indeed German or anyone else’s) opposition was “about” their economic interest in the existing regime. From my experience, and I have talked to a number of senior diplomats and foreign policy-makers who share this view, only very rarely do decision-makers set down a list of their motives, objectives and “interests”. More generally, this is an unordered and iterative process where a paradigmatic view of a situation is built up and then continually reinforced until, in a process similar to the shifts in scientific views described by Thomas Kuhn,4 something dramatic happens that forces that view to change.5 Those involved in formulating and expounding the view accumulate a series of facts to justify their interpretation.

pages: 244 words: 66,977

Subscribed: Why the Subscription Model Will Be Your Company's Future - and What to Do About It
by Tien Tzuo and Gabe Weisert
Published 4 Jun 2018

At these times, the total number of accounts grew rapidly, but revenue per account stagnated or sank. Each of those periods was followed by a correctional phase when the net new accounts decreased, but the average revenue per account increased. Pricing in the Subscription Economy is a flexible, iterative process. Companies frequently experiment with a combination of set fees and usage-based models as they seek to “land and expand.” Strategies prioritizing net new account growth will frequently drive growth with competitive pricing, and then later “switch levers” and attempt to drive ARPA with usage-based billing and by upselling into larger accounts.

pages: 227 words: 63,186

An Elegant Puzzle: Systems of Engineering Management
by Will Larson
Published 19 May 2019

I’ve met many product managers who are excellent operators, but few product managers who can operate at a high degree while also getting deep into their users’ needs. Likewise, I’ve worked with many engineering managers who ground their work in their users’ needs, but I’ve known few who can fix their attention on those users when things start getting rocky within their team. Figure 3.3 Iterative process of product development. Reality isn’t always accommodating of this ideal setup. Maybe your team’s product manager leaves or a new team is being formed,7 and you, as an engineering leader, need to cover both roles for a few months. This can be exciting, and, yes this can be a time when “exciting” rhymes with “terrifying.”

pages: 200 words: 67,943

Working Identity, Updated Edition, With a New Preface: Unconventional Strategies for Reinventing Your Career
by Herminia Ibarra
Published 17 Oct 2023

Three rough categories of career-change strategies emerged almost immediately: (1) growing a side project; (2) generating job offers or temporary assignments by talking to headhunters and canvassing old friends and coworkers; and (3) taking a sabbatical or time-out from full-time work, usually to go back to school or take courses. As these categories emerged, I used the theoretical sampling approach to make sure I had enough examples of each type to afford comparison. In the next stage of the data collection and analysis, I used an iterative process of moving back and forth between the data; the relevant literature in psychology, sociology, and organizational behavior; and my emerging concepts to begin to develop more abstract conceptual categories. I compared my emerging conceptual model; data from the study; and the literature on identity, career adaptation, and professional socialization to guide decisions about what other kinds of people to interview and what other themes to develop.

pages: 728 words: 182,850

Cooking for Geeks
by Jeff Potter
Published 2 Aug 2010

However, the methodical approach is to look at A, wonder if maybe B would be better, and rework it until you have B. ("Hmm, seems a bit dull, needs a bit more zing, how about some lemon juice?") The real skill isn’t in getting to B, though: it’s in holding the memory of A in your head and judging whether B is actually an improvement. It’s an iterative process—taste, adjust, taste, adjust—with each loop either improving the dish or educating you about what guesses didn’t work out. Even the bad guesses are useful because they’ll help you build up a body of knowledge. Taste the dish. It’s your feedback mechanism both for checking if A is "good enough" and for determining if B is better than A.

If your dream is to play in a band, don’t expect to get up on stage after a day or even a month; start by picking up a basic book on learning to play the guitar and practicing somewhere you’re comfortable. A beta tester for this book commented: While there are chefs with natural-born abilities, people have to be aware that learning to cook is an iterative process. They have to learn to expect not to get it right the first time, and proceed from there, doing it again and again. What about when you fubar (foobar?) a meal and can’t figure out why? Think of it like not solving a puzzle on the first try. When starting to cook, make sure you don’t pick puzzles that are too difficult.

pages: 220 words: 73,451

Democratizing innovation
by Eric von Hippel
Published 1 Apr 2005

As illustration, I first discuss the repartioning of the tasks involved in custom semiconductor chip development. Then, I show how the same principles can be applied in the less technical context of custom food design. Traditionally, fully customized integrated circuits were developed in an iterative process like that illustrated in figure 11.1. The process began with a user specifying the functions that the custom chip was to perform to a manufacturer of integrated circuits. The chip would then be designed by manufacturer employees, and an (expensive) prototype would be produced and sent to the user.

pages: 252 words: 74,167

Thinking Machines: The Inside Story of Artificial Intelligence and Our Race to Build the Future
by Luke Dormehl
Published 10 Aug 2016

While his team work on the machine-learning tools that will make the technology a reality, Eterni.me instead focuses on collecting the users’ data that will one day give its avatars their digital lifeblood. He doesn’t think Eterni.me’s 30,269 early adopters are going to be waiting forever, though. ‘This isn’t technology that is decades away,’ he says. ‘Building lifelike avatars is an iterative process. Think of it like search results; they’ll just get better and better, more and more accurate as time goes on.’ Yourself in Machine Form As it happens, Marius Ursache is far from the first person to consider how machines might allow humans to live on after their death. Despite only dating back a few decades, multiplayer online games have already had to grapple with what happens when a popular player or creator dies.

pages: 257 words: 76,785

Shorter: Work Better, Smarter, and Less Here's How
by Alex Soojung-Kim Pang
Published 10 Mar 2020

But as designers will tell you, it’s not enough to design and build a prototype; to see if it really works, and how it can be improved, you have to test it. 5 Test In the test phase of the design thinking process, you gather data on the performance of your prototype and use it to improve your plans and guide the next prototype. At its simplest and briefest, product design is an iterative process, with groups moving through several generations of ideas, prototypes, and testing before delivering a final product. In our world, designing the workday is a continuous, open-ended process. It never really ends. Clients change, employees come and go, new technologies emerge that can help you automate tasks or augment workers’ abilities, and they all present opportunities to improve your workday.

Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
by Aurelien Geron
Published 14 Aug 2019

The number of rooms per household is also more informative than the total number of rooms in a district—obviously the larger the houses, the more expensive they are. This round of exploration does not have to be absolutely thorough; the point is to start off on the right foot and quickly gain insights that will help you get a first reasonably good prototype. But this is an iterative process: once you get a prototype up and running, you can analyze its output to gain more insights and come back to this exploration step. Prepare the Data for Machine Learning Algorithms It’s time to prepare the data for your Machine Learning algorithms. Instead of just doing this manually, you should write functions to do that, for several good reasons: This will allow you to reproduce these transformations easily on any dataset (e.g., the next time you get a fresh dataset).

pages: 265 words: 75,202

The Heart of Business: Leadership Principles for the Next Era of Capitalism
by Hubert Joly
Published 14 Jun 2021

When a company strives to do good things and help people, the connection between personal drive and the company’s noble purpose is easy to make. An increasing number of businesspeople agree. But how does it work in practice? How do we foster that connection and nurture it? For us at Best Buy, it took an iterative process that continues today and entails the following: Explicitly articulating the people-first philosophy Exploring what drives people around you Capturing moments that matter Sharing stories and encouraging role modeling Framing the company’s purpose in a meaningful, human, and authentic fashion Spreading meaning Explicitly Articulating the People-First Philosophy On Monday, August 20, 2012, the day that my appointment as CEO of Best Buy was announced, I addressed 500 or so directors and officers of the company gathered at headquarters.

When Free Markets Fail: Saving the Market When It Can't Save Itself (Wiley Corporate F&A)
by Scott McCleskey
Published 10 Mar 2011

This generally means adding ‘‘credit enhancement,’’ such as overcollateralizing the loans in the pool, or it could mean a change to the characteristics of the underlying assets. But the rating agencies aren’t supposed to advise these potential clients; otherwise, they could be construed as part of the group creating the investment and their legal departments would have a fit (liability and all that). But in this ‘‘iterative process,’’ as the rating agencies describe it, there is a thin line between being helpful and being part of the team. Regulatory proposals in the European Union have taken a particularly aggressive view with respect to the advisory nature of rating agencies in the structured finance process, and a federal court opinion in 2009 made the point outright that the rating agencies were in fact part of the syndicate creating the securities that were the focus of a lawsuit.10 WHAT REALLY KEEPS THE RATING AGENCIES UP AT NIGHT (AND IT IS NOT YOUR MORTGAGE) Which brings us to the issue that is arguably the biggest concern for rating agencies, if their lobbying activities are any indication: civil liability.

pages: 307 words: 17,123

Behind the cloud: the untold story of how Salesforce.com went from idea to billion-dollar company--and revolutionized an industry
by Marc Benioff and Carlye Adler
Published 19 Nov 2009

The system includes entering a country, establishing a beachhead, gaining customers, earning local references, and then making hires. Next, we seek partners, build add-ons, and grow field sales. It is a system that operates as a machine with distinct cogs that work together. The best part is that it is an iterative process that works in almost all markets; or as Doug Farber, 177 BEHIND THE CLOUD who’s built our markets in Australia and Asia, says, the ability to ‘‘rinse and repeat’’ is the key to global growth. Play #81: Uphold a One-Company Attitude Across Borders The Internet was making the world more homogeneous when it came to IT needs, and the services we were selling were not affected by global boundaries.

Writing Effective Use Cases
by Alistair Cockburn
Published 30 Sep 2000

Often, they will be developed by different teams. These situations show up with shrink-wrapped software such as word processors, as illustrated above. The second situation is when you are writing additions to a locked requirements document. Susan Lilly writes, "You’re working on a project with an iterative process and multiple drops. You have baselined requirements for a drop. In a subsequent drop, you extend a baselined use case with new or additional functionality. You do not touch the baselined use case." If the base use case is not locked, then the extension is fragile: changing the base use case can mess up the condition mentioned in the extending use case.

pages: 411 words: 80,925

What's Mine Is Yours: How Collaborative Consumption Is Changing the Way We Live
by Rachel Botsman and Roo Rogers
Published 2 Jan 2010

Collaborative Consumption shows consumers that their material wants and needs do not need to be in conflict with the responsibilities of a connected citizen. The idea of happiness being epitomized by the lone shopper surrounded by stuff becomes absurd, and happiness becomes a much broader, more iterative process. Reputation Bank Account Reputation is one of the most salient areas where the push and pull between the collective good and self-interest have real impact. Reputation is a personal reward that is intimately bound up with respecting and considering the needs of others. Undeniably, almost all of us wonder and care, at least a little bit, what other people—friends, family, coworkers, and people we have just met—think about us.

pages: 345 words: 86,394

Frequently Asked Questions in Quantitative Finance
by Paul Wilmott
Published 3 Jan 2007

Expected loss The average loss once a specified threshold has been breached. Used as a measure of Value at Risk. See page 48. Finite difference A numerical method for solving differential equations wherein derivatives are approximated by differences. The differential equation thus becomes a difference equation which can be solved numerically, usually by an iterative process. Gamma The sensitivity of an option’s delta to the underlying. Therefore it is the second derivative of an option price with respect to the underlying. See page 111. GARCH Generalized Auto Regressive Conditional Heteroscedasticity, an econometric model for volatility in which the current variance depends on the previous random increments.

pages: 310 words: 82,592

Never Split the Difference: Negotiating as if Your Life Depended on It
by Chris Voss and Tahl Raz
Published 3 Oct 1989

In the twenty years I spent at the Bureau we’d designed a system that had successfully resolved almost every kidnapping we applied it to. But we didn’t have grand theories. Our techniques were the products of experiential learning; they were developed by agents in the field, negotiating through crisis and sharing stories of what succeeded and what failed. It was an iterative process, not an intellectual one, as we refined the tools we used day after day. And it was urgent. Our tools had to work, because if they didn’t someone died. But why did they work? That was the question that drew me to Harvard, to that office with Mnookin and Blum. I lacked confidence outside my narrow world.

The Buddha and the Badass: The Secret Spiritual Art of Succeeding at Work
by Vishen Lakhiani
Published 14 Sep 2020

Then put them in place and see how it optimizes your time and speeds up innovation. Step 3: Stick to the rule “Don’t present, reflect.” If people are presenting well-thought-out ideas they are probably overthinking them, not collaborating well, and slowing down innovation. It’s okay to be messy when brainstorming back and forth. Innovation is often an iterative process. CHAPTER 9 UPGRADE YOUR IDENTITY Life isn’t about finding yourself. Life is about creating yourself. —George Bernard Shaw The Universe acts as a mirror. It reflects back to you what you are. The miracle of this is that you can shift your identity and the world will obey.

pages: 291 words: 85,822

The Truth About Lies: The Illusion of Honesty and the Evolution of Deceit
by Aja Raden
Published 10 May 2021

Yes, synthetics are coming—are here, actually—but there’s no competition for hearts and minds. There’s no competition for market share. There’s no real competition over anything, because they’re all competing with themselves … literally. This dustup over synthetic diamonds is little more than a dumb show; everyone’s just going through the motions of that iterative process by which new facts become truth. And if it seems as if they’re phoning it in a little, it’s because they are. It’s because they can. Diamonds are truth made manifest through mass consensus and recursive reality on such a scale that it includes the whole world. Their mythology is so fully hardened into stone that it doesn’t matter that you know it’s a lie—you’ll still agree to believe it.

pages: 280 words: 85,091

The Wisdom of Psychopaths: What Saints, Spies, and Serial Killers Can Teach Us About Success
by Kevin Dutton
Published 15 Oct 2012

(See Christoph Grüter, Cristiano Menezes, Vera L. Imperatriz-Fonseca, and Francis L. W. Ratnieks, “A Morphologically Specialized Soldier Caste Improves Colony Defense in a Neotropical Eusocial Bee,” PNAS 109, no. 4 (2012):1182–86. doi:10.1073/pnas.1113398109.) But can we actually observe this iterative process in action, this repeated unfolding of the Prisoner’s Dilemma dynamic? We are, after all, firmly in the realm of a thought experiment here. Do these abstract observations pan out in real life? The answer depends on what we mean by “real.” If in “real” we’re prepared to include the “virtual,” then it turns out we might be in luck.

pages: 249 words: 81,217

The Art of Rest: How to Find Respite in the Modern Age
by Claudia Hammond
Published 5 Dec 2019

These were often hard questions to answer, and the temptation was sometimes to make something up or slightly embellish an answer. But whenever I tried this, Hurlburt spotted it immediately. He has been doing this for forty years; forty years questioning people in detail about single moments in time. Gradually I found myself giving more honest, if dull answers. And this was how it should be. It’s an iterative process, and like everyone I got better at it. His method is called Descriptive Experience Sampling or DES and through it he has found that five particular elements turn up a lot in people’s mind wanderings. He calls them the ‘five frequent phenomena’ – visual imagery, inner speech, feelings, sensory awareness and unsymbolised thought.

pages: 321

Finding Alphas: A Quantitative Approach to Building Trading Strategies
by Igor Tulchinsky
Published 30 Sep 2019

However, a researcher can increase diversification at the batch level by varying the set of input data (for example, fundamentals, Alphas from Automated Search119 price–volume), trial functions (linear combination, time-series regression), performance testing (maximizing returns, minimizing risk), or even the search process itself (different iterative processes). Combining alphas from many batches may further reduce the correlations among the alphas and the overall portfolio risk. SENSITIVITY TESTS AND SIGNIFICANCE TESTS A good alpha signal should be insensitive to noise. Cross-validation of data from different periods, from different durations, on random subsets of data, on each sector of stocks, and so forth can be a good way to mitigate the risks of overfitting, as well as the risks of noise data: we have more confidence in the signals that are less sensitive to these input changes.

pages: 346 words: 84,111

Beautiful Solutions: A Toolbox for Liberation
by Elandria Williams, Eli Feghali, Rachel Plattus and Nathan Schneider
Published 15 Dec 2024

In Reggio-style schools, teachers behave as facilitators for learning, and a significant part of their role is focused on documenting the behavior of children and tailoring teaching styles accordingly. By documenting and thinking deeply about the progress of students, and then discussing broader observations among themselves, teachers practice an iterative process rather than an inflexible and repetitive one. A popular book about Reggio Emilia preschools, The Hundred Languages of Children, explains that teachers “need to redefine their roles so that they are more than conveyors of knowledge and supporters of children. They are creators and sustainers of relationships; they are researchers; they must learn to follow children’s time and interests; they must sometimes think ahead, be the chief protagonist, invent, prompt, design, create, be the audience and the listener, the arbiter and judge, the author and scribe, the listener and recorder.”

pages: 678 words: 216,204

The Wealth of Networks: How Social Production Transforms Markets and Freedom
by Yochai Benkler
Published 14 May 2006

The fact that power law distributions of attention to Web sites result from random distributions of interests, not from formal or practical bottlenecks that cannot be worked around, means that whenever an individual chooses to search based on some mechanism other than the simplest, thinnest belief that individuals are all equally similar and dissimilar, a different type of site will emerge as highly visible. Topical sites cluster, unsurprisingly, around topical preference groups; one site does not account for all readers irrespective of their interests. We, as individuals, also go through an iterative process of assigning a likely relevance to the judgments of others. Through this process, we limit the information overload that would threaten to swamp our capacity to know; we diversify the sources of information to which we expose ourselves; and we avoid a stifling dependence on an editor whose judgments we cannot circumvent.

The process is replicated at larger and more general clusters, to the point where positions that have been synthesized "locally" and "regionally" can reach Web-wide visibility and salience. It turns out that we are not intellectual lemmings. We do not use the freedom that the network has made possible to plunge into the abyss of incoherent babble. Instead, through iterative processes of cooperative filtering and "transmission" through the high visibility nodes, the low-end thin tail turns out to be a peer-produced filter and transmission medium for a vastly larger number of speakers than was imaginable in the mass-media model. 459 The effects of the topology of the network are reinforced by the cultural forms of linking, e-mail lists, and the writable Web.

pages: 366 words: 94,209

Throwing Rocks at the Google Bus: How Growth Became the Enemy of Prosperity
by Douglas Rushkoff
Published 1 Mar 2016

It’s no longer a marketplace driven directly by supply and demand, business conditions, or commodity prices. Rather, prices, flows, and volatility are determined by the trading going on among all the algorithms. Each algorithm is a feedback loop, taking an action, observing the resulting conditions, and taking another action after that. Again, and again, and again. It’s an iterative process, in which the algorithms adjust themselves and their activity on every loop, responding less to the news on the ground than to one another. Such systems go out of control because the feedback of their own activity has become louder than the original signal. It’s like when a performer puts a microphone too close to an amplified speaker.

pages: 290 words: 87,549

The Airbnb Story: How Three Ordinary Guys Disrupted an Industry, Made Billions...and Created Plenty of Controversy
by Leigh Gallagher
Published 14 Feb 2017

“You might still call it a trip, you might still call it travel, but it’s going to make all the travel you knew before that look very, very different.” In many ways, this plan is a logical extension of the company’s core business. It doubles down on its focus on “living like a local,” the “anti-Frommer’s” approach to tourism that Airbnb has homed in on in the past few years. During the iteration process, the company plucked a tourist named Ricardo out of Fisherman’s Wharf and followed him with a photographer for a few days, documenting him at Alcatraz Island, trying to gaze through a fogged-out view of the Golden Gate Bridge, and eating at Bubba Gump Shrimp Company. Airbnb tallied up his receipts and found he spent most of his money on chain franchises based in other cities.

pages: 398 words: 86,855

Bad Data Handbook
by Q. Ethan McCallum
Published 14 Nov 2012

This was one of the areas where our data contest approach led to quite a different experience than you’d find with a more traditionally outsourced project. All of the previous preparation steps had created an extremely well-defined problem for the contestants to tackle, and on our end we couldn’t update any data or change the rules part-way through. Working with a consultant or part-time employee is a much more iterative process, because you can revise your requirements and inputs as you go. Because those changes are often costly in terms of time and resources, up-front preparation is still extremely effective. Several teams apparently tried to use external data sources to help improve their results, without much success.

The Fractalist
by Benoit Mandelbrot
Published 30 Oct 2012

These cartoons are scaling, have dependent jumps and long tails, and can be fine-tuned easily. Much of Benoit’s work was based on a simple idea—scaling, iteration, and dimension—applied with great finesse in new settings. By far, the biggest surprise is the Mandelbrot set. In class we set up the simple formula and describe the iteration process and how to color-code the result. Then we run the program and wait for the shock to spread across the room. “This formula produces that picture??? Are you kidding me?” “Just wait, you haven’t seen anything yet. Let’s magnify a bit and see what we find.” “You mean those complicated twirls and swirls still come from the same little formula?”

pages: 292 words: 94,324

How Doctors Think
by Jerome Groopman
Published 15 Jan 2007

"I believe that my thinking in the clinic is helped by having a laboratory. If you do an experiment two times and you don't get results, then it doesn't make sense to do it the same way a third time. You have to ask yourself: What am I missing? How should I do it differently the next time? It is the same iterative process in the clinic. If you are taking care of someone and he is not getting better, then you have to think of a new way to treat him, not just keep giving him the same therapy. You also have to wonder whether you are missing something." This seemingly obvious set of statements is actually a profound realization, because it is much easier both psychologically and logistically for a doctor to keep treating a serious disease with a familiar therapy even when the disease is not responding.

Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage
by Zdravko Markov and Daniel T. Larose
Published 5 Apr 2007

Thus, the following weighted mean and standard deviation are computed: n μC = i=1 n n σC2 = wi xi i=1 wi wi (xi − μC )2 n w i=1 i i=1 Note that the sums go for all values, not only for those belonging to the corresponding cluster. Thus, given a sample size n, we have an n-component weight vector for each cluster. The iterative process is similar to that of k-means; the data points are redistributed among clusters repeatedly until the process reaches a fixpoint. The k-means algorithm stops when the cluster membership does not change from one iteration to the next. k-Means uses “hard”2 cluster assignment, however, whereas the EM uses 2 In fact, there exist versions of k-means with soft assignment, which are special cases of EM. 80 CHAPTER 3 CLUSTERING “soft” assignment—probability of membership.

pages: 315 words: 93,522

How Music Got Free: The End of an Industry, the Turn of the Century, and the Patient Zero of Piracy
by Stephen Witt
Published 15 Jun 2015

A NOTE ON SOURCES A private detective once explained to me the essence of the investigative method: “You start with a document. Then you take that document to a person, and ask them about it. Then that person tells you about another document. You repeat this process until you run out of people, or documents.” Starting with the Affinity e-zine interview quoted in this book, and following this iterative process for the next four years, I ended up with dozens of people and tens of thousands of documents. A comprehensive catalog would take pages—below is a selection. The key interview subjects for this book were Karlheinz Brandenburg, Robert Buchanan, Brad Buckles, Leonardo Chiariglione, Ernst Eberlein, Keith P.

pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists
by Gary Marcus and Jeremy Freeman
Published 1 Nov 2014

Even if it is clear which kinds of measurements we want to make (for example, whole-brain calcium imaging of the larval zebrafish, two-photon imaging of multiple areas of mouse cortex), it is not clear which behaviors the organism should be performing while we collect those data, or which environment it should be experiencing. It is hard to imagine a single dataset, however massive, from which the truths we seek will emerge with only the right analysis, especially when we consider the nearly infinite set of alternative experiments we might have performed. Instead, we need an iterative process by which we move back and forth between using analytic tools to identify patterns in data and using the recovered patterns to inform and guide the next set of experiments. After many iterations, the patterns we identify may coalesce into rules and themes, perhaps even themes that extend across different systems and modalities.

pages: 297 words: 90,806

Blood, Sweat, and Pixels: The Triumphant, Turbulent Stories Behind How Video Games Are Made
by Jason Schreier
Published 4 Sep 2017

Each index card contained a story beat or scene idea—one midgame sequence, for example, was just called “epic chase”—and taken together, they told the game’s entire narrative. “One thing we’ve never done here is sat down and written down an entire script for the whole game start to front,” said Josh Scherr, a writer who sat with Straley and Druckmann for many of these meetings. “That never happens. And the reason it doesn’t happen is because game design is an iterative process, and if you do that you’re just asking for heartbreak when things inevitably change because the gameplay doesn’t work out the way you expected it to, or you have a better idea further down the line, or anything like that. You have to be able to be flexible.” Over the next few weeks, Druckmann and Straley put together a two-hour presentation that outlined their vision for Uncharted 4, then showed it to the rest of Naughty Dog.

pages: 304 words: 95,306

Duty of Care: One NHS Doctor's Story of the Covid-19 Crisis
by Dr Dominic Pimenta
Published 2 Sep 2020

We must therefore buy time with aggressive containment measures, including banning mass gatherings, closing the schools, isolating the vulnerable and restricting travel. With an incubation period of two weeks, any action we take now won’t have an effect for a fortnight. We need to act today – better yet, two weeks ago. It’s an iterative process, but we get there in the end. Dilsan posts it on her groups and I ask my groups to share and sign it – in an hour, we have nearly 500 NHS staff signatures. We order some food, and over dinner, I check the news for that day. Many groups are now pulling in the same direction, and behavioural scientists have called out the lack of a decision to lock down as being based on “flawed science”.

pages: 285 words: 86,853

What Algorithms Want: Imagination in the Age of Computing
by Ed Finn
Published 10 Mar 2017

We end up admiring not just the artistic concept of the volume but the piecework poems themselves. In this way, Of the Subcontract performs an algorithmic critique of Mechanical Turk, relying on the system itself to to process or run that critique. Our work as readers, then, is to examine the iterative process of commissioning and assembling these poems. The implementation, the ways that the work of these anonymous turkers was collected and contextualized, is an integral part of the whole poetic mechanism. Experimental poet Darren Wershler strikes at the heart of this critical tangle in his afterword, which is worth quoting at length: We have also read essays explaining that the Turk is in fact an elegant metaphor for the precarious condition of the worker in a globalized and networked milieu.

pages: 287 words: 95,152

The Dawn of Eurasia: On the Trail of the New World Order
by Bruno Macaes
Published 25 Jan 2018

But the faith in an endless power to transform reality can now be found on every corner of the planet. The process has a certain negative character: the attempt is made to free oneself from the existing model only to realize this model has been replaced by a broader but still limited set of possibilities, which in turn need to be overcome, and so on in an iterative process. More importantly, perhaps, each society has its own modernization path. Each society starts from a traditional model and creates new abstractions from that starting point. As the whole world becomes modern, we should expect different or multiple modernities to develop, rather than the cultural programme of modernity as it developed in Europe to become universal.

pages: 302 words: 90,215

Experience on Demand: What Virtual Reality Is, How It Works, and What It Can Do
by Jeremy Bailenson
Published 30 Jan 2018

Barbara had covered the storm, and done a lot of documentary work, so she was well acquainted with the kind of details that could bring this terrible scenario to life. Not to mention that the physical tracking space in the lab at Stanford was about the size of an actual rooftop, so the experience would really leverage the space. What followed was an iterative process: we’d pore over Barbara’s notes and videos, program the visual scene and interactivity to match, and then repeat the process, often bringing in new journalists to give us feedback and then going back to the drawing board. The project culminated in a large “opening” where a number of prominent visitors were able to experience Katrina.

How to Stand Up to a Dictator
by Maria Ressa
Published 19 Oct 2022

When he accepts 1 percent disinformation on his site, it’s like saying it’s okay50 to have 1 percent virus in a population unchecked. Both can take over, and if not eradicated, they can ultimately kill. I have tried to understand how Zuckerberg could come to those decisions, and the best that I can see is that it’s baked into the iterative process of software development. When you build technology products, there’s a prioritization process. Like when building a house, you have to break it down into the elements: nails, cement, tools, wood. Then you build in phases, what tech calls “agile development,” a breakdown into tasks that allows quick shifts depending on what has been accomplished.

pages: 344 words: 96,020

Hacking Growth: How Today's Fastest-Growing Companies Drive Breakout Success
by Sean Ellis and Morgan Brown
Published 24 Apr 2017

It wasn’t the immaculate conception of a world-changing product nor any single insight, lucky break, or stroke of genius that rocketed these companies to success. In reality, their success was driven by the methodical, rapid-fire generation and testing of new ideas for product development and marketing, and the use of data on user behavior to find the winning ideas that drove growth. If this iterative process sounds familiar, it’s likely because you’ve encountered a similar approach in agile software development or the Lean Startup methodology. What those two approaches have done for new business models and product development, respectively, growth hacking does for customer acquisition, retention, and revenue growth.

Beautiful Visualization
by Julie Steele
Published 20 Apr 2010

Make It Informative As I mentioned earlier, a visualization must be informative and useful to be successful. There are two main areas to consider to ensure that what is created is useful: the intended message and the context of use. Considering and integrating insight from these areas is usually an iterative process, involving going back and forth between them as the design evolves. Conventions should also be taken into consideration, to support the accessibility of the design (careful use of certain conventions allows users to assume some things about the data—such as the use of the colors red and blue in visuals about American politics).

pages: 364 words: 102,528

An Economist Gets Lunch: New Rules for Everyday Foodies
by Tyler Cowen
Published 11 Apr 2012

They are a constant challenge as to whether I have mastered various codes of Indian cooking and their lack of detail gives me room to improvise, learn, and make mistakes. Every now and then I go back to the more thorough cookbooks (another is 1,000 Indian Recipes by Neelam Batra) to learn new recipes and techniques, and then I do a batch more Indian cooking from the shorter guides. It’s an iterative process where I step back and forth between a food world where I am told what to do and a food world where I am immersed in the implicit codes of meaning and contributing to innovation within established structures. Some of your cookbooks, or more broadly your recipe sources, should have very short recipes for use in this manner.

pages: 398 words: 100,679

The Knowledge: How to Rebuild Our World From Scratch
by Lewis Dartnell
Published 15 Apr 2014

This gives the culture more nutrients to reproduce and continually doubles the size of the microbial territory to expand into. After about a week, once you have a healthy-smelling culture reliably growing and frothing after every replenishment, like a microbial pet thriving on the feed left in its bowl, you are ready to extract some of the dough and bake bread. By running through this iterative process you have essentially created a rudimentary microbiological selection protocol—narrowing down to wild strains that can grow on the starch nutrients in the flour with the fastest cell division rates at a temperature of around 20°–30°C. Your resultant sourdough is not a pure culture of a single isolate, but actually a balanced community of lactobacillus bacteria, able to break down the complex storage molecules of the grain, and yeast living on the byproducts of the lactobacilli and releasing carbon dioxide gas to leaven the bread.

pages: 323 words: 95,939

Present Shock: When Everything Happens Now
by Douglas Rushkoff
Published 21 Mar 2013

The digital can be stacked; the human gets to live in real time. This experience is what makes us creative, intelligent, and capable of learning. As science and innovation writer Steven Johnson has shown, great ideas don’t really come out of sudden eureka moments, but after long, steady slogs through problems.31 They are slow, iterative processes. Great ideas, as Johnson explained it to a TED audience, “fade into view over long periods of time.” For instance, Charles Darwin described his discovery of evolution as a eureka moment that occurred while he was reading Malthus on a particular night in October of 1838. But Darwin’s notebooks reveal that he had the entire theory of evolution long before this moment; he simply hadn’t fully articulated it yet.

pages: 372 words: 101,678

Lessons from the Titans: What Companies in the New Economy Can Learn from the Great Industrial Giants to Drive Sustainable Success
by Scott Davis , Carter Copeland and Rob Wertheimer
Published 13 Jul 2020

As of 2019 more than a hundred thousand people worked at CAT, and the stock market valued the company at $80 billion—but those numbers could have been much higher. The company just recently came out of a two-decade-long stretch with painful periods of poor performance and destructive acquisitions. Great industrials have endured through continuous improvement—a slow, iterative process with increasingly huge benefits as time goes by—the benefits of which are locked in as competence becomes ingrained across the employee base. Caterpillar’s experience illustrates the depth of the problems that can arise when a systematic culture and operating system isn’t present in a large organization.

pages: 367 words: 97,136

Beyond Diversification: What Every Investor Needs to Know About Asset Allocation
by Sebastien Page
Published 4 Nov 2020

Some of the objectives and constraints are “behavioral” in nature in that they don’t fit neatly within classical utility functions. Often, they represent plan sponsor needs. Also, our portfolio managers and researchers use judgment and experience in combination with data and models. They adjust portfolio weights on the margin to account for the uncertainty in the models and specific market needs. It’s an iterative process.10 Separately, we also optimize allocations to sub–asset classes and strategies. This second optimization process is how we “fill the buckets” within stocks and bonds. To do so, we use a variety of portfolio optimization techniques, scenario analyses, and, again, judgment and experience. We’ll discuss portfolio optimization in more detail shortly.

pages: 550 words: 89,316

The Sum of Small Things: A Theory of the Aspirational Class
by Elizabeth Currid-Halkett
Published 14 May 2017

Nonmonetary, nonfunctional goods may cost the same in price, but the cost of information, rather than the actual cost of the good, is what creates the barrier. This appropriation of value onto nonpecuniary behaviors is what the Columbia University sociologist Shamus Khan calls a “learned form of capital”; in other words, knowledge about ways of doing things becomes internalized, and acquiring it is an iterative process that in itself becomes valuable. In his study of Concord’s elite school, St. Paul’s, Khan argues that subtle forms of class assimilation, or “hidden curriculum,” are in many ways more reinforcing of social position than more ostentatious and material symbols. The boys Khan interviewed discussed the hard work it took to gain entry into St.

Risk Management in Trading
by Davis Edwards
Published 10 Jul 2014

For example, the interest rate on a 10‐year loan might be than higher than interest rate needed for a 3‐year loan. In the same way, a BAA‐rated bond might have an interest rate that is 1 percent over the one‐year risk‐free rate, 2 percent over the five year risk‐free rate, and 3 percent over the 10‐year rate. Using these points, it is possible to construct a forward curve by using an iterative process—first calculating the default rates associated with the soonest‐to‐ mature bonds and then calculating the later rates. When interest rates are calculated using interest rates or CDS spreads, they will be calculated from the valuation date to some point in the future. To construct a forward curve, it is necessary to use the probability of default between the valuation date (t0) and the dates in the future (t1, t2) to calculate the probability of default between t1 and t2.

pages: 268 words: 109,447

The Cultural Logic of Computation
by David Golumbia
Published 31 Mar 2009

While the rhetoric of CRM often focuses on “meeting customer needs,” the tools themselves are constructed so as to manage human behavior often against the customer’s own interest and in favor of statistically-developed corporate goals that are implemented at a much higher level of abstraction than the individual: The Cultural Logic of Computation p 170 CRM analytics can perform predictive modeling of customer behavior, customer segmentation grouping, profitability analysis, what-if scenarios, and campaign management, as well as personalization and event monitoring. These functions take advantage of a customer’s interactions with a business in realtime. CRM is a highly iterative process. When data from any source is harvested and fed back into the system, it improves the personalization capability of the very next customer transaction or e-mail campaign. More traditional marketing processes such as direct marketing often see months of lag time between a campaign’s execution and the review of its metrics.

pages: 723 words: 98,951

Down the Tube: The Battle for London's Underground
by Christian Wolmar
Published 1 Jan 2002

His explanation of the PPP makes it sound like a journey, an exceedingly long one, and is peppered with expressions like ‘the view we took’ and ‘in order to do that’.* Every solution involves opening up new issues which, in turn, must be solved. It was what the management consultants call an ‘iterative process’, one which started as a relatively simple idea and which built up gradually as the need to cover more and more eventualities became apparent. The best metaphor is probably that favourite school science experiment, the copper sulphate grain left hanging in a strong solution which grows into a complex crystalline structure as the water evaporates.

pages: 420 words: 100,811

We Are Data: Algorithms and the Making of Our Digital Selves
by John Cheney-Lippold
Published 1 May 2017

Halberstam writes of gender as “learned, imitative behavior that can be processed so well that it comes to look natural,” we see how an identity like gender is dependent on reflexive disciplinarity.76 The processed output of gender in Halberstam’s formula is an output that must immediately return to the individual subject, an iterative process that continually cycles back to some perceived naturality of gendered performance. Philosopher Judith Butler’s own interpretation of gender follows closely to Halberstam’s: “gender is a contemporary way of organising past and future cultural norms, a way of situating oneself in and through those norms, an active style of living one’s body in the world.”77 Indeed, we can step back to consider that most categories of identity need to directly interface with their subject in order to have the desired normalizing effect.

pages: 302 words: 100,493

Working Backwards: Insights, Stories, and Secrets From Inside Amazon
by Colin Bryar and Bill Carr
Published 9 Feb 2021

Rather than focusing on the sheer number of items added, they could instead add the items that would make the biggest impact on sales. Sounds simple, but with the wrong input metrics or an input metric that is too crude, your efforts may not be rewarded with an improvement in your output metrics. The right input metrics get the entire organization focused on the things that matter most. Finding exactly the right one is an iterative process that needs to happen with every input metric. Note: Most of the examples we give in this chapter are of large companies with substantial resources. But DMAIC and the WBR process is eminently scalable. Your level of investment should be on par with the resources you have. If you are a nonprofit, figure out a modest number of key metrics that reliably show how well you are doing.

pages: 411 words: 108,119

The Irrational Economist: Making Decisions in a Dangerous World
by Erwann Michel-Kerjan and Paul Slovic
Published 5 Jan 2010

Figure 6.2 provides a snapshot of the quality of the decision at a given point in time, in order to judge whether more work is needed. The practical challenge is to select decision-making methods that move the cursor in each link to the right efficiently, while periodically taking stock of the overall profile, without overshooting the optimal target. This is essentially a heuristic and iterative process, guided by intuition and decision coaching, in order to find the optimal position for each link. It is important to recognize that in the rational components of the model lie judgments and values that are behaviorally rooted and, thus, that deep biases may never fully surface or be completely eliminated.

pages: 446 words: 102,421

Network Security Through Data Analysis: Building Situational Awareness
by Michael S Collins
Published 23 Feb 2014

It enables you to move from simply reacting to signatures to continuous audit and protection. It provides you with baselines and an efficient anomaly detection strategy, it identifies critical assets, and it provides you with contextual information to speed up the process of filtering alerts. Creating an Initial Network Inventory and Map Network mapping is an iterative process that combines technical analysis and interviews with site administrators. The theory behind this process is that any inventory generated by design is inaccurate to some degree, but accurate enough to begin the process of instrumentation and analysis. Acquiring this inventory begins with identifying the personnel responsible for managing the network.

pages: 378 words: 110,408

Peak: Secrets From the New Science of Expertise
by Anders Ericsson and Robert Pool
Published 4 Apr 2016

Once you get to the edge of your field, you may not know exactly where you’re headed, but you know the general direction, and you have spent a good deal of your life building this ladder, so you have a good sense of what it takes to add on one more step. Researchers who study how the creative geniuses in any field—science, art, music, sports, and so on—come up with their innovations have found that it is always a long, slow, iterative process. Sometimes these pathbreakers know what they want to do but don’t know how to do it—like a painter trying to create a particular effect in the eye of the viewer—so they explore various approaches to find one that works. And sometimes they don’t know exactly where they’re going, but they recognize a problem that needs a solution or a situation that needs improving—like mathematicians trying to prove an intractable theorem—and again they try different things, guided by what has worked in the past.

pages: 502 words: 107,510

Natural Language Annotation for Machine Learning
by James Pustejovsky and Amber Stubbs
Published 14 Oct 2012

In particular, we will look at: What makes a good annotation goal Where to find related research How your dataset reflects your annotation goals Preparing the data for annotators to use How much data you will need for your task What you should be able to take away from this chapter is a clear answer to the questions “What am I trying to do?”, “How am I trying to do it?”, and “Which resources best fit my needs?”. As you progress through the MATTER cycle, the answers to these questions will probably change—corpus creation is an iterative process—but having a stated goal will help keep you from getting off track. Defining Your Goal In terms of the MATTER cycle, at this point we’re right at the start of “M”—being able to clearly explain what you hope to accomplish with your corpus is the first step in creating your model. While you probably already have a good idea about what you want to do, in this section we’ll give you some pointers on how to create a goal definition that is useful and will help keep you focused in the later stages of the MATTER cycle.

pages: 398 words: 108,889

The Paypal Wars: Battles With Ebay, the Media, the Mafia, and the Rest of Planet Earth
by Eric M. Jackson
Published 15 Jan 2004

Neither side is asking the right questions regarding the pressing needs of the day. “In our own way, at PayPal this is what we’ve been doing all along. We’ve been creating a system that enables global commerce for everyone. And we’ve been fighting the people who would do us and our users harm. It’s been a gradual, iterative process, and we’ve gotten plenty of stuff wrong along the way, but we’ve kept moving in the right direction to address these major issues while the rest of the world has been ignoring them. “And so I’d like to send a message back to planet Earth from Palo Alto. Life is good here in Palo Alto. We’ve been able to improve on many of the ways you do things.

pages: 325 words: 110,330

Creativity, Inc.: Overcoming the Unseen Forces That Stand in the Way of True Inspiration
by Ed Catmull and Amy Wallace
Published 23 Jul 2009

Think about how off-putting a movie about rats preparing food could be, or how risky it must’ve seemed to start WALL-E with 39 dialogue-free minutes. We dare to attempt these stories, but we don’t get them right on the first pass. And this is as it should be. Creativity has to start somewhere, and we are true believers in the power of bracing, candid feedback and the iterative process—reworking, reworking, and reworking again, until a flawed story finds its throughline or a hollow character finds its soul. As I’ve discussed, first we draw storyboards of the script and then edit them together with temporary voices and music to make a crude mock-up of the film, known as reels.

pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future
by Martin Ford
Published 4 May 2015

It’s generally accepted by AI researchers that such a system would eventually be driven to direct its intelligence inward. It would focus its efforts on improving its own design, rewriting its software, or perhaps using evolutionary programming techniques to create, test, and optimize enhancements to its design. This would lead to an iterative process of “recursive improvement.” With each revision, the system would become smarter and more capable. As the cycle accelerated, the ultimate result would be an “intelligence explosion”—quite possibly culminating in a machine thousands or even millions of times smarter than any human being. As Hawking and his collaborators put it, it “would be the biggest event in human history.”

pages: 489 words: 106,008

Risk: A User's Guide
by Stanley McChrystal and Anna Butrico
Published 4 Oct 2021

The lesson is clear. A shared narrative allows us to unite around a common purpose that is required to undertake effective action. What is true for governments is true for organizations: we must take care in defining the problem, naming the issues and complexities, and crafting solutions. This will be an iterative process. Narratives are not static—quite the opposite. Teams will constantly have to alter their understandings of the world, sharpen their own stories, and calibrate their norms and expectations as conditions shift. Iteration, coupled with a constant commitment to shared values and goals, is vital—and what defines a strong Risk Immune System.

pages: 571 words: 106,255

The Bitcoin Standard: The Decentralized Alternative to Central Banking
by Saifedean Ammous
Published 23 Mar 2018

To try to commit fraudulent transactions to the Bitcoin ledger is to deliberately waste resources on solving the proof‐of‐work only to watch nodes reject it at almost no cost, thereby withholding the block reward from the miner. As time goes by, it becomes increasingly difficult to alter the record, as the energy needed is larger than the energy already expended, which only grows with time. This highly complex iterative process has grown to require vast quantities of processing power and electricity but produces a ledger of ownership and transactions that is beyond dispute, without having to rely on the trustworthiness of any single third party. Bitcoin is built on 100% verification and 0% trust.4 Bitcoin's shared ledger can be likened to the Rai stones of Yap Island discussed in Chapter 2, in that the money does not actually move for transactions to take place.

pages: 430 words: 107,765

The Quantum Magician
by Derek Künsken
Published 1 Oct 2018

Belisarius turned to Marie after switching off the link. “Is there no way we could channel your creativity in other directions?” he asked. Chapter Twenty-Nine THE AIS OF the Anglo-Spanish Banks were wholly artificial things, electronically grown and printed on inorganic templates by iterative processes that mimicked embryonic stages. And if reports were to be believed, some of them achieved a limited but highly functional sentience. The Congregate did not pursue true artificial intelligence. The Scarecrow and other mobile AIs were constructed out of a kind of petrification process of living brains.

pages: 362 words: 108,359

The Accidental Investment Banker: Inside the Decade That Transformed Wall Street
by Jonathan A. Knee
Published 31 Jul 2006

I gave his supervisor my assessment of the banker and asked him whether he wanted me to fill out a voluntary review form. As it turned out, the supervisor wanted the banker promoted and preferred that I keep my views to myself. Because class ranking and compensation are determined through an iterative process in which that supervisor would have a role, if I had nonetheless submitted a voluntary review on his banker, it would have undoubtedly cost me or someone who worked for me money. This was just how the game was played. Because the majority of my coverage list and revenues were associated with companies that Morgan had never done business with before— revenue sheets at Morgan for a time actually separately categorized “first-time business”—I was generally treated quite well in this process.

pages: 321 words: 105,480

Filterworld: How Algorithms Flattened Culture
by Kyle Chayka
Published 15 Jan 2024

The TikTok app reveals to creators at which point in a video users tune out and flip to the next video. If viewers were skipping at nineteen seconds, Kabvina would go back and examine the underperforming section, and then try to avoid its problems in the next video. Such specific data allowed him to optimize for engagement at every moment. Kabvina liked the granular feedback and the iterative process to improve his work, perhaps a holdover from his math background. “I see creators get frustrated with the algorithm; they’re assuming something’s wrong,” he said. “It’s a lot easier to blame the algorithm than to try to say, ‘My content isn’t that good.’ ” For independent creators, the algorithm takes the place of bosses and performance reviews; it’s a real-time authority gauging your success at adapting to its definition of compelling content, which is always shifting.

pages: 321 words: 113,564

AI in Museums: Reflections, Perspectives and Applications
by Sonja Thiel and Johannes C. Bernhardt
Published 31 Dec 2023

For example, in our model, the keywords ‘religion’ and ‘bible’ have a cosine similarity of 0.45317942, while the keywords ‘forest’ and ‘digitization’, by contrast, have a similarity of -0.008358647. Selection Process To emphasize the exploratory nature of our algorithmic approach, we opted to initiate the selection algorithm with just one keyword and let the iterative process evolve from there (see fig. 2). Figure 2: Flowchart of the iterative selection process. This simple algorithm starts with a manually chosen initial keyword node k0 and iteratively adds relevant keyword nodes to the keyword selection KS based on their edge weights (sum of edge weights to all nodes in KS).

pages: 397 words: 110,130

Smarter Than You Think: How Technology Is Changing Our Minds for the Better
by Clive Thompson
Published 11 Sep 2013

Chess-playing software could show you how an artificial opponent would respond to any move. This dramatically increased the pace at which young chess players built up intuition. If you were sitting at lunch and had an idea for a bold new opening move, you could instantly find out which historic players had tried it, then war-game it yourself by playing against software. The iterative process of thought experiments—“If I did this, then what would happen?”—sped up exponentially. Chess itself began to evolve. “Players became more creative and daring,” as Frederic Friedel, the publisher of the first popular chess databases and software, tells me. Before computers, grand masters would stick to lines of attack they’d long studied and honed.

pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline
by Cathy O'Neil and Rachel Schutt
Published 8 Oct 2013

The degrees themselves aren’t giving us a real understanding of how interconnected a given node is, though, so in the next iteration, add the degrees of all the neighbors of a given node, again scaled. Keep iterating on this, adding degrees of neighbors one further step out each time. In the limit as this iterative process goes on forever, we’ll get the eigenvalue centrality vector. A First Example of Random Graphs: The Erdos-Renyi Model Let’s work out a simple example where a network can be viewed as a single realization of an underlying stochastic process. Namely, where the existence of a given edge follows a probability distribution, and all the edges are considered independently.

Decoding Organization: Bletchley Park, Codebreaking and Organization Studies
by Christopher Grey
Published 22 Mar 2012

However, this does not mean that the utilization of various theories arises simply or solely from the ‘facts’ of what happened at, in this case, BP: I am not advocating naïve empiricism. Clearly the way in which I select and identify those facts, questions and puzzles which seem interesting or important is itself something arising from the kinds of theories and ideas which I bring to bear in my selection and interpretation of the evidence available to me. There is an iterative process in play between theory and empirics, mediated, of course, by own concerns, pre-occupations and predispositions, which is irreducible. So in saying that my approach is one of ‘pragmatism’ I do not seek to deny the ‘theory-ladenness’ of empirical knowledge, I just endeavour not to become hamstrung by theoretical purism or tribalism.

pages: 425 words: 112,220

The Messy Middle: Finding Your Way Through the Hardest and Most Crucial Part of Any Bold Venture
by Scott Belsky
Published 1 Oct 2018

However, I have also learned over the years—often the hard way—how important it is to let people have their own process. Matias and I often butted heads when it came to aligning the Behance design team’s efforts with the broader company early on. While certain business and product decisions could be made in a single meeting, design decisions required a more iterative process of mock-ups and feedback before anything conclusive. However, I always wanted to find the answer fast. Since our team had so little time and resources, I was always pushing for the quickest responses with the largest outcomes. I considered myself the pacemaker, and in my effort to keep the team moving, I would push for a solution without regard for the steps Matias’s team needed to take to find the best one.

Succeeding With AI: How to Make AI Work for Your Business
by Veljko Krunic
Published 29 Mar 2020

Cost of capital—In the context of starting a new business project, cost of capital is the minimum rate of return that must be exceeded for the project to be worthwhile. Cross industry standard process for data mining (CRISP-DM)—A standard that defines the process for analytics and data mining. It predates the popularity of big data and AI. It’s an iterative process in which you start with understanding the business, then understanding the data. Next comes preparing the data for modeling, performing the modeling, and then evaluating the results. If the results are satisfactory, you then deploy the model; otherwise, you repeat the aforementioned cycle. See Wikipedia [184] for details.

pages: 424 words: 114,820

Neurodiversity at Work: Drive Innovation, Performance and Productivity With a Neurodiverse Workforce
by Amanda Kirby and Theo Smith
Published 2 Aug 2021

Any intervention needs to consider ways to enhance communication, trust and teamwork so there is a long-term plan. It is important, therefore, that this is not put on ‘a’ champion, but the work is undertaken as part of a group. No single neurodivergent person can offer everything required for success. It also requires an iterative process as each organization will be at a different stage and readiness for change. When working with champions, particularly when they are neurodivergent, there can be significant impacts on self-confidence and self-esteem if there is a lack of support for their role, especially if things do not go as expected.

pages: 412 words: 116,685

The Metaverse: And How It Will Revolutionize Everything
by Matthew Ball
Published 18 Jul 2022

It has since defined many of the mobile internet era’s visual design principles, economics, and business practices. In truth, however, there’s never a moment when a switch flips. We can identify when a specific technology was created, tested, or deployed, but not when an era precisely began, or ended. Transformation is an iterative process in which many different changes converge. Consider, as a case study, the process of electrification, which began in the late 19th century and ran midway into the 20th century, and focused on the adoption and usage of electricity, skipping past the centuries-long effort to understand, capture, and transmit it.

pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by Orly Lobel
Published 17 Oct 2022

I’ve long argued against a false dilemma between centralized command-and-control regulation and collaborative private-public governance.6 The web of interests and relationships that we saw, for instance, in Chapter 3—tackling pay equity through new laws, reporting requirements, social activism, intermediary platforms, and corporate practices—demonstrates the iterative process of regulation and private-market innovation. As we move beyond traditional litigation frameworks, government agencies also become research and development arms that incentivize, test, approve, and monitor proactive prevention programs. The immense challenge of harnessing technology for equality is one that must involve people from all disciplines and sectors.

pages: 2,466 words: 668,761

Artificial Intelligence: A Modern Approach
by Stuart Russell and Peter Norvig
Published 14 Jul 2019

As we remarked in our discussion of policy iteration in Chapter 16, these Bellman equations are linear when the policy π is fixed, so they can be solved using any linear algebra package. Alternatively, we can adopt the approach of modified policy iteration (see page 568), using a simplified value iteration process to update the utility estimates after each change to the learned model. Because the model usually changes only slightly with each observation, the value iteration process can use the previous utility estimates as initial values and typically converge very quickly. Learning the transition model is easy, because the environment is fully observable. This means that we have a supervised learning task where the input for each training example is a state–action pair, (s, a), and the output is the resulting state, s'.

(s) is the utility obtained if πi is executed starting in s, and the policy loss is the most the agent can lose by executing πi instead of the optimal policy π*. The policy loss of πi is connected to the error in Ui by the following inequality: In practice, it often occurs that πi becomes optimal long before Ui has converged. Figure 16.8 shows how the maximum error in Ui and the policy loss approach zero as the value iteration process proceeds for the 4 × 3 environment with γ = 0.9. The policy πi is optimal when i =5 , even though the maximum error in Ui is still 0.51. Description In all three graphs, The vertical axis is labeled P (U vertical bar E subscript j) and the horizontal axis is labeled U. The horizontal axis is marked at U subscript 1 and U subscript 2 from the right, respectively.

A machine translation program that runs on your phone and allows you to read signs in a foreign city is helpful—but not if it runs down the battery after an hour of use. Keep track of all the factors that lead to acceptance or rejection of your system, and design a process where you can quickly iterate the process of getting a new idea, running an experiment, and evaluating the results of the experiment to see if you have made progress. Making this iteration process fast is one of the most important factors for success in machine learning. 19.9.4Trust, interpretability, and explainability We have described a machine learning methodology where you develop your model with training data, choose hyperparameters with validation data, and get a final metric with test data.

pages: 597 words: 119,204

Website Optimization
by Andrew B. King
Published 15 Mar 2008

Most of your rankings in search engines are determined by the number and popularity of your inbound links. [19]. These concepts will come up again and again as you optimize for search-friendliness, and we'll discuss them in more detail shortly. Step 1: Determine Your Keyword Phrases Finding the best keyword phrases to target is an iterative process. First, start with a list of keywords that you want to target with your website. Next, expand that list by brainstorming about other phrases, looking at competitor sites and your logfiles, and including plurals, splits, stems, synonyms, and common misspellings. Then triage those phrases based on search demand and the number of result pages to find the most effective phrases.

pages: 471 words: 124,585

The Ascent of Money: A Financial History of the World
by Niall Ferguson
Published 13 Nov 2007

This provides the basis for the concept of statistical significance and modern formulations of probabilities at specified confidence intervals (for example, the statement that 40 per cent of the balls in the jar are white, at a confidence interval of 95 per cent, implies that the precise value lies somewhere between 35 and 45 per cent - 40 plus or minus 5 per cent). 4. Normal distribution. It was Abraham de Moivre who showed that outcomes of any kind of iterated process could be distributed along a curve according to their variance around the mean or standard deviation. ‘Tho’ Chance produces Irregularities,’ wrote de Moivre in 1733, ‘still the Odds will be infinitely great, that in process of Time, those Irregularities will bear no proportion to recurrency of that Order which naturally results from Original Design.’

pages: 472 words: 117,093

Machine, Platform, Crowd: Harnessing Our Digital Future
by Andrew McAfee and Erik Brynjolfsson
Published 26 Jun 2017

It’s highly energy efficient, using technology that reduces its carbon footprint by 34,000 metric tons per year, and sparing enough in its use of materials to save $58 million in construction costs. What’s more, we find its twisting, gleaming form quite beautiful. Both the building’s initial shape and structure were computer-generated. They were then advanced and refined by teams of human architects in a highly iterative process, but the starting point for these human teams was a computer-designed building, which is about as far from a blank sheet of paper as you can get. What We Are That Computers Aren’t Autogenerated-music pioneer David Cope says, “Most of what I’ve heard [and read] is the same old crap. It’s all about machines versus humans, and ‘aren’t you taking away the last little thing we have left that we can call unique to human beings — creativity?’

pages: 420 words: 124,202

The Most Powerful Idea in the World: A Story of Steam, Industry, and Invention
by William Rosen
Published 31 May 2010

And not just a screw fastener; the reason lathes are frequently called history’s “first self-replicating machines” is that, beginning in the sixteenth century, they were used to produce their own leadscrews. A dozen inventors from all over Europe, including the Huguenots Jacques Besson and Salomon de Caus, the Italian clockmaker Torriano de Cremona, the German military engineer Konrad Keyser, and the Swede Christopher Polhem, mastered the iterative process by which a lathe could use one leadscrew to cut another, over and over again, each time achieving a higher order of accuracy. By connecting the lathe spindle and carriage to the leadscrew, the workpiece could be moved a set distance for every revolution of the spindle; if the workpiece revolved eight times while the cutting tool was moved a single inch, then eight spiral grooves would be cut on the metal for every inch: eight turns per inch.

pages: 1,064 words: 114,771

Tcl/Tk in a Nutshell
by Paul Raines and Jeff Tranter
Published 25 Mar 1999

Tcl treats "#" as an ordinary character if it is not at the beginning of a command. A Symbolic Gesture Much of Tcl's strength as a programming languages lies in the manipulation of strings and lists. Compare the following two methods for printing each element of a list: set cpu_types [list pentium sparc powerpc m88000 alpha mips hppa] # "C-like" method of iterative processing for {set i 0} {$i < [llength $cpu_types]} {incr i} { puts [lindex $cpu_types $i] } # "The Tcl Way"-using string symbols foreach cpu $cpu_types { puts $cpu } The loop coded with for is similar to how a C program might be coded, iterating over the list by the use of an integer index value. The second loop, coded with foreach, is more natural for Tcl.

pages: 468 words: 124,573

How to Build a Billion Dollar App: Discover the Secrets of the Most Successful Entrepreneurs of Our Time
by George Berkowski
Published 3 Sep 2014

We even added the ability for the customer to pay with a stored credit card (yes, we faked it again). The first time I summoned one of our test group drivers he actually drove quite a few miles to pick me up, and I have to say I that was truly wowed. I couldn’t believe it had worked. Really Viable Building your own prototype is a tricky and iterative process. What you are trying to do is create the bare bones of something – the very basic vision of your app – and see whether it can become something that people love. You need to get to wow as quickly, cheaply and efficiently as possible. There’s no point wasting time or money on any app that doesn’t get to wow.

pages: 481 words: 125,946

What to Think About Machines That Think: Today's Leading Thinkers on the Age of Machine Intelligence
by John Brockman
Published 5 Oct 2015

I concede to AI proponents all of the semantic prowess of Shakespeare, the symbol juggling they do perfectly. Missing is the direct relationship with the ideas the symbols represent. Much of what is certain to come soon would have belonged in the old-school “Strong AI” territory. Anything that can be approached in an iterative process can and will be achieved, sooner than many think. On this point I reluctantly side with the proponents: exaflops in CPU+GPU performance, 10K resolution immersive VR, personal petabyte databases . . . here in a couple of decades. But it is not all “iterative.” There’s a huge gap between that and the level of conscious understanding that truly deserves to be called Strong, as in “Alive AI.”

Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data
by Dipanjan Sarkar
Published 1 Dec 2016

Doig, Introduction to Topic Modeling in Python) The black box in the figure represents the core algorithm that makes use of the previously mentioned parameters to extract K topics from the documents. The following steps give a very simplistic explanation of what happens in the algorithm for everyone's benefit: Initialize the necessary parameters. For each document, randomly initialize each word to one of the K topics. Start an iterative process as follows and repeat it several times. For each document D: For each word W in document: For each topic T: Compute , which is proportion of words in D assigned to topic T. Compute , which is proportion of assignments to topic T over all documents having the word W. Reassign word W with topic T with probability considering all other words and their topic assignments.

pages: 451 words: 125,201

What We Owe the Future: A Million-Year View
by William MacAskill
Published 31 Aug 2022

In practical terms, you might follow these steps: 1. Research your options. 2. Make your best guess about the best longer-term path for you. 3. Try it for a couple of years. 4. Update your best guess. 5. Repeat. Rather than feeling locked in to one career path, you would see it is an iterative process in which you figure out the role that is best for you and best for the world. The value of treating your career like an experiment can be really high: if you find a career that’s twice as impactful as your current best guess, it would be worth spending up to half of your entire career searching for that path.

pages: 503 words: 131,064

Liars and Outliers: How Security Holds Society Together
by Bruce Schneier
Published 14 Feb 2012

If you're knowledgeable and experienced and perform a good analysis, you can make some good guesses, but it can be impossible to know the actual effects—or unintended consequences—of a particular societal pressure until you've already implemented it. This means that implementing societal pressures is always an iterative process. We try something, see how well it works, then fine-tune. Any society—a family, a business, a government—is constantly balancing its need for security with the side effects, unintended consequences, and other considerations. Can we afford this particular societal pressure system? Are our fundamental freedoms and liberties more important than more security?

Bi-Rite Market's Eat Good Food: A Grocer's Guide to Shopping, Cooking & Creating Community Through Food
by Sam Mogannam and Dabney Gough
Published 17 Oct 2011

Rather than get bogged down by decoding labels and memorizing vintages or relying on subjective point systems, I propose a much simpler and much more rewarding approach. The key is to build a relationship with a top-notch wine retailer you trust who will help you bypass the dreck and home in on wine that you will enjoy. This is an iterative process. It takes a little research and trial and error to find such a shop, but it’s an investment of time that will pay off in the long run. The more often you go, the more the staff there will learn about your tastes. They’ll start to make informed suggestions for you, and before long it’ll be as if you have your very own personal wine shopper.

pages: 624 words: 127,987

The Personal MBA: A World-Class Business Education in a Single Volume
by Josh Kaufman
Published 2 Feb 2011

The worst response you can get when asking for Feedback isn’t emphatic dislike: it’s total apathy. If no one seems to care about what you’ve created, you don’t have a viable business idea. 5. Give potential customers the opportunity to preorder. One of the most important pieces of Feedback you can receive during the iteration process is the other person’s willingness to actually purchase what you’re creating. It’s one thing for a person to say that they’d purchase something and quite another for them to be willing to pull out their wallet or credit card and place a real order. You can do this even if the offer isn’t ready yet—a tactic called Shadow Testing (discussed later).

Producing Open Source Software: How to Run a Successful Free Software Project
by Karl Fogel
Published 13 Oct 2005

Most patches arrive either as posts to the project's development mailing list or as a pull request submitted through the version control system, but there are a number of different routes a patch can take after arrival. Sometimes someone reviews the patch, finds problems, and bounces it back to the original author for cleanup. This usually leads to an iterative process—all visible in a public forum—in which the original author posts revised versions of the patch until the reviewer has nothing more to criticize. It is not always easy to tell when this process is done: if the reviewer commits the patch, then clearly the cycle is complete. But if she does not, it might be because she simply didn't have time, or doesn't have commit access herself and couldn't rope any of the other developers into doing it.

Commodity Trading Advisors: Risk, Performance Analysis, and Selection
by Greg N. Gregoriou , Vassilios Karavas , François-Serge Lhabitant and Fabrice Douglas Rouah
Published 23 Sep 2004

Following Chang, Pinegar, and Schachter (1997), the volume and volatility relationship is modeled without including past volatility. 2. Following Irwin and Yoshimaru (1999), volatility lags are included as independent variables to account for the time series persistence of volatility. 3. Following Bessembinder and Seguin (1993), the persistence in volume and volatility is modeled through specification of an iterative process.5 Since estimation results for the different model specifications are quite similar, only results for a modified version of Chang, Pinegar, and Schachter’s specification are reported here.6 Chang, Pinegar, and Schachter (1997) regress futures price volatility on volume associated with large speculators (as provided by the CFTC large trader reports) and all other market volume.

AI 2041: Ten Visions for Our Future
by Kai-Fu Lee and Qiufan Chen
Published 13 Sep 2021

This level of complexity is much greater than what’s needed for making video games and developing apps. But without high-quality professional content, people won’t buy the devices. And without a proliferation of devices, the content will not monetize well. This chicken-and-egg problem will require an iterative process that will ultimately create a virtuous cycle, just like television and Netflix took a substantial amount of time and investment to become mainstream. That said, once the tools are invented and tested, the proliferation will be very rapid. One could imagine that professional tools like Unreal and Unity may evolve into the XR version of photo filters one day.

The Book of Why: The New Science of Cause and Effect
by Judea Pearl and Dana Mackenzie
Published 1 Mar 2018

Information bits are transformed into codewords; these are transmitted and received at the destination with noise (errors). (b) Bayesian network representation of turbo code. Information bits are scrambled and encoded twice. Decoding proceeds by belief propagation on this network. Each processor at the bottom uses information from the other processor to improve its guess of the hidden codeword, in an iterative process. This capsule history is correct except for one thing: Berrou did not know that he was working with Bayesian networks! He had simply discovered the belief propagation algorithm himself. It wasn’t until five years later that David MacKay of Cambridge realized that it was the same algorithm that he had been enjoying in the late 1980s while playing with Bayesian networks.

pages: 512 words: 131,112

Retrofitting Suburbia, Updated Edition: Urban Design Solutions for Redesigning Suburbs
by Ellen Dunham-Jones and June Williamson
Published 23 Mar 2011

Representative government says that at the end of the day it’s really those elected officials and their staff who need to take that responsibility with appropriate participation. But you need to create venues for participation that are meaningful. It’s not meaningful to ask a group of civilians to come up with the final determination of what the reuse of a mall is. But what it is appropriate to do is to use an iterative process to talk about the components that make a place interesting and livable. What do you want to preserve? What do you want to change? Q. How did the public-private partnership with Continuum Partners come about? A. City officials in Lakewood had a very broad concept. We wanted a more interesting urban space.

Programming Android
by Zigurd Mednieks , Laird Dornin , G. Blake Meike and Masumi Nakamura
Published 15 Jul 2011

In the following text, we describe SQLite commands as they are used inside the sqlite3 command-line utility. Later we will show ways to achieve the same effects using the Android API. Although command-line SQL will not be part of the application you ship, it can certainly help to debug applications as you’re developing them. You will find that writing database code in Android is usually an iterative process of writing Java code to manipulate tables, and then peeking at created data using the command line. SQL Data Definition Commands Statements in the SQL language fall into two distinct categories: those used to create and modify tables—the locations where data is stored—and those used to create, read, update, and delete the data in those tables.

pages: 1,025 words: 150,187

ZeroMQ
by Pieter Hintjens
Published 12 Mar 2013

Your goal as leader of a community is to motivate people to get out there and explore; to ensure they can do so safely and without disturbing others; to reward them when they make successful discoveries; and to ensure they share their knowledge with everyone else (and not because we ask them, not because they feel generous, but because it’s The Law). It is an iterative process. You make a small product, at your own cost, but in public view. You then build a small community around that product. If you have a small but real hit, the community then helps design and build the next version, and grows larger. And then that community builds the next version, and so on. It’s evident that you remain part of the community, maybe even a majority contributor, but the more control you try to assert over the material results, the less people will want to participate.

Poking a Dead Frog: Conversations With Today's Top Comedy Writers
by Mike Sacks
Published 23 Jun 2014

Things happen for no reason, and lead to nothing, or lead to something, but with weak causation. But in revision they get tighter and funnier and also gentler. And one thing leads to the next in a tighter, more undeniable way—a way that seems to “mean.” Which, I guess, makes sense, if we think of revision as just an iterative process of exerting one’s taste. Gradually the story comes to feel more like “you” than you could have imagined at the outset, and starts to manifest a sort of superlogic—an internal logic that is more direct and “caused” than mere real-life logic. The thing is, writing is really just the process of charming someone via prose—compelling them to keep reading.

pages: 560 words: 158,238

Fifty Degrees Below
by Kim Stanley Robinson
Published 25 Oct 2005

“The Corps has always done things on a big scale. Huge scale. Sometimes with huge blunders. All with the best intentions of course. That’s just the way things happen. We’re still gung-ho to try. Lots of things are reversible, in the long run. Hopefully this time around we’ll be working with better science. But, you know, it’s an iterative process. So, long story short, you get a project approved, and we’re good to go. We’ve got the expertise. The Corps’ esprit de corps is always high.” “What about budget?” Frank asked. “What about it? We’ll spend what we’re given.” “Well, but is there any kind of, you know, discretionary fund that you can tap into?”

pages: 660 words: 141,595

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking
by Foster Provost and Tom Fawcett
Published 30 Jun 2013

Let’s now discuss the steps in detail. Business Understanding Initially, it is vital to understand the problem to be solved. This may seem obvious, but business projects seldom come pre-packaged as clear and unambiguous data mining problems. Often recasting the problem and designing a solution is an iterative process of discovery. The diagram shown in Figure 2-2 represents this as cycles within a cycle, rather than as a simple linear process. The initial formulation may not be complete or optimal so multiple iterations may be necessary for an acceptable solution formulation to appear. The Business Understanding stage represents a part of the craft where the analysts’ creativity plays a large role.

pages: 573 words: 157,767

From Bacteria to Bach and Back: The Evolution of Minds
by Daniel C. Dennett
Published 7 Feb 2017

With discrimination and recognition comes the prospect of reflection: judgments that these two things are the same and those two are different can lead to the recognition of the higher-order pattern of sameness and difference, which then become two additional “things” in the manifest image of the child. Like the prebiotic cycles that provided the iterative processes from which life and evolution itself emerged, these iterated manipulations provide an engine of recombination from which the densely populated manifest image of a maturing human child can be constructed. We will look more closely at how the manifest image becomes manifest to the child, that is, part of conscious experience, in chapter 14.

pages: 688 words: 147,571

Robot Rules: Regulating Artificial Intelligence
by Jacob Turner
Published 29 Oct 2018

In a sense, unsupervised learning can be thought of as finding patterns in the data above and beyond what would be considered pure unstructured noise.122 A particularly vivid example of unsupervised learning was a program that, after being exposed to the entire YouTube library, was able to recognise images of cat faces, despite the data being unlabelled.123 This process is not limited to frivolous uses such as feline identification: its applications include genomics as well as in social network analysis.124 Reinforcement learning, sometimes referred to as “weak supervision”, is a type of machine learning which maps situations and actions so as to maximise a reward signal. The program is not told which actions to take, but instead has to discover which actions yield the most reward through an iterative process: in other words, it learns through trying different things out.125 One use of reinforcement learning involves a program being asked to achieve a certain goal, but without being told how it should do so. In 2014, Ian Goodfellow and colleagues including Yoshua Bengio at the University of Montreal developed a new technique for machine learning which goes even further towards taking humans out of the picture: Generative Adversarial Nets (GANs).

pages: 584 words: 149,387

Essential Scrum: A Practical Guide to the Most Popular Agile Process
by Kenneth S. Rubin
Published 19 Jul 2012

See Sprint planning sprints, 21–23 Planning Poker defined, 412 how to play, 131–133 overview of, 129–130 scale in assigning estimates, 130 Planning principles emphasis on small, frequent releases, 252–254 focus on adapting and replanning rather than conforming, 249–251 keeping options open, 249 learning fast and pivoting as necessary, 254–255 managing inventory of planning artifacts, 251–252 not assuming up-front plans are right, 248 overview of, 247–248 up-front planning should be helpful not excessive, 248–249 Platforms lack of experience resulting in technical debt, 140 testing in definition of done, 75 PMI (Project Management Institute), 237–239 Point inflation, 138, 412 Pollinators (Goldberg and Rubin), 217 Portfolio backlog defined, 413 estimating, 121 inflow strategies, 275–280 outflow strategies, 280–283 portfolio planning and, 267, 269 in-process strategies, 283–285 release train approach (Leffingwell), 221 Portfolio planning balancing product flow into/out of portfolio backlog, 276–278 calculating cost of delays, 271–274 defined, 413 economic filter for go/no-go decision making, 275–276 embracing emergent opportunities, 278–279 establishing WIP limits, 281–282 estimating for accuracy not precision, 274–275 focusing on idle work not idle workers, 281 managing economics of, 236 marginal economics applied to in-process products, 283–285 in multilevel planning, 259 optimizing scheduling for lifecycle profitability, 270–271 overview of, 267 participants in, 268 planning level details for, 258 process of, 268–270 product owner participating in, 168 small, frequent releases in, 279–280 strategies for in-process products, 283 strategies for inflow, 275 strategies for outflow, 280 strategies for sequence of products, 270 timing of, 267 waiting until entire team is in place, 282–283 Potentially shippable product increments (PSIs) defined, 413 defining when sprint is complete or done, 74–78 as input to sprint review, 368–369 inspecting and adapting during sprint review, 363 as outcome of iterative process, 2–3 planning release of features to customers, 307 release train approach (Leffingwell) and, 220, 222–223 sprint results, 25–26 Practices activities. See Activities artifacts. See Artifacts defined, 413 roles. See Roles rules. See Rules in Scrum framework, 14 Pragmatism no-goal-altering-change rule and, 72 Pragmatic Marketing Framework, 178–179 Precision defined, 413 vs. accuracy in estimating, 125, 274–275 Prediction balancing predictive work with adaptive work, 43–44 just enough predictive planning, 300 plan-driven development compared with agile development, 59 technical debt decreasing predictability, 143 timeboxing improving predictability, 64 Prediction and adaptation principle, in agile development accepting that you can’t get it right up front, 38–39 adaptive, exploratory approach in, 39–40 balancing predictive work with adaptive work, 43–44 handling cost of change, 40–43 keeping options open, 37–38 overview of, 37 pivoting and, 254–255 Predictive process.

pages: 470 words: 158,007

The Quiet Coup: Neoliberalism and the Looting of America
by Mehrsa Baradaran
Published 7 May 2024

* * * AS SCHOLARS have found and as Supreme Court opinions amply demonstrate, the flow of ideas between the Court’s right wing and Federalist Society panelists and scholars goes both ways. Prominent scholars on the right provide justices with innovative originalist arguments that can be used flexibly in upcoming cases, while practitioners hear from judges themselves what types of arguments will be persuasive. In an iterative process, judicial decisions that push the boundaries of interpretation can be built on and expanded even further. While the Federalist Society network provides the Court with fodder for argumentation, it also serves as a brake on the Court. According to founding member Steven Calabresi, the justices are “absolutely” “kept in check” by the Federalist Society: “When one tries to think about what kinds of checks exist on officials as powerful as Supreme Court Justices, I think the check of criticism by law schools, journalists, and conservative think tanks like the Federalist Society, criticism from those quarters is something that they notice.”20 The Federalist Society learned relatively early on that it needed to keep its judges in line.

pages: 515 words: 152,128

Material World: A Substantial Story of Our Past and Future
by Ed Conway
Published 15 Jun 2023

From the moment it received governmental approval, it was planning to destroy them. The point of the archaeology was not to preserve the caves but to remove any treasures before the diggers and blasters arrived. The artefacts were placed in a shipping container on the site of the mine. Mining is a slow, iterative process; one area is blasted, dug and cleared, then work begins on the next. Satellite imagery shows the Brockman mine slowly expanding in the direction of the caves. In September 2010 it was 4.4 kilometres from the Juukan Gorge. By 2015 it was within 300 metres of the gorge. By November 2019 it was 120 metres away.

pages: 923 words: 516,602

The C++ Programming Language
by Bjarne Stroustrup
Published 2 Jan 1986

All rights reserved. 696 Development and Design Chapter 23 The purpose of ‘‘design’’ is to create a clean and relatively simple internal structure, sometimes also called an architecture, for a program. In other words, we want to create a framework into which the individual pieces of code can fit and thereby guide the writing of those individual pieces of code. A design is the end product of the design process (as far as there is an end product of an iterative process). It is the focus of the communication between the designer and the programmer and between programmers. It is important to have a sense of proportion here. If I – as an individual programmer – design a small program that I’m going to implement tomorrow, the appropriate level of precision and detail may be some scribbles on the back of an envelope.

. – Consider minimalism, completeness, and convenience. [3] Refine the classes by specifying their dependencies. – Consider parameterization, inheritance, and use dependencies. [4] Specify the interfaces. – Separate functions into public and protected operations. – Specify the exact type of the operations on the classes. Note that these are steps in an iterative process. Typically, several loops through this sequence are needed to produce a design one can comfortably use for an initial implementation or a reimplementation. One advantage of well-done analysis and data abstraction as described here is that it becomes relatively easy to reshuffle class relationships even after code has been written.

pages: 680 words: 157,865

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design
by Diomidis Spinellis and Georgios Gousios
Published 30 Dec 2008

Most attendees express that the annual conference gives them a boost in motivation and in the effectiveness of their development work. The organization and structure KDE shows today is not the brain child of a group of executives who asked themselves how a Free Software project should be organized. It is the result of an iterative process of trying to find a suitable structure for the main nontechnical goals—to remain free and to ensure the longevity of the project and sustainable growth of the community. Freedom is used here not only in the sense of being able to provide the software for free and as Free Software, but also to be free of dominating influences from third parties.

pages: 561 words: 157,589

WTF?: What's the Future and Why It's Up to Us
by Tim O'Reilly
Published 9 Oct 2017

Their next-generation tool set supports what is called “generative design.” The engineer, architect, or product designer enters a set of design constraints—functionality, cost, materials; a cloud-based genetic algorithm (a primitive form of AI) returns hundreds or even thousands of possible options for achieving those goals. In an iterative process, man and machine together design new forms that humans have never seen and might not otherwise conceive. Most intriguing is the use of computation to help design radically new kinds of shapes and materials and processes. For example, Arup, the global architecture and engineering firm, showcases a structural part designed using the latest methods that is half the size and uses half the material, but can carry the same load.

pages: 1,331 words: 163,200

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
by Aurélien Géron
Published 13 Mar 2017

The number of rooms per household is also more informative than the total number of rooms in a district — obviously the larger the houses, the more expensive they are. This round of exploration does not have to be absolutely thorough; the point is to start off on the right foot and quickly gain insights that will help you get a first reasonably good prototype. But this is an iterative process: once you get a prototype up and running, you can analyze its output to gain more insights and come back to this exploration step. Prepare the Data for Machine Learning Algorithms It’s time to prepare the data for your Machine Learning algorithms. Instead of just doing this manually, you should write functions to do that, for several good reasons: This will allow you to reproduce these transformations easily on any dataset (e.g., the next time you get a fresh dataset).

pages: 574 words: 164,509

Superintelligence: Paths, Dangers, Strategies
by Nick Bostrom
Published 3 Jun 2014

The idea of using learning as a means of bootstrapping a simpler system to human-level intelligence can be traced back at least to Alan Turing’s notion of a “child machine,” which he wrote about in 1950: Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s? If this were then subjected to an appropriate course of education one would obtain the adult brain.3 Turing envisaged an iterative process to develop such a child machine: We cannot expect to find a good child machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution….

pages: 505 words: 161,581

The Founders: The Story of Paypal and the Entrepreneurs Who Shaped Silicon Valley
by Jimmy Soni
Published 22 Feb 2022

“Neither side is asking the right questions regarding the pressing needs of the day. “In our own way, at PayPal this is what we’ve been doing all along. We’ve been creating a system that enables global commerce for everyone. And we’ve been fighting the people who would do us and our users harm. It’s been a gradual, iterative process, and we’ve gotten plenty of stuff wrong along the way, but we’ve kept moving in the right direction to address these major issues while the rest of the world has been ignoring them. “And so I’d like to send a message back to Planet Earth from Palo Alto. Life is good here in Palo Alto. We’ve been able to improve on many of the ways you do things.

Smart Grid Standards
by Takuro Sato
Published 17 Nov 2015

By getting one safety container, and using brute force to compute all valid combinations of CRC1 and VCN that generates the same CRC2 as in the received message, a set of possible CRC1 can be obtained. With the knowledge that the CRC1 is static over the session lifetime, the remaining combinations can be reduced to the CRC1 that is in use. This has to be done as an iterative process that terminates when the correct CRC1 has been found. The remaining challenge is to find the actual VCN very quickly for all received safety containers. The VCN will increase monotonically at a rate depending on the bus period time, host, and device period time executing Smart Grid Standards 324 the safety layer.

pages: 700 words: 160,604

The Code Breaker: Jennifer Doudna, Gene Editing, and the Future of the Human Race
by Walter Isaacson
Published 9 Mar 2021

The competition was run by First Robotics, a nationwide program created by the irrepressible Segway inventor Dean Kamen. 2. Interviews, audio and video recordings, notes, and slides provided by Jennifer Doudna, Megan Hochstrasser, and Fyodor Urnov; Walter Isaacson, “Ivory Power,” Air Mail, Apr. 11, 2020. 3. See chapter 12 on the yogurt makers for a fuller discussion of the iterative process that can occur between basic researchers and technological innovation. Chapter 1: Hilo 1. Author’s interviews with Jennifer Doudna and Sarah Doudna. Other sources for this section include The Life Scientific, BBC Radio, Sept. 17, 2017; Andrew Pollack, “Jennifer Doudna, a Pioneer Who Helped Simplify Genome Editing,” New York Times, May 11, 2015; Claudia Dreifus, “The Joy of the Discovery: An Interview with Jennifer Doudna,” New York Review of Books, Jan. 24, 2019; Jennifer Doudna interview, National Academy of Sciences, Nov. 11, 2004; Jennifer Doudna, “Why Genome Editing Will Change Our Lives,” Financial Times, Mar. 14, 2018; Laura Kiessling, “A Conversation with Jennifer Doudna,” ACS Chemical Biology Journal, Feb. 16, 2018; Melissa Marino, “Biography of Jennifer A.

Alpha Trader
by Brent Donnelly
Published 11 May 2021

Then, in Chapter 4, we take this information and study the research on trading to solve for the more specific equation of trading success. As you read Part One, think about where you match up or diverge from the Alpha Trader profile, and take notes. The point of Part One is to study, understand and define the factors that lead to trading success and then start an iterative process where you repeatedly learn, plan, execute, analyze, and improve to get there. CHAPTER 1 KNOW YOURSELF Good traders are introspective and self-aware In trading, the enemy is not bad luck. It’s not other traders, or the market. The enemy is not the algos or the central banks or the goldbugs or sales or your manager or the HODLers or Dave Portnoy or QE.

pages: 821 words: 178,631

The Rust Programming Language
by Steve Klabnik and Carol Nichols
Published 14 Jun 2018

Most of the time when specifying one of the Fn trait bounds, you can start with Fn and the compiler will tell you if you need FnMut or FnOnce based on what happens in the closure body. To illustrate situations where closures that can capture their environment are useful as function parameters, let’s move on to our next topic: iterators. Processing a Series of Items with Iterators The iterator pattern allows you to perform some task on a sequence of items in turn. An iterator is responsible for the logic of iterating over each item and determining when the sequence has finished. When you use iterators, you don’t have to reimplement that logic yourself.

pages: 799 words: 187,221

Leonardo Da Vinci
by Walter Isaacson
Published 16 Oct 2017

With each category, he included striking observations, such as this: “There is always a space where the light falls and then is reflected back towards its cause; it meets the original shadow and mingles with it and modifies it somewhat.”16 Reading his studies on reflected light provides us with a deeper appreciation for the subtleties of the light-dappled shadow on the edge of Cecilia’s hand in Lady with an Ermine or the Madonna’s hand in Virgin of the Rocks, and it reminds us why these are innovative masterpieces. Studying the paintings, in turn, leads to a more profound understanding of Leonardo’s scientific inquiry into rebounding and reflected light. This iterative process was true for him as well: his analysis of nature informed his art, which informed his analysis of nature.17 SHAPES WITHOUT LINES Leonardo’s reliance on shadows, rather than contour lines, to define the shape of most objects stemmed from a radical insight, one that he derived from both observation and mathematics: there was no such thing in nature as a precisely visible outline or border to an object.

Blueprint: The Evolutionary Origins of a Good Society
by Nicholas A. Christakis
Published 26 Mar 2019

Just because our tools are limited and sometimes (or even often) fail does not mean they serve no useful purpose. Ever since Galileo used his rudimentary telescope to find moons around Jupiter, spots on the sun, and craters on the moon, people have discovered quite a lot with tools that seem primitive by today’s standards. Science is an iterative process. We cannot go back and skip over the earlier, imperfect invention of a terrestrial telescope to today’s space-based telescopes, even if the former had limitations imposed by light pollution and atmospheric interference. It would be like saying that the paper maps many people used to navigate for centuries served no purpose now that we have GPS.

Statistics in a Nutshell
by Sarah Boslaugh
Published 10 Nov 2012

The initial cluster centers are related to the correlations and the corresponding principal components extracted in the previous analysis. Cluster 1 is strongly associated with reading, verbal, and spelling; cluster 2 with arithmetic and geometry; and cluster 3 with sports. Although there are some changes during the iterative process, these groupings tend not to change. The resulting group allocations are simply a function of the distance from each centroid. The pairwise distances between each centroid are also reasonably consistent with each other; that is, the between-group distances appear to have been successfully maximized, and there does not appear to have been difficulty in separating them.

pages: 685 words: 203,949

The Organized Mind: Thinking Straight in the Age of Information Overload
by Daniel J. Levitin
Published 18 Aug 2014

Letter boxes had to be taken down from a shelf and opened up, a time-consuming operation when large amounts of filing were done. As Yates notes, keeping track of whether a given document or pile of documents was deemed active or archival was not always made explicit. Moreover, if the user wanted to expand, this might require transferring the contents of one box to another in an iterative process that might require dozens of boxes being moved down in the cabinet, to make room for the new box. To help prevent document loss, and to keep documents in the order they were filed, a ring system was introduced around 1881, similar to the three-ring binders we now use. The advantages of ringed flat files were substantial, providing random access (like Phaedrus’s 3 x 5 index card system) and minimizing the risk of document loss.

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
by Ralph Kimball and Margy Ross
Published 30 Jun 2013

The modeling effort typically works through the following sequence of tasks and deliverables, as illustrated in Figure 18.1: High-level model defining the model's scope and granularity Detailed design with table-by-table attributes and metrics Review and validation with IT and business representatives Finalization of the design documentation As with any data modeling effort, dimensional modeling is an iterative process. You will work back and forth between business requirements and source details to further refine the model, changing the model as you learn more. This section describes each of these major tasks. Depending on the design team's experience and exposure to dimensional modeling concepts, you might begin with basic dimensional modeling education before kicking off the effort to ensure everyone is on the same page regarding standard dimensional vocabulary and best practices.

pages: 935 words: 197,338

The Power Law: Venture Capital and the Making of the New Future
by Sebastian Mallaby
Published 1 Feb 2022

I know.”[12] With that, Markkula decided to put his energy behind Apple. He became an adviser to the Steves, writing their business plan, serving as marketing chief and company chairman, arranging a bank credit line, and ultimately investing $91,000 of his own capital in exchange for 26 percent of the company.[13] After a circuitous and iterative process, the Silicon Valley network had finally come to the right solution. Jobs and Wozniak had been turned down repeatedly, by multiple investors. But one introduction had led to another, and Apple had eventually secured the lifeline it needed. Markkula was not a venture capitalist. He was arguably the Valley’s first “angel investor”: somebody grown rich from the success of one startup who recycles his wealth and experience into more startups.

Four Battlegrounds
by Paul Scharre
Published 18 Jan 2023

Militaries continually adopt new technologies to improve their operations, and the same processes for evolving new tactics and procedures will also work with AI, although not without some trial and error. And similar to other technologies, the militaries that learn how to employ AI most effectively will have significant advantages on the battlefield. Militaries that invest in an iterative process of experimentation, prototyping, and concept development will be best poised to take the lead in benefiting from AI. The intelligentization, or cognitization, of warfare will unfold over decades. Once AI has been seeded into every crevice of military operations, what will war look like? The industrial revolution transformed warfare, increasing its scale, lethality, and destructive power.

pages: 1,294 words: 210,361

The Emperor of All Maladies: A Biography of Cancer
by Siddhartha Mukherjee
Published 16 Nov 2010

In other words, if you started off with 100,000 leukemia cells in a mouse and administered a drug that killed 99 percent of those cells in a single round, then every round would kill cells in a fractional manner, resulting in fewer and fewer cells after every round of chemotherapy: 100,000 . . . 1,000 . . . 10 . . . and so forth, until the number finally fell to zero after four rounds. Killing leukemia was an iterative process, like halving a monster’s body, then halving the half, and halving the remnant half. Second, Skipper found that by adding drugs in combination, he could often get synergistic effects on killing. Since different drugs elicited different resistance mechanisms, and produced different toxicities in cancer cells, using drugs in concert dramatically lowered the chance of resistance and increased cell killing.

pages: 823 words: 220,581

Debunking Economics - Revised, Expanded and Integrated Edition: The Naked Emperor Dethroned?
by Steve Keen
Published 21 Sep 2011

The auctioneer then refuses to allow any sale to take place, and instead adjusts prices – increasing the price of those commodities where demand exceeded supply, and decreasing the price where demand was less than supply. This then results in a second set of prices, which are also highly unlikely to balance demand and supply for all commodities; so another round of price adjustments will take place, and another, and another. Walras called this iterative process of trying to find a set of prices which equates supply to demand for all commodities ‘tatonnement’ – which literally translates as ‘groping.’ He believed that this process would eventually converge to an equilibrium set of prices, where supply and demand are balanced in all markets (so long as trade at disequilibrium prices can be prevented).

Seeking SRE: Conversations About Running Production Systems at Scale
by David N. Blank-Edelman
Published 16 Sep 2018

Requiring high standards of writing can be counterproductive, intimidating engineers from creating docs. Similarly, polishing a document past the point where key information is up to date, discoverable, and clearly conveyed is a waste of time that could be spent improving other parts of the documentation (or your service itself). Just as code is an iterative process, so too is documentation. Learn to embrace what Anne Lamott describes as the “shitty first draft”: an imperfect document is infinitely more useful than a perfect one that does not yet exist. Ask yourself this: Does this doc meet its functional requirements and is the required information present and clearly conveyed?

pages: 496 words: 174,084

Masterminds of Programming: Conversations With the Creators of Major Programming Languages
by Federico Biancuzzi and Shane Warden
Published 21 Mar 2009

Maybe I make the decision that color is important for me and all of a sudden I realize, “Wow, I’ve just alienated the whole community of color-blind programmers.” Every one of those things becomes a constraint that I have to work out, and I have to deal with the consequences of those constraints. That argues for an iterative process. Grady: Absolutely. All of life is iterative. It goes back to the point I made earlier, which is you can’t a priori know enough to even ask the right questions. One has to take a leap of faith and move forward in the presence of imperfect information. Is it likely we’ll see a break-out visual programming language or system in the next 10 years?

pages: 828 words: 232,188

Political Order and Political Decay: From the Industrial Revolution to the Globalization of Democracy
by Francis Fukuyama
Published 29 Sep 2014

Hayek have argued, that human beings are never knowledgeable or wise enough to be able to predict the outcomes of their efforts to design institutions or plan policies with full ex ante knowledge of the results.1 But the exercise of human agency is not a one-shot affair: human beings learn from their mistakes and take actions to correct them in an iterative process. The constitution adopted by the Federal Republic of Germany in 1949 differed in significant ways from the constitution of the Weimar Republic, precisely because Germans had learned from the failure of democracy during the 1930s. In biological evolution, there are separate specific and general processes.

pages: 920 words: 233,102

Unelected Power: The Quest for Legitimacy in Central Banking and the Regulatory State
by Paul Tucker
Published 21 Apr 2018

Further, they would need to address whether an independent agency’s decisions would be observable, and whether outcomes could be evaluated against a standard fixed in advance. The public would need to be told if the success of the proposed regime might be hard to track. And all of that would need to be open to challenge and revision in an iterative process. In terms of some of today’s most potent Continental European traditions of political thought, this seems to bring about something of a reconciliation between the Freiburg ordo-liberal desire for rules of the game for socioeconomic life and the Frankfurt Habermasian prescription of political choices being made through rich and reasoned debate.

pages: 721 words: 238,678

Fall Out: A Year of Political Mayhem
by Tim Shipman
Published 30 Nov 2017

She would give a short speech on the Sunday on Brexit. Used to governing by speech, May’s aides say she used the writing process to define policy, rather than have the speech reflect a pre-ordained line. Timothy discussed with May what she wanted and then wrote a text. The finer points were clarified in ‘an iterative process’ involving May, Timothy, Hill, Jojo Penn, the deputy chief of staff, and Chris Wilkins, the head of strategy who had penned May’s ‘nasty party’ speech fourteen years earlier. ‘The first draft is a hypothesis that either she agrees with or not,’ one of those involved said. ‘Nick being Nick would write the most “out there” option and it would get reined in.

pages: 1,156 words: 229,431

The IDA Pro Book
by Chris Eagle
Published 16 Jun 2011

Note The sigmake documentation file, sigmake.txt, recommends that signature filenames follow the MS-DOS 8.3 name-length convention. This is not a hard-and-fast requirement, however. When longer filenames are used, only the first eight characters of the base filename are displayed in the signature-selection dialog. Signature generation is often an iterative process, as it is during this phase when collisions must be handled. A collision occurs anytime two functions have identical patterns. If collisions are not resolved in some manner, it is not possible to determine which function is actually being matched during the signature-application process. Therefore, sigmake must be able to resolve each generated signature to exactly one function name.

pages: 864 words: 272,918

Palo Alto: A History of California, Capitalism, and the World
by Malcolm Harris
Published 14 Feb 2023

In the early days of the radio age, the Bay Area’s inventions tended to end up in the hands of litigious East Coast capital, one way or another. This early trauma still lurks in Silicon Valley’s unconscious. Figuring out how to maintain control of their inventions and the profits they yielded was an iterative process for the Bay Area’s entrepreneurs and engineers, and that process selected for adventurous types. Consider Alexander M. Poniatoff, a Russian-born tinkerer who flew in the czar’s air corps in World War I and then fought with the Whites against the Bolsheviks. Defeated, he retreated through Siberia to Shanghai.

pages: 892 words: 91,000

Valuation: Measuring and Managing the Value of Companies
by Tim Koller , McKinsey , Company Inc. , Marc Goedhart , David Wessels , Barbara Schwimmer and Franziska Manoury
Published 16 Aug 2015

During the middle part of 2014, when we performed this valuation, UPS’s stock traded between $95 and $105 per share, well within a reasonable range of the DCF valuation (reasonable changes in forecast assumptions or WACC estimates can easily move a company’s value by up to 15 percent). Although this chapter presents the enterprise DCF valuation sequentially, valuation is an iterative process. To value operations, first reorganize the company’s financial statements to separate operating items from nonoperating items and capital structure. Then analyze the company’s historical performance; define and project free cash flow over the short, medium, and long 5 A noncontrolling interest arises when an outside investor owns a minority share of a subsidiary.

The Art of Computer Programming: Sorting and Searching
by Donald Ervin Knuth
Published 15 Jan 1998

As an extreme example, consider placing one run on Tl and n runs on T2, T3, T4, T5; if we alternately do five-way merging to T6 and Tl until T2, T3, T4, T5 are empty, the processing time is Bn2 + 3n) initial run lengths, essentially proportional to S2 instead of SlogS, although five-way merging was done throughout. Tape splitting. Efficient overlapping of rewind time is a problem that arises in many applications, not just sorting, and there is a general approach that can often be used. Consider an iterative process that uses two tapes in the following way: Tl T2 Phase 1 Phase 2 Phase 3 Output 1 Rewind Input 1 Rewind Output 3 Rewind Output 2 Rewind Input 2 Rewind Phase 4 Input 3 Output 4 Rewind Rewind and so on, where "Output k" means write the kth output file and "Input k" means read it. The rewind time can be avoided when three tapes are used, as suggested by C.

The Art of Computer Programming
by Donald Ervin Knuth
Published 15 Jan 2001

Chappie published the first O(N3) method for power series reversion in CACM 4 A961), 317-318, 503. It was an offline algorithm essentially equivalent to the method of exercise 16, with running time approximately the same as that of Algorithms L and T. Iteration of series. If we want to study the behavior of an iterative process xn «— f(xn-i), we are interested in studying the n-fold composition of a given function / with itself, namely xn = /(/(... f{xo) •••))• Let us define f^(x) = x and /W(x) = /(/[n~1](z))> so that A8) for all integers m, n > 0. In many cases the notation f^n\x) makes sense also when n is a negative integer, namely if /M and /t~nl are inverse functions such that x — f^(f^~n^(x)); if inverse functions are unique, A8) holds for all integers m and n.

pages: 1,797 words: 390,698

Power at Ground Zero: Politics, Money, and the Remaking of Lower Manhattan
by Lynne B. Sagalyn
Published 8 Sep 2016

The drafting committees convened again to review the public comments and make adjustments. During the process, Contini constantly went back to the Families Advisory Council to keep its members informed; there were always a few who didn’t agree, but most agreed with what was being formulated. The memorial, she said, “had to be about the individual and about the larger event.” The iterative process, Goldberger wrote, produced a final version “not nearly so genteel” as the initial attempt at a mission statement, which was “notable for its cautious, even hesitant language and sense of propriety.” The final version was “short, simpler, and blunter”:21 Remember and honor the thousands of innocent men, women, and children murdered by terrorists in the horrific attacks of February 26, 1993, and September 11, 2001.

Programming Python
by Mark Lutz
Published 5 Jan 2011

The Static Language Build Cycle Using traditional static languages, there is an unavoidable overhead in moving from coded programs to working systems: compile and link steps add a built-in delay to the development process. In some environments, it’s common to spend many hours each week just waiting for a static language application’s build cycle to finish. Given that modern development practice involves an iterative process of building, testing, and rebuilding, such delays can be expensive and demoralizing (if not physically painful). Of course, this varies from shop to shop, and in some domains the demand for performance justifies build-cycle delays. But I’ve worked in C++ environments where programmers joked about having to go to lunch whenever they recompiled their systems.