The Tyranny of Metrics
by Jerry Z. Muller
Published 23 Jan 2018

Is the success of the Cleveland Clinic a function of the fact that the Clinic publishes its outcomes? Or is the Clinic eager to publicize its outcomes precisely because they are so impressive? In fact, the Cleveland Clinic was one of the world’s great medical institutions before the rise of performance metrics, and it maintains that standing in the age of performance metrics. But to conclude that there is a causal relationship between the clinic’s quality and the publication of its performance metrics is to fall prey to the fallacy of post hoc ergo propter hoc. The success may have far more to do with local conditions—the ways in which the organizational culture of the Cleveland Clinic makes use of metrics—than with quality measurement per se.10 Metrics at Geisinger are effective because of the way in which they are embedded in a larger system.

Patten, Simon, 32 pay for performance, 19; in business and finance, 137–45; extrinsic and intrinsic rewards and, 53–57; in medicine, 114–16; in New Public Management, 52; origins of, 29–31; in schools, 95–96; situations for successful use of, 179–80; Taylorism and, 31–32 Pentagon, the, 35–36 performance measurement, 8, 63–64, 74, 177, 180; college, 73–75; medicine, 2–5, 107, 123, 176; and transparency as enemy of performance, 159–65 Peters, Tom, 17 pharmaceutical industry, 140–42 Phelps, Edmund, 172 philanthropy and foreign aid, 153–56 philosophical critiques of metrics, 59–64 Pisano, Gary, 150–51 Polanyi, Michael, 59 policing, 125–29, 175 politics and government, 160–62; Bush's use of performance metrics in, 11, 64, 89, 90; diplomacy and intelligence in, 162–65; higher education and (see higher education); Obama's use of performance metrics and, 33, 81–82, 85, 94; Obsessive Measurement Disorder in, 155–56; public policy related to accountability and, 12, 41, 73; schools and (see schools); Thatcher's use of performance metrics in, 56–57, 62–63, 73 Porter, Michael E., 107–8 practical tacit knowledge, 59–60 pretense of knowledge, 60 Princeton Review, 76 principal-agent theory, 49–51 productivity: increased numbers of college graduates and, 68; measuring academic, 78–80; metric fixation and costs of, 173 Pronovost, Peter, 109–10, 111–12, 176 ProPublica, 115, 116 public policy, 12, 41, 73 Public School Administration, 33 Race to the Top, 94–95, 100 Rand Corporation, 116, 131, 135 rankings, college, 75–78, 81 Rappaport, Alfred, 148 rationalism, 59–60 Ravitch, Diane, 89 remedial college courses, 70–71 Repenning, Nelson, 150 resistance to change, 46 rewarding of luck, 171 rewards, extrinsic and intrinsic, 53–57, 119–20, 137–38, 144 Rigas, John, 144 risk adjustment, 122 risk-taking, discouragement of, 62, 117–18, 171 rule cascades, 171 Sarbanes-Oxley Act of 2002, 144–45 SAT and ACT tests, 70 schools, 11, 24, 89, 175–76; achievement gap in, 20, 91, 96–99; costs of attempted gap-closing in, 99–101; paying for performance in, 95–96; problems and purported solution of NCLB for, 89–91; Race to the Top and, 94–95, 100; unintended consequences of NCLB for, 92–94.

Metric fixation, which aspires to imitate science, too often resembles faith. All of that is not intended to claim that measurement is useless or intrinsically pernicious. One of the purposes of this book is to specify when performance metrics are genuinely useful—how to use metrics without the characteristic dysfunctions of metric fixation. The next chapter, “Recurring Flaws,” provides a taxonomy of the most frequent types of flaws in the use of performance metrics. Defining and labeling them will make it easier to refer back to them later. Then, in part II, we examine the origins of metric fixation and account for its spread and tenacity in spite of its frequent failures, in addition to exploring some of the deeper philosophical sources of its shortcomings.

Mastering Machine Learning With Scikit-Learn
by Gavin Hackeling
Published 31 Oct 2014 Table of Contents Preface Chapter 1: The Fundamentals of Machine Learning Learning from experience Machine learning tasks Training data and test data Performance measures, bias, and variance An introduction to scikit-learn Installing scikit-learn Installing scikit-learn on Windows Installing scikit-learn on Linux Installing scikit-learn on OS X Verifying the installation Installing pandas and matplotlib Summary Chapter 2: Linear Regression Simple linear regression Evaluating the fitness of a model with a cost function Solving ordinary least squares for simple linear regression Evaluating the model Multiple linear regression Polynomial regression Regularization Applying linear regression Exploring the data Fitting and evaluating the model Fitting models with gradient descent Summary 1 7 8 10 11 13 16 16 17 17 18 18 18 19 21 21 25 27 29 31 35 40 41 41 44 46 50 Table of Contents Chapter 3: Feature Extraction and Preprocessing 51 Chapter 4: From Linear Regression to Logistic Regression 71 Chapter 5: Nonlinear Classification and Regression with Decision Trees 97 Extracting features from categorical variables Extracting features from text The bag-of-words representation Stop-word filtering Stemming and lemmatization Extending bag-of-words with TF-IDF weights Space-efficient feature vectorizing with the hashing trick Extracting features from images Extracting features from pixel intensities Extracting points of interest as features SIFT and SURF Data standardization Summary Binary classification with logistic regression Spam filtering Binary classification performance metrics Accuracy Precision and recall Calculating the F1 measure ROC AUC Tuning models with grid search Multi-class classification Multi-class classification performance metrics Multi-label classification and problem transformation Multi-label classification performance metrics Summary Decision trees Training decision trees Selecting the questions Information gain Gini impurity Decision trees with scikit-learn Tree ensembles The advantages and disadvantages of decision trees Summary [ ii ] 51 52 52 55 56 59 62 63 63 65 67 69 70 72 73 76 77 79 80 81 84 86 90 91 94 95 97 99 100 103 108 109 112 113 114 Table of Contents Chapter 6: Clustering with K-Means 115 Chapter 7: Dimensionality Reduction with PCA 137 Chapter 8: The Perceptron 155 Chapter 9: From the Perceptron to Support Vector Machines 171 Chapter 10: From the Perceptron to Artificial Neural Networks 187 Clustering with the K-Means algorithm Local optima The elbow method Evaluating clusters Image quantization Clustering to learn features Summary An overview of PCA Performing Principal Component Analysis Variance, Covariance, and Covariance Matrices Eigenvectors and eigenvalues Dimensionality reduction with Principal Component Analysis Using PCA to visualize high-dimensional data Face recognition with PCA Summary Activation functions The perceptron learning algorithm Binary classification with the perceptron Document classification with the perceptron Limitations of the perceptron Summary Kernels and the kernel trick Maximum margin classification and support vectors Classifying characters in scikit-learn Classifying handwritten digits Classifying characters in natural images Summary Nonlinear decision boundaries Feedforward and feedback artificial neural networks Multilayer perceptrons Minimizing the cost function Forward propagation Backpropagation [ iii ] 117 123 124 128 130 132 135 137 142 142 143 146 149 150 153 157 158 159 166 167 169 172 176 179 179 182 185 188 189 189 191 192 198 Table of Contents Approximating XOR with Multilayer perceptrons Classifying handwritten digits Summary Index [ iv ] 212 213 214 217 Preface Recent years have seen the rise of machine learning, the study of software that learns from experience.

I.ll pay you back by mid february. Pls. Prediction: ham. Message: Where do you need to go to get it? How well does our classifier perform? The performance metrics we used for linear regression are inappropriate for this task. We are only interested in whether the predicted class was correct, not how far it was from the decision boundary. In the next section, we will discuss some performance metrics that can be used to evaluate binary classifiers. Binary classification performance metrics A variety of metrics exist to evaluate the performance of binary classifiers against trusted labels. The most common metrics are accuracy, precision, recall, F1 measure, and ROC AUC score.

This problem transformation ensures that the single-label problems will have the same number of training examples as the multilabel problem, but ignores relationships between the labels. Multi-label classification performance metrics Multi-label classification problems must be assessed using different performance measures than single-label classification problems. Two of the most common performance metrics are Hamming loss and Jaccard similarity. Hamming loss is the average fraction of incorrect labels. Note that Hamming loss is a loss function, and that the perfect score is zero. Jaccard similarity, or the Jaccard index, is the size of the intersection of the predicted labels and the true labels divided by the size of the union of the predicted and true labels.

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
by Ralph Kimball and Margy Ross
Published 30 Jun 2013

Chapter 8 Customer Relationship Management Aggregated Facts as Dimension Attributes Business users are often interested in constraining the customer dimension based on aggregated performance metrics, such as filtering on all customers who spent over a certain dollar amount during last year or perhaps over the customer's lifetime. Selected aggregated facts can be placed in a dimension as targets for constraining and as row labels for reporting. The metrics are often presented as banded ranges in the dimension table. Dimension attributes representing aggregated performance metrics add burden to the ETL processing, but ease the analytic burden in the BI layer. Chapter 8 Customer Relationship Management Dynamic Value Bands A dynamic value banding report is organized as a series of report row headers that define a progressive set of varying-sized ranges of a target numeric fact.

Bus Matrix for HR Processes Although an employee dimension with precise type 2 slowly changing dimension tracking coupled with a monthly periodic snapshot of core HR performance metrics is a good start, they just scratch the surface when it comes to tracking HR data. Figure 9.4 illustrates other processes that HR professionals and functional managers are likely keen to analyze. We've embellished this preliminary bus matrix with the type of fact table that might be used for each process; however, your source data realities and business requirements may warrant a different or complementary treatment. Figure 9.4 Bus matrix rows for HR processes. Some of these business processes capture performance metrics, but many result in factless fact tables, such as benefit eligibility or participation.

We also explore a series of basic and advanced techniques for handling slowly changing dimension attributes; we've built on the long-standing foundation of type 1 (overwrite), type 2 (add a row), and type 3 (add a column) as we introduce readers to type 0 and types 4 through 7. Chapter 6: Order Management In this case study, we look at the business processes that are often the first to be implemented in DW/BI systems as they supply core business performance metrics—what are we selling to which customers at what price? We discuss dimensions that play multiple roles within a schema. We also explore the common challenges modelers face when dealing with order management information, such as header/line item considerations, multiple currencies or units of measure, and junk dimensions with miscellaneous transaction indicators.

Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data
by Dipanjan Sarkar
Published 1 Dec 2016

# predict sentiment for test movie reviews dataset sentiwordnet_predictions = [analyze_sentiment_sentiwordnet_lexicon(review) for review in test_reviews] from utils import display_evaluation_metrics, display_confusion_matrix, display_classification_report # get model performance statistics In [295]: print 'Performance metrics:' ...: display_evaluation_metrics(true_labels=test_sentiments, ...: predicted_labels=sentiwordnet_predictions, ...: positive_class='positive') ...: print '\nConfusion Matrix:' ...: display_confusion_matrix(true_labels=test_sentiments, ...: predicted_labels=sentiwordnet_predictions, ...: classes=['positive', 'negative']) ...: print '\nClassification report:' ...: display_classification_report(true_labels=test_sentiments, ...: predicted_labels=sentiwordnet_predictions, ...: classes=['positive', 'negative']) Performance metrics: Accuracy: 0.59 Precision: 0.56 Recall: 0.92 F1 Score: 0.7 Confusion Matrix: Predicted: positive negative Actual: positive 6941 569 negative 5510 1980 Classification report: precision recall f1-score support positive 0.56 0.92 0.70 7510 negative 0.78 0.26 0.39 7490 avg / total 0.67 0.59 0.55 15000 Our model has a sentiment prediction accuracy of around 60% and an F1-score of 70% approximately.

The following snippet shows the model sentiment prediction performance on the entire test movie reviews dataset: # predict sentiment for test movie reviews dataset vader_predictions = [analyze_sentiment_vader_lexicon(review, threshold=0.1) for review in test_reviews] # get model performance statistics In [302]: print 'Performance metrics:' ...: display_evaluation_metrics(true_labels=test_sentiments, ...: predicted_labels=vader_predictions, ...: positive_class='positive') ...: print '\nConfusion Matrix:' ...: display_confusion_matrix(true_labels=test_sentiments, ...: predicted_labels=vader_predictions, ...: classes=['positive', 'negative']) ...: print '\nClassification report:' ...: display_classification_report(true_labels=test_sentiments, ...: predicted_labels=vader_predictions, ...: classes=['positive', 'negative']) Performance metrics: Accuracy: 0.7 Precision: 0.65 Recall: 0.86 F1 Score: 0.74 Confusion Matrix: Predicted: positive negative Actual: positive 6434 1076 negative 3410 4080 Classification report: precision recall f1-score support positive 0.65 0.86 0.74 7510 negative 0.79 0.54 0.65 7490 avg / total 0.72 0.70 0.69 15000 The preceding metrics depict that our model has a sentiment prediction accuracy of around 70 percent and an F1-score close to 75 percent, which is definitely better than our previous model.

The following snippet achieves the same: # predict sentiment for test movie reviews dataset pattern_predictions = [analyze_sentiment_pattern_lexicon(review, threshold=0.1) for review in test_reviews] # get model performance statistics In [307]: print 'Performance metrics:' ...: display_evaluation_metrics(true_labels=test_sentiments, ...: predicted_labels=pattern_predictions, ...: positive_class='positive') ...: print '\nConfusion Matrix:' ...: display_confusion_matrix(true_labels=test_sentiments, ...: predicted_labels=pattern_predictions, ...: classes=['positive', 'negative']) ...: print '\nClassification report:' ...: display_classification_report(true_labels=test_sentiments, ...: predicted_labels=pattern_predictions, ...: classes=['positive', 'negative']) Performance metrics: Accuracy: 0.77 Precision: 0.76 Recall: 0.79 F1 Score: 0.77 Confusion Matrix: Predicted: positive negative Actual: positive 5958 1552 negative 1924 5566 Classification report: precision recall f1-score support positive 0.76 0.79 0.77 7510 negative 0.78 0.74 0.76 7490 avg / total 0.77 0.77 0.77 15000 This model gives a better and more balanced performance toward predicting the sentiment of both positive and negative classes.

pages: 263 words: 75,455

Quantitative Value: A Practitioner's Guide to Automating Intelligent Investment and Eliminating Behavioral Errors
by Wesley R. Gray and Tobias E. Carlisle
Published 29 Nov 2012

Figure 1.1 sets out a brief graphical overview of the performance of the cheapest stocks according to common fundamental price ratios, such as the price-to-earnings (P/E) ratio, the price-to-book (P/B) ratio, and the EBITDA enterprise multiple (total enterprise value divided by earnings before interest, taxes, depreciation, and amortization, or TEV/EBITDA). FIGURE 1.1 Cumulative Returns to Common Price Ratios As Figure 1.1 illustrates, value investing according to simple fundamental price ratios has cumulatively beaten the S&P 500 over almost 50 years. Table 1.1 shows some additional performance metrics for the price ratios. The numbers illustrate that value strategies have been very successful (Chapter 7 has a detailed discussion of our method of our investment simulation procedures). TABLE 1.1 Long-Term Performance of Common Price Ratios (1964 to 2011) The counterargument to the empirical outperformance of value stocks is that these stocks are inherently more risky.

We have chosen eight years as our “long term” for two reasons: First, eight years likely captures a boom-and-bust cycle for the typical stock, and, second, there are sufficient stocks with eight years of historical data that we can identify a sufficiently large universe of stocks.9 We analyze three long-term, high-return operating performance metrics and rank these variables against the entire universe of stocks: long-term free cash flow on assets, long-term geometric return on assets, and long-term geometric return on capital, discussed next. The first measure is long-term free cash flow on assets (CFOA), defined as the sum of eight years of free cash flow divided by total assets.

Maximum margin is calculated in the following way: MM = Max [Percentile (MS), Percentile (MG)] where percentile is simply the performance of the stock according to each variable expressed as its percentile in the universe of stocks. Maximum margin takes each stock's best-performing profit margin metric and awards this rank to the stock. For example, a stock that scores 50 on the margin growth, and 64 on the margin stability, is awarded a maximum margin rank of 64 because this is the stock's best-performing metric. MM allows each stock to put its best foot forward. It ensures that stocks with high profit margin growth get recognized for the growth, and are not penalized for the lack of stability. Similarly, maximum margin credits stocks for stable profit margins, but does not penalize them for the lack of growth.

pages: 240 words: 78,436

Open for Business Harnessing the Power of Platform Ecosystems
by Lauren Turner Claire , Laure Claire Reillier and Benoit Reillier
Published 14 Oct 2017

21 4 Economic characteristics of platforms 31 5 Platforms as business models 41 6 Platform-powered ecosystems 57 7 Life stages of platforms: design 73 8 Platform ignition: proving the concept 91 9 Platform scaling: reaching critical mass 105 10 Platform maturity: defending profitable growth 121 11 Platform pricing 137 12 Trust, governance and brand 153 13 Platforms, regulation and competition 173 14 Competing against platforms 193 15 The future of platforms 205 A word from the authors Index 216 217 Figures 1.1 2.1 2.2 2.3 2.4 2.5 4.1 5.1 5.2 5.3 5.4 6.1 6.2 6.3 6.4 6.5 6.6 7.1 7.2 7.3 8.1 9.1 9.2 10.1 10.2 12.1 12.2 12.3 12.4 Airbnb global listing growth Digital transformation from linear to non-linear Platform-powered businesses in top 10 FT Global 500 (market cap) Platform companies by region The top 10 most valuable brands in the world Private value of platform-powered unicorn start-ups Network effects The linear firm Michael Porter’s value chain (1985) The business model canvas The rocket model Amazon’s main business lines Relative growth of Amazon retail vs Amazon marketplace Apple’s main business lines Split of Apple’s hardware vs services revenues Apple App Store billings vs Hollywood US box office revenues Google’s main business lines The main life stages of a platform Pre-launch rocket questions Direct vs indirect platforms Ignition rocket questions to achieve platform fit Scaling-up the rocket to reach critical mass User and producer acquisition sources Defending platform position and growing profitably Assessing platform changes on users and producers Trust survey The trust bank The 7Cs of trust BlaBlaCar driver profile 2 12 13 14 15 16 34 42 43 44 46 59 60 64 65 66 68 74 77 81 92 106 109 122 129 154 155 157 158 Figures 12.5 13.1 13.2 13.3 14.1 eBay star system NYC taxi medallion prices (2004–2015) Google’s proposed search page during negotiations Uber banner appearing in Google Maps mobile searches Denial ix 160 179 184 185 195 Tables 1.1 1.2 2.1 3.1 5.1 6.1 6.2 6.3 7.1 7.2 7.3 7.4 8.1 9.1 9.2 9.3 10.1 10.2 11.1 11.2 13.1 Comparison of the largest hotel groups vs Airbnb Examples of digital platforms High-level typology of sharing economy platforms Simplified typology of platform and non-platform business models Economic strengths and weaknesses of selected business models Amazon’s ecosystem Apple’s ecosystem Google’s ecosystem Rocket life stages Hilton’s value proposition Deconstructing value propositions for multisided businesses Airbnb’s value propositions Examples of performance metrics at platform ignition Illustrative value proposition for a scaling product marketplace Mapping interactions between participants – eBay illustration Examples of performance metrics at platform scaling Upwork’s new sliding fee structure Examples of performance metrics at platform maturity Seller fee examples Matching platform objectives with pricing levers and examples Selected Google services and some competitors’ services by year of launch 3 5 17 27 53 60 64 69 75 82 82 83 101 113 116 117 126 127 146 150 181 About the authors Benoit Reillier is managing director and co-founder of Launchworks and specializes in helping businesses design and execute winning strategies to harness the power of communities and platform ecosystems.

If early participants are tech-savvy early adopters who like the platform concept, they won’t mind the imperfections and will willingly provide feedback to improve the user experience. As the number of participants is still small, the launch team should be able to be right on top of any participant issue that comes up, and update governance rules on an ongoing basis. Participants should feel Table 8.1 Examples of performance metrics at platform ignition Platform fit • Engagement: % of sign-ups that search, connect, transact • Customer feedback: particularly qualitative • Customer retention: % of users that remain active, ‘retention curves’ Clearing core interaction bottlenecks Liquidity: • Ratio of active users (producers) to total users (producers) and ratio of active users to active producers • Number of active users/producers vs minimum liquidity target • On-boarding completion rates • Fulfilment completion rates and time (e.g. suppliers delivering goods on time), waiting times for users (e.g.

How much capital is required to support growth over the next few months? (See Chapter 8 on ignition for a list of metrics.) A few generic metric examples are given in Table 9.3. Note that the ‘North Star’, introduced in Chapter 7, should ideally continue to act as the overarching growth metric for the business.16 Table 9.3 Examples of performance metrics at platform scaling Attracting participants • • • •‘ • • Growth rate of interactions Growth rate of active users and producers Cohort analysis of growth, % of interactions from ‘new’ users New/marginal’ user feedback Viral coefficient, breakdown of paid vs organic viral growth Retention curves, frequency/number of interactions per user Liquidity and balance • Number of/ratio of active producers/users above some activity threshold (e.g.

Investing Amid Low Expected Returns: Making the Most When Markets Offer the Least
by Antti Ilmanen
Published 24 Feb 2022

This book's subtitle reflects the same aspiration applied to the current market environment. I dedicated my first book to Rory's memory. 6 The historical average return levels depend crucially on the leverage and volatility applied to these long/short strategies. Sharpe ratio (SR) is a more robust (scale-invariant) performance metric than average return. I will thus use SRs extensively, despite their own shortcomings. As a reminder, SR is the ratio of average over volatility for any investment's excess return over cash. Here I conservatively target bond-like 5% volatility per style, which gives 2.5% to 4.5% average premium per style. 7 I found 30-odd years ago that I belong to the majority who naturally read text and gloss over equations in any article, while most of my peers in the Finance Ph.D. program were inclined to do the opposite.

The best resource on historical equity premia are Dimson-Marsh-Staunton yearbooks, which by 2021 cover up to 90 countries, though 32 included in the main results and “only” 21 having the full 121-year history, 1900–2020. For global equities, the total annual compound return or geometric mean is 8.3% (9.7% arithmetic mean7). The corresponding real return is 5.3% (6.7%), equity premium over US bills 4.4% (5.9%), equity premium over bonds 3.1% (4.3%), and SR 0.35.8 Figure 4.4 shows four different performance metrics for five multi-country composites: World (the middle bar), its split to the US and World-ex-US, and an alternative split to Developed and Emerging Markets.9 Whichever metric we use, equities have a proud history behind them, with statistically and economically significant positive premia.10 Figure 4.4 Average Compound Returns and Premia for Global Equities, 1900–2020 Source: Data from Dimson-Marsh-Staunton (2021).

Lack of mark-to-market pricing implies smoother returns, which may result in understated risk (volatility, drawdowns, equity market betas, and correlations). Many investors like this feature and it may have asset pricing implications. There is often less data in length, frequency, and quality in illiquid assets. Performance metrics based on internal rates of return (IRR) can be particularly problematic (opaque and gameable), and investors learn more slowly from them. Lack of investable indices means that investors must be active in illiquid assets, and the performance dispersion among managers is much wider than in traditional assets.

pages: 321

Finding Alphas: A Quantitative Approach to Building Trading Strategies
by Igor Tulchinsky
Published 30 Sep 2019

Sample only 3,000k 2,500k 1,500k PNL Sharpe ratio 2,000k 1,000k 500k 0 –500k 2013–07 2014–01 2014–07 2015–01 2015–07 2016–01 2016–07 2017–01 2017–07 2018–01 Figure 31.2 1 PnL graph for sample WebSim alpha1 Alpha = rank (sales/assets). Table 31.1 Performance metrics for sample WebSim alpha in Figure 31.2 260 Finding Alphas In addition, numerous metrics are displayed, giving the user an opportunity to evaluate the aggregate performance of the alpha, as shown in Table 31.1. These performance metrics reflect the distribution of capital across the stocks and the alpha’s performance, including the annual and aggregate PnL, Sharpe ratio, turnover, and other parameters. The first thing to consider is whether the alpha is profitable.

Each trade in or out of a position carries transaction costs (fees and spread costs). If the turnover number is high – for example, over 40% – the transaction costs may eradicate some or all of the PnL that the alpha generated during simulation. The other performance metrics and their uses in evaluating alpha performance are discussed in more detail in the WebSim user guides and in videos in the educational section of the website. In addition to the aggregate performance metrics, WebSim data visualization charts and graphs help to confirm that an alpha has an acceptable distribution of positions and returns across equities grouped by capitalization, industry, or sector.

We can use the same mean-reversion idea mentioned above and express it in terms of a mathematical expression as follows: Alpha1 close today close 5 _ days _ ago / close 5 _ days _ ago To find out if this idea works, we need a simulator to do backtesting. We can use WebSim for this purpose. Using WebSim, we get the sample results for this alpha, as shown in Figure 5.1. Table 5.1 shows several performance metrics used to evaluate an alpha. We focus on the most important metrics. The backtesting is done from 2010 through 2015, so each row of the output lists the annual performance of that year. The total simulation book size is always fixed at $20 million; the PnL is the annual PnL. Cumulative profit $15MM $10MM $5MM 1 /1 2 04 /1 2 07 /1 2 10 /1 2 01 /1 3 04 /1 3 07 /1 3 10 /1 3 01 /1 4 04 /1 4 07 /1 4 10 /1 4 01 /1 5 01 1 /1 10 1 /1 07 /1 1 Figure 5.1 04 0 /1 01 0 /1 10 /1 07 01 /1 0 $0 Sample simulation result of Alpha1 by WebSim How to Develop an Alpha: A Case Study35 Annual return is defined as: Ann_return ann_pnl / booksize / 2 The annual return measures the profitability of the alpha.

pages: 597 words: 119,204

Website Optimization
by Andrew B. King
Published 15 Mar 2008

Tracking and Metrics You should track the success of all PPC elements through website analytics and conversion tracking. Google offers a free analytics program called Google Analytics. With it you can track multiple campaigns and get separate data for organic and paid listings. Whatever tracking program you use, you have to be careful to keep track of performance metrics correctly. The first step in optimizing a PPC campaign is to use appropriate metrics. Profitable campaigns with equally valued conversions might be optimized to: Reduce the CPC given the same (or greater) click volume and conversion rates. Increase the CTR given the same (or a greater) number of impressions and the same (or better) conversion rates.

According to the study, "Approximately 31 percent of U.S. computer users clear their first-party cookies in a month " Under these conditions, a server-centric measurement would overestimate unique visitors by 150%. [166] PathLoss is a metric developed by Paul Holstein of Web Performance Metrics At first glance, measuring the speed of a web page seems straightforward. Start a timer. Load up the page. Click Stop when the web page is "ready." Write down the time. For users, however, "ready" varies across different browsers on different connection speeds (dial-up, DSL, cable, LAN) at different locations (Washington, DC, versus Mountain View, California, versus Bangalore, India) at different times of the day (peak versus off-peak times) and from different browse paths (fresh from search results or accessed from a home page).

IBM Page Detailer IBM Page Detailer is a Windows tool that sits quietly in the background as you browse. It captures snapshots of how objects are loading on the page behind the scenes. Download it from IBM Page Detailer captures three basic performance metrics: load time, bytes, and items. These correlate to the Document Complete, kilobytes received, and number of requests metrics we are tracking. We recommend capturing three to five page loads and averaging the metrics to ensure that no anomalies impacted performance in the data, such as a larger ad.

pages: 372 words: 67,140

Jenkins Continuous Integration Cookbook
by Alan Berg
Published 15 Mar 2012

Consider storing User-Agents and other browser headers in a textfile, and then picking the values up for HTTP requests through the CSV Data Set Config element. This is useful if resources returned to your web browser, such as JavaScript or images, depend on the User-Agents. JMeter can then loop through the User-Agents, asserting that the resources exist. See also Reporting JMeter performance metrics Functional testing using JMeter assertions Reporting JMeter performance metrics In this recipe, you will be shown how to configure Jenkins to run a JMeter test plan, and then collect and report the results. The passing of variables from an Ant script to JMeter will also be explained. Getting ready It is assumed that you have run through the last recipe, Creating JMeter test plans.

Testing Remotely In this chapter, we will cover the following recipes: Deploying a WAR file from Jenkins to Tomcat Creating multiple Jenkins nodes Testing with Fitnesse Activating Fitnesse HtmlUnit Fixtures Running Selenium IDE tests Triggering failsafe integration tests with Selenium Webdriver Creating JMeter test plans Reporting JMeter performance metrics Functional testing using JMeter assertions Enabling Sakai web services Writing test plans with SoapUI Reporting SoapUI test results Introduction By the end of this chapter, you will have ran performance and functional tests against web applications and web services. Two typical setup recipes are included.

This approach is especially important when starting from an HTML mockup of a web application, whose underlying code is changing rapidly. The test plan logs in and out of your local instance of Jenkins, checking size, duration, and text found in the login response. Getting ready We assume that you have already performed the Creating JMeter test plans and Reporting JMeter performance metrics recipes. The recipe requires the creation of a user tester1 in Jenkins. Feel free to change the username and password. Remember to delete the test user once it is no longer needed. How to do it... Create a user in Jenkins named tester1 with password testtest. Run JMeter. In the Test Plan element, change Name to LoginLogoutPlan, and add the following details for User Defined Variables:Name: USER; Value:tester1 Name: PASS; Value:testtest Right-click on Test Plan, then select Add | Config Element | HTTP cookie Manager.

pages: 502 words: 107,510

Natural Language Annotation for Machine Learning
by James Pustejovsky and Amber Stubbs
Published 14 Oct 2012

Again, using some of the features that are identified in Natural Language Processing with Python, we have:[2] F1: last_letter = “a” F2: last_letter = “k” F3: last_letter = “f” F4: last_letter = “r” F5: last_letter = “y” F6: last_2_letters = “yn” Choose a learning algorithm to infer the target function from the experience you provide it with. We will start with the decision tree method. Evaluate the results according to the performance metric you have chosen. We will use accuracy over the resultant classifications as a performance metric. But, now, where do we start? That is, which feature do we use to start building our tree? When using a decision tree to partition your data, this is one of the most difficult questions to answer. Fortunately, there is a very nice way to assess the impact of choosing one feature over another.

We will assume that target function is represented as the MAP of the Bayesian classifier over the features. Choose a learning algorithm to infer the target function from the experience you provide it with. This is tied to the way we chose to represent the function, namely: Evaluate the results according to the performance metric you have chosen. We will use accuracy over the resultant classifications as a performance metric. Sentiment classification Now let’s look at some classification tasks where different feature sets resulting from richer annotation have proved to be helpful for improving results. We begin with sentiment or opinion classification of texts.

In particular, we will answer the following question: when does annotation actually help in a learning algorithm? Defining Our Learning Task To develop an algorithm, we need to have a precise representation of what we are trying to learn. We’ll start with Tom Mitchell’s [1] definition of a learning task: Learning involves improving on a task, T, with respect to a performance metric, P, based on experience, E. Given this statement of the problem (inspired by Simon’s concise phrasing shown earlier), Mitchell then discusses the five steps involved in the design of a learning system. Consider what the role of a specification and the associated annotated data will be for each of the following steps for designing a learning system: Choose the “training experience.”

pages: 233 words: 67,596

Competing on Analytics: The New Science of Winning
by Thomas H. Davenport and Jeanne G. Harris
Published 6 Mar 2007

Unintegrated systems 2 Localized analytics Autonomous activity builds experience and confidence using analytics; creates new analytically based insights Disconnected, very narrow focus Pockets of isolated analysts (may be in finance, SCM, or marketing/CRM) Functional and tactical Desire for more objective data, successes from point use of analytics start to get attention Recent transaction data unintegrated, missing important information. Isolated BI/analytic efforts 3 Analytical aspirations Coordinated; establish enterprise performance metrics, build analytically based insights Mostly separate analytic processes. Building enterpriselevel plan Analysts in multiple areas of business but with limited interaction Executive—early stages of awareness of competitive possibilities Executive support for fact-based culture—may meet considerable resistance Proliferation of BI tools.

Bolstered by a series of smaller successes, management should set its sights on using analytics in the company’s distinctive capability and addressing strategic business problems. For the first time, program benefits should be defined in terms of improved business performance and care should be taken to measure progress against broad business objectives. A critical element of stage 3 is defining a set of achievable performance metrics and putting the processes in place to monitor progress. To focus scarce resources appropriately, the organization may create a centralized “business intelligence competency center” to foster and support analytical activities. In stage 3, companies will launch their first major project to use analytics in their distinctive capability.

The team realized that a major obstacle to building an enterprise-level analytical capability would be resistance from department heads. Their performance measures were based on the assets of their departments, not on enterprise-wide metrics. The bank’s senior management team responded by introducing new performance metrics that would assess overall enterprise performance (including measures related to asset size and profitability) and cross-departmental cooperation. These changes cleared the path for an enterprise-wide initiative to improve BankCo’s analytical orientation, beginning with the creation of an integrated and consistent customer database (to the extent permitted by law) as well as coordinated retail, trust, and brokerage marketing campaigns.

Unknown Market Wizards: The Best Traders You've Never Heard Of
by Jack D. Schwager
Published 2 Nov 2020

Over a decade later, Brandt’s desire to trade re-emerged, and he once again was very successful. The moral is: Be sure you really want to trade. And don’t confuse wanting to be rich with wanting to trade. Unless you love the endeavor, you are unlikely to succeed. * * * 1 See Appendix 2, Performance Metrics, for an explanation of the adjusted Sortino ratio and how it differs from the conventionally calculated Sortino ratio. 2 See Appendix 2 for an explanation of this performance metric. 3 Jack D. Schwager, A Complete Guide to the Futures Market (New Jersey, John Wiley and Sons, Inc., 2017), 205–231. 4 Jack D. Schwager, Market Wizards (New Jersey, John Wiley and Sons, Inc., 2012), 9–82. 5 This three-word designation is a misnomer in at least two ways.

Schwager Contents Preface Acknowledgments Part I: Futures Traders Peter Brandt: Strong Opinions, Weakly Held Jason Shapiro: The Contrarian Richard Bargh: The Importance of Mindset Amrit Sall: The Unicorn Sniper Daljit Dhaliwal: Know Your Edge John Netto: Monday Is My Favorite Day Part II: Stock Traders Jeffrey Neumann: Penny Wise, Dollar Wise Chris Camillo: Neither Marsten Parker: Don’t Quit Your Day Job Michael Kean: Complementary Strategies Pavel Krejčí: The Bellhop Who Beat the Pros Conclusion: 46 Market Wizard Lessons Epilogue Appendix 1: Understanding the Futures Markets Appendix 2: Performance Metrics Publishing details Other books by Jack D. Schwager Market Wizards: Interviews with Top Traders The New Market Wizards: Conversations with America’s Top Traders Hedge Fund Market Wizards: How Winning Traders Win Stock Market Wizards: Interviews with America’s Top Stock Traders The Little Book of Market Wizards: Lessons from the Greatest Traders A Complete Guide to the Futures Markets: Technical Analysis, Trading Systems, Fundamental Analysis, Options, Spreads and Trading Principles Market Sense and Nonsense: How the Markets Really Work (and How They Don’t) Schwager on Futures: Technical Analysis Schwager on Futures: Fundamental Analysis Schwager on Futures: Managed Trading Myths and Truths To Aspen The Next Generation May you have the charm, beauty, and sense of humor of both your parents and the spending sense of neither.

Winning trades (or smaller losing trades, as was the case in this example) can be bad trades if they violate trading and risk control rules that have been responsible for a trader’s longer-term success. Similarly, losing trades can be good trades if the trader followed a process that has demonstrated efficacy in generating profits with acceptable risk. * * * 12 See Appendix 2 for an explanation of these performance metrics. Amrit Sall: The Unicorn Sniper AMRIT Sall has one of the best track records I have ever encountered. Over a 13-year career, Sall has achieved an average annual compounded return of 337% (yes, that’s annual, not cumulative). And return is not even the most impressive aspect of his performance.

pages: 294 words: 77,356

Automating Inequality
by Virginia Eubanks

According to court documents, in December 2007 just over 11,000 documents were unindexed. By February 2009, nearly 283,000 documents had disappeared, an increase of 2,473 percent. The rise in technical errors far outpaced increased system use. The consequences are staggering if you consider that any single missing document could cause an applicant to be denied benefits. Performance metrics designed to speed eligibility determinations created perverse incentives for call center workers to close cases prematurely. Timeliness could be improved by denying applications and then advising applicants to reapply, which required that they wait an additional 30 or 60 days for a new determination.

By removing human discretion from frontline social servants and moving it instead to engineers and private contractors, the Indiana experiment supercharged discrimination. The “social specs” for the automation were based on time-worn, race- and class-motivated assumptions about welfare recipients that were encoded into performance metrics and programmed into business processes: they are lazy and must be “prodded” into contributing to their own support, they are sneaky and prone to fraudulent claims, and their burdensome use of public resources must be repeatedly discouraged. Each of these assumptions relies on, and is bolstered by, race- and class-based stereotypes.

In Indiana, the combination of eligibility automation and privatization achieved striking reductions in the welfare rolls. Cumbersome administrative processes and unreasonable expectations kept people from accessing the benefits they were entitled to and deserved. Brittle rules and poorly designed performance metrics meant that when mistakes were made, they were always interpreted as the fault of the applicant, not the state or the contractor. The assumption that automated decision-making tools were infallible meant that computerized decisions trumped procedures intended to provide applicants with procedural fairness.

pages: 318 words: 78,451

Kanban: Successful Evolutionary Change for Your Technology Business
by David J. Anderson
Published 6 Apr 2010

If it is necessary, it should be a significantly looser agreement than that offered for standard class items, for example, 60 days with 50 percent due-date performance. Determining a Service Delivery Target In the example set of classes of service above, the Standard class of service used a target lead time, for example, 28 days (4 weeks). The concept of offering a target lead time coupled with a due-date performance metric is an alternative to treating each item individually and having to estimate and commit to a delivery date for each item. The service-level agreement allows us to avoid costly activities, such as estimation; low-trust activities, such as making commitments; and to spread risk by aggregating a large collection of requests and promising only aggregate performance in the form of a percentage due-date performance.

If you perform a spectral analysis of some historical data and can see that perhaps 70 percent are delivered within 28 days, and the remaining 30 percent spread out over another 100 days, then perhaps it’s reasonable to suggest a target delivery date of 28 days. I’ve learned that the use of classes of service is a very powerful technique. With my team in 2007, approximately 30 percent of all requests were late compared to the target lead time. We reported this as the Due Date Performance metric. It was never above 70 percent. However, despite this dismal performance versus the target date, we had very few complaints. The reasons for this became evident: All the important items—those with high risk or high value—were always on time, and there was a trust that the late ones would be delivered within an additional two or four weeks, as deliveries were happening with dependable regularity.

Figure 12.4 Example of report showing mean lead time and due date performance Due Date Performance I’ve found it useful to report Due Date Performance for the most recent month and for the year to date. You may also want to report performance year-on-year (or 12 months ago) for comparison. Hence, it is useful to have 13 months of data. With the Fixed Delivery Date class of service items, you can include these in the Due Date Performance metric. In this case, you are answering the question, “Was the item delivered on time?” However, although you will have a lead time recorded, that in itself is not as interesting as comparing the estimated lead time to the actual. Estimate versus actual demonstrates how predictable the team is and how well they are performing with Fixed Delivery Date service items.

pages: 280 words: 82,355

Extreme Teams: Why Pixar, Netflix, AirBnB, and Other Cutting-Edge Companies Succeed Where Most Fail
by Robert Bruce Shaw , James Foster and Brilliance Audio
Published 14 Oct 2017

The vetting of new members is treated seriously because teams are rewarded in Whole Foods based on team performance in areas such as overall sales and profit per labor hour. A team bonus is paid monthly, which can result in thousands of extra dollars each year for the members of a successful group.4 Whole Foods then goes one step further. It posts each team’s monthly results for everyone to see. A produce team, for example, will see how it stacks up on key performance metrics compared to the meat or seafood teams within its own store. Team leaders can also compare their team’s performance against other teams across a region. New team members who do not pull their weight pose two risks. First, poor performers can reduce the bonus pay of all team members if the team’s results suffer.

For instance, research shows that some people will work less diligently when part of a team, allowing others in their group to compensate for their lack of effort. Social scientists call this the “freeloader” or “social loafing” problem.16 In these situations, a few team members contribute less than others and yet benefit from being part of a team where others make up for their shortcomings. Whole Foods deals with this problem by having clear performance metrics and team-level rewards. These practices, along with other informal methods such as peer feedback, increase the likelihood that everyone will contribute to the success of his or her team. New hires at Whole Foods quickly learn that they are not simply employees of the company or accountable only to their managers—they are, above all else, working for each other with financial and reputational consequences if they don’t perform.

TAKEAWAYS Cutting-edge firms actively communicate the broader context to their members (market opportunities and threats, financial realities . . . ). They then clarify their vital few strategic priorities—the three or four goals that must be achieved to move the firm or team forward. These priorities are defined in a manner that ensures that everyone knows what success looks like, including performance metrics and accountabilities. Cutting-edge firms, however, also understand that too much focus can be self-defeating—thus, they foster ongoing experimentation in an attempt to identify innovative customer and revenue opportunities. CHAPTER 5 PUSH HARDER, PUSH SOFTER Every Great Culture Embraces a Great Contradiction Most firms operate with either a hard or soft edge.1 Those with a hard edge emphasize the need for clear performance targets, disciplined practices, and absolute accountability for results.

pages: 514 words: 111,012

The Art of Monitoring
by James Turnbull
Published 1 Dec 2014

This separation allows us to individually manage each plugin and lends itself to management with a configuration management tool like Puppet, Chef, or Ansible. Now let's configure each plugin. The cpu plugin The first plugin we're going to configure is the cpu plugin. The cpu plugin collects CPU performance metrics on our hosts. By default, the cpu plugin emits CPU metrics in Jiffies: the number of ticks since the host booted. We're going to also send something a bit more useful: percentages. First, we're going to create a file to hold our plugin configuration. We'll put it into the /etc/collectd.d directory.

Diamond — An open-source metrics collector originally written by Brightcove but now maintained by a wider community. Fullerite — An open-source metrics collector written by the Yelp Engineering team. It's written in Go and designed for large scale metrics collection. PCP and Vector — Used by Netflix this combination provides high resolution on host performance metrics suitable for diagnostics. sumd — A lightweight Python collector that allows you to run processes, for example Nagios plugins, locally and send the results to Riemann. Note There's also some overlap with these tools and the collection and graphing tools we looked at in Chapters 3 and 4. Summary In our Riemann configuration we saw how we can make use of this data to monitor our hosts and their components, and how we can notify on specific events or thresholds.

The application architecture can require understanding the interconnection between multiple containers, instances, and hosts. Added to this, the lifespan of a container might be in seconds or minutes. This makes the traditional monitoring techniques used for a single host or instance problematic. From a monitoring perspective there are three major issues with this new host model: Convergence and dynamism. Performance Metric volume Let's first talk about convergence and dynamism. The speed and limited lifespan means a lot of churn in your monitoring configuration: hosts appearing and disappearing quickly. Sometimes a host will even appear and disappear before your monitoring environment is aware of it. In many monitoring environments your configuration is applied after the installation of the host or service, either manually or via a configuration management tool like Puppet or Chef.

Digital Transformation at Scale: Why the Strategy Is Delivery
by Andrew Greenway,Ben Terrett,Mike Bracken,Tom Loosemore
Published 18 Jun 2018

Some businesses measure hundreds of different variables in their quest for profitability. Most governments tend to be similarly thorough, with the added complication of managing multiple desired outcomes at the same time, where the operational measures often fail to match up with lofty political goals. In the UK, to keep things simple, we selected four performance metrics: digital take-up, completion rate, cost per transaction and user satisfaction. We could have picked more. Four was a manageable number, and effectively covered the bases for the GDS’s primary strategic aims: getting more people to use online government services, building services that worked first time, saving money and meeting user needs.

In government, measuring user satisfaction picks up false signals: about how happy people are about paying tax, even about how happy they are with the government’s political performance in general. These are not things that any digital service team can do anything about. In the end, the most reliable way to measure user satisfaction was in the research lab, watching real people use the service. This was difficult to scale, but always worth the effort. The GDS’s choice of four performance metrics acted as useful pointers for stories to celebrate or worries to address. They weren’t designed to provide the people managing the services day to day with all the detailed insight needed to make incremental improvements to services; more detailed web analytics packages delivered that. What they offered was an indication of relative progress, and a measure of momentum.

If these priorities had been reversed – saving money before meeting needs – it is unlikely the users would get much of a look in. Summary Write a list of all the services your organisation provides and use it to gauge where digital change can have the biggest impact for users. Choose performance metrics that give clues as to how well you are meeting user needs; these may differ from organisational objectives. Use metrics to judge velocity of change, rather than setting hard targets. Make an economic case for applying digital transformation to your organisation. Move away from spreadsheet data requests to automated real-time data collection as fast as you can

pages: 351 words: 123,876

Beautiful Testing: Leading Professionals Reveal How They Improve Software (Theory in Practice)
by Adam Goucher and Tim Riley
Published 13 Oct 2009

The performance test cases, however, were renamed “Performance Testing Checkpoints” and included the following (abbreviated here): 42 CHAPTER FOUR • Collect baseline system performance metrics and verify that each functional task included in the system usage model achieves performance requirements under a user load of 1 for each performance testing build in which the functional task has been implemented. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that each functional task included in the system usage model achieves performance requirements under a user load of 10 for each performance testing build in which the functional task has been implemented. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements under the following loads to the degree that the usage model has been implemented in each performance testing build. — [Increasing loads from 100 users to 3,000 users, listed one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements for the duration of a 9-hour, 1,000-user stress test on performance testing builds that the lead developer, performance tester, and project manager deem appropriate.

. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that each functional task included in the system usage model achieves performance requirements under a user load of 10 for each performance testing build in which the functional task has been implemented. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements under the following loads to the degree that the usage model has been implemented in each performance testing build. — [Increasing loads from 100 users to 3,000 users, listed one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements for the duration of a 9-hour, 1,000-user stress test on performance testing builds that the lead developer, performance tester, and project manager deem appropriate.

Now understanding the intent, I suggested that Harold schedule a conference room for a few hours for us to discuss his task further. He agreed. As it turned out, it took more than one meeting for Harold to explain to me the client’s expectations, the story behind his task, and for me to explain to Harold why we didn’t want to be contractually obligated to performance metrics that were inherently ambiguous, what those ambiguities were, and what we could realistically measure that would be valuable. Finally, Harold and I took what were now several sheets of paper with the following bullets to Sandra, our project manager, to review: “System Performance Testing Requirements: • Performance testing will be conducted under a variety of loads and usage models, to be determined when system features and workflows are established

pages: 217 words: 63,287

The Participation Revolution: How to Ride the Waves of Change in a Terrifyingly Turbulent World
by Neil Gibb
Published 15 Feb 2018

The pursuit of happiness 14. Together 15. Home III. How it works Framework 1. Create a cause A new kind of leadership Bank to the future The non-linear business model 2. Mobilise a movement Weapons of mass participation The Art of transformation Analytics and performance metrics 3. Build a community Together That thing we most seek Social economics IV. Into action A call to action Manifesto An open-source tool kit “Tomorrow belongs to those who can hear it coming” David Bowie I. Introduction When things fall apart “You can’t stop the waves, but you can learn to surf” Jon Kabat-Zinn Galileo Galilei was a clever lad.

Letting it go can be hard and painful, which is why we often hang on. At times, the process of transformation can feel a lot like grief. It is difficult and disruptive. But context changes everything. When we have a big why – something that strikes us as worthwhile, meaningful and important to us – our experience is transformed. 3. Analytics and performance metrics “Out of this crisis, there could be a rebirth of economics. I’m not someone who would say that all that’s been done in the past is terrible. It’s just that the models we had were rather narrow and fragile. The problem came when the world was tipped upside down and those models were ill-equipped to making sense of behaviours” Andrew Haldane, chief economist, Bank of England, 2017 When Andy Haldane addressed the Institute of Government in London in early 2017, he described his profession’s inability to foresee the collapse of Lehman Brothers or the ensuing global financial crisis as its “Michael Fish moment” – referring to an infamous incident in 1987 when a BBC weather forecaster confidently predicted that a hurricane was going to miss the UK, only for it to hit the country with full-force the next day, causing devastation and mayhem; the worst storm in a century.

You can’t assess the value of open-source software, intellectual crowd-sourcing, peer-based wellbeing, or businesses like Facebook on the value of what they produce. The key metric of corporations – productivity – is a measure that has been flatlining for the best part of a decade. The reason for this is that it isn’t measuring what is driving the new economy. Productivity as a performance metric emerged out of the factory system, where value was calculated by assessing the price of the end product in relation to the cost of the materials, labour, and processes that made it. It was a linear calculation. Enterprises in the emerging paradigm don’t work like that. They are not linear, they are networked.

pages: 571 words: 105,054

Advances in Financial Machine Learning
by Marcos Lopez de Prado
Published 2 Feb 2018

In this scheme: The dataset is partitioned into k subsets. For i = 1,…,k The ML algorithm is trained on all subsets excluding i. The fitted ML algorithm is tested on i. Figure 7.1 Train/test splits in a 5-fold CV scheme The outcome from k-fold CV is a kx1 array of cross-validated performance metrics. For example, in a binary classifier, the model is deemed to have learned something if the cross-validated accuracy is over 1/2, since that is the accuracy we would achieve by tossing a fair coin. In finance, CV is typically used in two settings: model development (like hyper-parameter tuning) and backtesting.

The implication is that sample weighted cross-entropy loss estimates the classifier's performance in terms of variables involved in a PnL (mark-to-market profit and losses) calculation: It uses the correct label for the side, probability for the position size, and sample weight for the observation's return/outcome. That is the right ML performance metric for hyper-parameter tuning of financial applications, not accuracy. When we use log loss as a scoring statistic, we often prefer to change its sign, hence referring to “neg log loss.” The reason for this change is cosmetic, driven by intuition: A high neg log loss value is preferred to a low neg log loss value, just as with accuracy.

Most backtests published in journals are flawed, as the result of selection bias on multiple tests (Bailey, Borwein, López de Prado, and Zhu [2014]; Harvey et al. [2016]). A full book could be written listing all the different errors people make while backtesting. I may be the academic author with the largest number of journal articles on backtesting1 and investment performance metrics, and still I do not feel I would have the stamina to compile all the different errors I have seen over the past 20 years. This chapter is not a crash course on backtesting, but a short list of some of the common errors that even seasoned professionals make. 11.2 Mission Impossible: The Flawless Backtest In its narrowest definition, a backtest is a historical simulation of how a strategy would have performed should it have been run over a past period of time.

pages: 354 words: 26,550

High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems
by Irene Aldridge
Published 1 Dec 2009

Kurtosis indicates whether the tails of the distribution are normal; high kurtosis signifies “fat tails,” a higher than normal probability of extreme positive or negative events. COMPARATIVE RATIOS While average return, standard deviation, and maximum drawdown present a picture of the performance of a particular trading strategy, the measures do not lend to an easy point comparison among two or more strategies. Several comparative performance metrics have been developed in an attempt to summarize mean, variance, and tail risk in a single number that can be used to compare different trading strategies. Table 5.1 summarizes the most popular point measures. The first generation of point performance measures were developed in the 1960s and include the Sharpe ratio, Jensen’s alpha, and the Treynor ratio.

Of course, the original VaR assumes normal distributions of returns, whereas the returns are known to be fat-tailed. To address this issue, a modified VaR (MVaR) measure was proposed by Gregoriou and Gueyie (2003) and takes into account deviations from normality. Gregoriou and Gueyie (2003) also suggest using MVaR in place of standard deviation in Sharpe ratio calculations. How do these performance metrics stack up against each other? It turns out that all metrics deliver comparable rankings of trading strategies. Evaluating Performance of High-Frequency Strategies 57 Eling and Schuhmacher (2007) compare hedge fund ranking performance of the 13 measures listed and conclude that the Sharpe ratio is an adequate measure for hedge fund performance.

Methods for forecast comparisons include: r Mean squared error (MSE) r Mean absolute deviation (MAD) 221 Back-Testing Trading Models r Mean absolute percentage error (MAPE) r Distributional performance r Cumulative accuracy profiling If the value of a financial security is forecasted to be xF,t at some future time t and the realized value of the same security at time t is xR,t , the forecast error for the given forecast, εF,t , is computed as follows: ε F,t = xF,t − x R,t (15.2) The mean squared error (MSE) is then computed as the average of squared forecast errors over T estimation periods, analogously to volatility computation: MSE = T 1  2 ε T τ =1 F,τ (15.3) The mean absolute deviation (MAD) and the mean absolute percentage error (MAPE) also summarize properties of forecast errors: MAD = MAPE = T 1  |ε F,τ | T τ =1  T  1   ε F,τ  x  T R,τ (15.4) (15.5) τ =1 Naturally, the lower each of the three metrics (MSE, MAD, and MAPE), the better the forecasting performance of the trading system. The distributional evaluation of forecast performance also examines forecast errors ε F,t normalized by the realized value, x R,t . Unlike MSE, MAD, and MAPE metrics, however, the distributional performance metric seeks to establish whether the forecast errors are random. If the errors are indeed random, there exists no consistent bias in either of price   direction ε F,t movement, and the distribution of normalized errors xR,t should fall on the uniform [0, 1] distribution. If the errors are nonrandom, the forecast can be improved.

pages: 719 words: 181,090

Site Reliability Engineering: How Google Runs Production Systems
by Betsy Beyer , Chris Jones , Jennifer Petoff and Niall Richard Murphy
Published 15 Apr 2016

A given set of production dependencies can be shared, possibly with different stipulations around intent. Performance metrics Demand for one service trickles down to result in demand for one or more other services. Understanding the chain of dependencies helps formulate the general scope of the bin packing problem, but we still need more information about expected resource usage. How many compute resources does service Foo need to serve N user queries? For every N queries of service Foo, how many Mbps of data do we expect for service Bar? Performance metrics are the glue between dependencies. They convert from one or more higher-level resource type(s) to one or more lower-level resource type(s).

Of course, when an area of uncertainty resolves into a fault, you need to select additional branch points. Testing Scalable Tools As pieces of software, SRE tools also need testing.10 SRE-developed tools might perform tasks such as the following: Retrieving and propagating database performance metrics Predicting usage metrics to plan for capacity risks Refactoring data within a service replica that isn’t user accessible Changing files on a server SRE tools share two characteristics: Their side effects remain within the tested mainstream API They’re isolated from user-facing production by an existing validation and release barrier Barrier Defenses Against Risky Software Software that bypasses the usual heavily tested API (even if it does so for a good cause) could wreak havoc on a live service.

In Google’s experience, services tend to achieve the best wins as they cross to step 3: good degrees of flexibility are available, and the ramifications of this request are in higher-level and understandable terms. Particularly sophisticated services may aim for step 4. Precursors to Intent What information do we need in order to capture a service’s intent? Enter dependencies, performance metrics, and prioritization. Dependencies Services at Google depend on many other infrastructure and user-facing services, and these dependencies heavily influence where a service can be placed. For example, imagine user-facing service Foo, which depends upon Bar, an infrastructure storage service.

pages: 161 words: 39,526

Applied Artificial Intelligence: A Handbook for Business Leaders
by Mariya Yao , Adelyn Zhou and Marlene Jia
Published 1 Jun 2018

They can also provide support to public company leadership who are especially subject to the whims of quarterly earnings reports. Keeping your board educated and updated is essential if you aspire to larger projects. Build An Enterprise-Wide Case For AI Your case for investing in AI and in automation will depend on your champions and stakeholders since they possess different business priorities, performance metrics, technical aptitude, propensity for risk, and political relationships. Presenting a clear ROI on AI initiatives is the best way to persuade executive stakeholders, but this can be challenging when enterprise AI adoption is early and still being proven in many sectors. Many corporations are still completing their big data investments and have yet to broach analytics.

If you’re operating off of a miniscule customer base, then even a 200 percent increase in a key metric may not lead to meaningful boosts to revenue. If the problem is worth pursuing, then how much can better conversion rates and longer lifetime values improve sales volume? If you’re partnering with a vendor, ask them for performance metrics. What results have other clients seen? What is the upper and lower limit of improvements? When did they begin to see results? Decreasing Costs Measuring the ability to reduce costs is another popular way to assess returns on AI investments. AI promises greater operational efficiencies, predominantly in middle and back office functions, such as in legal, finance and accounting, operations, and human resources.

pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline
by Cathy O'Neil and Rachel Schutt
Published 8 Oct 2013

(bin), summarise, mean_p = mean(p), mean_y = mean(y)) fin <- data.frame(bin = summ$bin, mean_p = summ$mean_p, mean_y = summ$mean_y, t) # Get wMAE num = 0 den = 0 for (i in c(1:nrow(fin))) { num <- num + fin$Freq[i] * abs(fin$mean_p[i] - fin$mean_y[i]) den <- den + fin$Freq[i] } wmae <- num / den if (doplot == 1) { plot(summ$bin, summ$mean_p, type = "p", main = paste(title," MAE =", wmae), col = "blue", ylab = "P(C | AD, X)", xlab = "P(C | AD, X)") points(summ$bin, summ$mean_y, type = "p", col = "red") rug(p) } return(wmae) } library(ROCR) get_auc <- function(ind, y) { pred <- prediction(ind, y) perf <- performance(pred, 'auc', fpr.stop = 1) auc <- as.numeric(substr(slot(perf, "y.values"), 1, 8), double) return(auc) } # Get X-Validated performance metrics for a given feature set getxval <- function(vars, data, folds, mae_bins) { # assign each observation to a fold data["fold"] <- floor(runif(nrow(data)) * folds) + 1 auc <- c() wmae <- c() fold <- c() # make a formula object f = as.formula(paste("Y", "~", paste(vars, collapse = "+"))) for (i in c(1:folds)) { train <- data[(data$fold !

= i), ] test <- data[(data$fold == i), ] mod_x <- glm(f, data=train, family = binomial(logit)) p <- predict(mod_x, newdata = test, type = "response") # Get wMAE wmae <- c(wmae, getmae(p, test$Y, mae_bins, "dummy", 0)) fold <- c(fold, i) auc <- c(auc, get_auc(p, test$Y)) } return(data.frame(fold, wmae, auc)) } ############################################################### ########## MAIN: MODELS AND PLOTS ########## ############################################################### # Now build a model on all variables and look at coefficients and model fit vlist <- c("AT_BUY_BOOLEAN", "AT_FREQ_BUY", "AT_FREQ_LAST24_BUY", "AT_FREQ_LAST24_SV", "AT_FREQ_SV", "EXPECTED_TIME_BUY", "EXPECTED_TIME_SV", "LAST_BUY", "LAST_SV", "num_checkins") f = as.formula(paste("Y_BUY", "~" , paste(vlist, collapse = "+"))) fit <- glm(f, data = train, family = binomial(logit)) summary(fit) # Get performance metrics on each variable vlist <- c("AT_BUY_BOOLEAN", "AT_FREQ_BUY", "AT_FREQ_LAST24_BUY", "AT_FREQ_LAST24_SV", "AT_FREQ_SV", "EXPECTED_TIME_BUY", "EXPECTED_TIME_SV", "LAST_BUY", "LAST_SV", "num_checkins") # Create empty vectors to store the performance/evaluation metrics auc_mu <- c() auc_sig <- c() mae_mu <- c() mae_sig <- c() for (i in c(1:length(vlist))) { a <- getxval(c(vlist[i]), set, 10, 100) auc_mu <- c(auc_mu, mean(a$auc)) auc_sig <- c(auc_sig, sd(a$auc)) mae_mu <- c(mae_mu, mean(a$wmae)) mae_sig <- c(mae_sig, sd(a$wmae)) } univar <- data.frame(vlist, auc_mu, auc_sig, mae_mu, mae_sig) # Get MAE plot on single variable - use holdout group for evaluation set <- read.table(file, header = TRUE, sep = "\t", row.names="client_id") names(set) split<-.65 set["rand"] <- runif(nrow(set)) train <- set[(set$rand <= split), ] test <- set[(set$rand > split), ] set$Y <- set$Y_BUY fit <- glm(Y_BUY ~ num_checkins, data = train, family = binomial(logit)) y <- test$Y_BUY p <- predict(fit, newdata = test, type = "response") getmae(p,y,50,"num_checkins",1) # Greedy Forward Selection rvars <- c("LAST_SV", "AT_FREQ_SV", "AT_FREQ_BUY", "AT_BUY_BOOLEAN", "LAST_BUY", "AT_FREQ_LAST24_SV", "EXPECTED_TIME_SV", "num_checkins", "EXPECTED_TIME_BUY", "AT_FREQ_LAST24_BUY") # Create empty vectors auc_mu <- c() auc_sig <- c() mae_mu <- c() mae_sig <- c() for (i in c(1:length(rvars))) { vars <- rvars[1:i] vars a <- getxval(vars, set, 10, 100) auc_mu <- c(auc_mu, mean(a$auc)) auc_sig <- c(auc_sig, sd(a$auc)) mae_mu <- c(mae_mu, mean(a$wmae)) mae_sig <- c(mae_sig, sd(a$wmae)) } kvar<-data.frame(auc_mu, auc_sig, mae_mu, mae_sig) # Plot 3 AUC Curves y <- test$Y_BUY fit <- glm(Y_BUY~LAST_SV, data=train, family = binomial(logit)) p1 <- predict(fit, newdata=test, type="response") fit <- glm(Y_BUY~LAST_BUY, data=train, family = binomial(logit)) p2 <- predict(fit, newdata=test, type="response") fit <- glm(Y_BUY~num_checkins, data=train, family = binomial(logit)) p3 <- predict(fit, newdata=test,type="response") pred <- prediction(p1,y) perf1 <- performance(pred,'tpr','fpr') pred <- prediction(p2,y) perf2 <- performance(pred,'tpr','fpr') pred <- prediction(p3,y) perf3 <- performance(pred,'tpr','fpr') plot(perf1, color="blue", main="LAST_SV (blue), LAST_BUY (red), num_checkins (green)") plot(perf2, col="red", add=TRUE) plot(perf3, col="green", add=TRUE) Chapter 6.

Let’s remind ourselves of the various possibilities using the truth table in Table 9-1. Table 9-1. Actual versus predicted table, also called the Confusion Matrix Actual = True Actual = False Predicted = True (true positive) (false positive) Predicted = False (false positive) (false negative) The most straightforward performance metric is Accuracy, which is defined using the preceding notation as the ratio: Another way of thinking about accuracy is that it’s the probability that your model gets the right answer. Given that there are very few positive examples of fraud—at least compared with the overall number of transactions—accuracy is not a good metric of success, because the “everything looks good” model, or equivalently the “nothing looks fraudulent” model, is dumb but has good accuracy.

pages: 304 words: 80,965

What They Do With Your Money: How the Financial System Fails Us, and How to Fix It
by Stephen Davis , Jon Lukomnik and David Pitt-Watson
Published 30 Apr 2016

Elson commented, “Even the best corporate boards will fail to address executive compensation concerns unless they tackle the structural bias created by external peer group benchmarking metrics. … Boards should measure performance and determine compensation by focusing on internal metrics. For example, if customer satisfaction is deemed important to the company, then results of customer surveys should play into the compensation equation. Other internal performance metrics can include revenue growth, cash flow, and other measures of return.”58 In other words, boards should focus, as owners do, on what makes the business flourish. USE THE RIGHT METRICS As discussed earlier, 90 percent of large American companies measure the performance of their executive teams over a three-year period or less.

A., 254n2 BrightScope, 122 Brokers, fiduciary duty and, 256n23 Brooks, David, 167 Buffett, Warren, 45, 63, 64, 80, 150, 221 Business judgment rule, 78–79 Business school curriculum, 190–92 Buy and Hold Is Dead (Again) (Solow), 65 Buy and Hold Is Dead (Kee), 65 Buycott, 118 Cadbury, Adrian, 227 Call option, 93 CalPERS, 91, 110, 111–12, 208, 221, 241n37 CalSTRS, 208 Canada, pension funds in, 59, 111, 209 Capital Aberto (magazine), 117 Capital gains, taxation of, 92 Capital Institute, 59, 87 Capital losses, 92 Capitalism: agency, 33, 74–80 defined, 243n2 Eastern European countries’ transition to, 167 financial system and, 9 injecting ownership back into, 83–93 private ownership and, 62 reforming, 11–12 Carbon Disclosure Project, 89 Career paths, new economic thinking and, 189–90 CDC. See Collective pension plans CDFIs. See Community Development Financial Institutions (CDFIs) CDSs. See Credit default swaps (CDSs) CEM Benchmarking, 54 Central banks, 20, 213 Centre for Policy Studies, 105 CEOs: performance metrics, 68, 86–87 short-term mindset among, 67–68. See also Executive compensation Ceres, 120 CFA Institute, 121 Chabris, Christopher, 174 Charles Schwab, 29, 31 Cheating, regulations and, 144–45 Chinese Academy of Social Sciences, 167 Citadel, 29 Citicorp, 76 Citizen investors/savers, 19 charter for, 227–31 communication between funds and, 110–11 dependence on others to manage money, 5–6, 19, 20 goals of, 48, 49 government regulation to safeguard, 107–9 lack of accountability to, 5–7, 96, 99–106 technology and, 90–92 trading platforms that protect, 88–89 City Bank of Glasgow, 257n34 Civil society organizations (CSOs), 153 corporate accountability and, 119–23 scrutiny of funds by, 224 “Civil Stewardship Leagues,” 122 Clark, Gordon L., 101, 106 Classical economics, 159–61 Clegg, Nick, 9 Clinton, Bill, 68–69 Clinton, Hillary Rodham, 119 Coase, Ronald, 169–70, 243n2, 261n31 Cohen, Lauren, 102 Coles Myer, 82 Collective Defined Contribution (CDC), 266n28 Collective pension plans, 263n1, 266n28 duration of liabilities, 264n3 in Netherlands, 197, 199, 209, 264n6.

See also Retirement savings Pension Trustee Code of Conduct, 121 Pension trustees, 105–6, 108–9, 137–38, 140, 205, 207, 224–25, 229 People’s Pension, 202–11 cost of, 217 enrollment into, 208–9 feedback mechanisms, 207 fees, 204 governance and, 202–3, 205–6 investment interests of beneficiaries, 206–7 models for, 266n28 reform of financial institutions and, 226 transparency and, 203–4, 207–8 Performance: asset managers and, 48–50 defined, 149 encouraging through collective action, 57–58 executive compensation and, 68, 148–49 fees, 239n16 governance and, 100–104 institutional investors and incentives for, 112–13 investment management, 35–38 Performance metrics for executives, 68, 86–87 Perry Capital, 81 PFZW. See Stichting Pensioenfonds Zorg en Welzijn (PFZW) PGGM, 77, 111 Philippon, Thomas, 26–28, 220 Philosophy, Politics and Economics (PPE), 190 Pitman, Brian, 213 Pitt, William, 158 Pitt-Watson, David, 263n1, 264n4, 264–65n11, 266n28 Plender, John, 259n5 Political economy, 142, 152 Political institutions, 183–84 Portfolio management: ownership and, 246n36 pension fund, 208–9 PPE (Philosophy, Politics and Economics), 190 Premium, 22 Price of goods, 160 Principles for Responsible Investment.

Mastering Private Equity
by Zeisberger, Claudia,Prahl, Michael,White, Bowen , Michael Prahl and Bowen White
Published 15 Jun 2017

Every €1m of EBITDA increase is valued at €10m enterprise value and results in €1m of cashflow. The Management Equity Plan accrues 10% of the €10m and of the cashflows—so every €1m of EBITDA increase delivers €1.1m directly to the management pot. This is highly motivating to management. Consequently, they embrace the performance metrics and scrutiny of their private equity investors. They thrive on seeing the EBITDA increase and the net debt go down. A great private equity CEO recognises that to get the best exit value for their business they have to create a business that has a sustainable growth strategy, a strong management team and a consistent track record of financial performance.

Interim Fund Performance The performance of a PE fund is reported to its LPs on a quarterly basis. These quarterly reports offer insight into the value of a fund’s portfolio companies and the overall performance of the fund to date. Exhibit 19.1 shows the basic steps taken to translate the value of a fund’s portfolio companies to its gross and net performance metrics. Exhibit 19.1 Evaluating PE Fund Performance Most limited partnership agreements require that a GP reports the fair market value of a fund’s investments, a fund’s net asset value (NAV) plus its gross multiple of money invested (MoM) and its internal rate of return (IRR) as of the reporting date.

A GP may therefore decide to distribute the shares of publicly held companies in the fund “in-kind,” allowing each LP to act according to its preference.3 Gross Performance With realized and unrealized valuations of its portfolio companies in hand, a GP will calculate a range of fund-level performance metrics, including the fund’s MoM, NAV and IRR. Calculating the MoM of each investment—and ultimately of the fund—is fairly straightforward; it is simply realized plus unrealized equity value divided by the capital invested in the company. Similarly, calculating the NAV is simply the sum of the unrealized equity value in a fund’s portfolio companies.

pages: 297 words: 91,141

Market Sense and Nonsense
by Jack D. Schwager
Published 5 Oct 2012

Figure 3.11 NAV Comparison: Three-Period Prior Best S&P Sector versus Prior Worst and Average Data source: S&P Dow Jones Indices. So far, the analysis has only considered returns and has shown that choosing the best past sector would have yielded slightly lower returns than an equal-allocation approach (that is, the average). Return, however, is an incomplete performance metric. Any meaningful performance comparison must also consider risk (a concept we will elaborate on in Chapter 4). We use two measures of risk here: 1. Standard deviation. The standard deviation is a volatility measure that indicates how spread out the data is—in this case, how broadly the returns vary.

Figure 8.12 2DUC: Manager E versus Manager F Investment Misconceptions Investment Misconception 23: The average annual return is probably the single most important performance statistic. Reality: Return alone is a meaningless statistic because return can always be increased by increasing risk. The return/risk ratio should be the primary performance metric. Investment Misconception 24: For a risk-seeking investor considering two investment alternatives, an investment with expected lower return/risk but higher return may often be preferable to an equivalent-quality investment with the reverse characteristics. Reality: The higher return/risk alternative would still be preferable, even for risk-seeking investors, because by using leverage it can be translated into an equivalent return with lower risk (or higher return with equal risk).

However, pro forma results that only adjust for differences between current and past fees and commissions can be more representative than actual results. It is critical to differentiate between these two radically different applications of the same term: pro forma. 16. Return alone is a meaningless statistic because return can be increased by increasing risk. The return/risk ratio should be the primary performance metric. 17. Although the Sharpe ratio is by far the most widely used return/risk measure, return/risk measures based on downside risk come much closer to reflecting risk as it is perceived by most investors. 18. Conventional arithmetic-scale net asset value (NAV) charts provide a distorted picture, especially for longer-term track records that traverse a wide range of NAV levels.

pages: 353 words: 88,376

The Investopedia Guide to Wall Speak: The Terms You Need to Know to Talk Like Cramer, Think Like Soros, and Buy Like Buffett
by Jack (edited By) Guinan
Published 27 Jul 2009

A term that measures a particular aspect of a company’s financial well-being, determined by dividing one metric by another metric. The metric in the numerator is typically larger than the one in the denominator, because the top metric usually is supposed to be many times Multiple = Performance Metric “A” Performance Metric “B” larger than the bottom metric. It is calculated as follows: Investopedia explains Multiple As an example, the term “multiple” can be used to show how much investors are willing to pay per dollar of earnings, as computed by the P/E ratio. Suppose one is analyzing a stock with $2 of earnings per share (EPS) that is trading at $20; this stock has a P/E of 10.

To hedge that risk, the investor could purchase currency futures to lock in a specified exchange rate for the future stock sale and conversion back into the foreign currency. Related Terms: • Credit Derivative • Hedge • Stock Option • Forward Contract • Option Diluted Earnings per Share (Diluted EPS) What Does Diluted Earnings per Share (Diluted EPS) Mean? A performance metric used to gauge the quality of a company’s earnings per share (EPS) if all convertible securities were exercised. Convertible securities refer to all outstanding convertible preferred shares, convertible debentures, stock options (primarily employee-based), The Investopedia Guide to Wall Speak 75 and warrants.

There Is No Planet B: A Handbook for the Make or Break Years
by Mike Berners-Lee
Published 27 Feb 2019

Its latest report, ‘Global Warming of 1.5 C’ makes its most compelling call so far for urgent global action. Jobs A way of spending time that can be useful, fulfilling and which can be a mechanism for appropriate wealth distribution. Worth having when at least two of these three criteria are met, but otherwise not. Therefore to be used with caution as a national performance metric. Kids (ours) The people who will have to understand better than their parents, the nature of the Anthropocene challenge and how to deal with it. Leadership A scarce and much needed quality for dealing with the issues covered in this book. Anyone can display it in any walk of life 230 ALPHABETICAL QUICK TOUR and small actions can occasionally go viral.

32, 147–48, 227 big picture perspective 186, 191, 195–97 biodiversity 44, 53–54, 101–3, 102–3, 103–4, 214 big picture perspective 195–96 pressure on land 78–79, 91 Bioregional, One Planet Living 160–62, 162 boats/shipping 114–16, 235–36 Brazil 69–70, 70 278 Brexit 214 Buddhism 193, 208 bullshit 179, 214; see also fake news; truth Burning Question (Berners-Lee and Clark) 4, 92, 215 business as usual 8, 128, 204 businesses 158, 215 environmental strategies 163–64 fossil fuel companies 223 perspectives/vision 159 role in wealth distribution 138–39 science-based targets 164–66 systems approaches 159–62, 161–62 technological changes 166–68 useful/beneficial organisations 158–59 values 159, 174 see also food retailers call centres, negative effect of performance metrics 125–26 calorific needs 12, 242–43 carbohydrates, carbon footprint 23–25, 25 carbon budgets 51–52, 88, 146, 169–70, 201–2, 204–5 carbon capture and storage (CCS) 91–92, 141, 211, 215 carbon dioxide emissions, exponential growth 202–4, 203, 220; see also greenhouse gas emissions carbon footprints agriculture 22–25, 23, 29–30 carbohydrates 25 local food/food miles 30–32 population growth 149 protein 24 sea travel 114–16 vegetarianism/veganism 27 INDEX carbon pricing 145–47, 209–10 carbon scrubbing 211, 216 carbon taxes 142–43 CCS see carbon capture and storage celebrities 182 change, embracing see openmindedness chicken farms 25–26 Chilean seabass (Patagonian toothfish) 33–34 China 216 global distribution of fossil fuel reserves 89–90 sunlight/radiant energy 69–70, 70 choice//being in control 266 cities, urban planning and transport 104–6 citizen’s wages 136–39, 153–54 Clark, Duncan: Burning Question (with Berners-Lee) 4, 92, 215 climate change 3–4, 51, 55, 216 big picture perspective 195 biodiversity impacts 53–54 evidence against using fossil fuels 64–66 ocean acidification 54–55 plastics production/pollution 55–58, 56–57 rebound effects 52, 128, 165–66, 206–7, 206 science-based targets 164–66 scientific facts 51–53, 200–11, 203, 206 systems approaches 159–62, 161 values 169–70 coal 216; see also fossil fuels comfort breaks, performance metrics 125–27 Common Cause report (Crompton) 129 community service 174 Index commuting 217; see also travel and transport companies see businesses competence 266 complexity 189, 191, 221; see also simplistic thinking consumption/consumerism 217 ethical 147–48, 168 personal actions 174–75 risks of further growth 121 values 173 corporate responsibility 219; see also businesses critical realism 176 critical thinking skills 188–89, 191 Crompton, Tom (Common Cause report) 129 cruises 115–16 cultural norms big picture perspective 197 values 171–72 cultures of truth 177–79 cumulative carbon budgets 51, 201–2 cycling 4–5, 99–102, 116, 217 dairy industry 230–31; see also animal sources of food democracy 141, 218, 240–41; see also voting denial 198, 227 Denmark, wealth distribution 130–35 Desai, Pooran 161–62 desalination plants, energy use 94 determinism 95, 218 developed countries 218–19 energy use 93 food waste 13, 39–40, 241 diesel vehicles 107–9, 109 diet, sustainable 219; see also vegetarianism/veganism 279 digital information storage, and energy efficiency 84–85 direct air capture, carbon dioxide 211, 216 distance, units of 243 double-sided photocopying metaphor 219 driverless cars 109–10 e-transport e-bikes 101–2, 116 e-boats 115 e-cars 101–2, 106, 220 e-planes 111 investment 141 economic growth 119, 219 big picture perspective 196–97 carbon pricing 145–47 carbon taxes 142–43 consumer power through spending practices 147–48 GDP as inadequate metric 123–24, 126–27 investment 140–42 market forces 127–30 need for new metric of healthy growth 124–27 risks and benefits of growth 120–23, 121 trickledown of wealth 130–31, 130 wealth distribution 130–35, 131–40, 132, 134 education 173–74, 219 efficiency 219–20 digital information storage 84–85 energy use 82–85 investment 141 limitations of electricity 73–86, 85–87 meat eating/animal feed 212–13 rebound effects 84, 207 280 electric vehicles see e-transport electricity, limitations of use 73–86, 85–87; see also renewable energy sources empathy 172, 186–87, 191 employment see work/employment enablement, businesses 163–64 energy in a gas analogy of wealth distribution 136–39 energy use 59, 87, 95–96 current usage 59–60 efficiency 82–85 fracking 79–81, 81 growth rates over time see below inequality 60, 90–91, 131 interstellar travel 117–18 limitations of electricity 73–86, 85–87 limits to growth 67–69, 68, 94–95, 208 nuclear fission 75–77 nuclear fusion 77 personal actions and effects 97 risks of further growth 120–21 sources 63–64 supplied by food 12 UK energy by end use 62, 62 units of 242–43 values 169–70 see also fossil fuels; renewable energy sources energy use growth 1–2, 60–62, 61, 220 and energy efficiency 84 future estimates 93–94 limits to growth 67–69, 68, 94–95 and renewables 81–82 enhanced rock weathering 92 enoughness 221; see also limits to growth environmental strategies, businesses 163–64 science-based targets 164–66 INDEX ethical consumerism 147–48, 168 ethics see values evolutionary rebalancing 6, 221 expert opinion 221 exponential growth 120, 121, 149, 202–4, 220–21 extrinsic motivation and values 143–44, 170–73 facts 222 climate change 51–53, 200–11, 203, 206 meaning of 175–76 media roles in promoting 179–80 see also misinformation; truth fake news 170, 175, 222; see also misinformation farming see food and agriculture fast food 238 feedback mechanisms 272; see also rebound effects fish farming 33 fishing industry 32–36, 222–23 flat lining blip, carbon dioxide emissions 203–4, 220 flexibility see open-mindedness flying see air travel food and agriculture 11, 50, 222–23 animal farming 16–21, 29 biofuels 44 carbon footprints 22–25, 23–25, 27 chicken farming 25–26 employment in agriculture 44–45, 222 feeding growing populations 46–47 fish 32–36 global surplus in comparison to needs 12, 13 human calorific needs 12 investment in sustainability 48–50, 141 Index malnutrition and inequalities of distribution 15–16 overeating/obesity 16 personal actions 30, 34–35, 40, 43, 50 research needs 49 rice farming 29–30 soya bean farming 21, 22 supply chains 48 technology in agriculture 45–46 vegetarianism/veganism 26–29 see also waste food food imports, and population growth 150 food markets 130–31 food miles 30–32, 230 food retailers fish 35–36 food wastage 40–42 rice 30 vegetarianism/veganism 28 fossil fuel companies 223 fossil fuels 63–64, 216, 223 carbon pricing 145–47, 209–10 carbon taxes 142–43 evidence against using 64–66 global deals 87–91, 161, 205–6, 208–9 global distribution of reserves 89, 89–90 limitations of using electricity instead 73–86, 85–87 need to leave in the ground 87–91, 161, 205–6, 208–9, 223 sea travel 115 using renewables instead of or as well as 81–82 fracking 79–81, 81, 224 free markets 127–30, 172, 228 free will 95, 167 frog in a pan of water analogy 236, 241 fun 224 281 fundamentalism 176, 192 future scenarios aims and visions 8–9 climate change lag times 204–5 energy use 93–94 planning ahead 204–5 thinking/caring about 187, 191, 229 travel and transport 100–1, 109–10 gambling industry 139–40, 152, 265 gas analogy of wealth distribution 136–39 gas (natural gas) 224; see also fracking; methane GDP big picture perspective 196–97 as inappropriate metric of healthy growth 123–24, 126–27 risks of further growth 121–22 genetic modification 45–46 genuineness 172 geo engineering solutions 224–25 Germany, tax system 145 Gini coefficient of income inequality 144 global cultural norms 171–72, 197 global deals 163 fossil fuels 87–91, 208–10 inequity 210 global distribution, fossil fuels 89–91, 89 global distribution, solar energy 69–71, 70, 89 global distribution, wind energy 74, 74 global food surplus 12, 13 global governance 127–30, 141, 225 global solutions, big picture perspective 196 global systems 5–6, 186, 225 global temperature increases 200–1 282 global thinking skills 186 global travel, by mode of transport 100 global wealth distribution 130–35, 132, 132, 134, 144, 145 governmental roles big picture perspective 196 climate change policies 51–53, 200–11 energy use policies 59, 97 fishing industry 36 promoting culture of truth 178–80 sustainable farming 29, 45 technological changes 168 wealth distribution 138 see also global governance greed 225–26; see also individualism greenhouse gas emissions 209 exponential growth curves 202–4, 203, 220 food and agriculture 23 market forces 128 measurement 127 mitigation of food waste 42, 43, 43 risks of further growth 120 scientific facts 51–53 units 243 see also carbon dioxide; carbon footprints; methane; nitrogen dioxide greenwash 215, 226 growth 226; see also economic growth; energy use growth; exponential growth hair shirts 212, 224, 226–27 Handy, Charles 236 Happy Planet Index 126 Hardy, Lew 143 Hawking , Stephen 2, 166–67 Hong Kong, population growth 149–50 INDEX How Bad Are Bananas?

32, 147–48, 227 big picture perspective 186, 191, 195–97 biodiversity 44, 53–54, 101–3, 102–3, 103–4, 214 big picture perspective 195–96 pressure on land 78–79, 91 Bioregional, One Planet Living 160–62, 162 boats/shipping 114–16, 235–36 Brazil 69–70, 70 278 Brexit 214 Buddhism 193, 208 bullshit 179, 214; see also fake news; truth Burning Question (Berners-Lee and Clark) 4, 92, 215 business as usual 8, 128, 204 businesses 158, 215 environmental strategies 163–64 fossil fuel companies 223 perspectives/vision 159 role in wealth distribution 138–39 science-based targets 164–66 systems approaches 159–62, 161–62 technological changes 166–68 useful/beneficial organisations 158–59 values 159, 174 see also food retailers call centres, negative effect of performance metrics 125–26 calorific needs 12, 242–43 carbohydrates, carbon footprint 23–25, 25 carbon budgets 51–52, 88, 146, 169–70, 201–2, 204–5 carbon capture and storage (CCS) 91–92, 141, 211, 215 carbon dioxide emissions, exponential growth 202–4, 203, 220; see also greenhouse gas emissions carbon footprints agriculture 22–25, 23, 29–30 carbohydrates 25 local food/food miles 30–32 population growth 149 protein 24 sea travel 114–16 vegetarianism/veganism 27 INDEX carbon pricing 145–47, 209–10 carbon scrubbing 211, 216 carbon taxes 142–43 CCS see carbon capture and storage celebrities 182 change, embracing see openmindedness chicken farms 25–26 Chilean seabass (Patagonian toothfish) 33–34 China 216 global distribution of fossil fuel reserves 89–90 sunlight/radiant energy 69–70, 70 choice//being in control 266 cities, urban planning and transport 104–6 citizen’s wages 136–39, 153–54 Clark, Duncan: Burning Question (with Berners-Lee) 4, 92, 215 climate change 3–4, 51, 55, 216 big picture perspective 195 biodiversity impacts 53–54 evidence against using fossil fuels 64–66 ocean acidification 54–55 plastics production/pollution 55–58, 56–57 rebound effects 52, 128, 165–66, 206–7, 206 science-based targets 164–66 scientific facts 51–53, 200–11, 203, 206 systems approaches 159–62, 161 values 169–70 coal 216; see also fossil fuels comfort breaks, performance metrics 125–27 Common Cause report (Crompton) 129 community service 174 Index commuting 217; see also travel and transport companies see businesses competence 266 complexity 189, 191, 221; see also simplistic thinking consumption/consumerism 217 ethical 147–48, 168 personal actions 174–75 risks of further growth 121 values 173 corporate responsibility 219; see also businesses critical realism 176 critical thinking skills 188–89, 191 Crompton, Tom (Common Cause report) 129 cruises 115–16 cultural norms big picture perspective 197 values 171–72 cultures of truth 177–79 cumulative carbon budgets 51, 201–2 cycling 4–5, 99–102, 116, 217 dairy industry 230–31; see also animal sources of food democracy 141, 218, 240–41; see also voting denial 198, 227 Denmark, wealth distribution 130–35 Desai, Pooran 161–62 desalination plants, energy use 94 determinism 95, 218 developed countries 218–19 energy use 93 food waste 13, 39–40, 241 diesel vehicles 107–9, 109 diet, sustainable 219; see also vegetarianism/veganism 279 digital information storage, and energy efficiency 84–85 direct air capture, carbon dioxide 211, 216 distance, units of 243 double-sided photocopying metaphor 219 driverless cars 109–10 e-transport e-bikes 101–2, 116 e-boats 115 e-cars 101–2, 106, 220 e-planes 111 investment 141 economic growth 119, 219 big picture perspective 196–97 carbon pricing 145–47 carbon taxes 142–43 consumer power through spending practices 147–48 GDP as inadequate metric 123–24, 126–27 investment 140–42 market forces 127–30 need for new metric of healthy growth 124–27 risks and benefits of growth 120–23, 121 trickledown of wealth 130–31, 130 wealth distribution 130–35, 131–40, 132, 134 education 173–74, 219 efficiency 219–20 digital information storage 84–85 energy use 82–85 investment 141 limitations of electricity 73–86, 85–87 meat eating/animal feed 212–13 rebound effects 84, 207 280 electric vehicles see e-transport electricity, limitations of use 73–86, 85–87; see also renewable energy sources empathy 172, 186–87, 191 employment see work/employment enablement, businesses 163–64 energy in a gas analogy of wealth distribution 136–39 energy use 59, 87, 95–96 current usage 59–60 efficiency 82–85 fracking 79–81, 81 growth rates over time see below inequality 60, 90–91, 131 interstellar travel 117–18 limitations of electricity 73–86, 85–87 limits to growth 67–69, 68, 94–95, 208 nuclear fission 75–77 nuclear fusion 77 personal actions and effects 97 risks of further growth 120–21 sources 63–64 supplied by food 12 UK energy by end use 62, 62 units of 242–43 values 169–70 see also fossil fuels; renewable energy sources energy use growth 1–2, 60–62, 61, 220 and energy efficiency 84 future estimates 93–94 limits to growth 67–69, 68, 94–95 and renewables 81–82 enhanced rock weathering 92 enoughness 221; see also limits to growth environmental strategies, businesses 163–64 science-based targets 164–66 INDEX ethical consumerism 147–48, 168 ethics see values evolutionary rebalancing 6, 221 expert opinion 221 exponential growth 120, 121, 149, 202–4, 220–21 extrinsic motivation and values 143–44, 170–73 facts 222 climate change 51–53, 200–11, 203, 206 meaning of 175–76 media roles in promoting 179–80 see also misinformation; truth fake news 170, 175, 222; see also misinformation farming see food and agriculture fast food 238 feedback mechanisms 272; see also rebound effects fish farming 33 fishing industry 32–36, 222–23 flat lining blip, carbon dioxide emissions 203–4, 220 flexibility see open-mindedness flying see air travel food and agriculture 11, 50, 222–23 animal farming 16–21, 29 biofuels 44 carbon footprints 22–25, 23–25, 27 chicken farming 25–26 employment in agriculture 44–45, 222 feeding growing populations 46–47 fish 32–36 global surplus in comparison to needs 12, 13 human calorific needs 12 investment in sustainability 48–50, 141 Index malnutrition and inequalities of distribution 15–16 overeating/obesity 16 personal actions 30, 34–35, 40, 43, 50 research needs 49 rice farming 29–30 soya bean farming 21, 22 supply chains 48 technology in agriculture 45–46 vegetarianism/veganism 26–29 see also waste food food imports, and population growth 150 food markets 130–31 food miles 30–32, 230 food retailers fish 35–36 food wastage 40–42 rice 30 vegetarianism/veganism 28 fossil fuel companies 223 fossil fuels 63–64, 216, 223 carbon pricing 145–47, 209–10 carbon taxes 142–43 evidence against using 64–66 global deals 87–91, 161, 205–6, 208–9 global distribution of reserves 89, 89–90 limitations of using electricity instead 73–86, 85–87 need to leave in the ground 87–91, 161, 205–6, 208–9, 223 sea travel 115 using renewables instead of or as well as 81–82 fracking 79–81, 81, 224 free markets 127–30, 172, 228 free will 95, 167 frog in a pan of water analogy 236, 241 fun 224 281 fundamentalism 176, 192 future scenarios aims and visions 8–9 climate change lag times 204–5 energy use 93–94 planning ahead 204–5 thinking/caring about 187, 191, 229 travel and transport 100–1, 109–10 gambling industry 139–40, 152, 265 gas analogy of wealth distribution 136–39 gas (natural gas) 224; see also fracking; methane GDP big picture perspective 196–97 as inappropriate metric of healthy growth 123–24, 126–27 risks of further growth 121–22 genetic modification 45–46 genuineness 172 geo engineering solutions 224–25 Germany, tax system 145 Gini coefficient of income inequality 144 global cultural norms 171–72, 197 global deals 163 fossil fuels 87–91, 208–10 inequity 210 global distribution, fossil fuels 89–91, 89 global distribution, solar energy 69–71, 70, 89 global distribution, wind energy 74, 74 global food surplus 12, 13 global governance 127–30, 141, 225 global solutions, big picture perspective 196 global systems 5–6, 186, 225 global temperature increases 200–1 282 global thinking skills 186 global travel, by mode of transport 100 global wealth distribution 130–35, 132, 132, 134, 144, 145 governmental roles big picture perspective 196 climate change policies 51–53, 200–11 energy use policies 59, 97 fishing industry 36 promoting culture of truth 178–80 sustainable farming 29, 45 technological changes 168 wealth distribution 138 see also global governance greed 225–26; see also individualism greenhouse gas emissions 209 exponential growth curves 202–4, 203, 220 food and agriculture 23 market forces 128 measurement 127 mitigation of food waste 42, 43, 43 risks of further growth 120 scientific facts 51–53 units 243 see also carbon dioxide; carbon footprints; methane; nitrogen dioxide greenwash 215, 226 growth 226; see also economic growth; energy use growth; exponential growth hair shirts 212, 224, 226–27 Handy, Charles 236 Happy Planet Index 126 Hardy, Lew 143 Hawking , Stephen 2, 166–67 Hong Kong, population growth 149–50 INDEX How Bad Are Bananas?

pages: 343 words: 103,376

The Alternative: How to Build a Just Economy
by Nick Romeo
Published 15 Jan 2024

It looks more likely to be regulatory Whac-A-Mole. For now, the Department of Labor does not seem interested in helping irregular workers by changing their workforce board performance metrics so that more of them might support public platforms. Rowan has found multiple leaders of workforce boards reluctant to move forward for this reason. He was initially optimistic that the Biden administration would be open to revising the Workforce Innovation and Opportunity Act performance metrics. Unfortunately, even progressive politicians seem afraid of trying any approach to improving conditions for gig workers other than incremental regulation, which to date has proved inadequate.

THE ROAD AHEAD The bad conditions that many gig workers face has led some to conclude that all flexible work is necessarily exploitative and should not be supported by public agencies. In short, a counterreaction motivated by the misdeeds of the dominant gig-work companies now risks sabotaging a project with the potential to help workers. One locus of this conflict is a seemingly narrow issue: the six performance metrics by which America’s public workforce boards are assessed. These metrics reflect a strong preference for standard full-time employment, effectively disincentivizing the leaders of local workforce boards interested in experimenting with a platform that could help part-time workers.36 If your organization is evaluated on the number of people it helps move into full-time positions, spending time and money helping the people who will never work full-time jobs is a risky proposition.

pages: 739 words: 174,990

The TypeScript Workshop: A Practical Guide to Confident, Effective TypeScript Programming
by Ben Grynhaus , Jordan Hudgens , Rayon Hunte , Matthew Thomas Morgan and Wekoslav Stefanovski
Published 28 Jul 2021

All the added functionalities stem from the system in which the class lives. The basketball game, by itself, does not need authorization, or performance metrics, or auditing. But the scoreboard application does need all of those and more. Note that all the added logic is already encapsulated within methods (audit, isAuthorized, logDuration), and the code that actually performs all the aforementioned operations is outside your method. The code you inserted into your function does the bare minimum – yet it still complicated your code. In addition, authorization, performance metrics, and auditing will be needed in many places within your application, and in none of those places will that code be instrumental to the actual working of the code that is being authorized or measured or audited.

In addition, authorization, performance metrics, and auditing will be needed in many places within your application, and in none of those places will that code be instrumental to the actual working of the code that is being authorized or measured or audited. The Solution Let's take a better look at one of the concerns from the previous section, the performance metric, that is, the duration measurement. This is something that is very important to an application, and to add it to any specific method, you need a few lines of code at the beginning and a few lines at the end of the method: const start =; // actual code of the method const end =; logDuration("updateScore", start, end); We'll need to add this to each and every method you need to measure.

Second decorator factory: 7 function Second () { 8 console.log("Generating second decorator") 9 return function (constructor: Function) { 10 console.log("Applying second decorator") 11 } 12 } Now imagine that they are applied on a single target: 13 @First() 14 @Second() 15 class Target {} The generation process will generate the first decorator before the second, but in the application process, the second will be applied, and then the first: Generating first decorator Generating second decorator Applying second decorator Applying first decorator Activity 7.02: Using Decorators to Apply Cross-Cutting Concerns In this activity, we're going full circle to the basketball game example (Example_Basketball.ts). You are tasked with adding all the necessary cross-cutting concerns, such as authentication, performance metrics, auditing, and validation to the Example_Basketball.ts file in a maintainable manner. You can begin the activity with the code that you already have in the Example_Basketball.ts. First, take stock of the elements that are already present in the file: The interface that describes the team.

pages: 404 words: 107,356

The Future of Fusion Energy
by Jason Parisi and Justin Ball
Published 18 Dec 2018

Hence, we conclude that the power plant designs that would be most attractive to electric utilities (and society) are small, but still produce lots of electricity. Not exactly surprising, right? These arguments tell us that we should do everything we can to maximize the net electric power density of our design, rather than any other quantity. You can think of the power density as being the ultimate performance metric for power plant economics, just like the triple product was the ultimate performance metric for a D–T plasma. With this concrete, quantifiable goal in mind, we can now conclude our discussion of economics and delve into the physics complexities of fusion power plants. As a warning, this chapter is a bit more technical than previous ones — designing a fusion power plant isn’t easy!

Even if a tokamak plasma is ignited, power will be required to drive the plasma current. This external power must be minimized as it detracts from the net electricity that can be sold to the consumer. The balance of this can be seen in Figure 8.1, which shows the electricity that is needed by the plant and the electricity that is produced by the plant. Our ultimate power plant performance metric, the net electric power density, is simply the difference of the two divided by the volume6: Figure 8.1:The flow of power through a tokamak power plant. Here, Pinput is the external power required by the plant, ηh&cd is the efficiency of the heating and current drive systems, Pfusion is the fusion power produced by the plasma,5 fblanket is the power multiplication that occurs in the tritium breeding blanket, and ηsteam is the steam cycle efficiency of the turbine.

pages: 561 words: 114,843

Startup CEO: A Field Guide to Scaling Up Your Business, + Website
by Matt Blumberg
Published 13 Aug 2013

For example, I wasn’t surprised that there was a high degree of convergence in the way people thought about the organization’s values since we had a strong values-driven culture that people were living every day, even if those values hadn’t been well articulated in the past. But it was a little surprising that we could effectively crowdsource a strategy statement and key performance metrics at a time when the business was at a fork in the road. Given this degree of alignment, our task as an executive team became less about picking concepts and more about picking words. We worked together to come up with a solid draft that took the best of what was submitted to us. We worked with a copywriter to make the statements flow well.

What is the capital required to get there and what are your financing requirements from where your balance sheet sits today? The costs are easier to forecast, especially if you carefully articulated your resource requirements. As everybody in the startup world knows, ROI is trickier. You’re not leading an enterprise that has extremely detailed historical performance metrics to rely on in their forecasting. When Schick or Gillette introduces a new razor into the marketplace, they can very accurately forecast how much it’s going to cost them and what their return will be. If you’re creating a new product in a new marketplace, that isn’t the case. While monthly burn and revenue projections will inevitably change, capital expenditures can be more predictable, though you need to make sure you understand the cash flow mechanics of capital expenditure.

Finally, an earn-out can’t be too high a percentage of the deal. The preponderance will have to be cash and stock. Otherwise, the process of judging performance should be shared by both parties. In one of our largest deals at Return Path, each side appointed representatives who met quarterly to agree on performance metrics, adjustments, and so on. We also designated a third representative in advance who was available to adjudicate any disagreements. We never had to use him. Whatever mechanism you put in place, trust plays a huge role here. If it’s not there, this acquisition might not be a good idea. THE FLIP SIDE OF M&A: DIVESTITURE When Return Path turned six years old in 2005, we had gone from being a startup focused on our initial ECOA business to the world’s smallest conglomerate, with five lines of business: in addition to change of address, we were market leaders in email delivery assurance (a market we created), email–based market research (a tiny market when we started) and email list management and list rental (both huge markets when we founded the company).

pages: 49 words: 12,968

Industrial Internet
by Jon Bruner
Published 27 Mar 2013

Newer wind turbines use software that acts in real-time to squeeze a little more current out of each revolution, pitching the blades slightly as they rotate to compensate for the fact that gravity shortens them as they approach the top of their spin and lengthens them as they reach the bottom. Power producers use higher-level data analysis to inform longer-range capital strategies. The 150-foot-long blades on a wind turbine, for instance, chop at the air as they move through it, sending turbulence to the next row of turbines and reducing efficiency. By analyzing performance metrics from existing wind installations, planners can recommend new layouts that take into account common wind patterns and minimize interference. Automotive Google captured the public imagination when, in 2010, it announced that its autonomous cars had already driven 140,000 miles of winding California roads without incident.

pages: 571 words: 124,448

Building Habitats on the Moon: Engineering Approaches to Lunar Settlements
by Haym Benaroya
Published 12 Jan 2018

Loss Analysis is based on frequency and probability data, and is used to extract performance metrics that are meaningful to facility stakeholders; metrics such as upper bound economic loss during the owner-investor’s planning period. Risk-management decisions can be made based on the loss analysis. The project manager for the design and fabrication of a lunar facility has concerns that go beyond investor metrics. The lunar facility is more than a project; rather, it is something that has an almost metaphysical hold on those who have devoted their lives (figuratively and literally) to its creation. Performance metrics for a lunar base need to be based on survivability and development, as well technical aspects of its operation.

Such technologies are being studied and developed but are far from being usable. Do you see self-healing as a critical technology? We had a program called InFlex (Intelligent Flexible Materials for Deployable Space Structures) in the early 2000s. The focus was increasing inflatable structures performance metrics from habitats, to space suits, to aeroshells. We categorized each individual threat ( Micrometeoroid and Orbital Debris [MMOD], impact from external equipment, impact from inside, material degradation, etc.) in every possible environment (LEO, Moon, Mars, etc.) and looked at what technologies made sense to deal with the threats ( self-healing , structural health monitoring, layered materials, etc.).

pages: 291 words: 77,596

Total Recall: How the E-Memory Revolution Will Change Everything
by Gordon Bell and Jim Gemmell
Published 15 Feb 2009

Proceedings of International IEEE Workshop on Wearable and Implantable Body Sensor Networks (BSN), Aachen, Germany, March 2007. Schlenoff, Craig, et al. “Overview of the First Advanced Technology Evaluations for ASSIST.” Proceedings of Performance Metrics for Intelligent Systems (PerMIS) 2006, IEEE Press, Gaithersburg, Maryland, August 2006. Stevers, Michelle Potts. “Utility Assessments of Soldier-Worn Sensor Systems for ASSIST.” Proceedings of the Performance Metrics for Intelligent Systems Workshop, 2006. Starner, Thad. “The Virtual Patrol: Capturing and Accessing Information for the Soldier in the Field.” Proceedings of the 3rd ACM Workshop on Continuous Archival and Retrieval of Personal Experiences, Santa Barbara, California, 2006.

pages: 231 words: 71,248

Shipping Greatness
by Chris Vander Mey
Published 23 Aug 2012

This is a good, but not sufficiently specific, framework. I prefer the Great Delta Convention (described in Chapter 10). If you apply the Great Delta Convention to your goals, nobody will question them—they will almost be S.M.A.R.T. by definition (lacking only the “reasonable” part). Business Performance Business performance metrics tell you where your problems are and how you can improve your user’s experience. These metrics are frequently measured as ratios, such as conversion from when a user clicks the Buy button to when the checkout process is complete. Like goal metrics, it’s critical to measure the right aspects of your business.

Most major websites have testing frameworks that they use to roll out features incrementally and ensure that a new feature or experience has the intended effect. If it’s even remotely possible, try to build an experimentation framework in from the beginning (see Chapter 7’s discussion of launching for other benefits of experiments). Systems Performance Systems performance metrics measure the health of your product in real time. Metrics like these include 99.9% mean latency, total requests per second, simultaneous users, orders per second, and other time-based metrics. When these metrics go down substantially, something has gone wrong. A pager should go off. If you’re a very fancy person, you’ll want to look at your metrics through the lens of statistical process control (SPC).

Designing Search: UX Strategies for Ecommerce Success
by Greg Nudelman and Pabini Gabriel-Petit
Published 8 May 2011

A/B Testing and Multivariate Testing To ensure you create successful design solutions that meet both business and customer goals, your team should conduct frequent, quantitative A/B testing of your no search results page and other search results pages. Follow up with qualitative lab and field testing to help you make sense of your A/B testing results and suggest ideas for future improvements. The central idea behind A/B testing is to have two different user interface designs running on your site at the same time, while collecting key performance metrics (KPMs) that enable you to measure desired customer behavior. For example, say you want to introduce some improvements to your current no search results page, which you can call variant A. To determine whether your redesigned version of the page, variant B, offers any improvement, you can deploy variant B and send a small percentage of site visitors—for example, 1% to 10%—to that server and observe the metrics for variant B: Did those visitors buy more stuff?

Finally, it is important to note that, although this framework can be helpful and these five ecommerce search roles apply to most ecommerce projects, this generalized framework is not precise. For any framework to be maximally useful for a specific project, you should refine it through direct observation of customers and careful study of key performance metrics. As the prominent philosopher and the father of General Semantics Alfred Korzybski so eloquently stated, “The map is not the territory.” You can neither camp on the little triangles that represent mountains on a map nor go swimming in those blue patches of ink that represent lakes. Rather than viewing this role framework as “reality,” use it as you would a map—to help you navigate your ecommerce search design projects, and as the foundation for developing your own approach—subject to change as you get more data and gain a better understanding of the needs and behaviors of your customers.

Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
by Aurelien Geron
Published 14 Aug 2019

This is controlled by the n_init hyperparameter: by default, it is equal to 10, which means that the whole algorithm described earlier actually runs 10 times when you call fit(), and Scikit-Learn keeps the best solution. But how exactly does it know which solution is the best? Well of course it uses a performance metric! It is called the model’s inertia: this is the mean squared distance between each instance and its closest centroid. It is roughly equal to 223.3 for the model on the left of Figure 9-5, 237.5 for the model on the right of Figure 9-5, and 211.6 for the model in Figure 9-3. The KMeans class runs the algorithm n_init times and keeps the model with the lowest inertia: in this example, the model in Figure 9-3 will be selected (unless we are very unlucky with n_init consecutive random initializations).

Bad choices for the number of clusters You might be thinking that we could just pick the model with the lowest inertia, right? Unfortunately, it is not that simple. The inertia for k=3 is 653.2, which is much higher than for k=5 (which was 211.6), but with k=8, the inertia is just 119.1. The inertia is not a good performance metric when trying to choose k since it keeps getting lower as we increase k. Indeed, the more clusters there are, the closer each instance will be to its closest centroid, and therefore the lower the inertia will be. Let’s plot the inertia as a function of k (see Figure 9-8): Figure 9-8. Selecting the number of clusters k using the “elbow rule” As you can see, the inertia drops very quickly as we increase k up to 4, but then it decreases much more slowly as we keep increasing k.

pages: 342 words: 72,927

Transport for Humans: Are We Nearly There Yet?
by Pete Dyson and Rory Sutherland
Published 15 Jan 2021

Created by Ogilvy for Deutsche Bahn, when German Instagram users search for glamorous destinations, an algorithm shows them an attraction of similar beauty much closer to home. Figure 20. Travel can be reframed to promote domestic destinations, with social media enabling smart and timely targeting. 5. Design for perception first Recall Goodhart’s Law: quantification means that a lot of investment is spent improving performance metrics. In comparison, much less effort is needed to improve perceptual ones. This insight is known to proponents of Kano theory, a Japanese model of product development.37 It states that products have ‘delight attributes’ that make us extremely happy but are tangential to what the product is designed to do.

Highways England reduced the level of error in forecasting benefits from a peak of 30% in 2002 to 3% seven years later. It did this by setting up ‘post-opening project evaluations’ at one and five years after completion. This created a meta-report of all schemes on a two-year cycle. The method went deeper than traditional operational performance metrics by surveying local businesses and communities to evaluate the social and human impacts of projects. It revealed who valued a new road, and what they now used it for. The What Works Centre, established by the London School of Economics in partnership with Arup, is also applying a rigorous experimental approach, using a difference-in-differences method.

pages: 444 words: 86,565

Investment Banking: Valuation, Leveraged Buyouts, and Mergers and Acquisitions
by Joshua Rosenbaum , Joshua Pearl and Joseph R. Perella
Published 18 May 2009

Second, we analyze and compare the trading multiples for the peer group, placing particular emphasis on the best comparables. Benchmark the Financial Statistics and Ratios The first stage of the benchmarking analysis involves a comparison of the target and comparables universe on the basis of key financial performance metrics. These metrics, as captured in the financial profile framework outlined in Steps I and III, include measures of size, profitability, growth, returns, and credit strength. They are core value drivers and typically translate directly into relative valuation. The results of the benchmarking exercise are displayed on spreadsheet output pages that present the data for each company in an easy-to-compare format (see Exhibits 1.53 and 1.54).

As shown in Exhibit 3.39, this approach led us to hold capex constant throughout the projection period at 2% of sales. Based on this assumption, capex increases from $21.6 million in 2009E to $25.3 million in 2013E. EXHIBIT 3.39 ValueCo Historical and Projected Capex Change in Net Working Capital Projections As with ValueCo’s other financial performance metrics, historical working capital levels normally serve as reliable indicators of future performance. The direct prior year’s ratios are typically the most indicative provided they are consistent with historical levels. This was the case for ValueCo’s 2007 working capital ratios, which we held constant throughout the projection period (see Exhibit 3.40).

pages: 277 words: 81,718

Vassal State
by Angus Hanton
Published 25 Mar 2024

The cat was out of the bag for treating work seriously and producing outsized returns in relation to effort. Right through the twentieth century America developed many new systems for better processes and management. ‘Lean manufacturing’ sought to streamline production and reduce waste; ‘Six Sigma’ was developed by Motorola to improve quality in a systematic way; and ‘performance metrics’ made sure that key performance indicators (KPIs) were measured.34 American businesses developed a ‘continuous improvement culture’ encouraging all employees to look for opportunities to solve problems and improve efficiency, and ‘agile methodologies’ allowed iterative and incremental improvements to efficiency.35 Other countries copied, but only very slowly, and they were almost always behind the US curve and forced to play catch-up.

Drucker, ‘The rise of the knowledge society’ [PDF], Wilson Quarterly 17/2 (1993), 33 ‘Hawthorne effect’, Wikipedia [website], 34 See, for instance, ‘Lean manufacturing’, Wikipedia [website],; ‘Six Sigma’, Wikipedia [website],; ‘What are performance metrics?’, Sage [website], 35 See, for instance, Helen Gray, ‘What is continuous improvement? (And how to include it in your company culture)’, eLearning Industry [website] (19 January 2021),; Sarah Laoyan, ‘What is agile methodology? (A beginner’s guide)’, Asana [website] (15 October 2022), 36 See, for instance, ‘JIT just-in-time manufacturing’, University of Cambridge [website], 37 ‘Marshall Field (1834–1906)’, American Experience [website], 38 Quoted in Charles Arthur, ‘Walled gardens look rosy for Facebook, Apple – and would-be censors’, Guardian (17 April 2012), 39 Keshav Thakur, ‘Gillette’s razor-sharp success: how a simple strategy built a business empire’, LinkedIn [website] (22 November 2023), 40 ‘EG Group reaches deal to buy Tesla Superchargers for its forecourts’, Car (14 November 2023), 41 ‘Market share held by mobile operating systems in the United Kingdom (UK) from January 2018 to July 2023’, Statista [website] (4 September 2023), 42 Denny Ludwell, America Conquers Britain: A Record of Economic War (New York: A.

pages: 92 words: 23,741

Lessons From Private Equity Any Company Can Use
by Orit Gadiesh and Hugh MacArthur
Published 14 Aug 2008

Additionally, the measurement of performance via quantitative metrics (such as return on investment, or ROI) gave the sales force a new communication tool with their clients. The sales force found that renewal rates were higher in categories where they could articulate a high ROI from the client’s ad expenditure. The new performance metrics also provided a regular report card on whether the SYP turnaround was on track. They made transparent for the first time how much revenue per customer each dollar invested in the sales effort yielded. The clear metrics helped give a powerful signal to other potential investors that the new sales force strategy was working.

pages: 516 words: 157,437

Principles: Life and Work
by Ray Dalio
Published 18 Sep 2017

Look at what people in comparable jobs with comparable experience and credentials make, add some small premium over that, and build in bonuses or other incentives so they will be motivated to knock the cover off the ball. Never pay based on the job title alone. b. Have performance metrics tied at least loosely to compensation.While you will never fully capture all the aspects that make for a great work relationship in metrics, you should be able to establish many of them. Tying performance metrics to compensation will help crystallize your understanding of your deal with people, provide good ongoing feedback, and influence how the person behaves on an ongoing basis. c. Pay north of fair.

Look for people who have lots of great questions. b. Show candidates your warts. c. Play jazz with people with whom you are compatible but who will also challenge you. 8.6 When considering compensation, provide both stability and opportunity. a. Pay for the person, not the job. b. Have performance metrics tied at least loosely to compensation. c. Pay north of fair. d. Focus more on making the pie bigger than on exactly how to slice it so that you or anyone else gets the biggest piece. 8.7 Remember that in great partnerships, consideration and generosity are more important than money. a. Be generous and expect generosity from others. 8.8 Great people are hard to find so make sure you think about how to keep them. 9 Constantly Train, Test, Evaluate, and Sort People 9.1 Understand that you and the people you manage will go through a process of personal evolution.

pages: 290 words: 87,549

The Airbnb Story: How Three Ordinary Guys Disrupted an Industry, Made Billions...and Created Plenty of Controversy
by Leigh Gallagher
Published 14 Feb 2017

But decline too many requests or respond too slowly or cancel too many reservations or simply appear inhospitable in reviews, and Airbnb can drop a powerful hammer: it can lower your listing in search results or even deactivate your account. Behave well, though, and Airbnb will shine its love upon you. If you hit a certain series of performance metrics—in the past year, if you have hosted at least ten trips, if you have maintained a 90 percent response rate or higher, if you have received a five-star review at least 80 percent of the time, and if you’ve canceled a reservation only rarely or in extenuating circumstances, you are automatically elevated to “Superhost” status.

W., 139 Maslow, Abraham, 70–71, 92 Mason, Andrew, 49 matching (guest and host), 44–45 McAdoo, Greg, 30–31, 35–36, 164 McCann, Pol, 74, 116, 117 McChrystal, Stanley, 173, 186 McGovern, George, xvi, 167 McNamara, Robert, 166 media and press Airbnb in pop culture, xv–xvi, 60–61 at conventions (2009), 38 Democratic National Convention coverage, 19–20 “Meet Carol” television ad, 112 negative exposure, 50–55, 80–82, 86, 91 presidential inauguration, 28 “Meet Carol” television ad, 112 Meyer, Danny, 191 Michael (original guest), 8, 10 Mildenhall, Jonathan, 64 millennials as Airbnb early adopters, xii, xiii, 59, 66, 150–51, 157–58 apartments and, 129–30 hotel industry and, 141, 152 as mobilizing force, 134 New York and, 108 mission statement, xiv, xix, 36, 64–67, 78–79, 117, 171, 172, 194, 205 Moore, Geoffrey, 181, 188 Morey, Elizabeth, 31 Morgan, Jonathan, 74–75, 116, 117, 134 Morgan Stanley, 145 Morris, Phil, 202 Moxy, 152 multifamily buildings, 129–31 Multiple Dwelling Law, 107, 115 multiunit listings, 110–13, 116–17 Murphy, Laura, 102, 171 Mushroom Dome, 60, 183 Musk, Elon, 196 N Nassetta, Christopher, 141–42 neighbors, 83–85, 109, 118–19, 132–34 network effect, 40–41 New Jersey, 126 New York City, 105–37 anti-Airbnb alliance, 109 attorney general’s report, 109–110 Chesky’s reaction to, 113 commercial “multiunit” listings, 110–13, 115–16 customer base, 26–28, 106, 119, 126 future negotiations, 133–37 objections to short-term rentals, 118–24 Warren verdict, 108–9 Noirbnb, 102 O Oasis, 154, 155–56 Obama, Barack, 18, 28, 92, 161–62, 173–74, 209 Obama O’s, 20–23, 24, 33, 47, 174 Olympics, 156 “one host, one home” policy, 114 onefinestay, 153, 154–55, 158 online travel agencies (OTAs), 148 Open Doors policy, 102 Orbitz, 148 Orlando, 142 Oswald, Lee Harvey, xvi P Packard, Dave, 1 Paltrow, Gwyneth, 59, 60, 191 Panetta, Leon, x Paris, Airbnb Open, 77–78 parties, 81–90 Patel, Elissa, 159, 209 Patton, George S., 166 payment system, 14, 16, 27, 39–40, 42–43 PayPal, 43 Peak (Conley), 70–71 Penz, Hans, 200 performance metrics, 72–73 photography, xvii, 27, 45, 99, 100–104, 206 Pillow, 75 politics Airbnb as force for change, 126–28 Airbnb guests and, 133 future negotiations, 133–37 Lehane and, 125–29 New York advertising policy, 121–22 New York short-term rentals, 105–10 pop culture, xv–xvi, 60–61 popular listings, 60 Pressler, Paul, 196 Priceline, 148, 154, 198 pricing, as issue, 27, 99–100 privacy policy, 87, 115 product evolution, 59–60 product/market fit, 34–37 professional operators, Airbnb, 111 profit and earnings, 73, 110, 112–13, 127 property management, 129 Proposition F, 128–29 prototype operations, 177–78 R Rabois, Keith, 31 racial discrimination, 99–104 “Ramen profitable,” 26, 29 rankings, 16, 72–73, 162 Rasulo, Jay, 196 Rausch Street apartment, 7–8, 14, 25, 36–38, 179, 183, 208 rebranding, 64–67, 78–79 regulations.

pages: 324 words: 92,805

The Impulse Society: America in the Age of Instant Gratification
by Paul Roberts
Published 1 Sep 2014

On the downside side, Autor told me, those jobs will always be low-wage “because the skills they use are generic and almost anyone can be productive at them within a couple of days.”34 And, in fact, there will likely be far more downsides to these jobs than upsides. For example, because Big Data will allow companies to more easily and accurately measure worker productivity, workers will be under constant pressure to meet specific performance metrics and will be subject to constant ratings, just as restaurants and online products are today. Companies will assess every data point that might affect performance, so that every aspect of employment, from applying for a job to the actual performance of duties, will become much more closely scrutinized and assessed.

“Rather than balancing our budget with higher taxes or lower benefits,” Cowen says, “we will allow the real wages of many workers to fall, and thus we will allow the creation of a new underclass.” Certain critics have found such dystopic visions far too grim. And yet, the signs of such a future are everywhere. Already, companies are using Big Data performance metrics to determine whom to cut—meaning that to be laid off is to be branded unemployable. In the ultimate corruption of innovation, a technology that might be used to help workers upgrade their skills and become more secure is instead being use to harass them. To be sure, Big Data will be put to more beneficial uses.

pages: 323 words: 90,868

The Wealth of Humans: Work, Power, and Status in the Twenty-First Century
by Ryan Avent
Published 20 Sep 2016

What our firm is, is not so much a business that produces a weekly magazine, but a way of doing things consisting of an enormous set of processes. You run that programme, and you get a weekly magazine at the end of it. Employees want job security, to advance, to receive pay rises. Those desires are linked to tangible performance metrics; within The Economist, it matters that a writer delivers the expected stories with the expected frequency and with the expected quality. Yet that is not all that matters. Advancement is also about the extent to which a worker thrives within a culture. What constitutes thriving depends on the culture.

The notion of a ‘disruptive’ technology was first described in detail by Clayton Christensen, a scholar at Harvard Business School.4 Disruption is one of the most important ideas in business and management to emerge over the last generation. A disruptive innovation, in Christensen’s sense, is one that is initially not very good, in the sense that it does badly on the performance metrics that industry leaders care about, but which then catches on rapidly, wrong-footing older firms and upending the industry. Christensen explained his idea through the disk-drive industry, which was once dominated by large, 8-inch disks that could hold lots of information and access it very quickly.

pages: 374 words: 94,508

Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage
by Douglas B. Laney
Published 4 Sep 2017

The high-level metrics framework includes the following attributes, definition, and sample metrics, along with performance measures transposed into sample information supply chain metrics: Performance Attribute Classic Supply Chain Performance Attribute Definition2 Sample Information Supply Chain Performance Metrics * * * Reliability The ability to perform tasks as expected. Reliability focuses on the predictability of the outcome of a process. Typical metrics for the reliability attribute include: on-time, the right quantity, the right quality. • Query/update performance • Data quality (accuracy, completeness, timeliness, integrity, etc.)

Power 57, 67, 229 Jessup, Beau Rose 35 Juniper Networks 33 keiretsus 132 Keough, Don 132 Knowledge Centered Support (KCS) 171n15 knowledge management (KM): information strategy 179; knowhow 155–6; people 191 “KnowMe” program, Westpac 54 Kosmix 31 Kovitz Investment 164 Kraft 39 Kreditech 62 Krishna, Dilip 260 Kroger 32, 36 Kumar, V. 30 Kushner, Theresa 272 Kyoto Protocol 63 Ladley, John 25, 120n13, 148, 187 34 Latulippe, Barb 188, 195, 234–5 Leatherberry, Tonie 260 legal rulings, information property rights 303–5 LexisNexis 57 liability, information as 216 library and information science (LIS) 156–8 library science: information strategy 179–80; metrics 184 lifecycle 252; see also information lifecycle LinkedIn 64 liquidity, information 20–1 Lockheed Martin 41–3, 62; project information 88 Logan, Valerie 144, 244 logical data warehouse (LDW) 181 Loss Adjustment Expense (LAE) 95 Lovelock, James 144n7 Lowans, Brian 272 loyalty 15, 22, 37, 80, 246 Lyft 141 McCrory, Dave 261 McDonald’s 236 McGilvray, Danette 272 McKnight, William 148 Magic Quadrants 68 marginal utility 273, 276; architecting for optimized information utility 278–9; concept of 284n4; of information 276–8; information for people 277; of information for people 277; information for technologies 278; of information for technologies 278; law of diminishing 276; negative 276; positive 276; understanding 273 market: cultivating for information product 73; entering new 35–6; market value of information (MVI) 257, 262, 266, 274; monetization success 74 Mashey, James 101n8 Mears, Rena 260 measurement: business-related benefits 244–5; data quality 246–8; future of infonomics 292; information assets 242–6; information asset valuation models 249–60; information–related benefits 242–4; information valuation 260–1; value of information 246–61 Mechanical Turk 65 Medicaid 44, 98 Medicare 44, 98 Megaupload 225 Memorial Healthcare System 62 Merck 66 Mercy Hospitals 98 metrics: applied asset management 184–6; assessing data quality 246–9; information asset management 182–6; information management challenges 299; information supply chain 126–7; objective quality 247–8; subjective quality 249 Microsoft 42, 223–5, 239n8 Microstrategy 133 Miller, Nolan 231, 272 Mishra, Gokula 235 MISO Energy 267 Mobilink 80 Mondelez 39 monetization 11–13, 16–18, 20, 29, 31–2, 34, 40–1, 44, 46, 55–6, 66–9, 80–100, 139, 176, 195, 244–5, 257, 265–6, 273, 277, 281, 286–7, 290, 292 monetizing information 28–9; analytics as engine of 77; bartering for favorable terms and conditions 38–9; bartering for goods and services 37–8; being in information business 48–9; creating supplemental revenue stream 32–3; defraying costs of information management and analytics 40–1; enabling competitive differentiation 36–7; entering new markets 35–6; future of infonomics 292; improving citizen well-being 45–8; increasing customer acquisition and retention 29–31; introducing a new line of business 33–4; measurement benefits 244–5; more than cash for 14–16; myths of 12–13; possibilities of 11–12; recognizing organizational roadblocks to 287–8; reducing fraud and risk 44–5; reducing maintenance costs, cost overruns and delays 41–4; success of 74–6; understanding unstructured information 94–5; value of 265–6 monetizing information steps 55–74; alternatives for direct and indirect monetization 66–8; available information assets 59–66; feasibility of ideas 69–73; high-value ideas from other industries 69; information product management function 56–9; market cultivation for information 73–4; preparing data for monetization 73 Monsanto 8, 21 Mozenda 65 Mullier, Graham 239n15 multiple listing service (MLS) 53–4 Multispectral 35 Nabisco 39 Nash Equilibrium 273 National Health Service 232 Naudé, Glabriel 156 Negroponte, Nicholas 147 Netflix 59 network effect 25, 27n14 New Jersey, state of 179, 192 New York City, reducing fraud and risk 44–5 New York Stock Exchange 23, 134 Ng, Andrew 231 Nielsen 34, 57, 63, 67 non-rivalrous 19, 131, 235, 256, 277, 280 non-rivalry principle 142, 274 Nordic Wellness Products 33 North Atlantic Treaty Organization (NATO) 147–8 Oberholzer-Gee, Felix 169 Ocean Tomo 213 O’Neal, Kelle 272 98 operational data, information asset 61 opportunity cost for information 279–80 Orange 33 ownership: brief history of 223; habeas corpus of rulings on 226–7; information 221–2, 237–8; information location 223–5; infothievery 225–6; internal, of information 233–7; owning usage of information 230–1; personal, of personal information 231–3; see also information ownership; property, information as Pacioli, Luca 210 Panchmatia, Nimish 43 Patel, Ash 134 patent 1, 19, 28; algorithm 239nn17–18; applications 230–1, 260; economic value 229; intangible asset 168, 207; intellectual property 128, 130 Patrick, Charlotte 34 People Capability Maturity Model (P-CMM) 165–6 people-process-technology 99 Pepsi 40 performance metrics, information supply chain 126–7 performance value of information (PVI) 254–5 periodicity 248 personal information, ownership of 231–3 personally identifiable information (PII) 25, 76, 178 physical asset management 158–63; asset condition 161–2; asset maintenance and replacement costs 162–3; asset register 160–1; asset risks 161–2; governance 188; vision 176 Pigott, Ian 7 Pinterest 31 Plotkin, David 187 PNC Bank 192 Poste Italiane 47 Post Malaysia 47 Potbot 64 Preska, Loretta 224 Prevedere Software 97 Price, James 106, 114–15, 181, 186, 272 price elasticity of information 275–6 process: applied asset management 194–6; information management 193–6; information management challenges 301–2; maturity, challenges and remedies 193–4 production possibility frontier 280 productive efficiency 280 productization, monetization success 75 product management function 56–9 profitability, information 24–6 ProgrammableWeb 63 property, information as 227–30 public data, information asset 64 Publicly Available Specification (PAS): metrics 184; physical asset management 158–60, 162–3 public-private partnerships 47 quality see data quality Radio Shack 141 Rajesekhar, Ruchi 267 Raskino, Mark 37 53 records information management (RIM) 152–3 Reddit 23 Redman, Thomas 235, 272 relationships, bartering for 38–9 reusable nature, information 19–20 revenue: monetization success 74; supplemental stream 32–3; value of expanded 266 Ricardian Rent 273 risk 12, 14, 28, 44–5, 62, 74–5, 85, 89, 91–2, 106, 115, 125, 139, 152, 159–61, 185, 194, 216, 242–4, 256–7, 286–90; reduction 44–5, 74–5, 85, 89; monetization success 74 Rite Aid 32 Roosevelt, Franklin 37 Rosenkranz, E.

pages: 312 words: 92,131

Beginners: The Joy and Transformative Power of Lifelong Learning
by Tom Vanderbilt
Published 5 Jan 2021

Eventually, he starts losing, because “his whole game is based on this plateau.” He professes the “love of the game” but has a “tight and hangdog…smile.” But surfing wasn’t competitive for me. There wasn’t any marathon finishing time or cycling PR I was trying to beat. I didn’t have any quantified performance metrics. I didn’t know what “losing” meant in surfing, other than perhaps losing the joy of doing the thing. If the day came when I felt bored riding down the line, I could start experimenting with other things. I could travel to new places; I could buy a new board. I was far from jaded. The smallest waves at Rockaway gave me a little charge when I saw them on the surfcam.

Patricia soon joined me, but only because she had another full week of swimming on deck and wanted to pace herself. It was all a bit embarrassing, but it also felt, strangely, exhilarating. My travails in the water were actually one of the things I loved about open-water swimming. I appreciated that the ocean was, for me, one big blank slate. On a bike, I had a precisely calibrated sense of my performance metrics and an obsessive sense of obligation to meet or exceed them; I spent hours on the “sports social network” site Strava studying my rides, seeing what imaginary trophies I could amass or how I stacked up against people I knew. With swimming, I not only had no sense of what good swimming times were; I found I didn’t care—and, well, good thing!

pages: 98 words: 25,753

Ethics of Big Data: Balancing Risk and Innovation
by Kord Davis and Doug Patterson
Published 30 Dec 2011

We live in an age when the amount of data we expect to be generated in the world is measured in exabytes and zettabytes. By 2025, the forecast is that the Internet will exceed the brain capacity of everyone living on the entire planet. Additionally, the variety of sources and data types being generated expands as fast as new technology can be created. Performance metrics from in-car monitors, manufacturing floor yield measurements, all manner of healthcare devices, and the growing number of Smart Grid energy appliances all generate data. More importantly, they generate data at a rapid pace. The velocity of data generation, acquisition, processing, and output increases exponentially as the number of sources and increasingly wider variety of formats grows over time.

pages: 98 words: 27,201

Are Chief Executives Overpaid?
by Deborah Hargreaves
Published 29 Nov 2018

Although at first it seemed as though this binding vote would be a damp squib, it has proved important in some cases and it has focused the attention of companies who might be tempted to manipulate their performance goals. In conjunction with this binding vote, companies are also required to make their performance metrics clearer and more understandable. They also have to present a table about how much will be paid out for certain levels of performance, which is meant to make the remuneration report easier to understand and in an accessible format. These reforms were aimed at simplifying the multiple pages of the remuneration report contained in a company’s annual report, but they have done little to address the complexity of executive pay.

pages: 556 words: 46,885

The World's First Railway System: Enterprise, Competition, and Regulation on the Railway Network in Victorian Britain
by Mark Casson
Published 14 Jul 2009

The engineering assumptions are very conservative relative to actual railway practice, while the use of detailed land surveys and large-scale maps means that major infringements of local parks and amenities have been avoided. 1 . 4 . P E R F O R M A N C E M E T R I C S : D I S TA N C E A N D T I M E Two main performance metrics are used in this study: journey distance and journey time. The most obvious metric by which to compare the actual and counterfactual systems is by the route mileages between pairs of towns. This Introduction and Summary 7 metric is not quite so useful as it seems, however. For many types of traYc, including passengers, mail, troops, and perishable goods, it is the time taken by the journey that is important and not the distance per se.

In practice the counterfactual system, being smaller, would have been completed much earlier than the actual system, assuming that the pace of construction had been the same. Thus the average working life of the counterfactual system would have been longer—another advantage which is not formally included in the comparison. 3 . 4 . C O N S T RU C T I O N O F T H E C O UN T E R FAC T UA L : PERFORMANCE METRICS To compare the performance of the actual and counterfactual systems a set of 250 representative journeys was examined. Ten different types of journey were 64 The World’s First Railway System distinguished, and sub-samples of 25 journeys of each type were generated. Performance was measured for each type of journey, and an overall measure of performance, based on an arithmetic average, was constructed.

R. 367 Clitheroe as secondary natural hub 83 Tab 3.4 Clyde River 199 Clyde Valley 156 coal industry 1, 50 exports 5 see also regional coalfields coal traffic 53, 182–3, 270 coalfield railways 127, 167 Coalville 187 Coatbridge 157 Cobden, Richard 37 Cockermouth 219 Colchester 69, 107, 108 Coldstream 158, 159 Colebrook 198 Colonial Office, British 48 Combe Down Tunnel 144 commerce, industry and railways 308 Index Commercial Railway Scheme, London 152, 154 Commission on the Merits of the Broad and Narrow Gauge 228 Tab 6.2 company law 42–3 competing local feeders 204–7 competition adverse effects of 221 adversarial 316–19 concept applied to railways 258–60 Duopolistic on networks 492–4 and duplication of routes 94 and excess capacity 477–97 excessive 16–19 and fare reduction 261–2 individual/multiple linkages 266, 267 inter-town 323–4 and invasions by competing companies 268–9, 273 and invasions in joint venture schemes (large) 166–73 and invasions in joint venture schemes (small) 173–8 network effects 262–4 principle of 221 and territorial strategy 286–7 wastage/inefficiency 162, 166 compulsory purchase of land 30, 223, 288 concavity principle 72, 82 connectivity and networks 2–3 Connel Ferry 161 construction costs 16–17 consultant engineers see civil engineers; mechanical engineers contour principle 72 contractors 301–2 Conway River 136 cooperation between companies 324–6 core and peripheral areas, UK 85 Fig 3.8 Corn Laws, Repeal (1846) 37, 110 Cornwall 152 Cornwall Railway 141 corporate social responsibility 311–13 corridor trains 311 Cosham 147, 190 Cotswold Hills 110, 111, 114, 149 counterfactual map of the railway network East Midlands 90 Fig 3.10 North of England 92 Fig 3.12 South East England 90 Fig 3.10 Wales 91 Fig 3.11 West of England 91 Fig 3.11 counterfactual railway network 4–29, 58–104 bypass principle 80–2, 89 and cities 306 concavity principle 82 continuous linear trunk network with coastal constraints 74 Fig 3.2 503 continuous linear trunk network with no coastal constraints 73 Fig 3.1 contour principle 87, 88 Fig 3.9 core and periphery principle 82–6, 84 Tab 3.5, 85 Fig 3.8 coverage of cities, town and villages 62–3 cross-country linkages on the symmetric network 100 Fig 3.19 cross-country routes 274 cut-off principle 80, 81 Fig 3.7, 89 cut-off principle with traffic weighting 81 Fig 3.7 Darlington core hub 89 Derby core hub 89 frequency of service 65–6 Gloucester as corner hub 82 heuristic principles of 10–12, 71–2 hubs 439–71, 440–9 Tab A5.1 hubs, configuration of 89, 94–103 hubs, size and distribution 95 Fig 3.13 Huddersfield core hub 89 influence of valleys and mountains 88 Fig 3.9 iterative process 64 Kirkby Lonsdale core hub 89 Leicester core hub 89 Lincolnshire region cross-country routes 119 London as corner hub 82 London terminals 155 loop principle 86–7 Melrose core hub 89, 158–9 mileage 437 Tab A4.4 Newcastle as corner hub 82 North-South linkages 148 North-South spine with ribs 75 Fig 3.3 objections to 12–14 optimality of the system 91–3 performance compared to actual system 64–5, 65 Tab 3.2 performance metrics 63–6 quality of network 392 Tab A4.1 and rational principles 322 Reading core hub 89 role of network 392, 393 Tab A4.2 route description 392–438, 393–436 Tab A4.3 and Severn Tunnel 112–14 Shoreham as corner hub 82 Southampton as corner hub 82 space-filling principle 87–9 Steiner solution 76 Fig 3.4 Steiner solution with traffic weighting 78 Fig 3.5 Stoke-on-Trent as corner hub 89 timetable 8, 89–90, 472–6, 474–6 Tab A6.1 timetable compared with actual 315–16 504 Index counterfactual railway network (cont.) traffic flows 66–71 traffic-weighting principle 77, 78 Fig 3.5 trial solution, first 89–91, 90 Fig 3.10, 91 Fig 3.11, 92 Fig 3.12 triangle principle 77–80, 79 Fig 3.6, 89, 96 triangle principle without traffic weighting 79 Fig 3.6 Trowbridge core hub 89 Warrington as corner hub 82 Wetherby core hub 122 country towns avoided by railway schemes 307–9 Coventry 68, 118, 135 Coventry Canal 117 Crafts, Nicholas F.

Data and the City
by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle
Published 2 Aug 2017

Such dangers are even greater in situations where experiments and data gathering are slow and difficult to replicate, such as policy research (Reichman et al. 2011: 704). Data provenance failures can result in massive societal and economic losses (see, for example, discussions around the liability of geographic information/data in Onsrud 1999 and Phillips 1999). While urban policy and planning have long been guided by data, big urban data, performance metrics and data analytics are increasingly shaping urban policy and planning (Kitchin 2014a; Townsend 2013). The provenance of data, particularly geographic data (Monmonier 1995), must then be firmly established to produce trust and the necessary information required to avoid possible misinterpretation or misuses of the data.

Here, it is recognized that cities are not mechanical systems that can be disassembled into its component parts and fixed, or steered and controlled through data levers. Instead, systems and governance are understood as complex and multi-level in nature, and the effects of policy measures are diverse and multifaceted, and neither is easily reducible to targets and performance metrics (Van Assche et al. 2010). Indicators highlight trends and potential issues, but do not show their causes or prescribe answers. Conceived in this way city dashboards provide useful contextual data – that can be used in conjunction with other data and initiatives – but are not used in a strongly instrumentalist, mechanistic way to direct management practices (Kitchin et al. 2015).

pages: 349 words: 98,309

Hustle and Gig: Struggling and Surviving in the Sharing Economy
by Alexandrea J. Ravenelle
Published 12 Mar 2019

Sarah, twenty-nine, the Tasker profiled in the opening chapter, explained, “There were people complaining and in tears because they were like, ‘I don’t have any income now. I was your top TaskRabbit and you are treating me like shit,’ and [it] kind of felt like a betrayal.” The pivot also introduced strict requirements in terms of response rates and task acceptance. “You know,” said Sarah, “they have these performance metrics, which is not a very good way to measure. You have to accept 85 percent of what’s given to you. The way that the metrics work is a thirty-day kind of thing, so sometimes it just doesn’t add up—and you don’t know what you are going to get or when you are going to get it.” As a result, Taskers who don’t accept several tasks within a relatively short period may find themselves flagged.

See also piecemeal system overtime: avoidance of, 36; independent contractor status and, 94; paid overtime, 189 overwork, 6, 15–16 The Overworked American (Schor), 16 owner-occupied move-ins, 41 paid time off, 180, 188, 190 paid travel time, 189 pajama policy, 204–5, 206 participant recruitment and methodology, 223n85, 225n34, 228n30, 229n5; overview, 22; Airbnb, 42–43; Kitchensurfing, 42–43, 57; research methodology, 21–22; shared assets, 42; sharing economy, 42–43; skills issue, 42; TaskRabbit, 42–43; Uber, 42–43; underused asset access, 42 participation barriers: capital, 42–43, 43table1, 160, 166–68, 183; entrepreneurship and, 38; skills, 42–43, 43table1, 160, 166–68, 183 partners, 3, 79 part-time workers, 180 party-line rides, 105–6 payment rate changes: Kitchensurfing, 59; TaskRabbit, 79–80; Uber, 74–79 PayPal, 26 pay-to-work situations, 2–3 Peeple, 156 Peers, 56, 72, 225n31 peer-to-peer connections: lack of full disclosure in, 97–100; marketing as, 21, 182; political language and, 23; sexual behavior and, 134 peer-to-peer firms, 26 performance metrics, 78–79 Perkins, Frances, 93, 227n8 personal assistant services, 42 Peters, Diniece, 228n32 piecemeal system, 5–6, 22, 66. See also outsourcing Piketty, Thomas, 40 Pinkerton National Detective Agency, 68, 179 pivots: effects of, 11; Kitchensurfing, 57, 222n62; TaskRabbit, 1, 17, 55–56, 79–80, 138, 203, 222n62; Uber, 74–79 platform economy: forms of, 26–28, 28fig. 2; growth of, 7; term usage, 5.

pages: 302 words: 100,493

Working Backwards: Insights, Stories, and Secrets From Inside Amazon
by Colin Bryar and Bill Carr
Published 9 Feb 2021

Before we start building, we write a Press Release to clearly define how the new idea or product will benefit customers, and we create a list of Frequently Asked Questions to resolve the tough issues up front. We carefully and critically study and modify each of these documents until we’re satisfied before we move on to the next step. The customer is also at the center of how we analyze and manage performance metrics. Our emphasis is on what we call controllable input metrics, rather than output metrics. Controllable input metrics (e.g., reducing internal costs so you can affordably lower product prices, adding new items for sale on the website, or reducing standard delivery time) measure the set of activities that, if done well, will yield the desired results, or output metrics (such as monthly revenue and stock price).

See also Amazon Prime Mechanisms Amazon Leadership Principles and annual planning process compensation plan S-Team goals metrics Amazon flywheel Amazon history and origins of anatomy of metrics chart anecdote and correct, controllable input metrics the deck (data package) DMAIC (Define-Measure-Analyze-Improve-Control) DMAIC analyze stage DMAIC control stage DMAIC define stage DMAIC improve stage DMAIC measure stage Fast Track In Stock and life cycle of output and input metrics pitfall of disaster meetings pitfall of noise obscuring signal weekly and monthly metrics on single graph at Weekly Business Review meetings year-over-year (YOY) trends microservices-based architecture Microsoft Excel MIT Media Lab Mobipocket mock-ups Music 2.0 (digital music industry conference) Napster Narrative Information Multiplier narratives sample narrative tenets and FAQs See also Press Release/Frequently Asked Questions process (PR/FAQ); six-pager NBC NBC Universal Netflix House of Cards (original series) Watch Now (video streaming service) New Project Initiatives (NPI) choosing our priorities with force-ranking our options with News Corp Nichols, Dorothy Nintendo Wii noise obscuring signal Obidos Offer Through Onboarding (Bar Raiser step) 1-Click button operating plan annual planning process OP1 process OP2 O’Reilly, Tim O’Reilly Emerging Technology conference O’Reilly Group Ownership leadership principle Palm Computing Peacock (NBC’s streaming service) performance metrics personal bias Phone Screen (Bar Raiser step) Piacentini, Diego PlayStation PowerPoint (PP) drawbacks of six-pager compared with in S-Team meetings pre-authorization, credit card Press Release/Frequently Asked Questions process (PR/FAQ) Amazon Leadership Principles and Amazon Web Services and dependencies and example of Melinda (Smart Mailbox) FAQ components features and benefits Fire Phone and history and origins of Kindle and narrative forms and PowerPoint compared with press release components price and Prime and Prime Video and process and product Working Backwards and Price, Roy Prime.

pages: 132 words: 31,976

Getting Real
by Jason Fried , David Heinemeier Hansson , Matthew Linderman and 37 Signals
Published 1 Jan 2006

—The Ganssle Group (from Keep It Small) Table of contents | Essay list for this chapter | Next essay Optimize for Happiness Choose tools that keep your team excited and motivated A happy programmer is a productive programmer. That's why we optimize for happiness and you should too. Don't just pick tools and practices based on industry standards or performance metrics. Look at the intangibles: Is there passion, pride, and craftmanship here? Would you truly be happy working in this environment eight hours a day? This is especially important for choosing a programming language. Despite public perception to the contrary, they are not created equal. While just about any language can create just about any application, the right one makes the effort not merely possible or bearable, but pleasant and invigorating.

pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future
by Martin Ford
Published 4 May 2015

Police departments across the globe are turning to algorithmic analysis to predict the times and locations where crimes are most likely to occur and then deploying their forces accordingly. The City of Chicago’s data portal allows residents to see both historical trends and real-time data in a range of areas that capture the ebb and flow of life in a major city—including energy usage, crime, performance metrics for transportation, schools and health care, and even the number of potholes patched in a given period of time. Tools that provide new ways to visualize data collected from social media interactions as well as sensors built into doors, turnstiles, and escalators offer urban planners and city managers graphic representations of the way people move, work, and interact in urban environments, a development that may lead directly to more efficient and livable cities.

This amounted to about 200 million pages of information, including dictionaries and reference books, works of literature, newspaper archives, web pages, and nearly the entire content of Wikipedia. Next they collected historical data for the Jeopardy! quiz show. Over 180,000 clues from previously televised matches became fodder for Watson’s machine learning algorithms, while performance metrics from the best human competitors were used to refine the computer’s betting strategy.20 Watson’s development required thousands of separate algorithms, each geared toward a specific task—such as searching within text; comparing dates, times, and locations; analyzing the grammar in clues; and translating raw information into properly formatted candidate responses.

pages: 128 words: 38,187

The New Prophets of Capital
by Nicole Aschoff
Published 10 Mar 2015

Women who channel their energies toward reaching the top of corporate America undermine the struggles of women trying to realize institutional change by organizing unions and implementing laws that protect women (and men) in the workplace. An anecdote shared by Sandberg illustrates this point: In 2010 Mark Zuckerberg pledged $100 million to improve the performance metrics of the Newark Public Schools. The money would be distributed through a new foundation called Startup: Education. Sandberg recommended Jen Holleran, a woman she knew “with deep knowledge and experience in school reform” to run the foundation. The only problem was that Jen was raising fourteen-month-old twins at the time, working part time, and not getting much help from her husband.

pages: 147 words: 37,622

Personal Kanban: Mapping Work, Navigating Life
by Jim Benson and Tonianne Demaria Barry
Published 2 Feb 2011

We recognize the pressures placed on groups to perform, and understand that factors like policies, growth, partnerships, and internal politics directly impact that performance. Each team or organization has its own dynamic. Modus helps teams learn to create tools and practices that create collaborative systems, make constraints explicit, reward innovation, and provide meaningful performance metrics. MORE FROM MODUS COOPERANDI PRESS Books: Scrumban: Esssays on Kanban Systems for Lean Software Development by Corey Ladas Available on Amazon, iBooks, and at Coming Soon: Scrumban II: Stories of Continuous Improvement Kidzban: Personal Kanban for Kids The latest publications and videos from Modus Cooperandi Press are available at: Table of Contents TITLE PAGE COPYRIGHT PAGE FOREWORD The Agony of Crisis Management INTRODUCTION Personal Kanban: 100% New Age Free CHAPTER 1 The Basics of Personal Kanban Towards a More Personal Kanban Rules for a System That Abhors Rules Why Visualize Your Work: Navigate Safely Why Limit Your WIP Why Call it Personal Kanban How to Use This Book PKFlow Tips CHAPTER 2 Building Your First Personal Kanban Step One: Get Your Stuff Ready Step Two: Establish Your Value Stream Step Three: Establish Your Backlog Step Four: Establish Your WIP Limit Step Five: Begin to Pull Step Six: Reflect Personal Kanban Power Boosters What it All Means CHAPTER 3 My Time Management is in League with the Freeway Flow Like Traffic Setting WIP Limits Living the Days of Our Lives Clarity Calms Carl To-do Lists: Spawns of the Devil CHAPTER 4 Nature Flows Flow: Work’s Natural Movement Cadence: Work’s Beat Slack: Avoiding Too Many Notes Pull, Flow, Cadence, and Slack in Action Busboy Wisdom: The Nature of Pull CHAPTER 5 Components of a Quality Life Metacognition: A Cure for the Common Wisdom Productivity, Efficiency, and Effectiveness Defining a Good Investment Reality Check CHAPTER 6 Finding Our Priorities Structure, Clarity, and Our Ability to Prioritize Smaller, Faster, Better: Prioritization in Theory and Practice

pages: 302 words: 82,233

Beautiful security
by Andy Oram and John Viega
Published 15 Dec 2009

Operational profile definition Explore Problem definition prioritizes key performance and capacity needs Architect for performance, capacity, and future growth Volume deploy Execute Performance budgets Performance targets Performance engineer begins work during requirements phase Annotated use cases and user scenarios Releases and iterations prioritized to validate key performance issues early Prototyping Performance estimates Benchmarks Performance measurements Code instrumentation Automated execution of performance and load tests Performance data capture Test tools/scripts for field measurement of performance/capacity Project management tracks performance metrics FIGURE 10-3. Best practices dependencies: Performance and Capacity SECURITY BY DESIGN 177 Explore Problem definition prioritizes key functions needed Operational profile definition Reliability engineer begins work during requirements phase to understand critical functions and constraints Tune physical and functional architecture for reliability and Define acceptable failure and Annotated use cases and user scenarios availability recovery rates– availability and reliability targets Predict expected reliability and availability Releases and iterations prioritized to handle capabilities early Fault/failure injection testing Failure data collected and analyzed and predictions made Fault detection, System auditing and isolation, and repair sanity control Automated execution Project management tracks of stability testing Code instrumentation quality index Volume deploy Execute Reliability budgets for failure and recovery rates Reliability and availability data capture Field measurement of failures and recovery FIGURE 10-4.

The results proved that this decision actually increased compliance with the security plan. With the requirement to pass the static analysis test still hanging over teams, they felt the need to remove defects earlier in the lifecycle so that they would avoid last-minute rejections. The second decision was the implementation of a detailed reporting framework in which key performance metrics (for instance, percentage of high-risk vulnerabilities per lines of code) were shared with team leaders, their managers, and the CIO on a monthly basis. The vulnerability information from the static code analyzer was summarized at the project, portfolio, and organization level and shared with all three sets of stakeholders.

Trading Risk: Enhanced Profitability Through Risk Control
by Kenneth L. Grant
Published 1 Sep 2004

If on each transaction you strive to save yourself a few pennies on commission, achieve a tick or two better on each execution, manage your risk to slightly more precise parameters, and conduct that extra little bit of research, you will achieve a dramatic, positive impact on your performance. I promise that, depending on where you are in your P/L cycle, this will turn good periods into great ones, mediocre periods into respectable ones, and otherwise catastrophic intervals into ones where the consequences are acceptable. Managing these types of performance metrics is hard work, but it is not nearly as difficult as losing lots of money or simply treading water. What is more, you’ll never maximize your returns unless you factor these components in. Improving performance at the margins of your trading activity will be a main theme of this book. Do Not Become Overreliant on Other Market Participants for Comfort or Assistance.

See Scientific method Optimal f, 245–251 Optimism, importance of, 4 Options: asymmetric payoff functions, 150–151 implications of, generally, 148–149 implied volatility, 86–89, 150 leverage, 151–153 nonlinear pricing dynamics, 149 pricing, 88–89, 106 strike price/underlying price, relationship between, 149–150 volatility arbitrage, 106 Out-of-the-money option, 150 Over-the-counter derivatives, 148 Performance analysis, 7–8 Performance metrics, 16, 35 Performance objectives: “going to the beach,” 32–36 importance of, 19–20, 29 nominal target return, 20, 24–26 optimal target return, 20–24 stop-out level, 20–21, 26–32 Performance ratio, 188–200 Performance success metrics: accuracy ratio (win/loss), 184–186 impact ratio, 186–188 performance ratio, 188–200 profitability concentration (90/10) ratio, 200–208 Planning, importance of, 9.

pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money
by Frank J. Ohlhorst
Published 28 Nov 2012

That is why it is important to build objectives, measurements, and milestones that demonstrate the benefits of a team focused on Big Data analytics. Developing performance measurements is an important part of designing a business plan. With Big Data, those metrics can be assigned to the specific goal in mind. For example, if an organization is looking to bring efficiency to a warehouse, a performance metric may be measuring the amount of empty shelf space and what the cost of that empty shelf space means to the company. Analytics can be used to identify product movement, sales predictions, and so forth to move product into that shelf space to better service the needs of customers. It is a simple comparison of the percentage of space used before the analytics process and the percentage of space used after the analytics team has tackled the issue.

pages: 320 words: 33,385

Market Risk Analysis, Quantitative Methods in Finance
by Carol Alexander
Published 2 Jan 2007

Then we solve the portfolio allocation decision for a risk averse investor, following and then generalizing the classical problem of portfolio selection that was introduced by Markowitz (1959). This lays the foundation for our review of the theory of asset pricing, and our critique of the many risk adjusted performance metrics that are commonly used by asset managers. ABOUT THE CD-ROM My golden rule of teaching has always been to provide copious examples, and whenever possible to illustrate every formula by replicating it in an Excel spreadsheet. Virtually all the concepts in this book are illustrated using numerical and empirical examples, and the Excel workbooks for each chapter may be found on the accompanying CD-ROM.

Many risk adjusted performance measures that are commonly used today are either not linked to a utility function at all, or if they are associated with a utility function we assume the investor cares nothing at all about the gains he makes above a certain threshold. Kappa indices can be loosely tailored to the degree of risk aversion of the investor, but otherwise the rankings produced by the risk adjusted performance measure may not be ranking in the order of an investor’s preference! The only universal risk adjusted performance metric, i.e. one that can rank investments having any returns distributions for investors having any type of utility function, is the certain equivalent. The certain equivalent of an uncertain investment is the amount of money, received for certain, that gives the same utility to the investor as the uncertain investment.

pages: 410 words: 119,823

Radical Technologies: The Design of Everyday Life
by Adam Greenfield
Published 29 May 2017

Amazon is again the leading indicator here.28 Its warehouse workers are hired on fixed, short-term contracts, through a deniable outsourcing agency, and precluded from raises, benefits, opportunities for advancement or the meaningful prospect of permanent employment. They work under conditions of “rationalized” oversight in the form of performance metrics that are calibrated in real time. Any degree of discretion or autonomy they might have retained is ruthlessly pared away by efficiency algorithm. The point couldn’t be made much more clearly: these facilities are places that no one sane would choose to be if they had any other option at all.

Theatro’s devices are less elaborate than a Hitachi wearable called Business Microscope, which aims to capture, quantify and make inferences from several dimensions of employee behavior.33 As grim as call-center work is, a Hitachi press release brags about their ability to render it more dystopian yet via the use of this tool—improving performance metrics not by reducing employees’ workload, but by compelling them to be more physically active during their allotted break periods.34 Hitachi’s wearables, in turn, are less capable than the badges offered by Cambridge, MA, startup Sociometric Solutions, which are “equipped with two microphones, a location sensor and an accelerometer” and are capable of registering “tone of voice, posture and body language, as well as who spoke to whom for how long.”35 As with all of these devices, the aim is to continuously monitor (and eventually regulate) employee behavior.

pages: 493 words: 139,845

Women Leaders at Work: Untold Tales of Women Achieving Their Ambitions
by Elizabeth Ghaffari
Published 5 Dec 2011

So I don’t view physicians as the enemy. It just doesn’t make good business sense. Ghaffari: How many departments did you end up having under you? Luttgens: I had a total of ten professional services departments. Most of them were physician-led or physician-supported. Ghaffari: What was your performance metric that you did for them? Luttgens: Back in those days, the early eighties, we didn’t have quality management or outcomes as we do today. You needed to control expenses, enhance revenue, increase patient volume, and get along. I was well-known around the medical center for getting substantial capital funding for items in my capital budgets each year.

That taught me that I was both good at, and enjoyed, fundraising because I understood the customer and believed in the product. Ghaffari: Was your primary responsibility there in an executive director role? What were some of your key accomplishments? Roden: Yes. Regarding accomplishments, we tracked several metrics. First of all, sponsorship was an important performance metric. When I started, SVASE was bringing in about $10,000 a year in sponsorship. When I left, it was $300,000 a year. Another key metric was the mailing list. When I started, we had about two thousand people on our e-mail list. When I left, it was about twenty thousand people. When I started, we had about twenty volunteers.

How I Became a Quant: Insights From 25 of Wall Street's Elite
by Richard R. Lindsey and Barry Schachter
Published 30 Jun 2007

In the early 1990s, the entire banking industry was moving headlong toward Raroc as a pricing and performance measurement framework. However, as early as 1992, I recognized that the common Raroc measure based on own portfolio risk or VaR was at odds with equilibrium and arbitrage pricing theory (see Wilson (1992)). Using classical finance to make the point, I recast a simple CAPM model into a Raroc performance metric and showed that Raroc based on own portfolio risk without the recognition of funding was inherently biased. In the years since 1992, many other authors have followed a similar line of thought. What is the appropriate cost of capital, by line of business, if capital is allocated based on the standalone risk of each underlying business?

See Credit risk integrated tool set, application, 80 technology, usage, 134–135 Portfolio optimization, 281–283 “Portfolio Optimization with Factors, Scenarios, and Realistic Short Positions,” 281 Portfolio Theory (Levy/Sarnat), 228 Portfolio trading, mathematics, 128–130 Positive interest rates, ensuring, 161–162 Prepayment data, study, 183 Press, Bill, 36 Price/book controls, pure return, 272 Price data, study, 183 Price/earnings ratios, correlation, 269 Price limits, impact, 77 Primitive polynomial modulo two, 170 Prisoner’s dilemma, 160 Private equity returns, benchmarks (finding), 145 Private signals, quality (improvement), 159–160 Publicly traded contingent claims, combinations (determination), 249 Public pension funds, investment, 25 Pure mathematics, 119, 126 Quantitative active management, growth, 46–47 Quantitative approach, usage, 26–27 Quantitative finance, 237–238 purpose, 96–98 Quantitative Financial Research (Bloomberg), 137 Quantitative investing, limitation, 209 Quantitative label, implication, 25–26 Quantitative methods, role, 96–97 Quantitative Methods for Financial Analysis (Stephen/Kritzman), 253 Quantitative models, enthusiasm, 234 Quantitative portfolio management, 130–131 Quantitative strategies, usage, 240 Quantitative Strategies (SAC Capital Management, LLC), 107 Quantitative training, application process, 255–260 Quants business sense, discussion, 240–241 characteristics/description, 208–210 conjecture, 177–179 conversion, 327 data mining, 209–210 description, New York Times article, 32 due diligence, requirement, 169 future, 13–16, 261 innovations, 255–258 myths, dispelling, 258–260 perspective, change, 134–135 process, 92–93 research, 127–128 Quigg, Laura, 156–158, 160 Quotron, recorded data (usage), 22 Rahl, Leslie, 83–93 Ramaswamy, Krishna, 253 385 RAND Corporation, 13–17 Raroc models, usage/development, 102–103 Raroc performance metric, 103 Reagan, Ronald, 15 Real economic behavior, level (usefulness), 101 Real options (literature), study, 149 Real-time artificial intelligence, 16 Rebonato, Riccardo, 168, 169, 232 Reed, John, 89 Registered investment advisors, 79 Regression, time-varying, 239 Renaissance Medallion fund, 310 Representation Theory and Complex Geometry, 122–125 Resampling statistics, usage, 239–240 Research collaboration, type, 157–158 Research Objectivity Standards, 280–281 Retail markets, changes, 148–149 Return, examination, 71–72 Return-predictor relationships, 269 Returns separation, 34–35 variance, increasing, 72 “Revenue Recognition Certificates: A New Security” (LeClair/Schulman), 82 Rich, Don, 256 Riemann Hypothesis, solution, 108 Risk analytics, sale, 301 bank rating, 216 buckets, 71 cost, 129 examination, 70–71 forecast, BARRA bond model (usage), 39 importance, 34–35 manager, role, 302–303 reversal, 299 worries, 39 Risk-adjusted return, 102 Risk management, 233 consulting firm, 293 technology, usage, 134–135 world developments, 96 Risk Management (Clinton Group), 295 Risk Management & Quantitative Research (Permal Group), 227 RiskMetrics, 300–301 business, improvement, 301 computational device, 240 Technical Document, publication (1996), 66 Risk/return trade-off, 259 RJR Nabisco, LBO, 39 Roll, Richard, 140 Ronn, Ehud, 157, 160–162 Rosenberg, Barr, 34–42 models, development, 34–37 Rosenbluth, Jeff, 132 Ross, Stephen A., 141, 254, 336 arbitrage pricing model, development, 147–148 Rubinstein, Mark, 278, 336 P1: OTE/PGN JWPR007-Lindsey P2: OTE January 1, 1904 6:33 386 Rudd, Andrew, 35, 307 historical performance analysis, 44 Rudy, Rob, 219 Russell 3000, constitution, 275 Salomon Brothers, Bloomberg (employ), 73 Samuelson, Paul, 256–257 time demonstration, 258 Sankar, L., 162 Sargent, Thomas, 188 Savine, Antoine, 167 Sayles, Loomis, 33 SBCC, 285 Scholes, Myron, 11, 88, 177, 336 input, 217 Schulman, Evan, 67–82 Schwartz, Robert J., 293, 320 Secret, classification, 16–18 Securities Act of 1933, 147 Securities Exchange Act of 1934, 147 Security replication, probability (usage), 122 SETS, 77 Settlement delays, 174 Seymour, Carl, 175–176 Shareholder value creation, questions, 98 Sharpe, William, 34, 254 algorithm, 257–258 modification, 258 Shaw, Julian, 227–242 Sherring, Mike, 232 Short selling, 275–276 Short selling, risk-reducing/returnenhancing benefits, 277 Short-term reversal strategy, 198–199 Shubik, Martin, 288–289, 291, 293 Siegel’s Paradox, 321–322 Sklar’s theorem, 240 Slawsky, Al, 40–41 Small-cap stocks, purchase, 268 Smoothing, 192–193 Sobol’ numbers, 173–173 Social Sciences Research Network (SSRN), 122 Social Security system, bankruptcy, 148 Society for Quantitative Analysis (SQA), 253 Spatt, Chester, 252 Spot volatility, parameter, 89–90 Standard & Poor’s futures, price (relationship), 75 INDEX Start-up company, excitement, 24–25 Statistical data analysis, 213–214 Statistical error, 228 Sterge, Andrew J., 317–327 Stevens, Ross, 201 Stochastic calculus, 239 Stock market crash (1987), 282 Stocks portfolio trading, path trace, 129 stories, analogy, 23–26 Strategic Business Development (RiskMetrics Group), 49 Sugimoto, E., 171 Summer experience, impact, 57 Sun Unix workstation, 22 Surplus insurance, usage, 255–256 Swaps rate, Black volatilities, 172 usage, 292–293 Sweeney, Richard, 190 Symbolics, 16, 18 Taleb, Nassim, 132 Tenenbein, Aaron, 252 Textbook learning, expansion, 144 Theoretical biases, 103 Theory, usage/improvement, 182–185 Thornton, Dan, 139 Time diversification, myths, 258 Top secret, classification, 16–18 Tracking error, focus, 80–81 Trading, 72–73 Transaction cost, 129 absence, 247 impact, 273–274 Transaction pricing, decision-making process, 248 Transistor experiment (TX), 11 Transistorized Experimental Computer Zero (tixo), usage, 86 Treynor, Jack, 34, 254 Trigger, usage, 117–118 Trimability, 281 TRS-80 (Radio Shack), usage, 50, 52, 113 Trust companies, individually managed accounts (growth), 79 Tucker, Alan, 334 Uncertainty examination, 149–150 resolution, 323–324 Unit initialization, 172 Universal Investment Reasoning, 19–20 Upstream Technologies, LLC, 67 U.S. individual stock data, research, 201–202 Value-at-Risk (VaR), 195. calculation possibility tails, changes, 100 design, 293 evolution, 235 measurement, 196 number, emergence, 235 van Eyseren, Olivier, 173–175 Vanilla interest swaptions, 172 VarianceCoVariance (VCV), 235 Variance reduction techniques, 174 Vector auto-regression (VAR), 188 Venture capital investments, call options (analogy), 145–146 Volatility, 100, 174, 193–194 Volcker, Paul, 32 von Neumann, John, 319 Waddill, Marcellus, 318 Wall Street business, arrival, 61–65 interest, 160–162 move, shift, 125–127 quant search, genesis, 32 roots, 83–85 Wanless, Derek, 173 Wavelets, 239 Weisman, Andrew B., 187–196 Wells Fargo Nikko Investment Advisors, Grinold (attachment), 44 Westlaw database, 146–148 “What Practitioners Need to Know” (Kritzman), 255 Wigner, Eugene, 54 Wiles, Andrew, 112 Wilson, Thomas C., 95–105 Windham Capital Management LLC, 251, 254 Wires, rat consumption (prevention), 20–23 Within-horizon risk, usage, 256 Worker longevity, increase, 148 Wyckoff, Richard D., 321 Wyle, Steven, 18 Yield, defining, 182 Yield curve, 89–90, 174 Zimmer, Bob, 131–132

pages: 168 words: 49,067

Becoming Data Literate: Building a great business, culture and leadership through data and analytics
by David Reed
Published 31 Aug 2021

Measuring productivity in the data realm While much of the focus around data projects tends to be on achieving cost efficiencies within business processes, such as through enhancing productivity using automation, data leaders also need to keep a close eye on the productivity of their own department. As with the need for prioritisation outlined earlier, there are performance metrics that need to be put in place to help justify the continued investment made by the organisation into this area. Simple measures, such as task completion, offer a top-line indicator of productivity. If combined with measures of stakeholder satisfaction, this may be sufficient to show that the data department is working effectively.

pages: 892 words: 91,000

Valuation: Measuring and Managing the Value of Companies
by Tim Koller , McKinsey , Company Inc. , Marc Goedhart , David Wessels , Barbara Schwimmer and Franziska Manoury
Published 16 Aug 2015

Equal attention is paid to the long-term value-creating intent behind short-term profit targets, and people across the company are in constant communication about the adjustments needed to stay in line with long-term performance goals. We approach performance management from both an analytical and an organizational perspective. The analytical perspective focuses first on ensuring that companies use the right metrics at the right level in the organization. Companies should not just rely on performance metrics for divisions or business units, but disaggregate performance to the level of individual business segments. In addition to historical performance measures, companies need to use diagnostic metrics that help them understand and manage their ability to create value over the longer term. Second, we analyze how to set appropriate targets, giving examples of analytically sound performance measurement in action.

Once that point is reached, the associated 6 For example, declining sales in one segment would imply increasing capital allocated to other segments even if their sales would be unchanged. 592 PERFORMANCE MANAGEMENT investments and operating costs need to be factored in for target setting in individual business segments. The Right Metrics in Action Choosing the right performance metrics can provide new insights into how a company might improve its performance in the future. For instance, Exhibit 26.8 illustrates the most important value drivers for a pharmaceutical company. The exhibit shows the key value drivers, the company’s current performance relative to best- and worst-in-class benchmarks, its aspirations for each driver, and the potential value impact from meeting its targets.

The greatest value creation would come from three areas: accelerating the rate of release of new products from 0.5 to 0.8 per year, reducing from six years to four the time it takes for a new drug to reach 80 percent of peak sales, and cutting the cost of goods sold from 26 percent to 23 percent of sales. Some of the value drivers (such as new-drug development) are long-term, whereas others (such as reducing cost of goods sold) have a shorter-term focus. Similarly, focusing on the right performance metrics can help reveal what may be driving underperformance. A consumer goods company we know illustrates the importance of having a tailored set of key value metrics. For several years, a business unit showed consistent double-digit growth in economic profit. Since the financial results were consistently strong—in fact, the strongest across all the business units—corporate managers were pleased and did not ask many questions of the business unit.

pages: 204 words: 54,395

Drive: The Surprising Truth About What Motivates Us
by Daniel H. Pink
Published 1 Jan 2008

Indeed, other economists have shown that providing an employee a high level of base pay does more to boost performance and organizational commitment than an attractive bonus structure. Of course, by the very nature of the exercise, paying above the average will work for only about half of you. So get going before your competitors do. 3. IF YOU USE PERFORMANCE METRICS, MAKE THEM WIDE-RANGING, RELEVANT, AND HARD TO GAME I magine you're a product manager and your pay depends largely on reaching a particular sales goal for the next quarter. If you're smart, or if you've got a family to feed, you're going to try mightily to hit that number. You probably won't concern yourself much with the quarter after that or the health of the company or whether the firm is investing enough in research and development.

Android Developer Tools Essentials: Android Studio to Zipalign
by Mike Wolfson and Donn Felker
Published 13 Aug 2013

Unused layouts in your hierarchy are a common problem with potentially big performance impacts, as each additional ViewGroup makes the measure pass described in Two-pass layout take more time (and it’s already the bulk of the time required to render the screen). It is reasonably easy to identify unused layouts. In this case, there is one LinearLayout (in the middle towards the left) that doesn’t show any performance metrics (there is just a blank space where the colored balls and timing information would be). This indicates that it is not being rendered and should be removed. Figure 13-15. Hierarchy View: bad detail Using the Tree tool to inspect the good UI The performance indicators in the good UI look much better than the bad one in the Tree View.

pages: 188 words: 54,942

Drone Warfare: Killing by Remote Control
by Medea Benjamin
Published 8 Apr 2013

Joshua Foust of the American Security Project discovered that in some targeting programs, contracted staffers have review quotas—that is, they must review a certain number of possible targets per given length of time. “Because they are contractors, their continued employment depends on their ability to satisfy the stated performance metrics,” Foust explained.256 “So they have a financial incentive to make life-or-death decisions about possible kill targets just to stay employed. This should be an intolerable situation, but because the system lacks transparency or outside review it is almost impossible to monitor or alter.” A policy paper on UAVs by the United Kingdom’s Ministry of Defence asked questions rarely heard in US government circles.

pages: 172 words: 51,837

How to Read Numbers: A Guide to Statistics in the News (And Knowing When to Trust Them)
by Tom Chivers and David Chivers
Published 18 Mar 2021

You need to measure things in order to see whether they’re going well: it’s impossible, in a modern nation of millions of people, for government to individually assess every school and hospital. The same is true of any large modern business. Metrics are there for a reason: a car company might give a bonus to the salesperson who sells the most cars, for instance, and by incentivising them to work harder towards that goal could improve overall performance. Metrics are necessary. But there’s a trade-off. If your car salespeople start competing against each other, rather than co-operating – undermining each other in front of the customers – you could find that you sell fewer cars overall. If the people in charge aren’t careful, they can lose sight of the fact that the metric isn’t what you really care about, but is a proxy for an often complex, multifaceted and hard-to-define – but nonetheless real – underlying quality which you do care about.

pages: 559 words: 155,372

Chaos Monkeys: Obscene Fortune and Random Failure in Silicon Valley
by Antonio Garcia Martinez
Published 27 Jun 2016

Just as my first view of Facebook’s high-level revenue dashboard proved a dispiriting exercise, Chorizo’s final results, which took months to produce, were a similar tale of woe. No user data we had, if fed freely into the topics that Facebook’s savviest marketers used to target their ads, improved any performance metric we had access to. That meant that advertisers trying to find someone who, say, wanted to buy a car, benefited not at all from all the car chatter taking place on Facebook. It was as if we had fed a mile-long trainful of meat cows into a slaughterhouse, and had come out with one measly sausage to show for it.

Immature advertising markets, the embryonic state of their e-commerce infrastructure, and their lower general wealth meant the impact of new optimization tricks or targeting data on those countries was minimal. And so the Ads team would slice off tranches of the FB user base in rich ads markets and dose them with different versions of the ads system to measure the effect of a new feature, as you would test subjects in a clinical drug trial.* The performance metrics of interest included clickthrough rates, which are a coarse measure of user interest. More convincing is the actual downstream monetization resulting from someone clicking through and buying something—assuming Facebook got the conversion data, which it often didn’t, given that Facebook didn’t have a conversion-tracking system.

pages: 519 words: 155,332

Tailspin: The People and Forces Behind America's Fifty-Year Fall--And Those Fighting to Reverse It
by Steven Brill
Published 28 May 2018

Except for the most civic minded among them, corporate executives—who spend millions to lobby against employment laws forcing even a fraction of these due process protections on their companies when they hire or fire their own employees—are not likely to worry about the straitjacket their government faces in recruiting talent or in training or in dismissing the untalented. Nor do they care much that their government doesn’t produce a budget or performance metrics, or pay enough to hire and keep competent people in jobs managing billions of dollars’ worth of programs. Similarly, there is an imbalance of passion and interest when it comes to perhaps the most obvious common good: the nation’s infrastructure. America’s deteriorating roads and power grids, and broken mass transit systems, are daily reminders of how the protected have undermined the government’s ability to fulfill its most basic purpose.

Or, as suggested, one or more news organizations could finally generate broad disgust over the auctioning of American democracy by routinely attaching to the name of every politician, whenever he or she holds a hearing or votes on an issue, a tally of the campaign contributions received from the special interests involved. Maybe an elderly woman’s struggle to get the Social Security money she is owed will go viral and snowball into a meme that taps everyone’s frustrations about broken government—and makes the Partnership for Public Service’s wonky plans for civil service reform, agency performance metrics, and a rational budgeting process a cause that politicians have to latch on to. Perhaps one of the cable news networks could televise Stier’s annual “Sammy Awards” dinner for stellar public servants to drive home the message that government can work if the right people are attracted to the mission.

pages: 204 words: 58,565

Keeping Up With the Quants: Your Guide to Understanding and Using Analytics
The variables in deciding whether to acquire Battier from the Grizzlies would be the cost of acquiring him (outright or in trade for other players), the amount that he would be paid going forward, various individual performance measures, and ideally some measure of team performance while Battier was on the court versus when he was not. DATA COLLECTION (MEASUREMENT). The individual performance metrics and financials were easy to gather. And there is a way to measure an individual player’s impact on team performance. The “plus/minus” statistic, adapted by Roland Beech of from a similar statistic used in hockey, compares how a team performs with a particular player in the game versus its performance when he is on the bench.

If you give constructive advice once or twice a week, for example, look for daily opportunities for sincere, specific, positive reinforcement. Monthly constructive advice? Shoot for the positive stuff slightly more than weekly. And so on. You get the math. Once you get hooked on the uptick in improvement, we’re confident you’ll be converted. How will you know if you’re getting it right? Your performance metric is someone else’s improvement. If you’re not seeing improvement, then it’s your job to try another way. Restate your observations with more specificity. Build more trust so that your recipient(s) can actually hear what you have to say. You don’t get credit for trying regardless of whether you were effective.

A Dog-free Example While no dogs were harmed in the making of this chapter (it’s all fun and games until someone sprouts and extra pair of legs), I wanted to give you one more practical example of the useful things you can do with class variables. Something that’s a little closer to the real-world applications for class variables. So here it is. The following CountedObject class keeps track of how many times it was instantiated over the lifetime of a program (which might actually be an interesting performance metric to know): class CountedObject: num_instances = 0 def __init__(self): self.__class__.num_instances += 1 CountedObject keeps a num_instances class variable that serves as a shared counter. When the class is declared, it initializes the counter to zero and then leaves it alone. Every time you create a new instance of this class, it increments the shared counter by one when the __init__ constructor runs: >>> CountedObject.num_instances 0 >>> CountedObject().num_instances 1 >>> CountedObject().num_instances 2 >>> CountedObject().num_instances 3 >>> CountedObject.num_instances 3 Notice how this code needs to jump through a little hoop to make sure it increments the counter variable on the class.

Founded in 2006 by Jay Coen Gilbert, Bart Houlahan and Andrew Kassoy, B Lab is a non-profit dedicated to ‘using business as a force for good’.120 It has created the Global Impact Investing Rating System (GIIRS) to measure the impact of all stakeholders, including workers, customers and communities.121 Other efforts include that of the Global Impact Investing Network (GIIN), founded in 2009, which provides a catalogue of standardized performance metrics for businesses receiving impact investment capital. The Sustainability Accounting Standards Board (SASB), founded in 2011, focuses on serving the needs of investors – SASB standards measure the impact of businesses across a range of issues relating to sustainability. The Global Reporting Initiative’s (GRI) Sustainability Reporting Standards, first launched in 2000, focus on sustainability, transparency and corporate disclosure, rather than on impact measurement.

Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 40 Deshmukh Supply Chain Management Supplier Relationship Management/E-Procurement • Supports multiple standards for security and data communication • Support for sourcing • Support for online negotiation • Support for catalog development and hosting • Support for auctions • Support for customization and automation according to trading agreements, workflows and business rules Vendor Self-Service • Get information on request for quotation or proposals, purchase order revision, receipt or return of goods and payments • Performance metrics for quality and delivery • Inventory requirements available through EDI or Internet posting • Drill-down capabilities • Chat facilities with purchase people Expenditures • Online forms for claiming travel and entertainment expenses • Online management of expense reimbursements • Online payroll • Online time sheets • Support for online travel centers Product Development • Online design and development tools • Sharing of product design over the Internet • Virtual testing and collaboration • Integration or interface with Computer Aided Design/Computer Aided Manufacturing (CAD/CAM) software Human Resources • Access to personal files, job performance and company policies • Access to 401K funds Copyright © 2006, Idea Group Inc.

pages: 253 words: 65,834

” Gail made a point of previewing news with each director right before the board meeting to allow them to have some reflection time before the meeting, and to alert her to any of their initial concerns. “I wanted them to know exactly what they were going to hear in the meeting. By the time we got into the board meeting, everybody was informed and we could really get into the meat of whatever the issue was.” The “no surprises” rule applies to changes in management as well as performance metrics. “The board would lose confidence in some team members at different times,” Gail told me. “So I was very clear about saying, ‘I see the same weaknesses. But here’s what they’re doing. And I’ll make the decision about this person at the right time.’ You can’t fool these guys. If you have an executive that has weaknesses and you try to deny it, it erodes the board’s confidence, makes them think you don’t have good judgment when it comes to people.

Some excel when we can assume that each number in the list is unique (as with social security numbers). So even within the constraint of developing a precise recipe for a precise computational task, there may be choices and trade-offs to confront. As the previous paragraph suggests, computer science has traditionally focused on algorithmic trade-offs related to what we might consider performance metrics, including computational speed, the amount of memory required, or the amount of communication required between algorithms running on separate computers. But this book, and the emerging research it describes, is about an entirely new dimension in algorithm design: the explicit consideration of social values such as privacy and fairness.

According to this logic, organizations are the way they are and do things the way they do for a reason, and that reason is to efficiently and effectively achieve the tasks they were designed for. Meyer and his colleagues are very much focused on the contemporary period, so there is a need to exercise caution when reading their insights back into previous centuries. Yet if organizations in an era of detailed performance metrics, huge data-processing and analytical capacities, and a whole industry of professional managers and consultants nevertheless can deviate so fundamentally from the rational ideal, there are good reasons to think that their early modern counterparts would have had an even harder time coming anywhere near this idealized mark.

‘While I’m dropping to the ground and finding cover, do you think I can do so, in a constructive, manly fashion?’ ‘I don’t care what fashion you do it in so long as you don’t get in my way or end up dead. Both options would be hugely helpful. If you do choose to die, could you do it after I have retrieved the egg? A failed recovery and a dead civilian will play hell with my performance metrics.’ ‘I’ll try not to spoil your clean-up rate. That’s obviously at the top of my priorities.’ I sighed. This wasn’t what I needed. We both needed to be alert, not at odds. I was used to telling people what they needed to do, and then they did it. I didn’t mind when Clio or Ramin talked back because I valued their insight and experience.

Redoing models, gathering new data, and retraining can all be expensive, and careful measurement of drift allows for wise decisions about whether to incur these costs. Detecting drift using model performance, data quality, and product efficacy metrics can reveal the source of low reliability. Model performance metrics that we explicate in Chapter 7, “Measuring the Loop,” include those around predictive accuracy, such as an F1 score. Data-quality metrics include those around label coverage. Product efficacy metrics get to the true customer ROI, such as failure rates on a production line. The interaction between these metrics is also important: if the distribution of data changes but the predictions are still accurate, that just means the world changed.

But Purple’s work demands were different. She was expected to respond to texts and emails on her company-provided mobile phone and laptop at all times. She was required to be on conference calls for her global clients sometimes as early as one in the morning. Even after doing this and hitting all her performance metrics, she was refused a promotion and told by her senior VP, “You’re not ready.” Discouraged and disillusioned, Purple decided to play the game to her advantage. Instead of staying in a particular role or with a company under the assumption that her hard work and loyalty would eventually be rewarded, she changed her approach.

• Count on the best people outperforming the worst by about 10:1. • Count on the best performer being about 2.5 times better than the median performer. • Count on the half that are better-than-median performers outdoing the other half by more than 2:1. These rules apply for virtually any performance metric you define. So, for instance, the better half of a sample will do a given job in less than half the time the others take; the more defect-prone half will put in more than two thirds of the defects, and so on. Results of the Coding War Games were very much in line with this profile. Take as an example Figure 8–2, which shows the performance spread of time to achieve the first milestone (clean compile, ready for test) in one year’s games.

For any respondent who wanted it, we provided a coded identification number that enabled the individual to examine the results and reports for personal reasons. In some cases, we even let them examine their own data in comparison to others in the financial elite. For a generation of business men and women who believe in measurement, and who grew Debunking Paris Hilton 15 up with IQ tests, SAT scores, and other performance metrics, this quantitative capability was an often irresistible source of pleasure. This was particularly true because the individuals had been on a special journey, one their upbringings had left them largely unprepared for, and so understanding the journeys of others was a means for understanding their own trips and themselves.

“The first thing you think is where’s the edge, where can I make a bit more money, how can I push, push the boundaries, maybe you know a bit of a gray area, push the edge of the envelope,” he said in one early interview. “But the point is, you are greedy, you want every little bit of money that you can possibly get because, like I say, that is how you are judged, that is your performance metric.” Paper coffee cups piled up as Hayes went over the minutiae of the case: how to hedge a forward rate agreement; the nuances of Libor and Tibor; why he and Darin hated each other so much. One of the interviews was conducted in the dark so Hayes could talk the investigators through his trading book, which was beamed onto a wall.

Having said that, the question “Would it be politically feasible and functionally possible to implement a shorter workweek for some but not all in your organization?” is one you have to answer for yourself, and the answer is going to be different with every organization. RUN A TRIAL Even after discussing it with employees, writing up contingency plans, and establishing performance metrics, very few companies make a permanent switch to a shorter workweek immediately. They first start with a trial period during which they give people time to adjust to the new schedule, observe, and solve unexpected problems. They then review the experiment at regular intervals to assess how things are going, absorb new lessons, and make course corrections.

a typical household spends: U.S. Energy Information Administration, The city of Shenzhen: Michael J. Coren, “Buses with Batteries,” Quartz, Jan. 2, 2018, Photo: Bloomberg via Getty Images. According to a 2017 study: Shashank Sripad and Venkatasubramanian Viswanathan, “Performance Metrics Required of Next-Generation Batteries to Make a Practical Electric Semi Truck,” ACS Energy Letters, June 27, 2017, Table: Green Premiums to replace diesel: Rhodium Group, Evolved Energy Research, IRENA, and Agora Energiewende. Retail price is the average in the United States from 2015 to 2018.

Much of what radiologists currently do involves taking images and then identifying issues of concern. They predict abnormalities in images. AIs are increasingly able to perform that prediction function at human levels of accuracy or better, which can assist radiologists and other medical specialists in making decisions that have an impact on patients. The critical performance metric is the accuracy of the diagnosis: whether the machine predicts a disease when the patient is ill and predicts no disease when the patient is healthy. But we must consider what such decisions involve. Suppose doctors suspect a lump and must decide how to determine if it is cancerous. One option is medical imaging.

Perhaps the most interesting application of technology in college education is the Minerva Project, a startup university now entering its fifth year. At Minerva, students take classes online, but they do so while living together in dorm-style housing. Minerva’s online interface is unusual in that the student’s face is shown the whole time, and they get called on to ensure accountability and engagement. This “facetime” is even the main performance metric—there aren’t final exams. Professors review the classes to see if individual students are demonstrating the right “habits of mind.” Minerva saves money by not investing in libraries, athletic facilities, sports teams, and the like. Students spend up to one year each in different dorms in San Francisco, Buenos Aires, Berlin, Seoul, and Istanbul.

“In reality, our CO2 emissions are still increasing, but the efforts we are making are very big.”17 Things were beginning to change politically as well. Niu’s green GDP was beginning to catch on. In 2015 China’s environment ministry again floated the idea that the performance of provincial officials should be judged partly by progress on improving the environment. In 2014 more than seventy smaller cities and counties jettisoned GDP as a performance metric for government officials, prioritizing environmental protection and poverty reduction instead. That summer President Xi had told party officials, “We need to look at obvious achievements as well as hidden achievements. We can no longer simply use GDP growth rates to decide who the [party] heroes are.”18 Internationally too, Beijing had gone from laggard to putative world leader.

What is most popular on the Internet is not wholly a matter of what users click on and how websites are hyperlinked—there are a variety of processes at play. Max Holloway of Search Engine Watch notes, “Similarly, with Google, when you click on a result—or, for that matter, don’t click on a result—that behavior impacts future results. One consequence of this complexity is difficulty in explaining system behavior. We primarily rely on performance metrics to quantify the success or failure of retrieval results, or to tell us which variations of a system work better than others. Such metrics allow the system to be continuously improved upon.”52 The goal of combining search terms, then, in the context of the landscape of the search engine optimization logic, is only the beginning.

For decades, the corporate world has been consumed with metrics. Managers love tangible measures by which they can determine success or failure. Work hours is one of the easiest ways to measure employee performance, but total hours worked is a meaningless statistic. In fact, while goal setting can be helpful, creating performance metrics for employees is often counterproductive. Metrics can be useful and even enlightening, but if they are overused or even employed to measure things that are unmeasurable like innovation, metrics become destructive. Trying to meet numerical goals is also not particularly inspiring to the human mind, and so metrics don’t encourage creative thought.

According to Friedman, people who believe that business should not be concerned merely with profits but should also promote social ends such as providing employment or avoiding pollution are preaching pure socialism.3 Milton Friedman’s perspective has one obvious advantage: it is simple. There is just one constituency to please—shareholders—and one performance metric that matters—profits. The Friedman doctrine remained business gospel for decades. In 1997, the Business Roundtable, which includes the CEOs of the largest and most influential companies in the United States, published a statement that declared: “The Business Roundtable wishes to emphasize that the principal objective of a business enterprise is to generate economic returns to its owners.”4 My view started changing when I was still a consultant, and my subsequent experience at the helm of several companies only confirmed what I started to feel in those latter days at McKinsey.

European and Asian executives, even those running multinational corporations, are paid a fraction of the salaries paid in the Anglo sphere.”41 CEO Lemons: The Collapse of Pay-for-Performance in America Foreign scholars describe American firms as providing “pathological overcompensation of fair-weather captains.”42 They are correct: the rise in US executive compensation of recent decades is unjustified by any performance metric, vastly outstripping indices like sales, profits, or returns to shareholders. The Clinton administration’s Secretary of Labor, Robert Reich, unearthed the smoking gun evidence: “By 2006, CEOs were earning, on average, eight times as much per dollar of corporate profits as they did in the 1980s.”43 A vast disparity like this in trend lines is powerful evidence that executive pay suffers from market failure.

She pulled a chair out and sat down and the steward poured her a cup of coffee immediately. I noticed that even on a cruise ship she was dressed in a business suit, although it looked somewhat the worse for wear. “Coffee, please,” I called after the retreating steward. “We met in Darmstadt, `97,” she said. “You’re Marcus Jackman? I critiqued your paper on performance metrics for IEEE maintenance transactions.” The penny dropped. “Karla . . . Carrol?” I asked. She smiled. “Yes, I remember your review.” I did indeed, and nearly burned my tongue on the coffee trying not to let slip precisely how I remembered it. I’m not fit to be rude until after at least the third cup of the morning.

You can also limit your rules to two or three, as we have seen elsewhere in the book, to increase the odds that you will remember and follow them. After crafting your preliminary rules, it is helpful to measure how well they are working. Measuring impact allows you to pinpoint what is and isn’t working, and evidence of success also provides more motivation to stick with the rules. The best performance metrics are tightly linked to what will move the needles for you—pounds lost for a dieter, or dollars invested if you are trying to save for retirement. Apps have made collecting data and tracking progress easier than at any other time in history. Imagine what the legendary self-improver Benjamin Franklin could have accomplished if he’d had an iPhone.

“Analytics Mailbag: Save Percentages, PDO, and Repeatability.” May 27, 2014. 205The statistic, later dubbed PDO: Details on PDO and NHL statistics given in: Weissbock, Joshua, Herna Viktor, and Diana Inkpen. “Use of Performance Metrics to Forecast Success in the National Hockey League” (paper presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Prague, September 23–27, 2013). 205England had the lowest PDO: Burn-Murdoch, John. “Were England the Uunluckiest Team in the World Cup Group Stages?”

“Securing premises using surfaced-based computing technology,” U.S. Patent number: 8138882. Issue date: March 20, 2012. The quantified-self movement—“Counting Every Moment,” The Economist, March 3, 2012. Apple earbuds for bio-measurements—Jesse Lee Dorogusker, Anthony Fadell, Donald J. Novotney, and Nicholas R Kalayjian, “Integrated Sensors for Tracking Performance Metrics,” U.S. Patent Application 20090287067. Assignee: Apple. Application Date: 2009-07-23. Publication Date: 2009-11-19. Derawi Biometrics, “Your Walk Is Your PIN-Code,” press release, February 21, 2011 ( iTrem information—See the iTrem project page of the Landmarc Research Center at Georgia Tech ( and email exchange.

Gallup’s Q12 Employee Engagement Survey completely shatters the idea that friendships at work are unproductive. The study concludes that one of the key determinants to engagement at work is having a workplace bestie. Coworkers who report a best friend at work are seven times more engaged at work than their disconnected counterparts. They score higher on all performance metrics. They are better with customers, bring more innovation to projects and have superior mental acuity and reduced rates of error and injury. This is because having social bonds at work makes people feel good. It makes them happy. In 2014, I interviewed Shawn Achor, Harvard researcher and bestselling author of The Happiness Advantage, Beyond Happiness and Big Potential.

Mitchell, the E. Fredkin University Professor in the Machine Learning Department of Carnegie Mellon University’s School of Computer Science, offers a good definition of machine learning in “The Discipline of Machine Learning.” He writes: “We say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.”9 I think this is a good definition because Mitchell uses very precise language to define learning.

User Metrics What user behaviors can you measure that will indicate that they adopt, use, and place value in your solution? 6. Adoption Strategy How will customers and users discover and adopt your solution? 7. Business Problem What problem for your business does building this product, feature, or enhancement solve for your business? 8. Business Metrics What business performance metrics will be affected by the success of this solution? These metrics are often a consequence of users changing their behavior. 9. Budget How much money and/or development would you budget to discover, build, and refine this solution? Opportunity Shouldn’t Be a Euphemism I know your company probably doesn’t use opportunities.

Martin wrote: A wife bonus, I was told, might be hammered out in a pre-nup or post-nup, and distributed on the basis of not only how well her husband’s fund had done but her own performance—how well she managed the home budget, whether the kids got into a “good” school—the same way their husbands were rewarded at investment banks. In turn these bonuses were a ticket to a modicum of financial independence and participation in a social sphere where you don’t just go to lunch, you buy a $10,000 table at the benefit luncheon a friend is hosting. Women who didn’t get them joked about possible sexual performance metrics. Women who received them usually retreated, demurring when pressed to discuss it further, proof to an anthropologist that a topic is taboo, culturally loaded and dense with meaning. The detail caused a media whirlwind, shock, and mockery—and, of course, a backlash. The wife-bonus women became a trope of elitist female depravity.

Now, this statement needs to be revised because we changed the testing procedure by appending new requirements, and the results might not be the same as before. The forecasted input will be received by the first “Data receiver” component. Knowing particular values of expected traffic and component performance metrics, we can finally estimate required capacity adjustments. We can calculate the potential maximum capacity we have now, the required capacity that can handle the maximum traffic expected, and then find the delta between the two. But this will be true only to the “Data receiver” because the forecasted input defines the size for this component only.

Although throughput is important, in practice it really matters for only the largest companies. This is because for smaller companies, developer time is almost always worth more than infrastructure costs. It’s only at very large scale that overall throughput really begins to matter when considering total cost of ownership (TCO). Instead, tail latency ends up being the most important performance metric for both small and large companies. This is because the causes of high tail latency are hard to understand and lead to a large amount of developer and operator cognitive load. Engineer time is typically the most precious resource for an organization, and debugging tail latency issues is one of the most time-intensive things that engineers do.

The bug count goes way down, but the number of bugs stays the same. Developers are clever this way. Whatever you try to measure, they’ll find a way to maximize, and you’ll never quite get what you want. Robert D. Austin, in his book Measuring and Managing Performance in Organizations, says there are two phases when you introduce new performance metrics. At first, you actually get what you want, because nobody has figured out how to cheat. In the second phase, you actually get something worse, as everyone figures out the trick to maximizing the thing that you’re measuring, even at the cost of ruining the company. Worse, Econ 101 managers think that they can somehow avoid this situation just by tweaking the metrics.

Corporate compensation committees responded in three ways. First, “everybody got a raise to $1 million,” Nell Minow, a corporate governance critic, told me.16 Next, corporate compensation committees, which remained bent on showering chief executives indiscriminately with cash, started inventing make-believe performance metrics. For instance, AES Corp., a firm based in Arlington, Virginia, that operates power plants, made it one of chief executive Dennis Bakke’s performance goals to ensure that AES remained a “fun” place to work. (“To some, it’s soft,” the fun-loving Bakke told Businessweek. “To me, it’s a vision of the world.”)

“The crossover between the two sides has been excellent.”4 Even though some students need to take the MCAS multiple times before they pass—vocational schools are particularly committed to offering help and remediation for students who fail—only three seniors did not receive diplomas in 2002. Moreover, Massachusetts vocational schools do far better than comprehensive high schools on crucial performance metrics.5 The statewide dropout rate at regular/comprehensive high schools averaged 2.8 percent in 2011 but was only 1.6 percent among the thirty-nine vocational technical schools and averaged 0.9 percent among regional vocational technical schools. (Massachusetts requires every school district to offer students a career vocational technical education option, either by providing it themselves—common among the larger districts—or as part of a regional career vocational technical high school system.)

The task of using a model where the parameters were learned by statistical inference to actually make predictions on previously unseen data is known as statistical prediction or forecasting. We need to be able to understand the metrics of how to differentiate between a good model and a bad model. There are several well known and well understood performance metrics for different models. For regression prediction problems, we should try to minimize the differences between predicted value and the actual value of the target variable. This error term is known as residual errors; larger errors mean worse models and, in regression, we try to minimize the sum of these residual errors, or the sum of the square of these residual errors (squaring has the effect of penalizing large outliers more strongly, but more on that later).

pages: 343 words: 91,080

Uberland: How Algorithms Are Rewriting the Rules of Work
by Alex Rosenblat
Published 22 Oct 2018

While the rating system is described as a simple way to compare Driver X to Driver Y across Uberland and to scale trust between drivers and passengers, in practice its implementation has troubling implications.15 In our case study of Uber's drivers, Luke Stark and I found that passengers effectively perform one of the roles of middle managers, because they are responsible for evaluating worker performance.16 When workers are monitored through an opaque system like Uber's, it's much harder to see the extent to which control and power dynamics are at play.17 In addition to sending in-the-moment nudges to drivers, Uber also exerts longer-term performance management through weekly performance metrics.

Driverless: Intelligent Cars and the Road Ahead
by Hod Lipson and Melba Kurman
Published 22 Sep 2016

Who is at fault in a driverless-car accident needs to be clarified. While accidents involving driverless cars are likely to be relatively rare and the question may wind up being irrelevant, the issue stills requires examination. In the United States, insurance law is defined and enforced at the state level. If the federal government can clarify standard performance metrics for each major system in a driverless car (i.e., the software, hardware sensors, and the automotive body), insurance companies will have a clear framework for quantifying risk, and manufacturers will be protected from frivolous lawsuits.

pages: 358 words: 93,969

Climate Change
by Joseph Romm
Published 3 Dec 2015

There were "statistically significant and meaningful reductions in decision-making performance" in the test subjects based on a standard assessment used for assessing cognitive function: At 1,000 ppm CO2, compared with 600 ppm, performance was significantly diminished on six of nine metrics of decision-making performance. At 2,500 ppm CO2, compared with 600 ppm, performance was significantly reduced in seven of nine metrics of performance, with percentile ranks for some performance metrics decreasing to levels associated with marginal or dysfunctional performance.

pages: 347 words: 91,318

Netflixed: The Epic Battle for America's Eyeballs
by Gina Keating
Published 10 Oct 2012

Mario Cibelli of Marathon Partners spent one of the more interesting days of his career as a hedge fund manager at a Netflix distribution center in Long Island, and came away with a different opinion. “There’s not a snowball’s chance in hell that Blockbuster can do this,” he told his colleagues, when he returned to work later that day. The warehouse manager, a former aerospace engineer, had shown Cibelli a series of charts posted on the wall; it had about two dozen optimum performance metrics. “As long as my performance is within this band, I won’t hear from senior management,” the man said, indicating the charts. “As soon as I move out of this band, I will get a call.” Hastings and his team had spent the time and thought to build a quality business, and management clearly was running Netflix for the long term, Cibelli thought.

Concentrated Investing
by Allen C. Benello
Published 7 Dec 2016

After his campaign went public, Comcast quietly began to make a lot of changes. A new chief financial officer, Michael Angelakis, joined the company and brought discipline to capital spending, operating budgets, and acquisitions. Compensation, which was entirely based on EBITDA size, was broadened to include more appropriate performance metrics. Share retirement and dividend increases became regular and predictable, while remaining somewhat anemic. Free cash flow became, for the first time, the company’s mantra. Although Comcast generates a lot of free cash flow they don’t return a significant amount of it. They just move up the dividend and buy back gently over time.

pages: 317 words: 89,825

No Rules Rules: Netflix and the Culture of Reinvention
by Reed Hastings and Erin Meyer
Published 7 Sep 2020

But at Netflix, where we have to be able to adapt direction quickly in response to rapid changes, the last thing we want is our employees rewarded in December for attaining some goal fixed the previous January. The risk is that employees will focus on a target instead of spot what’s best for the company in the present moment. Many of our Hollywood-based employees come from studios like WarnerMedia or NBC, where a big part of executive compensation is based on specific financial performance metrics. If this year the target is to increase operating profit by 5 percent, the way to get your bonus—often a quarter of annual pay—is to focus doggedly on increasing operational profit. But what if, in order to be competitive five years down the line, a division needs to change course? Changing course involves investment and risk that may reduce this year’s profit margin.

The Internet Trap: How the Digital Economy Builds Monopolies and Undermines Democracy
by Matthew Hindman
Published 24 Sep 2018

Cambridge, MA: Harvard University Press. Pai, A. (2017, April). The importance of economic analysis at the FCC. Remarks of the FCC Chairman at the Hudson Institute. Retrieved from _public/attachmatch/DOC-344248A1.pdf. Palmer, J. W. (2002). Website usability, design, and performance metrics. Information Systems Research, 13(2), 151–67. Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., and Granka, L. (2007). Journal of Computer-Mediated Communication, 12(3), 801–23. Bibliography • 219 Pandey, S., Aly, M., Bagherjeiran, A., Hatch, A., Ciccolo, P., Ratnaparkhi, A., and Zinkevich, M. (2011).

pages: 328 words: 90,677

Ludicrous: The Unvarnished Story of Tesla Motors
by Edward Niedermeyer
Published 14 Sep 2019

Both of these principles are part of Toyota’s managerial approach, called the Toyota Way. Not only are these principles the best method to improve performance in a business as large and complex as an automaker, they are also the only way to build a successful culture. Put differently, a culture founded on the principles of kaizen will not simply try to improve its performance metrics over the long term, it will also seek to continuously improve its ability to continuously improve. For this reason, any automaker with mass-market aspirations like those outlined in Musk’s master plan must build its culture on the proven values of lean manufacturing long before it reaches mass-production volume.

How to Stand Up to a Dictator
by Maria Ressa
Published 19 Oct 2022

Could we somehow employ the right stories, the right information to inspire real-world action, whether that was to vote, help a village during a flood, or call out corruption? Studies15 had shown that 80 to 95 percent of how we make decisions is based not on what we think but on how we feel.16 I thought of it like an iceberg: the tip was the story you could read and see (with performance metrics we could measure), but each story carried an emotion, which traveled over social networks—now social media, four times more powerful than physical networks. Over time, our theory went, that mix could change behavior. If we could map those networks, we would have an idea of how and why the emotions associated with stories traveled through society and changed human behavior.

pages: 1,088 words: 228,743

Expected Returns: An Investor's Guide to Harvesting Market Rewards
by Antti Ilmanen
Published 4 Apr 2011

The rational camp responds that risk stories can explain a surprisingly large part of observed returns without resorting to irrationality—and that various market frictions can make exploiting any remaining opportunities difficult. Specifically, Broadie–Chernov–Johannes (2009) argue that options are often thought to be mispriced because the performance metrics that are used (Sharpe ratios and CAPM alphas) are ill-suited for option analysis, especially over short samples. After documenting the huge challenge for rational models—massively negative average returns for long index puts, losses of 30% per month, or worse, as noted earlier—they proceed to show that standard option-pricing models can largely explain these average returns.

Conclusions The portfolio SR is a good starting point but it needs to be supplemented with other portfolio attributes. All of the desirable attributes discussed above may be worth some SR sacrifice. However, no single risk-adjusted return measure can capture them all, and many of these tradeoffs can only be assessed in a qualitative fashion. Multiple performance metrics are needed, given the multi-dimensional nature of the problem. 28.2.4 Smart risk taking and portfolio construction There now follow some intuitive rules of thumb for smart investing: a recipe for optimal diversification and the “fundamental law of active management”. First, here is a recipe for smart portfolio construction, which sums up mean variance optimization in a nutshell: allocate equal volatility to each asset class (or return source) in a portfolio, unless some assets’ exceptional SRs or diversification abilities justify deviating from equal volatility weightings.

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by Martin Kleppmann
Published 17 Apr 2017

For example, make it fast to roll back configuration changes, roll out new code gradually (so that any unexpected bugs affect only a small subset of users), and provide tools to recompute data (in case it turns out that the old com‐ putation was incorrect). • Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry. (Once a rocket has left the ground, telemetry is essential for tracking what is happening, and for understanding failures [14].) Monitoring can show us early warning sig‐ nals and allow us to check whether any assumptions or constraints are being vio‐ lated.

events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!

pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by Martin Kleppmann
Published 16 Mar 2017

For example, make it fast to roll back configuration changes, roll out new code gradually (so that any unexpected bugs affect only a small subset of users), and provide tools to recompute data (in case it turns out that the old computation was incorrect). Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry. (Once a rocket has left the ground, telemetry is essential for tracking what is happening, and for understanding failures [14].) Monitoring can show us early warning signals and allow us to check whether any assumptions or constraints are being violated.

