Bayesian statistics

back to index

description: a branch of statistics based on the Bayesian probability theory

62 results

pages: 589 words: 69,193

Mastering Pandas
by Femi Anthony
Published 21 Jun 2015

Index A .at operatorabout / The .iat and .at operators Active State PythonURL / Third-party Python software installation aggregate methodusing / Using the aggregate method aggregation, in Rabout / Aggregation in R aliases, for Time Series frequenciesabout / Aliases for Time Series frequencies alphaabout / The alpha and p-values alternative hypothesisabout / The null and alternative hypotheses Anacondaabout / Continuum Analytics Anaconda URL / Continuum Analytics Anaconda, Final step for all platforms, Other numeric or analytics-focused Python distributions installing / Installing Anaconda URL, for download / Installing Anaconda installing, on Linux / Linux installing, on Mac OS/X / Mac OS X installing, on Windows / Windows installing, final steps / Final step for all platforms numeric or analytics-focused Python distributions / Other numeric or analytics-focused Python distributions IPython installation / Install via Anaconda (for Linux/Mac OS X) scikit-learn, installing via / Installing via Anaconda appendusing / Using append arithmetic operationsapplying, on columns / Arithmetic operations on columns B Bayesian analysis exampleswitchpoint detection / Bayesian analysis example – Switchpoint detection Bayesiansabout / How the model is defined Bayesian statistical analysisconducting, steps / Conducting Bayesian statistical analysis Bayesian statisticsabout / Introduction to Bayesian statistics reference link / Introduction to Bayesian statistics mathematical framework / Mathematical framework for Bayesian statistics references / Mathematical framework for Bayesian statistics, Applications of Bayesian statistics, References applications / Applications of Bayesian statistics versus Frequentist statistics / Bayesian statistics versus Frequentist statistics Bayes theoryabout / Bayes theory and odds Bernoulli distributionabout / The Bernoulli distribution reference link / The Bernoulli distribution big datareferences / We live in a big data world 4V’s / 4 V's of big data about / 4 V's of big data examples / The move towards real-time analytics binomial distributionabout / The binomial distribution Boolean indexingabout / Boolean indexing any() method / The is in and any all methods isin method / The is in and any all methods all method / The is in and any all methods where() method, using / Using the where() method indexes, operations / Operations on indexes C 4-4-5 calendarreference link / pandas/tseries central limit theoremreference link / Background central limit theorem (CLT)about / The mean classes, converter.pyConverter / pandas/tseries Formatters / pandas/tseries Locators / pandas/tseries classes, offsets.pyDateOffset / pandas/tseries BusinessMixin / pandas/tseries MonthOffset / pandas/tseries MonthBegin / pandas/tseries MonthEnd / pandas/tseries BusinessMonthEnd / pandas/tseries BusinessMonthBegin / pandas/tseries YearOffset / pandas/tseries YearBegin / pandas/tseries YearEnd / pandas/tseries BYearEnd / pandas/tseries BYearBegin / pandas/tseries Week / pandas/tseries WeekDay / pandas/tseries WeekOfMonth / pandas/tseries LastWeekOfMonth / pandas/tseries QuarterOffset / pandas/tseries QuarterEnd / pandas/tseries QuarterrBegin / pandas/tseries BQuarterEnd / pandas/tseries BQuarterBegin / pandas/tseries FY5253Quarter / pandas/tseries FY5253 / pandas/tseries Easter / pandas/tseries Tick / pandas/tseries classes, parsers.pyTextFileReader / pandas/io ParserBase / pandas/io CParserWrapper / pandas/io PythonParser / pandas/io FixedWidthReader / pandas/io FixedWithFieldParser / pandas/io classes, plm.pyPanelOLS / pandas/stats MovingPanelOLS / pandas/stats NonPooledPanelOLS / pandas/stats classes, sql.pyPandasSQL / pandas/io PandasSQLAlchemy / pandas/io PandasSQLTable / pandas/io PandasSQLTableLegacy / pandas/io PandasSQLLegacy / pandas/io columnmultiple functions, applying to / Applying multiple functions column namespecifying, in R / Specifying column name in R specifying, in pandas / Specifying column name in pandas columnsarithmetic operations, applying on / Arithmetic operations on columns concat functionabout / The concat function concat function, elementsobjs function / The concat function axis function / The concat function join function / The concat function join_axes function / The concat function keys function / The concat function concat operationreference link / The join function Condadocumentation, URL / Final step for all platforms conda commandURL / Final step for all platforms Confidence (Frequentist) intervalversus Credible (Bayesian) interval / Confidence (Frequentist) versus Credible (Bayesian) intervals confidence intervalabout / Confidence intervals example / An illustrative example container types, RVector / R data types List / R data types DataFrame / R data types Matrix / R data types continuous probability distributionsabout / Continuous probability distributions continuous uniform distribution / The continuous uniform distribution exponential distribution / The exponential distribution normal distribution / The normal distribution continuous uniform distributionabout / The continuous uniform distribution Continuum AnalyticsURL / Third-party Python software installation correlationabout / Correlation and linear regression, Correlation reference link / Correlation, An illustrative example Credible (Bayesian) intervalversus Confidence (Frequentist) interval / Confidence (Frequentist) versus Credible (Bayesian) intervals cross-sections / Cross sections cut() function, pandasabout / The pandas solution cut() method, Rabout / An R example using cut() reference link / An R example using cut() Cython / What is pandas?

A Brief Tour of Bayesian Statistics Introduction to Bayesian statistics Mathematical framework for Bayesian statistics Bayes theory and odds Applications of Bayesian statistics Probability distributions Fitting a distribution Discrete probability distributions Discrete uniform distributions The Bernoulli distribution The binomial distribution The Poisson distribution The Geometric distribution The negative binomial distribution Continuous probability distributions The continuous uniform distribution The exponential distribution The normal distribution Bayesian statistics versus Frequentist statistics What is probability?

For deeper look at the statistics topics that we touched on, please take a look at Understanding Statistics in the Behavioral Sciences, which can be found at http://www.amazon.com/Understanding-Statistics-Behavioral-Sciences-Robert/dp/0495596523. Chapter 8. A Brief Tour of Bayesian Statistics In this chapter, we will take a brief tour of an alternative approach to statistical inference called Bayesian statistics. It is not intended to be a full primer but just serve as an introduction to the Bayesian approach. We will also explore the associated Python-related libraries, how to use pandas, and matplotlib to help with the data analysis. The various topics that will be discussed are as follows: Introduction to Bayesian statistics Mathematical framework for Bayesian statistics Probability distributions Bayesian versus Frequentist statistics Introduction to PyMC and Monte Carlo simulation Illustration of Bayesian inference – Switchpoint detection Introduction to Bayesian statistics The field of Bayesian statistics is built on the work of Reverend Thomas Bayes, an 18th century statistician, philosopher, and Presbyterian minister.

pages: 561 words: 120,899

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant From Two Centuries of Controversy
by Sharon Bertsch McGrayne
Published 16 May 2011

DeGroot, Morris H. (1986b) A conversation with Persi Diaconis. Statistical Science (1:3) 319–34. Diaconis P, Efron B. (1983) Computer-intensive methods in statistics. Scientific American (248) 116–30. Diaconis, Persi. (1985) Bayesian statistics as honest work. Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer (1), eds., Lucien M. Le Cam and Richard A. Olshen. Wadsworth. Diaconis P, Holmes S. (1996) Are there still things to do in Bayesian statistics? Erkenntnis (45) 145–58. Diaconis P. (1998) A place for philosophy? The rise of modeling in statistical science. Quarterly of Applied Mathematics (56:4) 797–805. DuMouchel WH, Harris JE. (1983) Bayes methods for combining the results of cancer studies in humans and other species.

Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time”—Provided by publisher. Includes bibliographical references and index. ISBN 978-0-300-16969-0 (hardback) 1. Bayesian statistical decision theory—History. I. Title. QA279.5.M415 2011 519.5’42—dc22 2010045037 A catalogue record for this book is available from the British Library. This paper meets the requirements of ANSI/NISO Z39.48–1992 (Permanence of Paper). 10 9 8 7 6 5 4 3 2 1 When the facts change, I change my opinion.

He introduced the signature features of Bayesian methods: an initial belief modified by objective new information. He could move from observations of the world to abstractions about their probable cause. And he discovered the long-sought grail of probability, what future mathematicians would call the probability of causes, the principle of inverse probability, Bayesian statistics, or simply Bayes’ rule. Given the revered status of his work today, it is also important to recognize what Bayes did not do. He did not produce the modern version of Bayes’ rule. He did not even employ an algebraic equation; he used Newton’s old-fashioned geometric notation to calculate and add areas.

Bulletproof Problem Solving
by Charles Conn and Robert McLean
Published 6 Mar 2019

This problem is a consequence of the underlying mathematics—and a reminder to always use the simplest model that sufficiently explains your phenomenon. Bayesian Statistics and the Space Shuttle Challenger Disaster For those who lived through the Space Shuttle Challenger disaster, it is remembered as an engineering failure. It was that of course, but more importantly it was a problem solving failure. It involved risk assessment relating to O‐ring damage that we now know is best assessed with Bayesian statistics. Bayesian statistics are useful in incomplete data environments, and especially as a way of assessing conditional probability in complex situations.

We start with simple data analysis and then move on to multiple regression, Bayesian statistics, simulations, constructed experiments, natural experiments, machine learning, crowd‐sourced problem solving, and finish up with another big gun for competitive settings, game theory. Of course each of these tools could warrant a textbook on their own, so this is necessarily only an introduction to the power and applications of each technique. Summary of Case Studies Data visualization: London air quality Multivariate regression: Understanding obesity Bayesian statistics: Space Shuttle Challenger disaster Constructed experiments: RCTs and A|B testing Natural experiments: Voter prejudice Simulations: Climate change example Machine learning: Sleep apnea, bus routing, and shark spotting Crowd‐sourcing algorithms Game theory: Intellectual property and serving in tennis It is a reasonable amount of effort to work through these, but bear with us—these case studies will give you a solid sense of which advanced tool to use in a variety of problem settings.

Several lessons emerge for the use of big guns in data analysis from the Challenger disaster. First is that the choice of model, in this case Bayesian statistics, can have an impact on conclusions about risks, in this case catastrophic risks. Second is that it takes careful thinking to arrive at the correct conditional probability. Finally, how you handle extreme values like launch temperature at 31F, when the data is incomplete, requires a probabilistic approach where a distribution is fitted to available data. Bayesian statistics may be the right tool to test your hypothesis when the opportunity exists to do updating of a prior probability with new evidence, in this case exploring the full experience of success and failure at a temperature not previously experienced.

pages: 283 words: 102,484

Everything Is Predictable: How Bayesian Statistics Explain Our World
by Tom Chivers
Published 6 May 2024

Even Daniël Lakens, whom Bayesians think of as the arch-frequentist, says that “often frequentist approaches are best, but sometimes you do have enough prior information to say we can use Bayesian statistics, and in those situations it has clear advantages. That’s the nuanced position, but you’re not going to write a book saying that.” Cassie Kozyrkov, the Google data scientist, in her blog post about whether you’re a Bayesian or a frequentist, has a subheading. “So, which one is better?” and her answer is: “Wrong question! The right one to choose depends on how you want to approach your decision-making.”48 She also points out, probably rightly, that during her graduate studies at Duke University—“which is to Bayesian statistics approximately what the Vatican is to Catholicism”—the loudest voices shouting about how great Bayesianism is weren’t the professors but the students, mainly because the basic Bayesian ideas are easier to grasp.

That might not have been as huge an accolade as it sounds: away from UCL, Bayesianism was very much a sideshow. “At University College the world looked Bayesian; thus, it came as a kind of a shock to discover that in most statistical conferences you had to fight for your right to work within Bayesian statistics to a mainly unsympathetic audience, with no real time left to go into the details of your work,” Bernardo went on. Grieve remembers something similar: “When we were first giving public lectures on Bayes,” he says, “it wasn’t unusual to be given the before- or after-lunch slot. The comedy slot.

It continues in this vein at some length.110 Over the years there was also a song called “José Bernardo,” which was sung to the tune of the Macarena; Andy Grieve sang a repurposed medieval students’ drinking song, “Gaudeamus Igitur,” along with another future president of the Royal Statistical Society, Professor Sir David Spiegelhalter; there was a “Bayesians in the Night” to the tune of “Strangers in the Night”; a “Like a Bayesian” (“Like a Virgin”). And so on. I mentioned this on Twitter and Sir David got in touch to say that, alas, “Our performance of ‘The Full Monty Carlo’ was before the smartphone era, so no recordings exist.”111 (“Who would want to see a video of six male professors of Bayesian statistics taking their clothes off in front of a screaming crowd in a Spanish nightclub?”112 he went on to ask, in my view entirely misjudging the nature of the modern internet.) There absolutely are videos, from later conferences, of “Bayesian Believer,” a Monkees reimagining (“Then I saw Tom Bayes, now I’m a believer”), and “What a Bayesian World,” à la Louis Armstrong.

pages: 354 words: 105,322

The Road to Ruin: The Global Elites' Secret Plan for the Next Financial Crisis
by James Rickards
Published 15 Nov 2016

Yet, based on the French-Russian reaction alone, Somary correctly inferred that world war was inevitable. His analysis was that if an insignificant matter excited geopolitical tensions to the boiling point, then larger matters, which inevitably occur, must lead to war. This inference is a perfect example of Bayesian statistics. Somary, in effect, started with a hypothesis about the probability of world war, which in the absence of any information is weighted fifty-fifty. As incidents like the sanjak railway emerge, they are added to the numerator and denominator of the mathematical form of Bayes’ theorem, increasing the odds of war.

Every analysis starts with the same data. Yet when you enter that data into a deficient model, you get deficient output. Investors who use complexity theory can leave mainstream analysis behind and get better forecasting results. The third tool in addition to behavioral psychology and complexity theory is Bayesian statistics, a branch of etiology also referred to as causal inference. Both terms derive from Bayes’ theorem, an equation first described by Thomas Bayes and published posthumously in 1763. A version of the theorem was elaborated independently and more formally by the French mathematician Pierre-Simon Laplace in 1774.

Money matters, but an emphasis on money to the exclusion of psychology is a fatal flaw. Keynesian and monetarist schools have lately merged into the neoliberal consensus, a nightmarish surf and turf presenting the worst of both. In this book, I write as a theorist using complexity theory, Bayesian statistics, and behavioral psychology to study economics. That approach is unique and not yet a “school” of economic thought. This book also uses one other device—history. When asked to identify which established school of economic thought I find most useful, my reply is Historical. Notable writers of the Historical school include the liberal Walter Bagehot, the Communist Karl Marx, and the conservative Austrian-Catholic Joseph A.

The Book of Why: The New Science of Cause and Effect
by Judea Pearl and Dana Mackenzie
Published 1 Mar 2018

Where causation is concerned, a grain of wise subjectivity tells us more about the real world than any amount of objectivity. In the above paragraph, I said that “most of” the tools of statistics strive for complete objectivity. There is one important exception to this rule, though. A branch of statistics called Bayesian statistics has achieved growing popularity over the last fifty years or so. Once considered almost anathema, it has now gone completely mainstream, and you can attend an entire statistics conference without hearing any of the great debates between “Bayesians” and “frequentists” that used to thunder in the 1960s and 1970s.

Did it come from the neighborhood grocery or a shady gambler? If it’s just an ordinary quarter, most of us would not let the coincidence of nine heads sway our belief so dramatically. On the other hand, if we already suspected the coin was weighted, we would conclude more willingly that the nine heads provided serious evidence of bias. Bayesian statistics give us an objective way of combining the observed evidence with our prior knowledge (or subjective belief) to obtain a revised belief and hence a revised prediction of the outcome of the coin’s next toss. Still, what frequentists could not abide was that Bayesians were allowing opinion, in the form of subjective probabilities, to intrude into the pristine kingdom of statistics.

Inquiries into Human Faculty and Its Development. Macmillan, London, UK. Galton, F. (1889). Natural Inheritance. Macmillan, London, UK. Goldberger, A. (1972). Structural equation models in the social sciences. Econometrica: Journal of the Econometric Society 40: 979–1001. Lindley, D. (1987). Bayesian Statistics: A Review. CBMS-NSF Regional Conference Series in Applied Mathematics (Book 2). Society for Industrial and Applied Mathematics, Philadelphia, PA. McGrayne, S. B. (2011). The Theory That Would Not Die. Yale University Press, New Haven, CT. Pearl, J. (2000). Causality: Models, Reasoning, and Inference.

Life Is Simple: How Occam's Razor Set Science Free and Shapes the Universe
by Johnjoe McFadden
Published 27 Sep 2021

As a church minister he was probably involved in fundraising events, such as tombolas, raffles or lotteries, so in his paper he starts his argument by asking us to ‘imagine a person present at the drawing of a lottery, who knows nothing of its scheme or of the proportion of Blanks to Prizes in it’. At this point it will be easier to appreciate the role of Occam’s razor in Bayesian statistics if we substitute dice for the tombola. We will imagine that Bayes’s friend Mr Price owns two dice. The first is a conventional simple six-sided dice and the second is an unconventional and more complex sixty-sided dice. We will further imagine that Mr Price persuades the reverend to play a game in which, behind a screen, he throws just one of the two dice and calls out its number.

Now the situation is uncertain as 5 could have been thrown by either dice. Are they both equally likely? The Reverend Bayes thought not and devised his statistical methods to deal with precisely this kind of inductive problem where two, or several, or even an infinite number of hypotheses or models fit the data. How do you choose between them? The key factor in Bayesian statistics is the Bayesian likelihood, which, as first pointed out by the statistician Harold Jeffreys in his textbook on probability published in 19894 and then further elaborated by many succeeding Bayesian statisticians,5 automatically incorporates Occam’s razor by favouring simple theories and punishing complex ones.

This is ten times the posterior probability for the more complex sixty-sided dice. So it is ten times more likely that the number 5 has been thrown by the six-sided rather than the sixty-sided dice. On the grounds of his innovative statistics, Bayes would call out ‘It’s the six-sided dice’ and, in this case, he wins again. Likelihood provides Bayesian statistics with its own built-in razor that automatically favours simpler hypotheses because they have a higher probability of generating the data. Another way of visualising this is to consider the parameter space, which is the range of values that is possible for each model or hypothesis or, equally, the range of observations that each might generate.

Super Thinking: The Big Book of Mental Models
by Gabriel Weinberg and Lauren McCann
Published 17 Jun 2019

To do this, Bayesians begin by bringing related evidence to statistical determinations. For example, picking a penny up off the street, you’d probably initially estimate a fifty-fifty chance that it would come up heads if you flipped it, even if you’d never observed a flip of that particular coin before. In Bayesian statistics, you can bring such knowledge of base rates to a problem. In frequentist statistics, you cannot. Many people find this Bayesian way of looking at probability more intuitive because it is similar to how your beliefs naturally evolve. In everyday life, you aren’t starting from scratch every time, as you would in frequentist statistics.

These statistics tell you that if you ran an experiment many times (e.g., the one-hundred-coin-flips example we presented), the confidence intervals calculated should contain the parameter you are studying (e.g., 50 percent probability of getting heads) to the level of confidence specified (e.g., 95 percent of the time). To many people’s dismay, a confidence interval does not say there is a 95 percent chance of the true value of the parameter being in the interval. By contrast, Bayesian statistics analogously produces credible intervals, which do say that; credible intervals specify the current best estimated range for the probability of the parameter. As such, this Bayesian way of doing things is again more intuitive. In practice, though, both approaches yield very similar conclusions, and as more data becomes available, they should converge on the same conclusion.

For example, Netflix held a contest in 2009 in which crowdsourced researchers beat Netflix’s own recommendation algorithms. Crowdsourcing can help you get a sense of what a wide array of people think about a topic, which can inform your future decision making, updating your prior beliefs (see Bayesian statistics in Chapter 5). It can also help you uncover unknown unknowns and unknown knowns as you get feedback from people with previous experiences you might not have had. In James Surowiecki’s book The Wisdom of Crowds, he examines situations where input from crowds can be particularly effective.

Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth
by Stuart Ritchie
Published 20 Jul 2020

Doing away with p-values wouldn’t necessarily improve matters; in fact, by introducing another source of subjectivity, it might make the situation a lot worse.26 With tongue only partly in cheek, John Ioannidis has noted that if we remove all such objective measures we invite a situation where ‘all science will become like nutritional epidemiology’ – a scary prospect indeed.27 The same criticism is often levelled at the other main alternative to p-values: Bayesian statistics. Drawing on a probability theorem devised by the eighteenth-century statistician Thomas Bayes, this method allows researchers to take the strength of previous evidence – referred to as a ‘prior’ – into account when assessing the significance of new findings. For instance, if someone tells you their weather forecast predicts a rainy day in London in the autumn, it won’t take too much to convince you that they’re right.

However, the Bayesian ‘prior’ is inherently subjective: we can all agree that the Sahara is hot and dry, but how strongly we should believe before a study starts that a particular drug will reduce depression symptoms, or that a specific government policy will boost economic growth, is wholly debatable. Aside from taking prior evidence into account, Bayesian statistics also have other differences from p-values.29 They’re less affected by sample size, for example: statistical power is not a factor because the Bayesian approach is aimed not at detecting the effect of a particular set of conditions, but simply at weighing up the evidence for and against a hypothesis.

That’s because, fundamentally, users of p-values are interested in frequencies – most notably the frequency with which you’ll find results with p-values below 0.05 if you run your study an infinite number of times and the hypothesis you’re testing isn’t true. 30.  A useful annotated reading list that serves as an introduction to Bayesian statistics is given by Etz et al., ‘How to Become a Bayesian in Eight Easy Steps: An Annotated Reading List’, Psychonomic Bulletin & Review 25, no. 1 (Feb. 2018): 219–34; https://doi.org/10.3758/s13423-017-1317-5. See also Richard McElreath, Statistical Rethinking: A Bayesian Course with Examples in R and Stan, Chapman & Hall/CRC Texts in Statistical Science Series 122 (Boca Raton: CRC Press/Taylor & Francis Group, 2016). 31.  

pages: 442 words: 94,734

The Art of Statistics: Learning From Data
by David Spiegelhalter
Published 14 Oct 2019

So the product of the likelihood ratio and the prior odds ends up being around 72,000/1,000,000, which are odds of around 7/100, corresponding to a probability of 7/107 or 7% that he is a cheat. So we should give him the benefit of the doubt at this stage, whereas we might not be so generous with someone we had just met in the pub. And perhaps we should keep a careful eye on the Archbishop. Bayesian Statistical Inference Bayes’ theorem, even if it is not permitted in UK courts, is the scientifically correct way to change our mind on the basis of new evidence. Expected frequencies make Bayesian analysis reasonably straightforward for simple situations that involve only two hypotheses, say about whether someone does or does not have a disease, or has or has not committed an offence.

Locators in italics refer to figures and tables A A/B tests 107 absolute risk 31–2, 36–7, 383 adjustment 110, 133, 135, 383 adjuvant therapy 181–5, 183–4 agricultural experiments 105–6 AI (artificial intelligence) 144–5, 185–6, 383 alcohol consumption 112–13, 299–300 aleatory uncertainty 240, 306, 383 algorithms – accuracy 163–7 – biases 179 – for classification 143–4, 148 – complex 174–7 – contests 148, 156, 175, 277–8 see also Titanic challenge – meaning of 383 – parameters 171 – performance assessment 156–63, 176, 177 – for prediction 144, 148 – robustness 178 – sensitivity 157 – specificity 157 – and statistical variability 178–9 – transparency 179–81 allocation bias 85 analysis 6–12, 15 apophenia 97, 257 Arbuthnot, John 253–5 Archbishop of Canterbury 322–3 arm-crossing behaviour 259–62, 260, 263, 268–70, 269 artificial intelligence (AI) 144–5, 185–6, 383 ascertainment bias 96, 383 assessment of statistical claims 368–71 associations 109–14, 138 autism 113 averages 46–8, 383 B bacon sandwiches 31–4 bar charts 28, 30 Bayes, Thomas 305 Bayes factors 331–2, 333, 384 Bayes’ Theorem 307, 313, 315–16, 384 Bayesian hypothesis testing 219, 305–38 Bayesian learning 331 Bayesian smoothing 330 Bayesian statistical inference 323–34, 325, 384 beauty 179 bell-shaped curves 85–91, 87 Bem, Daryl 341, 358–9 Bernoulli distribution 237, 384 best-fit lines 125, 393 biases 85, 179 bias/variance trade-off 169–70, 384 big data 145–6, 384 binary data 22, 385 binary variables 27 binomial distribution 230–6, 232, 235, 385 birth weight 85–91 blinding 101, 385 BMI (body mass index) 28 body mass index (BMI) 28 Bonferroni correction 280, 290–1, 385 boosting 172 bootstrapping 195–203, 196, 198, 200, 202, 208, 229–30, 386 bowel cancer 233–6, 235 Box, George 139 box-and-whisker plots 42, 43, 44, 45 Bradford-Hill, Austin 114 Bradford-Hill criteria 114–17 brain tumours 95–6, 135, 301–3 breast cancer screening 214–16, 215 breast cancer surgery 181–5, 183–4 Brier score 164–7, 386 Bristol Royal Infirmary 19–21, 56–8 C Cairo, Alberto 25, 65 calibration 161–3, 162, 386 Cambridge University 110, 111 cancer – breast 181–5, 183–4, 214–16, 215 – lung 98, 114, 266 – ovarian 361 – risk of 31–6 carbonated soft drinks 113 Cardiac Surgical Registry (CSR) 20–1 case-control studies 109, 386 categorical variables 27–8, 386 causation 96–9, 114–17, 128 reverse causation 112–15, 404 Central Limit Theorem 199, 238–9, 386–7 chance 218, 226 child heart surgery see heart surgery chi-squared goodness-of-fittest 271, 272, 387 chi-squared test of association 268–70, 387 chocolate 348 classical probability 217 classification 143–4, 148–54 classification trees 154–6, 155, 168, 174, 387 cleromancy 81 clinical trials 82–3, 99–107, 131, 280, 347 clustering 147 cohort studies 109, 387 coins 308, 309 communication 66–9, 353, 354, 364–5 complex algorithms 138–9 complexity parameters 171 computer simulation 205–7, 208 conclusions 15, 22, 347 conditional probability 214–16 confidence intervals 241–4, 243, 248–51, 250, 271–3, 335–6, 387–8 confirmatory studies 350–1, 388 confounders 110, 135, 388 confusion matrixes 157 continuous variables 46, 388 control groups 100, 389 control limits 234, 389 correlation 96–7, 113 count variables 44–6, 389 counterfactuals 97–8, 389 crime 83–5, 321–2 see also homicides Crime Survey for England and Wales 83–5 cross-sectional studies 108–9 cross-validation 170–1, 389 CSR(Cardiac Surgical Registry) 20–1 D Data 7–12, 15, 22 data collection 345 data distribution see sample distribution data ethics 371 data literacy 12, 389 data science 11, 145–6, 389 data summaries 40 data visualization 22, 25, 65–6, 69 data-dredging 12 death 9 see also mortality; murder; survival rates deduction 76 deep learning 147, 389 dependent events 214, 389 dependent variables 60, 125–6, 389 deterministic models 128–9, 138 dice 205–7, 206, 213 differences between groups of numbers 51–6 distribution 43 DNA evidence 216 dogs 179 Doll, Richard 114 doping 310–13, 311–12, 314, 315–16 dot-diagrams 42, 43, 44, 45 dynamic graphics 71 E Ears 108–9 education 95–6, 106–7, 131, 135, 178–9 election result predictions 372–6, 375 see also opinion polls empirical distribution 197, 404 enumerative probability 217–18 epidemiology 95, 117, 389 epistemic uncertainty 240, 306, 308, 309, 390 error matrixes 157, 158, 390 errors in coding 345–6 ESP (extra-sensory perception) 341, 358–9 ethics 371 eugenics 39 expectation 231, 390 expected frequencies 32, 209–13, 211, 214–16, 215, 390 explanatory variables 126, 132–5 exploratory studies 350, 390 exposures 114, 390 external validity 82–3, 390 extra-sensory perception (ESP) 341, 358–9 F False discovery rate 280, 390 false-positives 278–80, 390 feature engineering 147, 390 Fermat, Pierre de 207 final odds 316 financial crisis of 2007–2008 139–40 financial models 139–40 Fisher, Ronald 258, 265–6, 336, 345 five-sigma results 281–2 forensic epidemiology 117, 391 forensic statistics 6 framing 391 – of numbers 24–5 – of questions 79–80 fraud 347–50 funnel plots 234, 391 G Gallup, George 81 Galton, Francis 39–40, 58, 121–2, 238–9 gambler’s fallacy 237 gambling 205–7, 206, 213 garden of forking paths 350 Gaussian distribution see normal distribution GDP (Gross Domestic Product) 8–9 gender discrimination 110, 111 Gini index 49 Gombaud, Antoine 205–7 Gross Domestic Product (GDP) 8–9 Groucho principle 358 H Happiness 9 HARKing 351–2 hazard ratios 357, 391 health 169–70 heart attacks 99–104 Heart Protection Study (HPS) 100–2, 103, 273–5, 274, 282–7 heart surgery 19–21, 22–4, 23, 56–8, 57, 93, 136–8, 137 heights 122–5, 123, 124, 127, 134, 201, 202, 243, 275–8, 276 hernia surgery 106 HES (Hospital Episode Statistics) 20–1 hierarchical modelling 328, 391 Higgs bosons 281–2 histograms 42, 43, 44, 45 homicides 1–6, 222–6, 225, 248, 270–1, 272, 287–94 Hospital Episode Statistics (HES) 20–1 hospitals 19–21, 25–7, 26, 56–61, 138 house prices 48, 112–14 HPS (Heart Protection Study) 100–2, 103, 273–5, 274, 282–7 hypergeometric distribution 264, 391 hypotheses 256–7 hypothesis testing 253–303, 336, 392 see also Neyman-Pearson Theory; null hypothesis significance testing; P-values I IARC (International Agency for Research in Cancer) 31 icon arrays 32–4, 33, 392 income 47–8 independent events 214, 392 independent variables 60, 126, 392 induction 76–7, 392 inductive behaviour 283 inductive inference 76–83, 78, 239, 392 infographics 69, 70 insurance 180 ‘intention to treat’ principle 100–1, 392 interactions 172, 392 internal validity 80–1, 392 International Agency for Research in Cancer (IARC) 31 inter-quartile range (IQR) 51, 89, 392 IQ 349 IQR (inter-quartile range) 49, 51, 89, 392 J Jelly beans in a jar 40–6, 48, 49, 50 K Kaggle contests 148, 156, 175, 277–8 see also Titanic challenge k-nearest neighbors algorithm 175 L LASSO 172–4 Law of Large Numbers 237, 393 law of the transposed conditional 216, 313 league tables 25, 130–1 see also tables least-squares regression lines 124, 125, 393 left-handedness 113–14, 229–33, 232 legal cases 313, 321, 331–2 likelihood 327, 336, 394 likelihood ratios 314–23, 319–20, 332, 394 line graphs 4, 5 linear models 132, 138 literal populations 91–2 logarithmic scale 44, 45, 394 logistic regression 136, 172, 173, 394 London Underground 24 loneliness 80 long-run frequency probability 218 look elsewhere effect 282 lung cancer 98, 114, 266 lurking factors 113, 135, 394–5 M Machine learning 139, 144–5, 395 mammography 214–16, 215 margins of error 189, 199, 200, 244–8, 395 mean average 46–8 mean squared error (MSE) 163–4, 165, 395 measurement 77–9 meat 31–4 media 356–8 median average 46, 47–8, 51, 89, 395 Méré, Chevalier de 205–7, 213 meta-analysis 102, 104, 395 metaphorical populations 92–3 mode 46, 48, 395 mortality 47, 113–14 MRP (multilevel regression and post-stratification) 329, 396 MSE (mean squared error) 163–4, 165, 395 mu 190 multilevel regression and post-stratification (MRP) 329, 396 multiple linear regression 132–3, 134 multiple regression 135, 136, 396 multiple testing 278–80, 290, 396 murders 1–6, 222–6, 225, 248, 270–1, 287–94 N Names, popularity of 66, 67 National Sexual Attitudes and Lifestyle Survey (Natsal) 52, 69, 70, 73–5 natural variability 226 neural networks 174 Neyman, Jerzy 242, 283, 335–6 Neyman-Pearson Theory 282–7, 336–7 NHST (null hypothesis significance testing) 266–71, 294–7, 296 non-significant results 299, 346–7, 370 normal distribution 85–91, 87, 226, 237–9, 396–7 null hypotheses 257–65, 336, 397 null hypothesis significance testing (NHST) 266–71, 294–7, 296 O Objective priors 327 observational data 108, 114–17, 128 odds 34, 314, 316 odds ratios 34–6 one-sided tests 264, 397–8 one-tailed P-values 264, 398 opinion polls 82, 245–7, 246, 328–9 see also election result predictions ovarian cancer 361 over-fitting 167–71, 168 P P-hacking 351 P-values 264–5, 283, 285, 294–303, 336, 401 parameters 88, 240, 398 Pascal, Blaise 207 patterns 146–7 Pearson, Egon 242, 283, 336 Pearson, Karl 58 Pearson correlation coefficient 58, 59, 96–7, 126, 398 percentiles 48, 89, 398–9 performance assessment of algorithms 156–67, 176, 177 permutation tests 261–4, 263, 399 personal probability 218–19 pie charts 28, 29 placebo effect 131 placebos 100, 101, 399 planning 13–15, 344–5 Poisson distribution 223–4, 225, 270–1, 399 poker 322–3 policing 107 popes 114 population distribution 86–91, 195, 399 population growth 61–6, 62–4 population mean 190–1, 395 see also expectation populations 74–5, 80–93, 399 posterior distributions 327, 400 power of a test 285–6, 400 PPDAC (Problem, Plan, Data, Analysis, Conclusion) problem-solving cycle 13–15, 14, 108–9, 148–54, 344–8, 372–6, 400 practical significance 302, 400 prayer 107 precognition 341, 358–9 Predict 2.1 182 prediction 144, 148–54 predictive analytics 144, 400 predictor variables 392 pre-election polls see opinion polls presentation 22–7 press offices 355–6 priming 80 prior distributions 327, 400 prior odds 316 probabilistic forecasts 161, 400 probabilities, accuracy 163–7 probability 10 meaning of 216–22, 400–1 rules of 210–13 and uncertainty 306–7 probability distribution 90, 401 probability theory 205–27, 268–71 probability trees 210–13, 212 probation decisions 180 Problem, Plan, Data, Analysis, Conclusion (PPDAC) problem-solving cycle 13–15, 14, 108–9, 148–54, 344–8, 372–6, 400 problems 13 processed meat 31–4 propensity 218 proportions, comparisons 28–37, 33, 35 prosecutor’s fallacy 216, 313 prospective cohort studies 109, 401 pseudo-random-number generators 219 publication bias 367–8 publication of findings 355 Q QRPs (questionable research practices) 350–3 quartiles 89, 402 questionable research practices (QRPs) 350–3 Quetelet, Adolphe 226 R Race 179 random forests 174 random match probability 321, 402 random observations 219 random sampling 81–2, 208, 220–2 random variables 221, 229, 402 randomization 108, 266 randomization tests 261–4, 263, 399 randomized controlled trials (RCTs) 100–2, 105–7, 114, 135, 402 randomizing devices 219, 220–1 range 49, 402 rate ratios 357, 402 Receiver Operating Characteristic (ROC) curves 157–60, 160, 402 recidivism algorithms 179–80 regression 121–40 regression analysis 125–8, 127 regression coefficients 126, 133, 403 regression modelling strategies 138–40 regression models 171–4 regression to the mean 125, 129–32, 403 regularization 170 relative risk 31, 403 reliability of data 77–9 replication crisis in science 11–12 representative sampling 82 reproducibility crisis 11–12, 297, 342–7, 403 researcher degrees of freedom 350–1 residual errors 129, 403 residuals 122–5, 403 response variables 126, 135–8 retrospective cohort studies 109, 403 reverse causation 112–15, 404 Richard III 316–21 risk, expression of 34 robust measures 51 ROC (Receiver Operating Characteristic) curves 157–60, 160, 402 Rosling, Hans 71 Royal Statistical Society 68, 79 rules for effective statistical practice 379–80 Ryanair 79 S Salmon 279 sample distribution 43 sample mean 190–1, 395 sample size 191, 192–5, 193–4, 283–7 sampling 81–2, 93 sampling distributions 197, 404 scatter-plots 2–4, 3 scientific research 11–12 selective reporting 12, 347 sensitivity 157–60, 404 sentencing 180 Sequential Probability Ratio Test (SPRT) 292, 293 sequential testing 291–2, 404 sex ratio 253–5, 254, 261, 265 sexual partners 47, 51–6, 53, 55, 73–5, 191–201, 193–4, 196, 198, 200 Shipman, Harold 1–6, 287–94, 289, 293 shoe sizes 49 shrinkage 327, 404 sigma 190, 281–2 signal and the noise 129, 404 significance testing see null hypothesis significance testing Silver, Nate 27 Simonsohn, Uri 349–52, 366 Simpson’s Paradox 111, 112, 405 size of a test 285–6, 405 skewed distribution 43, 405 smoking 98, 114, 266 social acceptability bias 74 social physics 226 Somerton, Francis see Titanic challenge sortilege 81 sortition 81 Spearman’s rank correlation 58–60, 405 specificity 157–9, 405 speed cameras 130, 131–2 speed of light 247 sports doping 310–13, 311–12, 314, 315–16 sports teams 130–1 spread 49–51 SPRT (Sequential Probability Ratio Test) 292, 293 standard deviation 49, 88, 126, 405 standard error 231, 405–6 statins 36–7, 99–104, 273–5, 274, 282–7 statistical analysis 6–12, 15 statistical inference 208, 219, 229–51, 305–38, 323–8, 335, 404 statistical methods 12, 346–7, 379 statistical models 121, 128–9, 404 statistical practice 365–7 statistical science 2, 7, 404 statistical significance 255, 265–8, 270–82, 404 Statistical Society 68 statistics – assessment of claims 368–71 – as a discipline 10–11 – ideology 334–8 – improvements 362–4 – meaning of 404 – publications 16 – rules for effective practice 379–80 – teaching of 13–15 STEP (Study of the Therapeutic Effects of Intercessory Prayer) 107 storytelling 69–71 stratification 110, 383 Streptomycin clinical trial 105, 114 strip-charts 42, 43, 44, 45 strokes 99–104 Student’s t-statistic 275–7 Study of the Therapeutic Effects of Intercessory Prayer (STEP) 107 subjective probability 218–19 summaries 40, 49, 50, 51 supermarkets 112–14 supervised learning 143–4, 404 support-vector machines 174 surgery – breast cancer surgery 181–5, 183–4 – heart surgery 19–21, 22–4, 23, 56–8, 57, 93, 136–8, 137 – hernia surgery 106 survival rates 25–7, 26, 56–61, 57, 60–1 systematic reviews 102–4 T T-statistic 275–7, 404 tables 22–7, 23 tail-area 231 tea tasting 266 teachers 178–9 teaching of statistics 13–15 technology 1 telephone polls 82 Titanic challenge 148–56, 150, 152–3, 155, 162, 166–7, 172, 173, 175, 176, 177, 277 transposed conditionals, law of 216, 313 trees 7–8 trends 61–6, 62–4, 67 two-sided tests 265, 397–8 two-tailed P-values 265, 398 Type I errors 283–5, 404 Type II errors 283–5, 407 U Uncertainty 208, 240, 306–7, 383, 390 uncertainty intervals 199, 200, 241, 335 unemployment 8–9, 189–91, 271–3 university education 95–6, 135, 301–3 see also Cambridge University unsupervised learning 147, 407 US Presidents 167–9 V Vaccination 113 validity of data 79–83 variability 10, 49–51, 178–9, 407 variables 27, 56–61 variance 49, 407 Vietnam War draft lottery 81–2 violence 113 virtual populations 92 volunteer bias 85 voting age 79–80 W Waitrose 112–14 weather forecasts 161, 164, 165 weight loss 348 ‘When I’m Sixty-Four’ 351–2 wisdom of crowds 39–40, 48, 51, 407 Z Z-scores 89, 407 PELICAN BOOKS Economics: The User’s Guide Ha-Joon Chang Human Evolution Robin Dunbar Revolutionary Russia: 1891–1991 Orlando Figes The Domesticated Brain Bruce Hood Greek and Roman Political Ideas Melissa Lane Classical Literature Richard Jenkyns Who Governs Britain?

CHAPTER 9: Putting Probability and Statistics Together 1 To derive this distribution, we could calculate the probability of two left-handers as 0.2 × 0.2 = 0.04, the probability of two right-handers as 0.8 × 0.8 = 0.64, and so the probability of one of each must be 1 − 0.04 − 0.64 = 0.32. 2 There are important exceptions to this – some distributions have such long, ‘heavy’ tails that their expectations and standard deviations do not exist, and so averages have nothing to converge to. 3 If we can assume that all our observations are independent and come from the same population distribution, the standard error of their average is just the standard deviation of the population distribution divided by the square root of the sample size. 4 We shall see in Chapter 12 that practitioners of Bayesian statistics are happy using probabilities for epistemic uncertainty about parameters. 5 Strictly speaking, a 95% confidence interval does not mean there is a 95% probability that this particular interval contains the true value, although in practice people often give this incorrect interpretation. 6 Both of whom I had the pleasure of knowing in their more advanced years. 7 More precisely, 95% confidence intervals are often set as plus or minus 1.96 standard errors, based on assuming a precise normal sampling distribution for the statistic. 8 With 1,000 participants, the margin of error (in %) is at most ±100/√1,000 = 3%.

The Ethical Algorithm: The Science of Socially Aware Algorithm Design
by Michael Kearns and Aaron Roth
Published 3 Oct 2019

Differential privacy can also be interpreted as a promise that no outside observer can learn very much about any individual because of that person’s specific data, while still allowing observers to change their beliefs about particular individuals as a result of learning general facts about the world, such as that smoking and lung cancer are correlated. To clarify this, we need to think for a moment about how learning (machine or otherwise) works. The framework of Bayesian statistics provides a mathematical formalization of learning. A learner starts out with some set of initial beliefs about the world. Whenever he observes something, he changes his beliefs about the world. After he updates his beliefs, he now has a new set of beliefs about the world (his posterior beliefs).

See also p-hacking advantages of machine learning, 190–93 advertising, 191–92 Afghanistan, 50–51 age data, 27–29, 65–66, 86–89 aggregate data, 2, 30–34, 50–51 AI labs, 145–46 alcohol use data, 51–52 algebraic equations, 37 algorithmic game theory, 100–101 Amazon, 60–61, 116–17, 121, 123, 125 analogies, 57–63 anonymization of data “de-anonymizing,” 2–3, 14–15, 23, 25–26 reidentification of anonymous data, 22–31, 33–34, 38 shortcomings of anonymization methods, 23–29 and weaknesses of aggregate data, 31–32 Apple, 47–50 arbitrary harms, 38 Archimedes, 160–62 arms races, 180–81 arrest data, 92 artificial intelligence (AI), 13, 176–77, 179–82 Atari video games, 132 automation, 174–78, 180 availability of data, 1–3, 51, 66–67 averages, 40, 44–45 backgammon, 131 backpropagation algorithm, 9–10, 78–79, 145–46 “bad equilibria,” 95, 97, 136 Baidu, 148–51, 166, 185 bans on data uses, 39 Bayesian statistics, 38–39, 173 behavioral data, 123 benchmark datasets, 136 Bengio, Yoshua, 133 biases and algorithmic fairness, 57–63 and data collection, 90–93 and word embedding, 58–63, 77–78 birth date information, 23 bitcoin, 183–84 blood-type compatibility, 130 board games, 131–32 Bonferroni correction, 149–51, 153, 156, 164 book recommendation algorithms, 117–21 Bork, Robert, 24 bottlenecks, 107 breaches of data, 32 British Doctors Study, 34–36, 39, 51 brute force tasks, 183–84, 186 Cambridge University, 51–52 Central Intelligence Agency (CIA), 49–50 centralized differential privacy, 46–47 chain reaction intelligence growth, 185 cheating, 115, 148, 166 choice, 101–3 Chrome browser, 47–48, 195 classification of data, 146–48, 152–55 cloud computing, 121–23 Coase, Ronald, 159 Coffee Meets Bagel (dating app), 94–97, 100–101 coin flips, 42–43, 46–47 Cold War, 100 collaborative filtering, 23–24, 116–18, 123–25 collective behavioral data, 105–6, 109, 123–24 collective good, 112 collective language, 64 collective overfitting, 136.

Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
by Aurelien Geron
Published 14 Aug 2019

Bayes’ theorem Unfortunately, in a Gaussian mixture model (and many other problems), the denominator p(x) is intractable, as it requires integrating over all the possible values of z (Equation 9-3). This means considering all possible combinations of cluster parameters and cluster assignments. Equation 9-3. The evidence p(X) is often intractable This is one of the central problems in Bayesian statistics, and there are several approaches to solving it. One of them is variational inference, which picks a family of distributions q(z; λ) with its own variational parameters λ (lambda), then it optimizes these parameters to make q(z) a good approximation of p(z|X). This is achieved by finding the value of λ that minimizes the KL divergence from q(z) to p(z|X), noted DKL(q‖p).

A simpler approach to maximizing the ELBO is called black box stochastic variational inference (BBSVI): at each iteration, a few samples are drawn from q and they are used to estimate the gradients of the ELBO with regards to the variational parameters λ, which are then used in a gradient ascent step. This approach makes it possible to use Bayesian inference with any kind of model (provided it is differentiable), even deep neural networks: this is called Bayesian deep learning. Tip If you want to dive deeper into Bayesian statistics, check out the Bayesian Data Analysis book by Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin. Gaussian mixture models work great on clusters with ellipsoidal shapes, but if you try to fit a dataset with different shapes, you may have bad surprises. For example, let’s see what happens if we use a Bayesian Gaussian mixture model to cluster the moons dataset (see Figure 9-24): Figure 9-24. moons_vs_bgm_diagram Oops, the algorithm desperately searched for ellipsoids, so it found 8 different clusters instead of 2.

pages: 294 words: 81,292

Our Final Invention: Artificial Intelligence and the End of the Human Era
by James Barrat
Published 30 Sep 2013

He was not in his office but at home, perhaps calculating the probability of God’s existence. According to Dr. Holtzman, sometime before he died, Good updated that probability from zero to point one. He did this because as a statistician, he was a long-term Bayesian. Named for the eighteenth-century mathematician and minister Thomas Bayes, Bayesian statistics’ main idea is that in calculating the probability of some statement, you can start with a personal belief. Then you update that belief as new evidence comes in that supports your statement or doesn’t. If Good’s original disbelief in God had remained 100 percent, no amount of data, not even God’s appearance, could change his mind.

Aboujaoude, Elias accidents AI and, see risks of artificial intelligence nuclear power plant Adaptive AI affinity analysis agent-based financial modeling “Age of Robots, The” (Moravec) Age of Spiritual Machines, The: When Computers Exceed Human Intelligence (Kurzweil) AGI, see artificial general intelligence AI, see artificial intelligence AI-Box Experiment airplane disasters Alexander, Hugh Alexander, Keith Allen, Paul Allen, Robbie Allen, Woody AM (Automatic Mathematician) Amazon Anissimov, Michael anthropomorphism apoptotic systems Apple iPad iPhone Siri Arecibo message Aristotle artificial general intelligence (AGI; human-level AI): body needed for definition of emerging from financial markets first-mover advantage in jump to ASI from; see also intelligence explosion by mind-uploading by reverse engineering human brain time and funds required to develop Turing test for artificial intelligence (AI): black box tools in definition of drives in, see drives as dual use technology emotional qualities in as entertainment examples of explosive, see intelligence explosion friendly, see Friendly AI funding for jump to AGI from Joy on risks of, see risks of artificial intelligence Singularity and, see Singularity tight coupling in utility function of virtual environments for artificial neural networks (ANNs) artificial superintelligence (ASI) anthropomorphizing gradualist view of dealing with jump from AGI to; see also intelligence explosion morality of nanotechnology and runaway Artilect War, The (de Garis) ASI, see artificial superintelligence Asilomar Guidelines ASIMO Asimov, Isaac: Three Laws of Robotics of Zeroth Law of Association for the Advancement of Artificial Intelligence (AAAI) asteroids Atkins, Brian and Sabine Automated Insights availability bias Banks, David L. Bayes, Thomas Bayesian statistics Biden, Joe biotechnology black box systems Blue Brain project Bok globules Borg, Scott Bostrom, Nick botnets Bowden, B. V. brain augmentation of, see intelligence augmentation basal ganglia in cerebral cortex in neurons in reverse engineering of synapses in uploading into computer Brautigan, Richard Brazil Brooks, Rodney Busy Child scenario Butler, Samuel CALO (Cognitive Assistant that Learns and Organizes) Carr, Nicholas cave diving Center for Applied Rationality (CFAR) Chandrashekar, Ashok chatbots chess-playing computers Deep Blue China Chinese Room Argument Cho, Seung-Hui Church, Alonso Churchill, Winston Church-Turing hypothesis Clarke, Arthur C.

pages: 283 words: 81,376

The Doomsday Calculation: How an Equation That Predicts the Future Is Transforming Everything We Know About Life and the Universe
by William Poundstone
Published 3 Jun 2019

The Carter-Leslie argument promises to be more customizable, more suited to those who like to tinker. Gott’s 1993 article does not mention Bayes’s theorem or prior probabilities. For some Nature readers that was a great sin. I asked Gott why he omitted Bayes, and he had a quick answer: “Bayesians.” “I didn’t put any Bayesian statistics in this paper because I didn’t want to muddy the waters,” he explained. “Because Bayesian people will argue about their priors, endlessly. I had a falsifiable hypothesis.” The long-standing complaint is that prior probabilities are subjective. A Bayesian prediction can be a case of garbage in, garbage out.

“incredibly irresponsible”; “Anybody can see it’s garbage”: Caves interview, December 12, 2017. 2. “Gott dismisses the entire process”: Caves 2000, 2. 3. “it was important to find”: Caves 2000, 2. 4. “a notarized list of…24 dogs”: Caves 2000, 15. 5. “Gott is on record as applying”: Caves 2008, 2. 6. “We can distinguish two forms”: Bostrom 2002, 89. 7. “I didn’t put any Bayesian statistics”: Gott interview, July 31, 2017. 8. “When you can’t identify any time scales”: Caves 2008, 11. 9. “No other formula in the alchemy of logic”: Keynes 1921, 89. 10. Goodman’s objection to Gott: Goodman 1994. 11. Jeffreys prior compatible with location-and scale-invariance: This fact was demonstrated not by Jeffreys but by Washington University physicist E.

Dinosaurs Rediscovered
by Michael J. Benton
Published 14 Sep 2019

We then ran calculations to work out whether speciation and extinction rates were stable, rising, or falling through the Mesozoic. We were looking for one of three possible outcomes: that overall the balance of speciation and extinction gave ever-rising values, or levelling off, or declining values. We used Bayesian statistical methods, which involve seeding the calculations with a starting model, and then running the data millions or billions of times to assess how well the starting model fits the data, allowing for every possible source of uncertainty, and repeatedly adjusting the model to make it fit better. In this case, Manabu modelled uncertainty about dating the rocks, gaps in the record, accuracy of the phylogenetic tree, and many other issues.

McNeill 215, 216, 218, 228–29, 234, 252 Allen, Percy 73 alligators 118, 164–65, 194 Allosaurus 49, 121, 188 animated skin of 250 diet 206 fact file 188–89 feeding mechanisms 186–88, 190–91, 193, 193 medullary bone 145 Morrison Formation 69, 71 movement 248 skulls 17–18, X teeth and bite force 188, 189, 192, 196 Alvarez, Luis 259–62, 260, 264, 267, 285, 286 Alvarez, Walter 259, 260, 261–62, 264 amber dinosaurs preserved in 131–32, VI extracting DNA from fossils in 136, 137 American Museum of Natural History (AMNH) 54, 156, 166, 243 American National Science Foundation 52 Amherst College Museum, Connecticut 223, 224–25, 227 Amphicoelias 206 analogues, modern 16 Anatosaurus 221, 221 Anchiornis 68–69, 70, V fact file 70 feathers 125, 126 flight 245 footprints 224–25, 225 angiosperms 78–79 animation 249–52, 251 Ankylosaurus 65, 79, 272 extinction 276 fact file 272–73 Hell Creek Formation 270 use of arms and legs 236 Anning, Mary 195 apatite 142 Apatosaurus 206 Archaeopteryx 110, 112, IV as ‘missing link’ fossil 114, 121 fact file 112–13 flight 114, 124, 247 Richard Owen and 111, 114 skeleton found at Solnhofen 111, 277 archosauromorphs 35–36, 37 archosaurs 16, 21–22, 35, 39, 56 Armadillosuchus 201 Asaro, Frank 259 Asilisaurus 32–33 asteroid impact 254–69, 275–76, 280, 281, 286–87, XIX Attenborough, David 98, 213 B Bakker, Bob 109–10, 115, 126 asteroid impact and extinction 262 Deinonychus 110, 111, 221, 244–45 dinosaurs as warm-blooded creatures 109, 116, 117 modern birds as dinosaurs 110 speed of dinosaurs 230 validity of Owen’s Dinosauria 57, 59 Baron, Matt 80–83 Barosaurus 206 Barreirosuchus 201 Barrett, Paul 80–83 Baryonyx 193 Bates, Karl 192 Bayesian statistical methods 273, 275 BBC Horizon 229, 264–65 Walking with Dinosaurs 249–52, 251 beetles 78, 139, 204 Beloc, Haiti 265–66, 265 Bernard Price Palaeontological Institute 160, 163 Bernardi, Massimo 43, 46 biodiversity, documenting 52 bioinformatics 52 bipedal dinosaurs arms and legs 235–40 early images of 219–21 movement and posture 221–22, 222, 249 speed 228 Bird, Roland T. 242–43 birds 145 brains 129 breathing 118 eggs 155, 158, 159, 166 evolution of 277, 278–79, 279–81, 280 feathers 125–26, 127 flight 244, 247, 248 gastroliths 194 growth 174 identifying ancestral genetic sequences 151–52 intelligence 128 as living dinosaurs 110–15, 118, 120–21, 124, 132 and the mass extinction 277–81 medullary bone 143, 145 Mesozoic birds from China 118–24 movement 234 sexual selection 126 using feet to hold prey down 235, 235 bite force 191–94 blood, identifying dinosaur 141–43 Bonaparte, José 239 bones 99 age of 155 bone histology 116–18, 119 bone remodelling 116–17 casting 100 composition 142 excavating from rock 87–99, 105 extracting blood from 141–42 first found 65 first illustrated 65 growth lines 116, 117, 154–55, 170, 172–73, 184 how dinosaurs’ jaws worked 186 mapping 93–94 reconstructing 99–101 structures 170, XIII Brachiosaurus 49, 69, 178–79 diet 206, 207–8 fact file 178–79 Morrison Formation 69 size 175 bracketing 15–17 brain size 128–30, XI, XII breakpoint analysis 42, 43 breathing 118 Bristol City Museum 104 Bristol Dinosaur Project 101–4 British Museum, London 111, 114 Brontosaurus 69, 225 Brookes, Richard 65 Brown, Barnum 273 Brusatte, Steve 32, 36–37, 39 bubble plots 42, 43 Buckland, William 67, 195 Buckley, Michael 142 Burroughs, Edgar Rice, The Land that Time Forgot 134 Butler, Richard 32 Button, David 208, 213 C Camarasaurus 175, 206, 208–9, 209, 213, IX Cano, Raúl 136 Carcharodontosaurus 196 Carnegie, Andrew 211 Carnian Pluvial Episode 40, 42, 43, 45, 46, 50 carnivores 201 see also individual dinosaurs Carnotaurus 201, 238, 239, 240 fact file 239 carotenoids 124 cartilage 142 Caudipteryx 121, 123 fact file 123 Centrosaurus 87, 88 fact file 88–89 ceratopsians 79, 143, 156 diversity of 272, 275 use of arms and legs 236 Ceratosaurus 69, 71, 187, 206 Cetiosaurus 57, 66 Chapman Andrews, Roy 156, 166 Charig, Alan 22–23, 34, 39 Chasmosaurus 87 Chen, Pei-ji 121 Chicxulub crater, Mexico 264–68, 267, 285, 286 Chin, Karen 195, 204 China Jurassic dinosaurs 68–71 Mesozoic birds from 118–24 Chinsamy-Turan, Anusuya 145 chitin 139 chromosomes 151–52 Chukar partridges 248 clades 55, 82, 110 cladistics 53–55, 82–83 cladograms 55, 56 Clashach, Scotland 85, 86 classic model 21, 21 classification, evolutionary trees 52–84, 60–61 climate climate change 22, 40, 41, 43 Cretaceous 269 identifying ancient 46–47 Late Triassic 40, 41, 43, 49 Triassic Period 48, 49 cloning 134–35, 137, 148–51, 150 Coelophysis 193, 236, I, X Colbert, Ned 22, 23, 34 Romer-Colbert ecological relay model 22, 35, 36, 39–40 size and core temperature 118 cold-blooded animals 116 collagen 142, 143 colour of dinosaurs 124–25 of feathers 8–10, 17, 139, V computational methods 35–39 Conan Doyle, Sir Arthur, The Lost World 133–34, 133, 135 Confuciusornis 144, 145, 147, XIII fact file 146–47 conifers 22, 131, 197, III Connecticut Valley 223–26, 224–25, 227, 243 contamination of DNA 138 continental plates 47 Cope, Edward 208 coprolites 195, 195, 197, 204 coprophagy 204 crests 126, 128, 143 Cretaceous 50, 71–75 birds 277–78 climate 269 decline of dinosaurs 274, 275 dinosaur evolution rates 77 ecosystems 205 in North America 240–42 ornithopods 71 sauropods 71 see also Early Cretaceous; Late Cretaceous Cretaceous–Palaeogene boundary 260, 261–62, 265–66, 269 evolution of birds 276, 277, 278–79 Cretaceous Terrestrial Revolution 77–80, 131 Crichton, Michael, Jurassic Park 134–35, 136 criticism and scientific method 287–88 crocodiles 218 Adamantina Formation food web 201–3 eggs and babies 155, 159, 164, 165 feeding methods 194 function of the snout 193 crurotarsans 39 CT (computerized tomographic) scanning 97, 99 dinosaur embryos 160, 162 dinosaur skulls 163, 191 Currie, Phil 86, 91, 121 Cuvier, Georges 257 D Dal Corso, Jacopo 40 Daohugou Bed, China 68 Darwin, Charles 23, 107, 114, 132, 287 Daspletosaurus 170, 171 dating dinosaurian diversification 44–46 de-extinction science 149, 151 death of dinosaurs see extinction Deccan Traps 268, 285, 287 Deinonychus 112, 114, 121 fact file 112–13 John Ostrom’s monograph on 110, 111, 113, 116, 244–45 movement 221 dentine 196, 197 Dial, Ken 248 diet collapsing food webs 204–5 dinosaur food webs 201–4 fossil evidence for 194–95 microwear on teeth and diet 199–201 niche division and specialization in 205–13 digital models 17, 18, 19, 191–94, 231–34, 249, 252 dimorphism, sexual 126, 143 dinomania 107 Dinosaur Park Formation, Drumheller 86, 91–99, 100 Dinosaur Provincial Park, Alberta 86, 87, 91–92, 91 Dinosaur Ridge, Colorado 240 Dinosauria 33, 55, 82, 107 discovery of the clade 57–59 Diplodocus 175, 210–11, II diet 207, 208–9, 213 fact file 210–11 Morrison Formation 69 skulls IX teeth and bite force 209, 213 diversification of dinosaurs 29, 44–46 DNA (deoxyribonucleic acid) 134–35 cloning 148–51 dinosaurian genome 151–52 extracting from fossils in amber 136 extracting from museum skins and skeletons 138 identifying dinosaur 136–37 survival of in fossils 138–39, 141 Doda, Bajazid 180 Dolly the sheep 148, 149 Dromaeosaurus 87, 121 duck-billed dinosaurs see hadrosaurs dung beetles 204 dwarf dinosaurs 180–84 Dysalotosaurus 145 Dzik, Jerzy 29, 31 E Early Cretaceous diversity of species on land and in sea 78 Jehol Beds 124 Wealden 72–74, 74, 75, 78 ecological relay model 21, 22, 35, 36, 39 ecology, and the origin of dinosaurs 23–25 education, using dinosaurs in 101–4 eggs, birds 155, 158, 159, 166 eggs, dinosaur 154, 155–56 dinosaur embryos 160–63 nests and parental care 163–67 size of 158–59 El Kef, Tunisia 276 Elgin, Scotland 25–26, 26, 34, 85–86 embryos, dinosaur 154, 160–63 enamel, tooth 196, 197 enantiornithines 277–78 encephalization quotient (EQ) 130 engineering models 17–18 Eoraptor 29 Erickson, Greg 154–55, 170, 172–73, 184–85, 197 eumelanin 124 eumelanosomes V Euoplocephalus 87, 88 fact file 88–89 Europasaurus 117 European Synchrotron Radiation Facility (ESRF) 162 evolution 13, 23, 40 evolutionary trees 52–84, 60–61, 281 Richard Owen’s views on 106–7, 114 size and 181, 184 Evolution (journal) 109 excavations 87–99 Dinosaur Park Formation 86, 91–99, 100 recording 92–97 extant phylogenetic bracket 16, 217 external fundamental system (EFS) 170 extinction Carnian Pluvial Episode 40, 42, 43, 45, 46, 50 end-Triassic event 64 mass extinction 254–85 Permian–Triassic mass extinction 14, 33–34, 46, 222 sudden or gradual 270–75 eyes 100 F faeces, fossil 194, 195, 197, 204 Falkingham, Peter 192, 226 feathers 99, 245 in amber 131, VI bird feathers 125–26, 127 colour of 8–10, 17, 139, V as insulation 126 melanosomes 8–10, 8, 17, 124–25, 132, V sexual signalling 126, 128, 143 Sinosauropteryx 8–9, 8, 10, 17, 119, 120–21, 125, 126 Field, Dan 279, 281 films, dinosaurs in 249–52 Jurassic Park 134–35, 136, 217, 252 finding dinosaurs 87–105 finite element analysis (FEA) 18, 190–91, 199, 208 fishes 128, 159, 163–64, 196 flight 244–49 flowering plants 78–79, III food webs 71–75, 201–4 Adamantina Formation 201–4, 202–3 collapsing 204–5 Wealden 74, 75 footprints 223–27, 240 megatracksites 242 photogrammetry 94 swimming tracks 242, 243 fossils casting 100 extracting skeletons from 94–99, 105 plants 269 reconstructing 99–101 scanning 97, 99 survival of organic molecules in 138–39, 141 Framestore 249–50 Froude, William 228–29 G Galton, Peter 58, 59, 110, 115, 221, 221 Garcia, Mariano 232, 234 gastroliths 194 Gatesy, Stephen 226, 231 gaur 148–49 Gauthier, Jacques 53, 59, 245 genetic engineering, bringing dinosaurs back to life with 148–51 genome, dinosaurian 151–52 geological time scale 6–7, 44–45 gharials 193, 194 gigantothermy 117, 118 Gill, Pam 199 glasses, impact 265–66, 269 gliding 245, 247, 248 Gorgosaurus 87, 170, 171 Granger, Walter 157 Great Exhibition (1851) 107, 108 Gregory, William 157 Grimaldi, David 131 growth dwarf dinosaurs 180–84 growth rates 154, 170–74, 184 growth rings 116, 117, 154–55, 170, 172–73, 184 growth spurts 145 how dinosaurs could be so huge 175–79 Gryposaurus 87 Gubbio, Italy 260, 261–62, 265, 266, 286 H hadrosaurs 79, 143 Dinosaur Park Formation 91–99, 100 diversity of 272, 275 first skeleton 218–19, 220 teeth 196–97, 198, 201, XVIII use of arms and legs 236 Hadrosaurus foulkii 220 Haiti 265–66, 265 Haldane, J.

pages: 319 words: 90,965

The End of College: Creating the Future of Learning and the University of Everywhere
by Kevin Carey
Published 3 Mar 2015

To survive and prosper in the world with limited cognitive capacity, humans filter waves of constant sensory information through neural patterns—heuristics and mental shortcuts that our minds use to weigh the odds that what we are sensing is familiar and categorizable based on our past experience. Sebastian Thrun’s self-driving car does this with Bayesian statistics built into silicon and code, while the human mind uses electrochemical processes that we still don’t fully understand. But the underlying principle is the same: Based on the pattern of lines and shapes and edges, that is probably a boulder and I should drive around it. That is probably a group of three young women eating lunch at a table near the sushi bar and I should pay them no mind.

Abelard, Peter, 19, 190, 232 Academically Adrift (Arum and Roksa), 9, 85, 244 Accredible, 216–18, 248 Accreditation, 50, 58, 117, 150 ACT scores, 213 Adaptive Control of Thought—Rational (ACT-R), 101–4 Adler, Mortimer, 49 Administrative Behavior (Suppes), 78 Advanced Placement (AP) classes, 14, 15 Advanced Research Project Agency Network (ARPANET), 125, 126, 148, 205 African-Americans, 43 Agarwal, Anant, 11, 170–73, 214 Agincourt, Battle of, 98 Air Force, U.S., 186, 187 ROTC, 190 Alexander the Great, 92, 100, 226 Alexandria, Great Library of, 108 Amazon, 127, 145 American Revolution, 23 Anderson, John R., 101, 136 Andreessen, Marc, 126, 128, 205 Angell, James Rowland, 96 Anna Karenina (Tolstoy), 99 AOL, 204 Apache, 146 Apple, 126, 144, 146, 147 Argenteuil, Heloise d’, 19 Aristotle, 16, 17, 31, 44, 90–92, 95, 111, 226 Army, U.S., 90–91, 98 Air Force, 91 Artificial Intelligence (AI), 11, 79, 136, 153, 159, 170, 264n Adaptive Control of Thought—Rational (ACT-R) model for, 101–4 cognitive tutoring using, 103, 105, 138, 179, 210 Dartmouth conference on, 79, 101 learning pathways for, 155 personalized learning with, 5, 232 theorem prover based in, 110 Thrun’s work in, 147–50 Arum, Richard, 9, 10, 36, 85, 244 Associate’s degrees, 6, 61, 117, 141, 193, 196, 198 Atlantic magazine, 29, 65, 79, 123 AT&T, 146 Australian National University, 204 Bachelor’s degrees, 6–9, 31, 36, 60–61, 64 for graduate school admission, 30 percentage of Americans with, 8, 9, 57, 77 professional versus liberal arts, 35 required for public school teachers, 117 social mobility and, 76 time requirement for, 6, 22 value in labor market of, 58 Badges, digital, 207–12, 216–18, 233, 245, 248 Barzun, Jacques, 32–34, 44, 45, 85 Bayesian statistics, 181 Bell Labs, 123–24 Bellow, Saul, 59, 78 Berlin, University of, 26, 45-46 Bhave, Amol, 214–15 Bing, 212 Binghamton, State University of New York at, 183–84 Bishay, Shereef, 139, 140 Bloomberg, Michael, 251 Blue Ocean Strategy (Kim and Mauborgne), 130 Bologna, University of, 16–17, 21, 41 Bonn, University of, 147 Bonus Army, 51 Borders Books, 127 Boston College, 164, 175 Boston Gazette, 95 Boston Globe, 2 Boston University (BU), 59, 61–62, 64 Bowen, William G., 112–13 Bowman, John Gabbert, 74–75 Brigham Young University, 2 Brilliant, 213 British Army, 98 Brookings Institution, 54 Brooklyn College, 44 Brown v.

pages: 404 words: 92,713

The Art of Statistics: How to Learn From Data
by David Spiegelhalter
Published 2 Sep 2019

So the product of the likelihood ratio and the prior odds ends up being around 72,000/1,000,000, which are odds of around 7/100, corresponding to a probability of 7/107 or 7% that he is a cheat. So we should give him the benefit of the doubt at this stage, whereas we might not be so generous with someone we had just met in the pub. And perhaps we should keep a careful eye on the Archbishop. Bayesian Statistical Inference Bayes’ theorem, even if it is not permitted in UK courts, is the scientifically correct way to change our mind on the basis of new evidence. Expected frequencies make Bayesian analysis reasonably straightforward for simple situations that involve only two hypotheses, say about whether someone does or does not have a disease, or has or has not committed an offence.

* If we can assume that all our observations are independent and come from the same population distribution, the standard error of their average is just the standard deviation of the population distribution divided by the square root of the sample size. * We shall see in Chapter 12 that practitioners of Bayesian statistics are happy using probabilities for epistemic uncertainty about parameters. * Strictly speaking, a 95% confidence interval does not mean there is a 95% probability that this particular interval contains the true value, although in practice people often give this incorrect interpretation.

pages: 397 words: 102,910

The Idealist: Aaron Swartz and the Rise of Free Culture on the Internet
by Justin Peters
Published 11 Feb 2013

“We’d put the tools that we have at our disposal in their hands.”32 Swartz had actually been building tools like these for several months with his colleagues at ThoughtWorks. Victory Kit, as the project was called, was an open-source version of the expensive community-organizing software used by groups such as MoveOn. Victory Kit incorporated Bayesian statistics—an analytical method that gets smarter as it goes along by consistently incorporating new information into its estimates—to improve activists’ ability to reach and organize their bases. “In the end, a lot of what the software was about was doing quite sophisticated A/B testing of messages for advocacy,” remembered Swartz’s friend Nathan Woodhull.33 Swartz was scheduled to present Victory Kit to the group at the Holmes retreat.

Ashcroft, 137–38, 140 FBI file on, 191–92, 223 fleeing the system, 8, 145, 151, 158–59, 161, 171, 173, 193, 248, 267 and free culture movement, 3–4, 141, 152–55, 167, 223 and Harvard, 3, 205, 207, 223, 224, 229 health issues of, 9, 150, 165–66, 222 immaturity of, 8–9 and Infogami, 147, 148–51, 158 interests of, 6–7, 8–9, 204, 221 “Internet and Mass Collaboration, The,” 166–67 lawyers for, 6, 254–55 legacy of, 14–15, 268, 269–70 and Library of Congress, 139 and Malamud, 187–93, 222, 223 manifesto of, 6–7, 178–81, 189–90, 201, 228–30, 247 mass downloading of documents by, 1, 3, 188–94, 197–202, 207, 213, 215, 222, 228, 235 media stories about, 125 and MIT, 1, 3, 201, 204, 207, 213, 222, 227, 232, 249–50, 262 and money, 170–71 on morality and ethics, 205–6 and Open Library, 163, 173, 179, 223, 228 and PCCC, 202–3, 225 as private person/isolation of, 2–3, 5, 124, 127, 143, 154–55, 158–60, 166, 169, 205, 224, 227, 228, 248–49, 251 and public domain, 123 as public speaker, 213–14, 224, 243, 257 and Reddit, see Reddit The Rules broken by, 14 “saving the world” on bucket list of, 7, 8, 15, 125, 151–52, 181, 205–6, 247–48, 266, 267, 268 self-help program of, 251–53 and theinfo.org, 172–73 and US Congress, 224–25, 239–40 Swartz, Robert: and Aaron’s death, 261, 262, 264 and Aaron’s early years, 124, 127 and Aaron’s legal woes, 232, 250, 254 and MIT Media Lab, 203–4, 212, 219, 232, 250 and technology, 124, 212 Swartz, Susan, 128–29, 160, 192 Swartz’s legal case: as “the bad thing,” 3, 7–8, 234 change in defense strategy, 256–57 evidence-suppression hearing, 259–60 facts of, 11 felony charges in, 235, 253 grand jury, 232–33 indictment, 1, 5, 8, 10, 11, 233, 234, 235–37, 241, 253–54 investigation and capture, 215–17, 223, 228 JSTOR’s waning interest in, 231–32 manifesto as evidence in, 228–30 motion to suppress, 6 motives sought in, 223, 229 Norton subpoenaed in, 1–2, 227–29 ongoing, 248, 249–51 online petitions against, 236–37 original charges in, 218, 222 plea deals offered, 227, 250 possible prison sentence, 1, 2, 5, 7–8, 11, 222, 232, 235–36, 253, 260 potential harm assessed, 218, 219, 222, 235 prosecutor’s zeal in, 7–8, 11, 218, 222–24, 235–37, 253–54, 259–60, 263, 264 search and seizure in, 6, 223–24, 256–57 Symbolics, 103 systems, flawed, 265–67 T. & J. W. Johnson, 49 Tammany Hall, New York, 57 tech bubble, 146, 156 technology: Bayesian statistics in, 258–59 burgeoning, 69, 71, 84, 87–88 communication, 12, 13, 18, 87–88 computing, see computers and digital culture, 122 and digital utopia, 91, 266–67 of electronic publishing, 120 and intellectual property, 90–91 and irrational exuberance, 146 in library of the future, 81–83 as magic, 152 moving inexorably forward, 134 overreaching police action against, 233 power of metadata, 128, 130 as private property, 210 resisting change caused by, 120 saving humanity via, 101 thinking machines, 102 unknown, future, 85 and World War II, 208 telephone, invention of, 69 Templeton, Brad, 261 theinfo.org, 172–73 theme parks, 134 ThoughtWorks, 9, 248, 257, 258 “thumb drive corps,” 187, 191, 193 Toyota Motor Corporation, “lean production” of, 7, 257, 265 Trumbull, John, McFingal, 26 trust-busting, 75 Tucher, Andie, 34 Tufte, Edward, 263–64 “tuft-hunter,” use of term, 28 Tumblr, 240 Twain, Mark, 60, 62, 73 Tweed, William “Boss,” 57 Twitter, 237 Ulrich, Lars, 133 United States: Articles of Confederation, 26 copyright laws in, 26–27 economy of, 44–45, 51, 55, 56 freedom to choose in, 80, 269 industrialization, 57 literacy in, 25, 26–27, 39, 44, 48 migration to cities in, 57 national identity of, 28, 32 new social class in, 69–70 opportunity in, 58, 80 poverty in, 59 railroads, 55, 56 rustic nation of, 44–45 values of, 85 UNIVAC computer, 81, 90 Universal Studios Orlando, 134 University of Illinois at Urbana-Champaign, 94, 95–96, 112–15 Unix, 104 US Chamber of Commerce, 239 utilitarianism, 214 Valenti, Jack, 111, 132 Van Buren, Martin, 44 Van Dyke, Henry, The National Sin of Literary Piracy, 61 venture capital, 146 Viaweb, 146 Victor, O.

pages: 412 words: 115,266

The Moral Landscape: How Science Can Determine Human Values
by Sam Harris
Published 5 Oct 2010

If we are measuring sanity in terms of sheer numbers of subscribers, then atheists and agnostics in the United States must be delusional: a diagnosis which would impugn 93 percent of the members of the National Academy of Sciences.63 There are, in fact, more people in the United States who cannot read than who doubt the existence of Yahweh.64 In twenty-first-century America, disbelief in the God of Abraham is about as fringe a phenomenon as can be named. But so is a commitment to the basic principles of scientific thinking—not to mention a detailed understanding of genetics, special relativity, or Bayesian statistics. The boundary between mental illness and respectable religious belief can be difficult to discern. This was made especially vivid in a recent court case involving a small group of very committed Christians accused of murdering an eighteen-month-old infant.65 The trouble began when the boy ceased to say “Amen” before meals.

For instance, there is a difference between expected uncertainty—where one knows that one’s observations are unreliable—and unexpected uncertainty, where something in the environment indicates that things are not as they seem. The difference between these two modes of cognition has been analyzed within a Bayesian statistical framework in terms of their underlying neurophysiology. It appears that expected uncertainty is largely mediated by acetylcholine and unexpected uncertainty by norepinephrine (Yu & Dayan, 2005). Behavioral economists sometimes distinguish between “risk” and “ambiguity”: the former being a condition where probability can be assessed, as in a game of roulette, the latter being the uncertainty borne of missing information.

pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
by Pedro Domingos
Published 21 Sep 2015

The distinction between descriptive and normative theories was articulated by John Neville Keynes in The Scope and Method of Political Economy (Macmillan, 1891). Chapter Six Sharon Bertsch McGrayne tells the history of Bayesianism, from Bayes and Laplace to the present, in The Theory That Would Not Die (Yale University Press, 2011). A First Course in Bayesian Statistical Methods,* by Peter Hoff (Springer, 2009), is an introduction to Bayesian statistics. The Naïve Bayes algorithm is first mentioned in Pattern Classification and Scene Analysis,* by Richard Duda and Peter Hart (Wiley, 1973). Milton Friedman argues for oversimplified theories in “The methodology of positive economics,” which appears in Essays in Positive Economics (University of Chicago Press, 1966).

pages: 416 words: 118,522

Why Machines Learn: The Elegant Math Behind Modern AI
by Anil Ananthaswamy
Published 15 Jul 2024

MLE is powerful when you have a lot of sampled data, while MAP works best with fewer data. And as the amount of sampled data grows, MAP and MLE begin converging in their estimate of the underlying distribution. Most of us are intuitively frequentists. But the Bayesian approach to statistics is extremely powerful. (Note: Bayesian statistics is not the same as Bayes’s theorem. Even frequentists value Bayes’s theorem. They just object to this whole idea of having prior beliefs about the parameters of a distribution when trying to discern the properties of that very distribution from data.) One of the first large-scale demonstrations of using Bayesian reasoning for machine learning was due to two statisticians, Frederick Mosteller and David Wallace, who used the technique to figure out something that had been bothering historians for centuries: the authorship of the disputed Federalist Papers.

GO TO NOTE REFERENCE IN TEXT Data scientist Paul van der Laken: “The Monty Hall Problem: Simulating and Visualizing the Monty Hall Problem in Python & R,” paulvanderlaken.com/2020/04/14/simulating-visualizing-monty-hall-problem-python-r/. GO TO NOTE REFERENCE IN TEXT “born in 1701 with probability 0.8”: Stephen M. Stigler, “Richard Price, the First Bayesian,” Statistical Science 33, No. 1 (2018): 117–25. GO TO NOTE REFERENCE IN TEXT Royal Tunbridge Wells in England: “Thomas Bayes: English Theologian and Mathematician,” Science & Tech, Britannica, n.d., www.britannica.com/biography/Thomas-Bayes. GO TO NOTE REFERENCE IN TEXT Bayes and Price were kindred spirits: Stigler, “Richard Price, the First Bayesian,” p. 117.

pages: 829 words: 186,976

The Signal and the Noise: Why So Many Predictions Fail-But Some Don't
by Nate Silver
Published 31 Aug 2012

Scott Armstrong, The Wharton School, University of Pennsylvania LIBRARY OF CONGRESS CATALOGING IN PUBLICATION DATA Silver, Nate. The signal and the noise : why most predictions fail but some don’t / Nate Silver. p. cm. Includes bibliographical references and index. ISBN 978-1-101-59595-4 1. Forecasting. 2. Forecasting—Methodology. 3. Forecasting—History. 4. Bayesian statistical decision theory. 5. Knowledge, Theory of. I. Title. CB158.S54 2012 519.5'42—dc23 2012027308 While the author has made every effort to provide accurate telephone numbers, Internet addresses, and other contact information at the time of publication, neither the publisher nor the author assumes any responsibility for errors, or for changes that occur after publication.

This is why it is sometimes said that poker is a hard way to make an easy living. Of course, if this player really did have some way to know that he was a long-term winner, he’d have reason to persevere through his losses. In reality, there’s no sure way for him to know that. The proper way for the player to estimate his odds of being a winner, instead, is to apply Bayesian statistics,31 where he revises his belief about how good he really is, on the basis of both his results and his prior expectations. If the player is being honest with himself, he should take quite a skeptical attitude toward his own success, even if he is winning at first. The player’s prior belief should be informed by the fact that the average poker player by definition loses money, since the house takes some money out of the game in the form of the rake while the rest is passed around between the players.32 The Bayesian method described in the book The Mathematics of Poker, for instance, would suggest that a player who had made $30,000 in his first 10,000 hands at a $100/$200 limit hold ’em game was nevertheless more likely than not to be a long-term loser.

Nickerson, “Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy,” Psychological Methods, 5, 2 (2000), pp. 241–301. http://203.64.159.11/richman/plogxx/gallery/17/%E9%AB%98%E7%B5%B1%E5%A0%B1%E5%91%8A.pdf. 62. Andrew Gelman and Cosma Tohilla Shalizi, “Philosophy and the Practice of Bayesian Statistics,” British Journal of Mathematical and Statistical Psychology, pp. 1–31, January 11, 2012. http://www.stat.columbia.edu/~gelman/research/published/philosophy.pdf. 63. Although there are several different formulations of the steps in the scientific method, this version is mostly drawn from “APPENDIX E: Introduction to the Scientific Method,” University of Rochester. http://teacher.pas.rochester.edu/phy_labs/appendixe/appendixe.html. 64.

pages: 314 words: 122,534

The Missing Billionaires: A Guide to Better Financial Decisions
by Victor Haghani and James White
Published 27 Aug 2023

Indeed, uncertainty about expected returns is usually significantly more important than for any other type of parameter, and we've seen that even uncertainty in expected returns (in the stylized case we explored) is ultimately not especially impactful for one of the most important situations we care about. Notes a. A joint probability distribution is a probability distribution that describes the simultaneous occurrence of two or more random variables. It gives the probability of different combinations of values for the random variables. b. In Bayesian statistics, a “prior” belief is a probability distribution that reflects your beliefs about the probability of a certain event occurring before any new data is collected. c. Two people, each wearing a necktie, argue over who has the cheaper one. They agree to a bet wherein the person with the more expensive necktie must give it to the other person.

Actively managed mutual funds, 215–217 Active managers, 306–307 Active trading, 6, 88 Allais' paradox, 104 Alternative investments, 229–230 Ambiguity aversion, 279, 281 Annuities: Constant Standard of Living (CSL) annuities, 237–239 lifetime (see Lifetime annuities) Annuity puzzle, 166 Anomaly persistence, 220 AQR, 223, 230 Arbitrage pricing theory (APT), 218, 351n7 Arbitrage theories, 116 ARKK, 216–217 Art, 231 Asness, Cliff, 217–218 Asset classes, 201–234 actively managed mutual funds, 215–217 alternative investments, 229–230 art and collectibles, 231 commodities, 231 corporate bonds, 212–213 crypto, 231 ESG investing, 224 factor investing, 217–224 foreign equity markets, 209–210 index funds, 210–212 individual stocks, 224–226, 226e long‐term inflation‐linked government bonds, 204 nominal government bills/bonds, 204–205 options, 227 real estate, 213–215 special situation trades, 227–229 stock markets, 205–208 Average annual return, compound vs., 205–206 Average investor hypothesis, 203 Axioms of rational choice, 74 Bachelier, Louis, 351n6 Bacon, Louis, 61 Bankman‐Fried, Sam, 106n Base‐rate fallacy, 221n Bayesian statistics, 276n Behavioral economics, 104, 190 Bengen, Bill, 142 Bentham, Jeremy, 116, 176n Bequest function, 138n Bequests: end‐of‐life, 169–170 intergenerational, 182 Berkshire Hathaway, 307 Bernoulli, Daniel, 7, 70–72, 74, 113 Bernstein, William, 261 Beta, 217n, 351n9 Bezos, Jeff, 131 Bias, home, 209–210 Binary options, 263 Black, Fischer, 201, 351n9, 303 Black Monday, see October 1987 stock market crash Black‐Scholes‐Merton model of options pricing, 247 Black‐Scholes option pricing formula, 39 Bloomberg, 189n, 263 Bogle, John (“Jack”), 64, 209, 215–216, 246 Bonds.

pages: 573 words: 157,767

From Bacteria to Bach and Back: The Evolution of Minds
by Daniel C. Dennett
Published 7 Feb 2017

Each problem is couched thus: Given that your expectations based on past experience (including, we may add, the experience of your ancestors as passed down to you) are such and such (expressed as probabilities for each alternative), what effect on your future expectations should the following new data have? What adjustments in your probabilities would it be rational for you to make? Bayesian statistics, then, is a normative discipline, purportedly prescribing the right way to think about probabilities.41 So it is a good candidate for a competence model of the brain: it works as an expectation-generating organ, creating new affordances on the fly. Consider the task of identifying handwritten symbols (letters and digits).

“Knowing One’s Place: A Free-Energy Approach to Pattern Regulation.” Journal of the Royal Society Interface, 12: 20141383. Frith, Chris D. 2012. “The Role of Metacognition in Human Social Interactions.” Philosophical Transactions of the Royal Society B: Biological Sciences 367 (1599): 2213–2223. Gelman, Andrew. 2008. “Objections to Bayesian Statistics.” Bayesian Anal. 3 (3): 445–449. Gibson, James J. 1966. “The Problem of Temporal Order in Stimulation and Perception.” Journal of Psychology 62 (2): 141–149. —. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin. Godfrey-Smith, Peter. 2003. “Postscript on the Baldwin Effect and Niche Construction.”

pages: 208 words: 57,602

Futureproof: 9 Rules for Humans in the Age of Automation
by Kevin Roose
Published 9 Mar 2021

Because while I’m confident that the most valuable skills for the future are lowercase-h humanities—those surprising, social, and scarce abilities that we’ve been talking about—I’m much less confident that just studying the traditional, capital-H Humanities subjects in school will get them there. Is the average anthropology major likely to be more socially adept than the average engineering major? Does reading Beowulf make you better at handling surprises, or developing scarce talents, than learning Bayesian statistics? Many ideas have been proposed and tested for bringing our educational system into the twenty-first century, including personalized curricula, massive open online courses (MOOCs), and “lifelong learning” adult education programs. But few of them have been adequately tested, and all of the ideas deal primarily with how we should teach people, leaving open the question of what we should teach them.

Natural Language Processing with Python and spaCy
by Yuli Vasiliev
Published 2 Apr 2020

More no-nonsense books from NO STARCH PRESS PYTHON CRASH COURSE, 2ND EDITION A Hands-On, Project-Based Introduction to Programming by ERIC MATTHES MAY 2019, 544 pp., $39.95 ISBN 978-1-59327-928-8 MATH ADVENTURES WITH PYTHON An Illustrated Guide to Exploring Math with Code by PETER FARRELL JANUARY 2019, 304 pp., $29.95 ISBN 978-1-59327-867-0 THE BOOK OF R A First Course in Programming and Statistics by TILMAN M. DAVIES JULY 2016, 832 pp., $49.95 ISBN 978-1-59327-651-5 BAYESIAN STATISTICS THE FUN WAY Understanding Statistics and Probability with Star Wars, LEGO, and Rubber Ducks by WILL KURT JULY 2019, 256 pp., $34.95 ISBN 978-1-59327-956-1 PYTHON ONE-LINERS by CHRISTIAN MAYER SPRING 2020, 256 pp., $39.95 ISBN 978-1-7185-0050-1 AUTOMATE THE BORING STUFF WITH PYTHON, 2ND EDITION Practical Programming for Total Beginners by AL SWEIGART NOVEMBER 2019, 592 pp., $39.95 ISBN 978-1-59327-992-9 PHONE: 800.420.7240 OR 415.863.9900 EMAIL: SALES@NOSTARCH.COM WEB: WWW.NOSTARCH.COM BUILD YOUR OWN NLP APPLICATIONS Natural Language Processing with Python and spaCy will show you how to create NLP applications like chatbots, text-condensing scripts, and order-processing tools quickly and easily.

Analysis of Financial Time Series
by Ruey S. Tsay
Published 14 Oct 2001

In particular, we discuss Bayesian inference via Gibbs sampling and demonstrate various applications of MCMC methods. Rapid developments in the MCMC methodology make it impossible to cover all the new methods available in the literature. Interested readers are referred to some recent books on Bayesian and empirical Bayesian statistics (e.g., Carlin and Louis, 2000; Gelman, Carlin, Stern, and Rubin, 1995). For applications, we focus on issues related to financial econometrics. The demonstrations shown in this chapter only represent a small fraction of all possible applications of the techniques in finance. As a matter of fact, it is fair to say that Bayesian inference and the MCMC methods discussed here are applicable to most, if not all, of the studies in financial econometrics.

For MCMC methods, use of conjugate priors means that a closed-form solution for the conditional posterior distributions is available. Random draws of the Gibbs sampler can then be obtained by using the commonly available computer routines of probability distributions. In what follows, we review some well-known conjugate priors. For more information, readers are referred to textbooks on Bayesian statistics (e.g., DeGroot, 1970, Chapter 9). Result 1: Suppose that x1 , . . . , xn form a random sample from a normal distribution with mean µ, which is unknown, and variance σ 2 , which is known and positive. Suppose that the prior distribution of µ is a normal distribution with mean µo and variance σo2 .

pages: 206 words: 70,924

The Rise of the Quants: Marschak, Sharpe, Black, Scholes and Merton
by Colin Read
Published 16 Jul 2012

He postulated that the rational decision-maker will align his or her beliefs of unknown probabilities to the consensus bets of impartial bookmakers, a technique often called the Dutch Book. Thirty later, the great mind Leonard “Jimmie” Savage (1917–1971) elaborated his concept into an axiomatic approach to decision-making under uncertainty using arguments remarkably similar to Ramsey’s logic. The concepts of Ramsey and Savage also formed the basis for the theory of Bayesian statistics and are important in many aspects of financial decision-making. Marschak’s great insight While Ramsey created and Savage broadened the logical landscape for the inclusion of uncertainty into decision-making, it was not possible to incorporate their logic until the finance discipline could develop actual measures of uncertainty.

pages: 654 words: 191,864

Thinking, Fast and Slow
by Daniel Kahneman
Published 24 Oct 2011

And if you believe that there is a 30% chance that candidate X will be elected president, and an 80% chance that he will be reelected if he wins the first time, then you must believe that the chances that he will be elected twice in a row are 24%. The relevant “rules” for cases such as the Tom W problem are provided by Bayesian statistics. This influential modern approach to statistics is named after an English minister of the eighteenth century, the Reverend Thomas Bayes, who is credited with the first major contribution to a large problem: the logic of how people should change their mind in the light of evidence. Bayes’s rule specifies how prior beliefs (in the examples of this chapter, base rates) should be combined with the diagnosticity of the evidence, the degree to which it favors the hypothesis over the alternative.

.); WYSIATI (what you see is all there is) and associative memory; abnormal events and; anchoring and; causality and; confirmation bias and; creativity and; and estimates of causes of death Åstebro, Thomas Atlantic, The attention; in self-control paneight="0%" width="-5%"> Attention and Effort (Kahneman) Auerbach, Red authoritarian ideas availability; affect and; and awareness of one’s biases; expectations about; media and; psychology of; risk assessment and, see risk assessment availability cascades availability entrepreneurs bad and good, distinctions between banks bank teller problem Barber, Brad Bargh, John baseball baseball cards baseline predictions base rates; in cab driver problem; causal; in helping experiment; low; statistical; in Tom W problem; in Yale exam problem basic assessments basketball basketball tickets bat-and-ball problem Baumeister, Roy Bayes, Thomas Bayesian statistics Bazerman, Max Beane, Billy Beatty, Jackson Becker, Gary “Becoming Famous Overnight” (Jacoby) behavioral economics Behavioral Insight Team “Belief in the Law of Small Numbers” (Tversky and Kahneman) beliefs: bias for; past, reconstruction of Benartzi, Shlomo Bentham, Jeremy Berlin, Isaiah Bernoulli, Daniel Bernouilli, Nicholas Beyth, Ruth bicycle messengers Black Swan, The (Taleb) blame Blink (Gladwell) Borg, Björn Borgida, Eugene “Boys Will Be Boys” (Barber and Odean) Bradlee, Ben brain; amygdala in; anterior cingulate in; buying and selling and; emotional framing and; frontal area of; pleasure and; prefrontal area of; punishment and; sugar in; threats and; and variations of probabilities British Toxicology Society broad framing Brockman, John broken-leg rule budget forecasts Built to Last (Collins and Porras) Bush, George W.

pages: 586 words: 186,548

Architects of Intelligence
by Martin Ford
Published 16 Nov 2018

In the case that Roger Shepard was thinking about, he was working on the basic mathematics of how might an organism, having experienced a certain stimulus to have some good or negative consequence, figure out which other things in the world are likely to have that same consequence? Roger had introduced some mathematics based on Bayesian statistics for solving that problem, which was a very elegant formulation of the general theory of how organisms could generalize from experience and he was looking to neural networks to try to take that theory and implement it in a more scalable way. Somehow, I wound up working with him on this project.

Even a very young child can learn this new causal relation between moving your finger in a certain way and a screen lighting up, and that is how all sorts of other possibilities of action open to you. These problems of how we make a generalization from just one or a few examples are what I started working on with Roger Shepard when I was just an undergraduate. Early on, we used these ideas from Bayesian statistics, Bayesian inference, and Bayesian networks, to use the mathematics of probability theory to formulate how people’s mental models of the causal structure of the world might work. It turns out that tools that were developed by mathematicians, physicists, and statisticians to make inferences from very sparse data in a statistical setting were being deployed in the 1990s in machine learning and AI, and it revolutionized the field.

pages: 239 words: 74,845

The Antisocial Network: The GameStop Short Squeeze and the Ragtag Group of Amateur Traders That Brought Wall Street to Its Knees
by Ben Mezrich
Published 6 Sep 2021

And a third classmate, Michael, whom Jeremy had met in his advanced linear algebra class, happened to share Jeremy’s double major—math and psychology—which meant they had a joint penchant for making themselves miserable, coupled with a drive to figure out why they were chasing said misery. Between Jeremy’s bubble, which got together twice a week, and his course load, which included such mouthfuls as Bayesian statistics, probabilistic machine learning, and the cinema of psychopathology, it was almost possible to forget that the outside world had come to a grinding halt. Jeremy yanked his hood back as he moved deeper into his apartment, freeing his tangled mop of reddish hair, which sprang up above his high forehead like some sort of demented, rust-colored halo.

pages: 345 words: 75,660

Prediction Machines: The Simple Economics of Artificial Intelligence
by Ajay Agrawal , Joshua Gans and Avi Goldfarb
Published 16 Apr 2018

These applications are a microcosm of what most businesses will be doing in the near future. If you’re lost in the fog trying to figure out what AI means for you, then we can help you understand the implications of AI and navigate through the advances in this technology, even if you’ve never programmed a convolutional neural network or studied Bayesian statistics. If you are a business leader, we provide you with an understanding of AI’s impact on management and decisions. If you are a student or recent graduate, we give you a framework for thinking about the evolution of jobs and the careers of the future. If you are a financial analyst or venture capitalist, we offer a structure around which you can develop your investment theses.

pages: 267 words: 72,552

Reinventing Capitalism in the Age of Big Data
by Viktor Mayer-Schönberger and Thomas Ramge
Published 27 Feb 2018

Divergences would be flagged and brought to the attention of factory directors, then to government decision makers sitting in a futuristic operations room. From there the officials would send directives back to the factories. Cybersyn was quite sophisticated for its time, employing a network approach to capturing and calculating economic activity and using Bayesian statistical models. Most important, it relied on feedback that would loop back into the decision-making processes. The system never became fully operational. Its communications network was in place and was used in the fall of 1972 to keep the country running when striking transportation workers blocked goods from entering Santiago.

pages: 250 words: 79,360

Escape From Model Land: How Mathematical Models Can Lead Us Astray and What We Can Do About It
by Erica Thompson
Published 6 Dec 2022

We are back to the reference class question: weather-like models are those for which we have enough reasonable evidence to suggest that the model outputs form a reasonable reference class for the real outcome. Where did Berger and Smith go with this idea? Essentially, having made this distinction, they claim (and I agree) that for weather-like models a formal Bayesian statistical framework is a reliable way to assess and present uncertainty in forward predictions derived from models, stressing the necessity of incorporating formal descriptions of model imperfections. In climate-like situations, they demonstrate the possibility of unmodelled phenomena resulting in outcomes that are beyond the range of model outputs and therefore unquantifiable in principle as well as in practice.

pages: 290 words: 82,871

The Hidden Half: How the World Conceals Its Secrets
by Michael Blastland
Published 3 Apr 2019

Sondergaard et al., ‘Non-steroidal Anti-inflammatory Drug Use is Associated with Increased Risk of Out-of-Hospital Cardiac Arrest: A Nationwide Case-time-control Study’, European Heart Journal – Cardiovascular Pharmacotherapy, vol. 3, no. 2, 2017, pp. 100–107. 4 I wrote about this case in a blog for the Winton Centre for Risk and Evidence Communication: ‘Here we Go Again’, 21 March 2017. 5 See, for example, James Ware, ‘The Limitations of Risk Factors as Prognostic Tools’, New England Journal of Medicine, 21 December 2006; and Tjeerd-Pieter van Staa et al., ‘Prediction of Cardiovascular Risk Using Framingham, ASSIGN and QRISK2: How Well Do They Predict Individual Rather than Population Risk?’, PLOS One, 1 October 2014. 6 This is a metaphor often used by some statisticians. I have a lot of time for it. But we are teetering here on the brink of a discussion of Bayesian statistics, and had better resist. Readers can find plenty of such discussions elsewhere. 7 We simply don’t have the data to do it at the individual level. Some people think we do, but to begin to convert one to the other requires a series of medical trials involving multiple tests on the same person, known as ‘N of 1’ trials, and these are not standard. 8 For a favourable explanation of how NNTs are calculated, their advantages, and for a searchable database of NNTs for different treatments, see: theNNT.com. 9 The wide variability of response in individuals that could produce the kind of average effect shown in the chart – but might also be consistent with a quite different set of individual reactions – is discussed in two articles by Stephen Senn on https://errorstatistics.com: ‘Responder Despondency’ and ‘Painful Dichotomies’.

pages: 277 words: 87,082

Beyond Weird
by Philip Ball
Published 22 Mar 2018

Those beliefs do not become realized as facts until they impinge on the consciousness of the observer – and so the facts are specific to every observer (although different observers can find themselves agreeing on the same facts). This notion takes its cue from standard Bayesian probability theory, introduced in the eighteenth century by the English mathematician and clergyman Thomas Bayes. In Bayesian statistics, probabilities are not defined with reference to some objective state of affairs in the world, but instead quantify personal degrees of belief of what might happen – which we update as we acquire new information. The QBist view, however, says something much more profound than simply that different people know different things.

pages: 301 words: 85,126

AIQ: How People and Machines Are Smarter Together
by Nick Polson and James Scott
Published 14 May 2018

Allen WannaCry (ransomware attack) waterfall diagram Watson (IBM supercomputer) Waymo (autonomous-car company) WeChat word vectors word2vec model (Google) World War I World War II Battle of the Bulge Bayesian search and Hopper, Grace, and Schweinfurt-Regensburg mission (World War II) Statistical Research Group (Columbia) and Wald’s survivability recommendations for aircraft Yormark, Brett YouTube Zillow ABOUT THE AUTHORS NICK POLSON is professor of Econometrics and Statistics at the Chicago Booth School of Business. He does research on artificial intelligence, Bayesian statistics, and deep learning, and is a frequent speaker at conferences. He lives in Chicago. You can sign up for email updates here. JAMES SCOTT is associate professor of Statistics at the University of Texas at Austin. He earned his Ph.D. in statistics from Duke University in 2009 after studying mathematics at the University of Cambridge on a Marshall Scholarship.

The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences
by Rob Kitchin
Published 25 Aug 2014

They include parametric statistics which are employed to assess hypotheses using interval and ratio level data, such as correlation and regression; non-parametric statistics used for testing hypotheses using nominal or ordinal-level data; and probabilistic statistics that determine the probability of a condition occurring, such as Bayesian statistics. The armoury of descriptive and inferential statistics that have traditionally been used to analyse small data are also being applied to big data, though as discussed in Chapter 9 this is not always straightforward because many of these techniques were developed to draw insights from relatively scarce rather than exhaustive data.

pages: 292 words: 94,660

The Loop: How Technology Is Creating a World Without Choices and How to Fight Back
by Jacob Ward
Published 25 Jan 2022

He developed notions of time-sharing and utility computing that gave rise to today’s $250 billion cloud-computing industry. And he later founded and ran Stanford’s AI lab, while Marvin Minsky ran MIT’s. Wide-eyed, sleep-deprived Solomonoff went on to propose the first notions of “algorithmic probability” that could be used for predictions, and created the theoretical framework for using Bayesian statistics to deal with uncertainty, which makes him the ancestor of everything from modern weather prediction to AI that can spit out a reasonable-sounding term paper from a one-sentence prompt. RAND Corporation’s Allen Newell went on to publish the first doctoral dissertation in AI, “Information Processing: A New Technique for the Behavioral Sciences.”

Learn Algorithmic Trading
by Sebastien Donadio
Published 7 Nov 2019

Recently, there has been a resurgence in interest in machine learning algorithms and applications owing to the availability of extremely cost-effective processing power and the easy availability of large datasets. Understanding machine learning techniques in great detail is a massive field at the intersection of linear algebra, multivariate calculus, probability theory, frequentist and Bayesian statistics, and an in-depth analysis of machine learning is beyond the scope of a single book. Machine learning methods, however, are surprisingly easily accessible in Python and quite intuitive to understand, so we will explain the intuition behind the methods and see how they find applications in algorithmic trading.

pages: 289 words: 92,714

The Rationalist's Guide to the Galaxy: Superintelligent AI and the Geeks Who Are Trying to Save Humanity's Future
by Tom Chivers
Published 12 Jun 2019

In fact, we can be even more specific than that. For AI specialists like Bostrom, intelligence is the ability to make ‘probabilistically optimal use of available information’1 – to make the best bets with the information you have. There’s quite a lot of formal maths involved in this – about Bayesian statistics and complexity and so on – but essentially it’s about picking the course of action most likely to bring about whatever objective you’ve been set. If someone’s set you the task of finding all the lost pennies in Britain and using them to build a bronze statue of Makka Pakka off of In the Night Garden, then there is an optimally efficient way of doing that – you can perform that task intelligently.

pages: 375 words: 102,166

The Genetic Lottery: Why DNA Matters for Social Equality
by Kathryn Paige Harden
Published 20 Sep 2021

Yes, socially constructed race differences are systematically related to genetic ancestry. And, yes, within European-ancestry populations, genetic differences between people are associated with differences in their socially important life outcomes. But neither one of these pieces of information gives you any information about the sources of racial disparities. In Bayesian statistics, there is a something called a prior, which is a mathematical representation of what you believe—and how uncertain you are about those beliefs—before (prior to) any evidence is taken into account. What do you know—or believe that you know—when no information is available? That is the situation we find ourselves in regarding between-population genetic differences in complex life outcomes such as education.

pages: 340 words: 97,723

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity
by Amy Webb
Published 5 Mar 2019

Jean Bartik, mathematician and one of the original programmers for the ENIAC computer. Albert Turner Bharucha-Reid, mathematician and theorist who made significant contributions in Markov chains, probability theory, and statistics. David Blackwell, statistician and mathematician who made significant contributions to game theory, information theory, probability theory, and Bayesian statistics. Mamie Phipps Clark, a PhD and social psychologist whose research focused on self-consciousness. Thelma Estrin, who pioneered the application of computer systems in neurophysiological and brain research. She was a researcher in the Electroencephalography Department of the Neurological Institute of Columbia Presbyterian at the time of the Dartmouth Summer Research Project.

pages: 407 words: 104,622

The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution
by Gregory Zuckerman
Published 5 Nov 2019

Brown, Mercer, and the others relied upon Bayesian mathematics, which had emerged from the statistical rule proposed by Reverend Thomas Bayes in the eighteenth-century. Bayesians will attach a degree of probability to every guess and update their best estimates as they receive new information. The genius of Bayesian statistics is that it continuously narrows a range of possibilities. Think, for example, of a spam filter, which doesn’t know with certainty if an email is malicious, but can be effective by assigning odds to each one received by constantly learning from emails previously classified as “junk.” (This approach wasn’t as strange as it might seem.

pages: 363 words: 109,834

The Crux
by Richard Rumelt
Published 27 Apr 2022

They contain a strong random element. Track your monthly spending on groceries. A blip upward does not mean your finances are out of control, and a downward blip does not signal coming starvation. However, to insert proper logic into their estimates of value, the analysts would need PhDs in advanced Bayesian statistical modeling and certainly would not use spreadsheets. By construction, their fairly primitive estimating tools grossly overreact to blips. A third problem is that the “true” value of a company is very hard to know. Fischer Black, coauthor of the famous 1973 Black-Scholes option-pricing formula, was a believer that market prices were unbiased estimates of true value.3 But, over drinks, he also told me that the “true” value of a company was anywhere from half to twice the current stock price.

pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline
by Cathy O'Neil and Rachel Schutt
Published 8 Oct 2013

Just because your method converges, it doesn’t mean the results are meaningful. Make sure you’ve created a reasonable narrative and ways to check its validity. Chapter 12. Epidemiology The contributor for this chapter is David Madigan, professor and chair of statistics at Columbia. Madigan has over 100 publications in such areas as Bayesian statistics, text mining, Monte Carlo methods, pharmacovigilance, and probabilistic graphical models. Madigan’s Background Madigan went to college at Trinity College Dublin in 1980, and specialized in math except for his final year, when he took a bunch of stats courses, and learned a bunch about computers: Pascal, operating systems, compilers, artificial intelligence, database theory, and rudimentary computing skills.

Succeeding With AI: How to Make AI Work for Your Business
by Veljko Krunic
Published 29 Mar 2020

Finally, I’ll show you how to make sure your team possesses all the skills that the specific AI project you’re running requires. 2.7.1 Data science unicorns Data science could be considered an umbrella term that covers many skills. A survey performed in 2013 lists 22 different areas that are part of data science [66]. Examples of those areas include topics like statistics, operational research, Bayesian statistics, programming, and many others. It gets worse! Today, there are new areas that would certainly be considered important (for example, deep learning). Clearly, a data science unicorn should be a world-class expert in each one of those areas, right? No, these are individually very complex areas.

pages: 2,466 words: 668,761

Artificial Intelligence: A Modern Approach
by Stuart Russell and Peter Norvig
Published 14 Jul 2019

The Gibbs sampler was devised by Geman and Geman (1984) for inference in undirected Markov networks. The application of Gibbs sampling to Bayesian networks is due to Pearl (1987). The papers collected by Gilks et al. (1996) cover both theory and applications of MCMC. Since the mid-1990s, MCMC has become the workhorse of Bayesian statistics and statistical computation in many other disciplines including physics and biology. The Handbook of Markov Chain Monte Carlo (Brooks et al., 2011) covers many aspects of this literature. The BUGS package (Gilks et al., 1994) was an early and influential system for Bayes net modeling and inference using Gibbs sampling.

The text by Rasmussen and Williams (2006) covers the Gaussian process, which gives a way of defining prior distributions over the space of continuous functions. The material in this chapter brings together work from the fields of statistics and pattern recognition, so the story has been told many times in many ways. Good texts on Bayesian statistics include those by DeGroot (1970), Berger (1985), and Gelman et al. (1995). Bishop (2007), Hastie et al. (2009), Barber (2012), and Murphy (2012) provide excellent introductions to statistical machine learning. For pattern classification, the classic text for many years has been Duda and Hart (1973), now updated (Duda et al., 2001).

The annual NeurIPS (Neural Information Processing Systems, formerly NIPS) conference, whose proceedings are published as the series Advances in Neural Information Processing Systems, includes many Bayesian learning papers, as does the annual conference on Artificial Intelligence and Statistics. Specifically Bayesian venues include the Valencia International Meetings on Bayesian Statistics and the journal Bayesian Analysis. 1Statistically sophisticated readers will recognize this scenario as a variant of the urn-and-ball setup. We find urns and balls less compelling than candy. 2We stated earlier that the bags of candy are very large; otherwise, the i.i.d. assumption fails to hold.

pages: 398 words: 120,801

Little Brother
by Cory Doctorow
Published 29 Apr 2008

They hopped from Xbox to Xbox until they found one that was connected to the Internet, then they injected their material as undecipherable, encrypted data. No one could tell which of the Internet's packets were Xnet and which ones were just plain old banking and e-commerce and other encrypted communication. You couldn't find out who was tying the Xnet, let alone who was using the Xnet. But what about Dad's "Bayesian statistics?" I'd played with Bayesian math before. Darryl and I once tried to write our own better spam filter and when you filter spam, you need Bayesian math. Thomas Bayes was an 18th century British mathematician that no one cared about until a couple hundred years after he died, when computer scientists realized that his technique for statistically analyzing mountains of data would be super-useful for the modern world's info-Himalayas.

pages: 755 words: 121,290

Statistics hacks
by Bruce Frey
Published 9 May 2006

He earned his Bachelor's degree in educational research and psychology from Bucknell University in 2000, and his Doctorate in psychometric methods from the University of Massachusetts, Amherst in 2004. His primary research interest is in the application of mathematical models to psychometric data, including the use of Bayesian statistics for solving practical measurement problems. He also enjoys applying his knowledge of statistics and probability to everyday situations, such as playing poker against the author of this book! Acknowledgments I'd like to thank all the contributors to this book, both those who are listed in the "Contributors" section and those who helped with ideas, reviewed the manuscript, and provided suggestions of sources and resources.

pages: 415 words: 125,089

Against the Gods: The Remarkable Story of Risk
by Peter L. Bernstein
Published 23 Aug 1996

Slovic, Paul, Baruch Fischoff, and Sarah Lichtenstein, 1990. "Rating the Risks." In Glickman and Gough, 1990, pp. 61-75. Smith, Clifford W., Jr., 1995. "Corporate Risk Management: Theory and Practice." Journal of Derivatives, Summer, pp. 21-30. Smith, M. F. M., 1984. "Present Position and Potential Developments: Some Personal Views of Bayesian Statistics." Journal of the Royal Statistical Association, Vol. 147, Part 3, pp. 245-259. Smithson, Charles W., and Clifford W. Smith, Jr., 1995. Managing Financial Risk: A Guide to Derivative Products, Financial Engineering, and Value Maximization. New York: Irwin.* Sorensen, Eric, 1995. "The Derivative Portfolio Matrix-Combining Market Direction with Market Volatility."

pages: 483 words: 141,836

Red-Blooded Risk: The Secret History of Wall Street
by Aaron Brown and Eric Kim
Published 10 Oct 2011

Just as Archimedes claimed that with a long enough lever he could move the earth, I claim that with a big enough numeraire, I can make any faith-based action seem reasonable. Frequentist statistics suffers from paradoxes because it doesn’t insist everything be stated in moneylike terms, without which there’s no logical connection between frequency and degree of belief. Bayesian statistics suffers from insisting on a single, universal numeraire, which is often not appropriate. One thing we know about money is that it can’t buy everything. One thing we know about people is they have multiple natures, and groups of people are even more complicated. There are many numeraires, more than there are people.

No Slack: The Financial Lives of Low-Income Americans
by Michael S. Barr
Published 20 Mar 2012

Romich, Jennifer, Sarah Gordon, and Eric N. Waithaka. 2009. “A Tool for Getting By or Getting Ahead? Consumers’ Views on Prepaid Cards.” Working Paper 2009-WP-09. Terre Haute: Indiana State University, Networks Financial Institute (http://ssrn.com/ abstract=1491645). Rossi, Peter E., Greg M. Allenby, and Robert McCulloch. 2005. Bayesian Statistics and Marketing. West Sussex, U.K.: John Wiley & Sons. Sawtooth Software. 2008. “Proceedings of the Sawtooth Software Conference, October 2007” (www.sawtoothsoftware.com/download/techpap/2007Proceedings.pdf ). Seidman, Ellen, Moez Hababou, and Jennifer Kramer. 2005. A Financial Services Survey of Low- and Moderate-Income Households.

pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions
by Brian Christian and Tom Griffiths
Published 4 Apr 2016

distilled down to a single estimate: Laplace’s Law is derived by working through the calculation suggested by Bayes—the tricky part is the sum over all hypotheses, which involves a fun application of integration by parts. You can see a full derivation of Laplace’s Law in Griffiths, Kemp, and Tenenbaum, “Bayesian Models of Cognition.” From the perspective of modern Bayesian statistics, Laplace’s Law is the posterior mean of the binomial rate using a uniform prior. If you try only once and it works out: You may recall that in our discussion of multi-armed bandits and the explore/exploit dilemma in chapter 2, we also touched on estimates of the success rate of a process—a slot machine—based on a set of experiences.

pages: 574 words: 164,509

Superintelligence: Paths, Dangers, Strategies
by Nick Bostrom
Published 3 Jun 2014

They also provide important insight into the concept of causality.28 One advantage of relating learning problems from specific domains to the general problem of Bayesian inference is that new algorithms that make Bayesian inference more efficient will then yield immediate improvements across many different areas. Advances in Monte Carlo approximation techniques, for example, are directly applied in computer vision, robotics, and computational genetics. Another advantage is that it lets researchers from different disciplines more easily pool their findings. Graphical models and Bayesian statistics have become a shared focus of research in many fields, including machine learning, statistical physics, bioinformatics, combinatorial optimization, and communication theory.35 A fair amount of the recent progress in machine learning has resulted from incorporating formal results originally derived in other academic fields.

pages: 579 words: 183,063

Tribe of Mentors: Short Life Advice From the Best in the World
by Timothy Ferriss
Published 14 Jun 2017

Often, I end up realizing that those things aren’t important and I just forget about them forever. What is one of the best or most worthwhile investments you’ve ever made? Lots of time spent doing math and philosophy has paid off and will continue to pay off, I have (almost) no doubt. Questioning the foundation of Bayesian statistics has been a very valuable process. Reworking definitions and impossibility results from consensus literature has been equally valuable. What purchase of $100 or less has most positively impacted your life in the last six months (or in recent memory)? An audio lecture series on institutional economics called “International Economic Institutions: Globalism vs.

Statistics in a Nutshell
by Sarah Boslaugh
Published 10 Nov 2012

Bayes studied logic and theology at the University of Edinburgh and earned his livelihood as a minister in Holborn and Tunbridge Wells, England. However, his fame today rests on his theory of probability, which was developed in his essay, published after his death by the Royal Society of London. There is an entire field of study today known as Bayesian statistics, which is based on the notion of probability as a statement of strength of belief rather than as a frequency of occurrence. However, it is uncertain whether Bayes himself would have embraced this definition because he published relatively little on mathematics during his lifetime. Enough Exposition, Let’s Do Some Statistics!

pages: 685 words: 203,949

The Organized Mind: Thinking Straight in the Age of Information Overload
by Daniel J. Levitin
Published 18 Aug 2014

For every 5 people who take the treatment, 1 will be cured (because that person actually has the disease) and .25 will have the side effects. In this case, with two tests, you’re now about 4 times more likely to experience the cure than the side effects, a nice reversal of what we saw before. (If it makes you uncomfortable to talk about .25 of a person, just multiply all the numbers above by 4.) We can take Bayesian statistics a step further. Suppose a newly published study shows that if you are a woman, you’re ten times more likely to get the disease than if you’re a man. You can construct a new table to take this information into account, and to refine the estimate that you actually have the disease. The calculations of probabilities in real life have applications far beyond medical matters.

pages: 1,737 words: 491,616

Rationality: From AI to Zombies
by Eliezer Yudkowsky
Published 11 Mar 2015

Kruschke and Yudkowsky have replied that frequentism is even more “subjective” than Bayesianism, because frequentism’s probability assignments depend on the intentions of the experimenter.10 Importantly, this philosophical disagreement shouldn’t be conflated with the distinction between Bayesian and frequentist data analysis methods, which can both be useful when employed correctly. Bayesian statistical tools have become cheaper to use since the 1980s, and their informativeness, intuitiveness, and generality have come to be more widely appreciated, resulting in “Bayesian revolutions” in many sciences. However, traditional frequentist methods remain more popular, and in some contexts they are still clearly superior to Bayesian approaches.

You can’t avoid assigning a probability to the mathematician making one statement or another. You’re just assuming the probability is 1, and that’s unjustified.” To which the one replied, “Yes, that’s what the Bayesians say. But frequentists don’t believe that.” And I said, astounded: “How can there possibly be such a thing as non-Bayesian statistics?” That was when I discovered that I was of the type called “Bayesian.” As far as I can tell, I was born that way. My mathematical intuitions were such that everything Bayesians said seemed perfectly straightforward and simple, the obvious way I would do it myself; whereas the things frequentists said sounded like the elaborate, warped, mad blasphemy of dreaming Cthulhu.

pages: 848 words: 227,015

On the Edge: The Art of Risking Everything
by Nate Silver
Published 12 Aug 2024

In a normal distribution, 68 percent, 95 percent, and 99.7 percent of the data, respectively, is within one, two and three standard deviations of the mean, so someone whose IQ is said to be three standard deviations above the mean is very smart indeed. Statistically significant: Unlikely to be due to chance. In classical statistics, it means that the null hypothesis can be rejected with a specified probability, usually 95 percent. The term is falling out of favor in the River as a result of the adaptation of Bayesian statistics and the replication crisis, the failure of many published academic findings using classical statistics to be verified by other researchers. Steam chasing: In sports betting, the practice of following steam—changes in betting prices at bookmakers that you believe reflect the action of sharp bettors but which have not yet been incorporated by other bookmakers.

pages: 827 words: 239,762

The Golden Passport: Harvard Business School, the Limits of Capitalism, and the Moral Failure of the MBA Elite
by Duff McDonald
Published 24 Apr 2017

In short, their work opened up just about any business problem to mathematical analysis, without necessarily sacrificing expert opinion in the process. In 1959, Schlaifer published Probability and Statistics for Business Decisions, and in 1961, Raiffa and Schlaifer coauthored Applied Statistical Decision Theory, which “set the direction of Bayesian statistics for the next two decades.”10 But this was geeky stuff, especially for the more “broad-gauged” crowd at HBS. So even if the School was trying as hard as it could to keep up with the GSIAs of the world, it still felt a need to apologize for getting too geeky with Applied Statistical Decision Theory.

pages: 764 words: 261,694

The Elements of Statistical Learning (Springer Series in Statistics)
by Trevor Hastie , Robert Tibshirani and Jerome Friedman
Published 25 Aug 2009

A modified principal component technique based on the lasso, Journal of Computational and Graphical Statistics 12: 531–547. Jones, L. (1992). A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training, Annals of Statistics 20: 608–613. Jordan, M. (2004). Graphical models, Statistical Science (Special Issue on Bayesian Statistics) 19: 140–155. Jordan, M. and Jacobs, R. (1994). Hierachical mixtures of experts and the EM algorithm, Neural Computation 6: 181–214. Kalbfleisch, J. and Prentice, R. (1980). The Statistical Analysis of Failure Time Data, Wiley, New York. Kaufman, L. and Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York.