Therac-25

back to index

description: Radiotherapy machine involved in six accidents

10 results

Humble Pi: A Comedy of Maths Errors
by Matt Parker
Published 7 Mar 2019

So if the electron beam’s power was increased, it was vital to make sure the metal target and a collimator (a filter to shape the X-ray beam) had been placed in between the electron beam and the patient. For this, and a host of other safety reasons, the Therac-25 looped through a piece of set-up code, and only if all the systems are verified as being in the correct settings could the beam be turned on. The software had a number stored with the catchy name of Class3 (that’s just how creative programmers can be when naming their variables). Only after the Therac-25 machine had verified that everything was safe would it set Class3 = 0. To make sure that it was checked every time, the set-up loop code would add one to Class3 at the beginning of each loop so it started at non-zero.

So every 256th time the set-up loop ran, Class3 would be set to zero, not because the machine was safe but merely because the value had rolled over from 255 back to zero. This means that roughly 0.4 per cent of the time a Therac-25 machine would skip running Chkcol because Class3 was already set to zero, as if the collimator had already been checked and verified as being in the correct position. For a mistake with such deadly consequences, 0.4 per cent is a terrifying amount of time. On 17 January 1987 in Yakima Valley Memorial Hospital in Washington State, US (now Virginia Mason Memorial), a patient was due to receive eighty-six rads from a Therac-25 machine (rads is an antiquated unit of radiation absorption). Before the patient was to receive their dose of X-rays, however, the metal target and collimator had been moved out of the way so the machine could be aligned using normal visible light.

This was no longer required for the Ariane 5, but it lived on as a piece of vestigial code. In general, reusing code without retesting can cause all sorts of problems. Remember the Therac-25 radiation therapy machine, which had a 256-roll-over problem and accidentally overdosed people? During the course of the resulting investigation it was found that its predecessor, the Therac-20, had the same issues in its software, but it had physical safety locks to stop overdoses, so no one ever noticed the programming error. The Therac-25 reused code but did not have those physical checks, so the roll-over error was able to manifest itself in disaster. If there is any moral to this story, it’s that, when you are writing code, remember that someone may have to comb through it and check everything when it is being repurposed in the future.

Robot Futures
by Illah Reza Nourbakhsh
Published 1 Mar 2013

All these axes of complexity make the resulting system errors less clearly understandable and less accountable, with no one ever directly or solely responsible for the behavior of a complex robotic system. Brainspotting 101 Technology ethics and design courses frequently study the tragedy of the Therac-25 to understand how much can go wrong when poor design, incorrect training, and simple errors are compounded (Leveson and Turner 1993). The Therac-25 was a radiation therapy machine that provided focused radiation to cancer victims to destroy malignant tumors by rapidly moving a high-energy radiation beam. The nurse-operator of the machine would configure the machine for a customized treatment pattern, then launch its autonomous radiation therapy mode.

In the rare event that the operator entered the mode incorrectly, then backed up in the interface and corrected the entry within eight seconds, the machine would configure to an incorrect internal setting, where it would deliver one hundred times the intended dose of radiation, inducing massive pain in the patient and, eventually, killing patients through radiation sickness. Many aspects of the Therac-25 therapy process are partially to blame for this. The interface was poorly designed, making incorrect data entry easy. Training for the operators was lightweight, and the nurses afforded the expensive, fancy machines more authority than the machines deserved. When the patients complained of pain during the procedure, the nurses would discount this complaint because the machine indicated that all was well.

Kelly, Kevin. 2010. What Technology Wants. New York: Viking Press. Kurzweil, Ray. 2006. The Singularity Is Near: When Humans Transcend Biology. New York: Penguin Group. Lerner, Steve. 2010. Sacrifice Zones. Cambridge, MA: MIT Press. Leveson, N. G., and C. S. Turner. 1993. An investigation of the Therac 25 accidents. Computer 26 (7): 18–41. Lewis, M., S. Carpin, and S. Balakirsky. 2009. “Virtual Robots RoboCupRescue Competition: Contributions to Infrastructure and Science.” In Proceedings of IJCAI Workshop on Competitions in Artificial Intelligence and Robotics. Lewis, M., and K. Sycara. 2011. “Network-Centric Control for Multirobot Teams in Urban Search and Rescue.”

pages: 222 words: 53,317

Overcomplicated: Technology at the Limits of Comprehension
by Samuel Arbesman
Published 18 Jul 2016

We next turn to the social and biological limits of human comprehension: the reasons why our brains—and our societies—are particularly bad at dealing with these complex systems, no matter how hard we try. Chapter 3 LOSING THE BUBBLE In 1985, a patient entered a clinic to undergo radiation treatment for cancer of the cervix. The patient was prepared for treatment, and the operator of the large radiation machine known as the Therac-25 proceeded with radiation therapy. The machine responded with an error message, as well as noting that “no dose” had been administered. The operator tried again, with the same result. The operator tried three more times, for a total of five attempts, and each time the machine returned an error and responded that no radiation dosage had been delivered.

Several months later, the patient died of her cancer. It was discovered that she had suffered horrible radiation overexposure—her hip would have needed to be replaced—despite the machine’s having indicated that no dose of radiation was delivered. This was not the only instance of this radiation machine malfunctioning. In the 1980s, the Therac-25 failed for six patients, irradiating them with many times the dose they should have received. Damage from the massive radiation overdoses killed some of these people. These overdoses were considered the worst failures in the history of this type of machine. Could these errors have been prevented, or at least minimized?

This report implies a lack of awareness on the part of its makers that software could have a deadly complexity and be responsible for a radiation overdose. Software bugs are a fact of life, and yet the safety analysis almost completely ignored the risks they present. The people responsible for ensuring the safety of the Therac-25 misunderstood technological complexity, with lethal consequences. In hindsight it’s almost easy to see where they went wrong: they downplayed the importance of whole portions of the constructed system, and the result was a catastrophic failure. However, it’s more and more difficult to diagnose these kinds of problems in new technology.

Engineering Security
by Peter Gutmann

Just how dangerous it can be to assign arbitrary probabilities to events, in this instance for a fault tree, was illustrated in the design of the Therac-25 medical electron accelerator. This led to what has been described as the worst series of radiation accidents in the 35-year history of medical accelerators [131], with patients receiving estimated doses as high as 25,000 rads (a normal dose from the machine was under 200 rads, with 500 rads being the generally accepted lethal dose for fullbody radiation, the Therac-25 only affected one small area which is often less radiosensitive than the body as a whole). The analysis had assigned probabilities of 110-11 to “Computer selects wrong energy” and 410-9 to “Computer selects wrong mode”, with no explanation of how these values were obtained [132].

“Integrating Cyber Attacks within Fault Trees”, Igor Fovino, Marcelo Masera and Alessio De Cian, Reliability Engineering & System Safety, Vol.94, No.9 (September 2009), p.1394. “’Rear Guard’ Security Podcast: Interview with Bob Blakley”, Bob Blakley, July 2009, http://www.rearguardsecurity.com/episodes/rearguard_security_6.mp3. “Demystifying the Threat-modelling Process”, Peter Torr, IEEE Security and Privacy, Vol.3, No.5 (September/October 2005), p.66. 292 Threats [131] “Report on the Therac-25”, J.Rawlinson, OCTRF/OCI Physicists Meeting, 7 May 1987. [132] “An Investigation of the Therac-25 Accidents” Nancy Leveson and Clark Turner, IEEE Computer, Vol.26, No.7 (July 1993), p.18. [133] “Designing Safety-Critical Computer Systems”, William Dunn, IEEE Computer, Vol.36, No.11 (November 2003), p.40. [134] “Managing Attack Graph Complexity through Visual Hierarchical Aggregation”, Steven Noel and Sushil Jajodia, Proceedings of the Workshop on Visualization and Data Mining for Computer Security (VizSEC’04), October 2004, p.109. [135] “Multiple Coordinated Views for Network Attack Graphs”, Steven Noel, Michael Jacobs, Pramod Kalapa and Sushil Jajodia, Proceedings of the Workshop on Visualization for Computer Security (VizSEC’05), October 2005, p.99. [136] “An Interactive Attack Graph Cascade and Reachability Display”, Leevar Williams, Richard Lippmann and Kyle Ingols, Proceedings of the Workshop on Visualization for Computer Security (VizSEC’07), October 2007, p.221. [137] “GARNET: A Graphical Attack Graph and Reachability Network Evaluation Tool”, Leevar Williams, Richard Lippmann and Kyle Ingols, Proceedings of the 5th Workshop on Visualization for Computer Security (VizSEC’08), September 2008, p.44. [138] “Identifying Critical Attack Assets in Dependency Attack Graphs”, Reginald Sawilla and Xinming Ou, Proceedings of the 13th European Symposium on Research in Computer Security (ESORICS’08), Springer-Verlag LNCS No.5283, October 2008, p.18. [139] “Modeling Modern Network Attacks and Countermeasures Using Attack Graphs”, Kyle Ingols, Matthew Chu, Richard Lippmann, Seth Webster and Stephen Boyer, Proceedings of the 25th Annual Computer Security Applications Conference (ACSAC’09), December 2009, p.117. [140] “Evaluating Network Security With Two-Layer Attack Graphs”, Anming Xie, Zhuhua Cai, Cong Tang, Jianbin Hu and Zhong Chen, Proceedings of the 25th Annual Computer Security Applications Conference (ACSAC’09), December 2009, p.127. [141] “Using Attack Graphs to Design Systems”, Suvajit Gupta and Joel Winstead, IEEE Security and Privacy, Vol.5, No.4 (July/August 2007), p.80. [142] “The Basics of FMEA”, Robin McDermott, Raymond Mikulak and Michael Beauregard Productivity Press, 1996. [143] “Failure Mode and Effect Analysis: FMEA from Theory to Execution”, D.

A particularly notorious instance of user satisficing occurred with the Therac-25 medical electron accelerator, whose control software was modified to allow operators to click their way through the configuration process (or at least hit Enter repeatedly, since the interface was a VT-100 text terminal) after they had complained that the original design, which required them to manually re-enter the values to confirm them, took too long [142]. This (and a host of other design problems) led to situations where patients could be given huge radiation overdoses, resulting in severe injuries and even deaths (the Therac-25 case has gone down in control-system failure history, and is covered in more detail in “Other Threat Analysis Techniques” on page 259).

pages: 394 words: 118,929

Dreaming in Code: Two Dozen Programmers, Three Years, 4,732 Bugs, and One Quest for Transcendent Software
by Scott Rosenberg
Published 2 Jan 2006

“In all of modern technology”: From a video distributed by the Software Engineering Institute, available at http://www.sei.cmu.edu/videos/watts/DPWatts.mov. Minasi, The Software Conspiracy. The Mariner 1 bug is described at http://nssdc.gsfc.nasa.gov/nmc/tmp/MARIN1.htm. James Gleick tells the story of the Ariane 5 bug at http://www.around.com/ariane.htm. The Therac-25 bug is detailed in a paper by Nancy Leveson and Clark S. Turner in IEEE Computer, July 1993, at http://courses.cs.vt.edu/~cs3604/lib/Therac_25/ Therac_1.htm. The 1991 Patriot missile bug is well documented, for instance at http://www.cs.usyd.edu.au/~alum/patriot_bug.htm. Jon Ogg’s talk was at the Systems & Software Technology Conference, Salt Lake City, April 2004.

In June 1996, the European Space Agency’s $500 million unmanned Ariane 5 rocket exploded forty seconds after liftoff because of a bug in the software that controlled its guidance system. (It tried to convert a 64-bit variable to a 16-bit variable, but the number was too high, a buffer overflowed, and the system froze.) From 1985 to 1987 a radiation therapy machine named the Therac-25 delivered massive X-ray overdoses to a half-dozen patients because of software flaws. During the 1991 Gulf War, a battery of American Patriot missiles failed to fire against incoming Scud missiles; the enemy missile hit a U.S. barracks, leaving twenty-eight dead. Investigations found that the software’s calculations had a defect that compounded over time, and after one hundred hours of continuous use, the Patriot’s figures were so far off, it simply didn’t fire.

pages: 1,172 words: 114,305

New Laws of Robotics: Defending Human Expertise in the Age of AI
by Frank Pasquale
Published 14 May 2020

Given a decade of research on algorithmic accountability, neither justification should immunize such firms. We all now know that algorithms can harm people.28 Moreover, lawyers have grappled with the problem of malfunctioning computers for decades, dating back at least to the autopilot crashes of the 1950s and the Therac-25 debacle of the 1980s (when a software malfunction caused tragic overdoses of radiation).29 Nevertheless, some proposals would severely diminish the role of courts in the AI field, preempting their traditional role in assigning blame for negligent conduct. Others would kneecap federal regulatory agencies, leaving it up to judges to determine remedies appropriate for accidents.

See education television, 30, 96, 97, 115; robots portrayed on, 200 tenure, 141, 187 terrorism, 21, 147, 154, 160; databases for, 127; and drones, 3, 164; and facial recognition, 128; and non-state actors, 162; and online media, 98; and “terror capitalism,” 166–167. See also bioterrorism; 9 / 11 terrorist attacks Therac-25, 40 Thomas, Raymond, 241n69 Thrall, James, 42 Three Body Problem (Liu), 209 Tokui, Nao, 219 Tokyo University, 68–69 Toyama, Kentaro, 82 Toyota, 6 transportation, 6, 21, 177, 179, 182, 192, 193, 207. See also cars; self-driving cars; Uber Trump, Donald, 93, 94, 161 Tsai Ing-wen, 160 Tufekci, Zeynep, 160 Tumblr, 104 Turing, Alan, 203, 208, 210–211; and the Turing test, 203, 218, 255n32 Turkle, Sherry, 51–52, 80 Twitter, 12, 48, 90, 94, 100, 132 2001: A Space Odyssey, 210 Uber, 177, 207, 227 UBI.

pages: 351 words: 123,876

Beautiful Testing: Leading Professionals Reveal How They Improve Software (Theory in Practice)
by Adam Goucher and Tim Riley
Published 13 Oct 2009

id=6962 BUG MANAGEMENT AND TEST CASE EFFECTIVENESS 83 References Chernak, Y. 2001. “Validating and Improving Test-Case Effectiveness.” IEEE Software, 18(1): 81–86. Kidwell, P. A. 1998. “Stalking the Elusive Computer Bug.” Annals of the History of Computing, 20: 5–9. McPhee, N. “Therac-25 accidents,” http://www.morris.umn.edu/~mcphee/Courses/Readings/Therac _25_accidents.html. Smithsonian National Museum of American History. “Log Book With Computer Bug,” http:// americanhistory.si.edu/collections/object.cfm?key=35&objkey=30. Tzu, Sun. The Art of War. Trans. Lionel Giles. http://www.gutenberg.org/etext/132. 84 CHAPTER SIX CHAPTER SEVEN Beautiful XMPP Testing Remko Tronçon A T MY FIRST JOB INTERVIEW , ONE OF THE INTERVIEWERS ASKED ME if I knew what “unit testing” was and whether I had used it before.

pages: 239 words: 64,812

Geek Sublime: The Beauty of Code, the Code of Beauty
by Vikram Chandra
Published 7 Nov 2013

A bug can exist for half a century despite our best efforts to exterminate it.17 That software algorithms are now running our whole world means that software faults or errors can send us down the wrong highway, injure or kill people, and cause disasters. Every programmer is familiar with the most infamous bugs: the French Ariane 5 rocket that went off course and self-destructed forty seconds after lift-off because of an error in converting between representations of number values; the Therac-25 radiation therapy machine that reacted to a combination of operator input and a “counter overflow” by delivering doses of radiation a hundred times more intense than required, resulting in the agonizing deaths of five people and injuries to many others; the “Flash Crash” of 2010, when the Dow Jones suddenly plunged a thousand points and recovered just as suddenly, apparently as a result of automatic trading programs reacting to a single large trade.

pages: 931 words: 79,142

Concepts, Techniques, and Models of Computer Programming
by Peter Van-Roy and Seif Haridi
Published 15 Feb 2004

More complicated programs have many more possible interleavings. Programming with concurrency and state together is largely a question of mastering the interleavings. In the history of computer technology, many famous and dangerous bugs were due to designers not realizing how difficult this really is. The Therac-25 radiation therapy machine is an infamous example. Because of concurrent programming errors, it sometimes gave its patients radiation doses that were thousands of times greater than normal, resulting in death or serious injury [128]. This leads us to a first lesson for programming with state and concurrency: if at all possible, do not use them together!

In Second International Symposium on Operating Systems, IRIA, October 1978. Reprinted in Operating Systems Review, 13(2), April 1979, pp. 3–19. [126] Doug Lea. Concurrent Programming in Java. Addison-Wesley, 1997. [127] [128] Doug Lea. Concurrent Programming in Java, 2nd edition. Addison-Wesley, 2000. Nancy Leveson and Clark S. Turner. An investigation of the Therac-25 accidents. IEEE Computer, 26(7):18–41, July 1993. [129] Henry M. Levy. Capability-Based Computer Systems. Digital Press, Bedford, MA, 1984. Available for download from the author. [130] Henry Lieberman. Using prototypical objects to implement shared behavior in objectoriented systems. In 1st Conference on Object-Oriented Programming Languages, Systems, and Applications (OOPSLA 86), pages 214–223, September 1986.

., 334 task (in concurrency), 780 tautology, 632 TCP (Transmission Control Protocol), 712, 740 technology, xv dangers of concurrency, 21 history of computing, 176 magic, 314 molecular computing, 176 Prolog implementation, 661 reengineering, 522 singularity, 176 software component, 462 synchronous digital, 267 transition to 64-bit, 78 Tel, Gerard, 353 tell operation, 782, 787 temporal logic, 603 temporary failure, 739 term Erlang, 391 Oz, 833 Prolog, 664 termination detection, 276, 382 ping-pong example, 305 failure in declarative program, 245 partial, 243, 338, 804 proof, 449 total, 804 test-driven development, 452 testing declarative programs, 111, 407 dynamic typing, 105 programming in the small, 219 stateful programs, 407 text file, 210 Thalys high-speed train, 382 theorem binomial, 4 Church-Rosser, 331 Gödel’s completeness, 634 Gödel’s incompleteness, 634 halting problem, 681 theorem prover, 117, 634, 662 Therac-25 scandal, 21 thinking machine, 621 third-party independence, 335 32-bit address, 78 32-bit word, 74, 174 this, see self Thompson, D’Arcy Wentworth, 405 thread, 846 declarative model, 233 hanging, 399 interactive interface, 89 introduction, 15 Java, 615 monotonicity property, 239, 781, 782 priority, 253 ready, 239 runnable, 239 suspended, 239 synchronization, 333 thread statement, 241, 785 Thread class (in Java), 616 throughput, 263 thunk, 432 ticket, 480, 714 Connection module, 715 ticking, 307 time complexity, 11 time slice, 252–254 duration, 254 898 Index time-lease mechanism, 480, 734, 738 time-out, 740 Erlang, 391–394 system design, 460 timer protocol, 368 timestamp, 207, 602 timing measurement active object, 379 memory consumption, 173 palindrome product (constraint version), 758 palindrome product (naive version), 629 transitive closure, 471 word frequency, 201 token equality, 418, 714, 723 token passing, 579, 588, 591, 721 token syntax (of Oz), 833 tokenizer, 32, 162 top-down software development, 8, 451 total termination, 804 trade-off asynchronous communication vs. fault confinement, 745 compilation time vs. execution efficiency, 457 compositional vs. noncompositional design, 461 dynamic vs. static scoping, 58 dynamic vs. static typing, 104 explicit state vs. implicit state, 315, 409 expressiveness vs. execution efficiency, 116 expressiveness vs. manipulability, 681 functional decomposition vs. type decomposition, 542 helper procedure placement, 120 indexed collections, 435 inheritance vs. component composition, 462, 492 kernel language design, 844 language design, 811 lazy vs. eager execution, 329 memory use vs. execution speed, 177 names vs. atoms, 510 nonstrictness vs. explicit state, 331, 344 objects vs.

pages: 1,201 words: 233,519

Coders at Work
by Peter Seibel
Published 22 Jun 2009

Eich: So a blue-collar language like Java shouldn't have a crazy generic system because blue-collar people can't figure out what the hell the syntax means with covariant, contravariant type constraints. Certainly I've experienced some toe loss due to C and C++'s foot guns. Part of programming is engineering; part of engineering is working out various safety properties, which matter. Doing a browser they matter. They matter more if you're doing the Therac-25. Though that was more a thread-scheduling problem, as I recall. But even then, you talk about better languages for writing concurrent programs or exploiting hardware parallelism. We shouldn't all be using synchronized blocks—we certainly shouldn't be using mutexes or spin locks. So the kind of leverage you can get through languages may involve trade-offs where you say, “I'm going, for safety, to sacrifice some expressiveness.”