description: type of code injection, used to attack vulnerable data-driven software applications
62 results
The Web Application Hacker's Handbook: Finding and Exploiting Security Flaws
by
Dafydd Stuttard
and
Marcus Pinto
Published 30 Sep 2007
First, database components may themselves contain SQL injection flaws. Second, user input may be passed to potentially dangerous functions in unsafe ways. SQL Injection Chapter 9 described how prepared statements can be used as a safe alternative to dynamic SQL statements to prevent SQL injection attacks. However, even if prepared statements are properly used throughout the web application's own code, SQL injection flaws may still exist if database code components construct queries from user input in an unsafe manner. The following is an example of a stored procedure that is vulnerable to SQL injection in the ©name parameter: CREATE PROCEDURE show_current_orders (©name varchar(400) = NULL) AS DECLARE ©sql nvarchar(4000) SELECT ©sql = 'SELECT id_num, searchstring FROM searchorders WHERE ' + 'searchstring = ''' + ©name + ''''; EXEC (©sql) GO Even if the application passes the user-supplied name value to the stored procedure in a safe manner, the procedure itself concatenates this directly into a dynamic query and therefore is vulnerable.
…
vii Contents at a Glance viii Contents Introduction xxiii Chapter 1 Web Application (In)security 1 The Evolution of Web Applications 2 Common Web Application Functions 4 Benefits of Web Applications 5 Web Application Security 6 "This Site Is Secure" 7 The Core Security Problem: Users Can Submit Arbitrary Input 9 Key Problem Factors 10 The New Security Perimeter 12 The Future of Web Application Security 14 Summary 15 Chapter 2 Core Defense Mechanisms 17 Handling User Access 18 Authentication 18 Session Management 19 Access Control 20 Handling User Input 21 Varieties of Input 21 Approaches to Input Handling 23 Boundary Validation 25 Multistep Validation and Canonicalization 28 Handling Attackers 30 Handling Errors 30 Maintaining Audit Logs 31 Alerting Administrators 33 Reacting to Attacks 34 X Contents Chapter 3 Chapter 4 Contents xi Chapter 5 Bypassing Client-Side Controls 117 Transmitting Data Via the Client 118 Hidden Form Fields 118 HTTP Cookies 121 URL Parameters 121 The Referer Header 122 Opaque Data 123 The ASP.NET ViewState 124 Capturing User Data: HTML Forms 127 Length Limits 128 Script-Based Validation 129 Disabled Elements 131 Capturing User Data: Browser Extensions 133 Common Browser Extension Technologies 134 Approaches to Browser Extensions 135 Intercepting Traffic from Browser Extensions 135 Decompiling Browser Extensions 139 Attaching a Debugger 151 Native Client Components 153 Handling Client-Side Data Securely 154 Transmitting Data Via the Client 154 Validating Client-Generated Data 155 Logging and Alerting 156 Summary 156 Questions 157 Chapter 6 Attacking Authentication 159 Authentication Technologies 160 Design Flaws in Authentication Mechanisms 161 Bad Passwords 161 Brute-Forcible Login 162 Verbose Failure Messages 166 Vulnerable Transmission of Credentials 169 Password Change Functionality 171 Forgotten Password Functionality 173 "Remember Me" Functionality 176 User Impersonation Functionality 178 Incomplete Validation of Credentials 180 Nonunique Usernames 181 Predictable Usernames 182 Predictable Initial Passwords 183 Insecure Distribution of Credentials 184 Implementation Flaws in Authentication 185 Fail-Open Login Mechanisms 185 Defects in Multistage Login Mechanisms 186 Insecure Storage of Credentials 190 xii Contents Securing Authentication 191 Use Strong Credentials 192 Handle Credentials Secretively 192 Validate Credentials Properly 193 Prevent Information Leakage 195 Prevent Brute-Force Attacks 196 Prevent Misuse of the Password Change Function 199 Prevent Misuse of the Account Recovery Function 199 Log, Monitor, and Notify 201 Summary 201 Questions 202 Chapter 7 Attacking Session Management 205 The Need for State 206 Alternatives to Sessions 208 Weaknesses in Token Generation 210 Meaningful Tokens 210 Predictable Tokens 213 Encrypted Tokens 223 Weaknesses in Session Token Handling 233 Disclosure of Tokens on the Network 234 Disclosure of Tokens in Logs 237 Vulnerable Mapping of Tokens to Sessions 240 Vulnerable Session Termination 241 Client Exposure to Token Hijacking 243 Liberal Cookie Scope 244 Securing Session Management 248 Generate Strong Tokens 248 Protect Tokens Throughout Their Life Cycle 250 Log, Monitor, and Alert 253 Summary 254 Questions 255 Chapter 8 Attacking Access Controls 257 Common Vulnerabilities 258 Completely Unprotected Functionality 259 Identifier-Based Functions 261 Multistage Functions 262 Static Files 263 Platform Misconfiguration 264 Insecure Access Control Methods 265 Attacking Access Controls 266 Testing with Different User Accounts 267 Testing Multistage Processes 271 Testing with Limited Access 273 Testing Direct Access to Methods 276 Testing Controls Over Static Resources 277 Contents xiii Testing Restrictions on HTTP Methods 278 Securing Access Controls 278 A Multilayered Privilege Model 280 Summary 284 Questions 284 Chapter 9 Attacking Data Stores 287 Injecting into Interpreted Contexts 288 Bypassing a Login 288 Injecting into SQL 291 Exploiting a Basic Vulnerability 292 Injecting into Different Statement Types 294 Finding SQL Injection Bugs 298 Fingerprinting the Database 303 The UNION Operator 304 Extracting Useful Data 308 Extracting Data with UNION 308 Bypassing Filters 311 Second-Order SQL Injection 313 Advanced Exploitation 314 Beyond SQL Injection: Escalating the Database Attack 325 Using SQL Exploitation Tools 328 SQL Syntax and Error Reference 332 Preventing SQL Injection 338 Injecting into NoSQL 342 Injecting into MongoDB 343 Injecting into XPath 344 Subverting Application Logic 345 Informed XPath Injection 346 Blind XPath Injection 347 Finding XPath Injection Flaws 348 Preventing XPath Injection 349 Injecting into LDAP 349 Exploiting LDAP Injection 351 Finding LDAP Injection Flaws 353 Preventing LDAP Injection 354 Summary 354 Questions 354 Chapter 10 Attacking Back-End Components 357 Injecting OS Commands 358 Example 1: Injecting Via Perl 358 Example 2: Injecting Via ASP 360 Injecting Through Dynamic Execution 362 Finding OS Command Injection Flaws 363 Finding Dynamic Execution Vulnerabilities 366 xiv Contents Preventing OS Command Injection 367 Preventing Script Injection Vulnerabilities 368 Manipulating File Paths 368 Path Traversal Vulnerabilities 368 File Inclusion Vulnerabilities 381 Injecting into XML Interpreters 383 Injecting XML External Entities 384 Injecting into SOAP Services 386 Finding and Exploiting SOAP Injection 389 Preventing SOAP Injection 390 Injecting into Back-end HTTP Requests 390 Server-side HTTP Redirection 390 HTTP Parameter Injection 393 Injecting into Mail Services 397 E-mail Header Manipulation 398 SMTP Command Injection 399 Finding SMTP Injection Flaws 400 Preventing SMTP Injection 402 Summary 402 Questions 403 Chapter 11 Attacking Application Logic 405 The Nature of Logic Flaws 406 Real-World Logic Flaws 406 Example 1: Asking the Oracle 407 Example 2: Fooling a Password Change Function 409 Example 3: Proceeding to Checkout 410 Example 4: Rolling Your Own Insurance 412 Example 5: Breaking the Bank 414 Example 6: Beating a Business Limit 416 Example 7: Cheating on Bulk Discounts 418 Example 8: Escaping from Escaping 419 Example 9: Invalidating Input Validation 420 Example 10: Abusing a Search Function 422 Example 11: Snarfing Debug Messages 424 Example 12: Racing Against the Login 426 Avoiding Logic Flaws 428 Summary 429 Questions 430 Chapter 12 Attacking Users: Cross-Site Scripting 431 Varieties of XSS 433 Reflected XSS Vulnerabilities 434 Stored XSS Vulnerabilities 438 DOM-Based XSS Vulnerabilities 440 XSS Attacks in Action 442 Real-World XSS Attacks 442 Contents xv Payloads for XSS Attacks 443 Delivery Mechanisms for XSS Attacks 447 Finding and Exploiting XSS Vulnerabilities 451 Finding and Exploiting Reflected XSS Vulnerabilities 452 Finding and Exploiting Stored XSS Vulnerabilities 481 Finding and Exploiting DOM-Based XSS Vulnerabilities 487 Preventing XSS Attacks 492 Preventing Reflected and Stored XSS 492 Preventing DOM-Based XSS 496 Summary 498 Questions 498 Chapter 13 Attacking Users: Other Techniques 501 Inducing User Actions 501 Request Forgery 502 UI Redress 511 Capturing Data Cross-Domain 515 Capturing Data by Injecting HTML 516 Capturing Data by Injecting CSS 517 JavaScript Hijacking 519 The Same-Origin Policy Revisited 524 The Same-Origin Policy and Browser Extensions 525 The Same-Origin Policy and HTML5 528 Crossing Domains with Proxy Service Applications 529 Other Client-Side Injection Attacks 531 HTTP Header Injection 531 Cookie Injection 536 Open Redirection Vulnerabilities 540 Client-Side SQL Injection 547 Client-Side HTTP Parameter Pollution 548 Local Privacy Attacks 550 Persistent Cookies 550 Cached Web Content 551 Browsing History 552 Autocomplete 552 Flash Local Shared Objects 553 Silverlight Isolated Storage 553 Internet Explorer userData 554 HTML5 Local Storage Mechanisms 554 Preventing Local Privacy Attacks 554 Attacking ActiveX Controls 555 Finding ActiveX Vulnerabilities 556 Preventing ActiveX Vulnerabilities 558 Attacking the Browser 559 Logging Keystrokes 560 Stealing Browser History and Search Queries 560 xvi Contents Enumerating Currently Used Applications 560 Port Scanning 561 Attacking Other Network Hosts 561 Exploiting Non-HTTP Services 562 Exploiting Browser Bugs 563 DNS Rebinding 563 Browser Exploitation Frameworks 564 Man-in-the-Middle Attacks 566 Summary 568 Questions 568 Chapter 14 Automating Customized Attacks 571 Uses for Customized Automation 572 Enumerating Valid Identifiers 573 The Basic Approach 574 Detecting Hits 574 Scripting the Attack 576 JAttack 577 Harvesting Useful Data 583 Fuzzing for Common Vulnerabilities 586 Putting It All Together: Burp Intruder 590 Barriers to Automation 602 Session-Handling Mechanisms 602 CAPTCHA Controls 610 Summary 613 Questions 613 Chapter 15 Exploiting Information Disclosure 615 Exploiting Error Messages 615 Script Error Messages 616 Stack Traces 617 Informative Debug Messages 618 Server and Database Messages 619 Using Public Information 623 Engineering Informative Error Messages 624 Gathering Published Information 625 Using Inference 626 Preventing Information Leakage 627 Use Generic Error Messages 628 Protect Sensitive Information 628 Minimize Client-Side Information Leakage 629 Summary 629 Questions 630 Chapter 16 Attacking Native Compiled Applications 633 Buffer Overflow Vulnerabilities 634 Stack Overflows 634 Heap Overflows 635 Contents xvii "Off-by-One" Vulnerabilities 636 Detecting Buffer Overflow Vulnerabilities 639 Integer Vulnerabilities 640 Integer Overflows 640 Signedness Errors 641 Detecting Integer Vulnerabilities 642 Format String Vulnerabilities 643 Detecting Format String Vulnerabilities 644 Summary 645 Questions 645 Chapter 17 Attacking Application Architecture 647 Tiered Architectures 647 Attacking Tiered Architectures 648 Securing Tiered Architectures 654 Shared Flosting and Application Service Providers 656 Virtual Hosting 657 Shared Application Services 657 Attacking Shared Environments 658 Securing Shared Environments 665 Summary 667 Questions 667 Chapter 18 Attacking the Application Server 669 Vulnerable Server Configuration 670 Default Credentials 670 Default Content 671 Directory Listings 677 WebDAV Methods 679 The Application Server as a Proxy 682 Misconfigured Virtual Hosting 683 Securing Web Server Configuration 684 Vulnerable Server Software 684 Application Framework Flaws 685 Memory Management Vulnerabilities 687 Encoding and Canonicalization 689 Finding Web Server Flaws 694 Securing Web Server Software 695 Web Application Firewalls 697 Summary 699 Questions 699 Chapter 19 Finding Vulnerabilities in Source Code 701 Approaches to Code Review 702 Black-Box Versus White-Box Testing 702 Code Review Methodology 703 Signatures of Common Vulnerabilities 704 Cross-Site Scripting 704 xviii Contents Chapter 20 Contents xix Technical Challenges Faced by Scanners 778 Current Products 781 Using a Vulnerability Scanner 783 Other Tools 785 Wikto/Nikto 785 Firebug 785 Hydra 785 Custom Scripts 786 Summary 789 Chapter 21 A Web Application Hacker's Methodology 791 General Guidelines 793 1 Map the Application's Content 795 1.1 Explore Visible Content 795 1.2 Consult Public Resources 796 1.3 Discover Hidden Content 796 1.4 Discover Default Content 797 1.5 Enumerate Identifier-Specified Functions 797 1.6 Test for Debug Parameters 798 2 Analyze the Application 798 2.1 Identify Functionality 798 2.2 Identify Data Entry Points 799 2.3 Identify the Technologies Used 799 2.4 Map the Attack Surface 800 3 Test Client-Side Controls 800 3.1 Test Transmission of Data Via the Client 801 3.2 Test Client-Side Controls Over User Input 801 3.3 Test Browser Extension Components 802 4 Test the Authentication Mechanism 805 4.1 Understand the Mechanism 805 4.2 Test Password Quality 806 4.3 Test for Username Enumeration 806 4.4 Test Resilience to Password Guessing 807 4.5 Test Any Account Recovery Function 807 4.6 Test Any Remember Me Function 808 4.7 Test Any Impersonation Function 808 4.8 Test Username Uniqueness 809 4.9 Test Predictability of Autogenerated Credentials 809 4.10 Check for Unsafe Transmission of Credentials 810 4.11 Check for Unsafe Distribution of Credentials 810 4.12 Test for Insecure Storage 811 4.13 Test for Logic Flaws 811 4.14 Exploit Any Vulnerabilities to Gain Unauthorized Access 813 5 Test the Session Management Mechanism 814 5.1 Understand the Mechanism 814 5.2 Test Tokens for Meaning 815 5.3 Test Tokens for Predictability 816 xx Contents 5.4 Check for Insecure Transmission of Tokens 817 5.5 Check for Disclosure of Tokens in Logs 817 5.6 Check Mapping of Tokens to Sessions 818 5.7 Test Session Termination 818 5.8 Check for Session Fixation 819 5.9 Check for CSRF 820 5.10 Check Cookie Scope 820 6 Test Access Controls 821 6.1 Understand the Access Control Requirements 821 6.2 Test with Multiple Accounts 822 6.3 Test with Limited Access 822 6.4 Test for Insecure Access Control Methods 823 7 Test for Input-Based Vulnerabilities 824 7.1 Fuzz All Request Parameters 824 7.2 Test for SQL Injection 827 7.3 Test for XSS and Other Response Injection 829 7.4 Test for OS Command Injection 832 7.5 Test for Path Traversal 833 7.6 Test for Script Injection 835 7.7 Test for File Inclusion 835 8 Test for Function-Specific Input Vulnerabilities 836 8.1 Test for SMTP Injection 836 8.2 Test for Native Software Vulnerabilities 837 8.3 Test for SOAP Injection 839 8.4 Test for LDAP Injection 839 8.5 Test for XPath Injection 840 8.6 Test for Back-End Request Injection 841 8.7 Test for XXE Injection 841 9 Test for Logic Flaws 842 9.1 Identify the Key Attack Surface 842 9.2 Test Multistage Processes 842 9.3 Test Handling of Incomplete Input 843 9.4 Test Trust Boundaries 844 9.5 Test Transaction Logic 844 10 Test for Shared Hosting Vulnerabilities 845 10.1 Test Segregation in Shared Infrastructures 845 10.2 Test Segregation Between ASP-Hosted Applications 845 11 Test for Application Server Vulnerabilities 846 11.1 Test for Default Credentials 846 11.2 Test for Default Content 847 11.3 Test for Dangerous HTTP Methods 847 11.4 Test for Proxy Functionality 847 11.5 Test for Virtual Hosting Misconfiguration 847 11.6 Test for Web Server Software Bugs 848 11.7 Test for Web Application Firewalling 848 Contents xxi 12 Miscellaneous Checks 849 12.1 Check for DOM-Based Attacks 849 12.2 Check for Local Privacy Vulnerabilities 850 12.3 Check for Weak SSL Ciphers 851 12.4 Check Same-Origin Policy Configuration 851 13 Follow Up Any Information Leakage 852 Index 853 Introduction This book is a practical guide to discovering and exploiting security flaws in web applications.
…
The material in Chapters 9 and 10 has been reorganized to create more manageable chapters and a more logical arrangement of topics. Chapter 9, "Attacking Data Stores," focuses on SQL injection and similar attacks against other data store technologies. As SQL injection vulnerabilities have become more widely understood and addressed, this material now focuses more on practical situations where SQL injection is still found. There are also minor updates throughout to reflect current technologies and attack methods. A new section on using automated tools for exploiting SQL injection vulnerabilities is included. The material on LDAP injection has been largely rewritten to include more detailed Introduction xxxi coverage of specific technologies (Microsoft Active Directory and OpenLDAP), as well as new techniques for exploiting common vulnerabilities.
SQL Hacks
by
Andrew Cumming
and
Gordon Russell
Published 28 Nov 2006
Finding the attacker's real name and where he lives from an IP address is not difficult for the authorities. You can try your skills, without fear of prosecution, at http://sqlzoo.net/hack. 6.7.7. See Also "Prevent an SQL Injection Attack" [Hack #48] Hack 48. Prevent an SQL Injection Attack You can take steps to prevent an SQL injection attack. You can also minimize the consequences of an SQL injection attack. Preventing an SQL injection attack is simply a matter of escaping values that come from web pages. When you escape a string you replace special characters with escape sequences. For string input the only special character you need to worry about is the single quote.
…
Don't Overreact The basic SQL injection attack will not reveal your SQL user password or your operating system passwords. Only data in the database is exposed. It may be possible for someone to obtain an encrypted version of your SQL account password, but if you have chosen a sound password, that does not constitute a threat. You should not underestimate the power of an SQL injection attack, but neither should you overestimate it. My site, http://sqlzoo.net, is vulnerable to SQL injection attacks by its very nature, and it has been running fairly smoothly for several years. I can't guard against SQL injection because I allow users to execute any SQL command they want against several different SQL engines.
…
Online Applications Hack 41. Copy Web Pages into a Table Hack 42. Present Data Graphically Using SVG Hack 43. Add Navigation Features to Web Applications Hack 44. Tunnel into MySQL from Microsoft Access Hack 45. Process Web Server Logs Hack 46. Store Images in a Database Hack 47. Exploit an SQL Injection Vulnerability Hack 48. Prevent an SQL Injection Attack Chapter 7. Organizing Data Hack 49. Keep Track of Infrequently Changing Values Hack 50. Combine Tables Containing Different Data Hack 51. Display Rows As Columns Hack 52. Display Columns As Rows Hack 53. Clean Inconsistent Records Hack 54. Denormalize Your Tables Hack 55.
Python Web Penetration Testing Cookbook
by
Cameron Buchanan
,
Terry Ip
,
Andrew Mabbitt
,
Benjamin May
and
Dave Mound
Published 28 Jun 2015
We append our successful message to our answer string in character form, converting it with the chr command: if yes in req.text: answer.append(chr(asciivalue)) break If the yes value is not present, we add to asciivalue to move on to the next potential character for that position and pass: else: asciivalue = asciivalue + 1 pass Finally, we reset asciivalue for each loop, and then when the loop hits the length of the string, we finish, printing the whole recovered string: asciivalue = 1 print “Recovered String: “+ ''.join(answer) There's more… Potentially, this script could be altered to handle iterating through tables and recovering multiple values through better crafted SQL Injection strings. Ultimately, this provides a base plate, as with the later Blind SQL Injection script, for developing more complicated and impressive scripts to handle challenging tasks. See the Exploiting Blind SQL Injection script for an advanced implementation of these concepts. Exploiting Blind SQL Injection Sometimes, life hands you lemons; blind SQL Injection points are some of those lemons. When you're reasonably sure you've found an SQL Injection vulnerability but there are no errors and you can't get it to return your data, in these situations you can use timing commands within SQL to cause the page to pause in returning a response and then use that timing to make judgments about the database and its data.
…
If successful, the message should appear in our logs: req = requests.post(url, headers=headers) Chapter 4. SQL Injection In this chapter, we will cover the following topics: Checking jitter Identifying URL-based SQLi Exploiting Boolean SQLi Exploiting Blind SQLi Encoding payloads Introduction SQL Injection is the loud and noisy attack that beats you over the head in every tech-related media provider you see. It is one of the most common and most devastating attacks of recent history and continues to thrive in new installations. This chapter focuses on both performing and supporting SQL Injection attacks. We will create scripts that encode attack strings, perform attacks, and time normal actions to normalize attack times.
…
Vulnerability Identification Introduction Automated URL-based Directory Traversal Getting ready How to do it… How it works… There's more Automated URL-based Cross-site scripting How to do it… How it works… There's more… Automated parameter-based Cross-site scripting How to do it… How it works… There's more… Automated fuzzing Getting ready How to do it… How it works… There's more… See also jQuery checking How to do it… How it works… There's more… Header-based Cross-site scripting Getting ready How to do it… How it works… See also Shellshock checking Getting ready How to do it… How it works… 4. SQL Injection Introduction Checking jitter How to do it… How it works… There's more… Identifying URL-based SQLi How to do it… How it works… There's more… Exploiting Boolean SQLi How to do it… How it works… There's more… Exploiting Blind SQL Injection How to do it… How it works… There's more… Encoding payloads How to do it… How it works… There's more… 5. Web Header Manipulation Introduction Testing HTTP methods How to do it… How it works… There's more… Fingerprinting servers through HTTP headers How to do it… How it works… There's more… Testing for insecure headers Getting ready How to do it… How it works… Brute forcing login through the Authorization header Getting ready How to do it… How it works… There's more… See also Testing for clickjacking vulnerabilities How to do it… How it works… Identifying alternative sites by spoofing user agents How to do it… How it works… See also Testing for insecure cookie flags How to do it… How it works… There's more… Session fixation through a cookie injection Getting ready How to do it… How it works… There's more… 6.
The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities
by
Justin Schuh
Published 20 Nov 2006
WEB TECHNOLOGIES Introduction Web Services and Service-Oriented Architecture SOAP REST AJAX Web Application Platforms CGI Indexed Queries Environment Variables Path Confusion Perl SQL Injection File Access Shell Invocation File Inclusion Inline Evaluation Cross-Site Scripting Taint Mode PHP SQL Injection File Access Shell Invocation File Inclusion Inline Evaluation Cross-Site Scripting Configuration Java SQL Injection File Access Shell Invocation File Inclusion JSP File Inclusion Inline Evaluation Cross-Site Scripting Threading Issues Configuration ASP SQL Injection File Access Shell Invocation File Inclusion Inline Evaluation Cross-Site Scripting Configuration ASP.NET SQL Injection File Access Shell Invocation File Inclusion Inline Evaluation Cross-Site Scripting Configuration ViewState Summary BIBLIOGRAPHY INDEX About the Authors Mark Dowd is a principal security architect at McAfee, Inc. and an established expert in the field of application security.
…
Most security problems in programs written in these higher-level languages occur in the places where they interact with other systems or components, such as the database, file system, operating system, or network. Some of these technical problems are explained in the following sections. SQL Injection SQL injection, discussed in Chapter 8, is arguably one of the most common vulnerabilities in Web applications. To briefly recap, in SQL injection, a SQL query is constructed dynamically by using user input, and users are capable of inserting their own SQL commands into the query. When reviewing a Web application, try to find every interaction with the database engine to hunt down all potential SQL injection points. Sometimes, you need to augment your testing with black-box methods if the mapping to the underlying database is obscured by an object-oriented abstraction or is otherwise unclear.
…
The most common SQL-related vulnerability is SQL injection. It occurs when input is taken from request data (post variables, forms, or cookies) and concatenated into a query string issued against the database. Listing 8-20 is a simple example in PHP and MySQL. Listing 8-20. SQL Injection Vulnerability $username = $HTTP_POST_VARS['username']; $password = $HTTP_POST_VARS['passwd']; $query = "SELECT * FROM logintable WHERE user = '" . $username . "' AND pass = '" . $password. "'"; ... $result = mysql_query($query); if(!$result) die_bad_login(); ... This query is vulnerable to SQL injection because users can supply unfiltered input for the passwd and username variables.
We Are Anonymous: Inside the Hacker World of LulzSec, Anonymous, and the Global Cyber Insurgency
by
Parmy Olson
Published 5 Jun 2012
Though its job was to help other companies protect themselves from cyber attacks, HBGary Federal itself was vulnerable to a simple attack method called SQL injection, which targeted databases. Databases were one of the many key technologies powering the Internet. They stored passwords, corporate e-mails, and a wide variety of other types of data. The use of Structured Query Language (SQL, commonly mispronounced “sequel”) was a popular way to retrieve and manipulate the information in databases. SQL injection worked by “injecting” SQL commands into the server that hosted the site to retrieve information that should be hidden, essentially using the language against itself.
…
Soon she was immersing herself in scripting languages like Perl, Python, and PHP, learning how to attack Web databases with the SQL injection method. It was mostly harmless, but by the time she was fourteen, Kayla claimed she was writing scripts that could automate cyber attacks. It had all been harmless, “until I went looking for so-called hacking forums,” Kayla said. “I registered at some of them and they were all, ‘Go away little girl this isn’t for you.’ Fair enough I was only 14 but it made me so angry!” Using some of the skills she had picked up from her dad and online research, she claimed she hacked into one forum site and deleted much of its contents using SQL injection. It was an attack unlike any the regulars had seen before.
…
People close to Kayla say she set up tr0ll and filled it with skilled hackers that she had either chosen or trained. Kayla was a quick learner and liked to teach other hackers tips and tricks. She was patient but pushy. One student remembered Kayla teaching SQL injection by first explaining the theory and then telling the hackers to do it over and over again using different approaches for two days straight. “It was hell on your mind, but it worked,” the student said. Kayla understood the many complex layers to methods like SQL injection, a depth of knowledge that allowed her to exploit vulnerabilities that other hackers could not. On tr0lll, Kayla and her friends discussed the intricacies of Gawker’s servers, trying to figure out a way to steal some source code for the site.
Building Secure and Reliable Systems: Best Practices for Designing, Implementing, and Maintaining Systems
by
Heather Adkins
,
Betsy Beyer
,
Paul Blankinship
,
Ana Oprea
,
Piotr Lewandowski
and
Adam Stubblefield
Published 29 Mar 2020
problem space's similarities to security problems, Foreword by Royal Hansen risk evaluation by, The Roles of Specialists Security Engineering's similarities to, Foreword by Michael Wildpaner SRE Security Exchange Program, Build Empathy software development (see testing (code); writing code) software errors, recovery from, Software Errors software supply chain, Concepts and Terminology-Concepts and Terminology software upgrades, recovery and, System Rebuilds and Software Upgrades Sony Pictures, Attacker Motivations Space Shuttle Columbia incident, Culture of Inevitably Spanner, The Build or Buy Decision, Example: Temporary Files-Example: Temporary Files SQL injection (SQLI), Frameworks to Enforce Security and Reliability, SQL Injection Vulnerabilities: TrustedSqlString-SQL Injection Vulnerabilities: TrustedSqlString SRE (see Site Reliability Engineer/Engineering) SRE Security Exchange Program, Build Empathy SSL key rotation, Credential and Secret Rotation SSO (single sign-on) services, Credential and Secret Rotation staged rollouts, Reduce Fear with Risk-Reduction Mechanisms stalkerware, Criminal Actors statedefined, Know Your Intended State, Down to the Bytes device firmware, Device firmware global services, Global services host management, Host management-Host management knowing intended state, Know Your Intended State, Down to the Bytes-Persistent data persistent data and, Persistent data state exhaustion attack, Defendable Architecture static analysis, Static Program Analysis-Formal Methodsabstract interpretation, Abstract Interpretation automated code inspection tools, Automated Code Inspection Tools-Automated Code Inspection Tools formal methods, Formal Methods integration into developer workflow, Integration of Static Analysis in the Developer Workflow-Integration of Static Analysis in the Developer Workflow reverse engineering and test input generation, Integration of Static Analysis in the Developer Workflow static type checking, Use strong typing and static type checking Stoll, Clifford, Understanding Adversaries stored XSS bugs, Understanding Complex Data Flows Strava, Protecting your systems from nation-state actors strongly typed language, Use strong typing and static type checking, Use Strong Types-Use Strong Types structured justification, Choosing an auditor supply chain, Unexpected Benefits(see also software supply chain) code deployment issues, Take It One Step at a Time security benefits of designing for recovery, Unexpected Benefits sustainability, culture of, Culture of Sustainability-Culture of Sustainability sustained velocity, initial velocity versus, Initial Velocity Versus Sustained Velocity-Initial Velocity Versus Sustained Velocity Syrian Electronic Army, Activists system invariants, System Invariants-Analyzing Invariants, Analyzing Invariants system logs, as attack target, Reliability Versus Security: Design Considerations system rebuilds, System Rebuilds and Software Upgrades systems (generally)in context of systems engineering, Preface investigating (see investigating systems) T tabletop exercises, Conducting Nonintrusive Tabletops-Conducting Nonintrusive Tabletops tactics, techniques, and procedures (TTPs), Tactics, Techniques, and Procedures, Scoping the Recovery talks, interactive, Culture of Awareness TCB (see trusted computing base) technical debtcompromised assets as, Isolating Assets (Quarantine) during recovery, What are your mitigation options?
…
For example, when using a SQL query API that accepts only values of type SafeSql (instead of String), you don’t have to worry about SQL injection vulnerabilities, since all values of type SafeSql are safe to use as a SQL query. Sinks may also accept values of basic types (such as strings), but in this case must not make any assumptions about the value’s safety in the sink’s injection context. Instead, the sink API is itself responsible for validating or encoding data, as appropriate, to ensure at runtime that the value is safe. With this design, you can support an assertion that an entire application is free of SQL injection or XSS vulnerabilities based solely on understanding the implementations of the types and the type-safe sink APIs.
…
Of course, no framework can protect against all security vulnerabilities, and it is still possible for attackers to discover an unforeseen class of attacks or find mistakes in the implementation of the framework. But if you discover a new vulnerability, you can address it in one place (or a few) instead of throughout the codebase. To provide one concrete example: SQL injection (SQLI) holds the top spot on both the OWASP and SANS lists of common security vulnerabilities. In our experience, when you use a hardened data library such as TrustedSqlString (see “SQL Injection Vulnerabilities: TrustedSqlString”), these types of vulnerabilities become a nonissue. Types make these assumptions explicit, and are automatically enforced by the compiler. Benefits of Using Frameworks Most applications have similar building blocks for security (authentication and authorization, logging, data encryption) and reliability (rate limiting, load balancing, retry logic).
Essential SQLAlchemy
by
Rick Copeland
Published 4 Jun 2008
Using string manipulation to build up a query as done here can lead to various logical errors and vulnerabilities such as opening your application up to SQL injection attacks. Generating the string to be executed by your database server verbatim also ties your code to the particular DB-API driver you are currently using, making migration to a different database server difficult. For instance, if we wished to migrate the previous example to the Oracle DB-API driver, we would need to write: sql="INSERT INTO user(user_name, password) VALUES (:1, :2)" cursor = conn.cursor() cursor.execute(sql, 'rick', 'parrot') SQL Injection Attacks SQL injection is a type of programming error where carefully crafted user input can cause your application to execute arbitrary SQL code.
…
In this case, the SQL executed is INSERT INTO user(user_name, password) VALUES ('rick', 'parrot'); DELETE FROM user; --', which would probably delete all users from your database. The use of bind parameters (as in the first example in the text) is an effective defense against SQL injection, but as long as you are manipulating strings directly, there is always the possibility of introducting a SQL injection vulnerability into your code. In the SQLAlchemy SQL expression language, you could write the following instead: statement = user_table.insert(user_name='rick', password='parrot') statement.execute() To migrate this code to Oracle, you would write, well, exactly the same thing.
…
When the query is run, SQLAlchemy will send the query string (with bind parameters) and the actual variables (in this case, the string "rick") to the database engine. Using the SQLAlchemy SQL-generation layer has several advantages over hand-generating SQL strings: Security Application data (including user-generated data) is safely escaped via bind parameters, making SQL injection-style attacks extremely difficult. Performance The likelihood of reusing a particular query string (from the database server’s perspective) is increased. For instance, if we wanted to select another user from the table, the SQL generated would be identical, and a different bind parameter would be sent.
Fancy Bear Goes Phishing: The Dark History of the Information Age, in Five Extraordinary Hacks
by
Scott J. Shapiro
A hacker can retrieve all the information in a database by using an SQL injection. Instead of submitting data, the hacker injects code. In our example, Tom doesn’t submit his username: Tom (data); he injects a partial SQL query: Tom’ OR 1=’1 (code). The new snippet interacts with the original code to produce a result that the original coder had not intended. SQL injections can be devastating. Jacobsen had used an SQL injection to gain access to the entire database of T-Mobile customers. But while dangerous and quite common, SQL injections are easy to prevent. Web application developers should “sanitize” inputs.
…
To reset the password, hackers would also have to know her phone number. Since Paris Hilton had many friends, hackers could easily have learned of her personal phone number from some mutual contact. SQL Injection Still, the leading theory in the security community was not that hackers had exploited information about Paris Hilton’s Chihuahua. T-Mobile’s entire customer base had been compromised the previous year by a twenty-one-year-old hacker named Nicholas Jacobsen. Using a so-called SQL injection, Jacobsen compromised the accounts of 16 million T-Mobile customers. One of those customers was Peter Cavicchia, a Secret Service cybercrime agent in New York who used a Sidekick.
…
ongoing criminal investigations: Kevin Poulsen, “Hacker Breaches T-Mobile Systems, Reads US Secret Service Email and Downloads Candid Shots of Celebrities,” The Register, January 12, 2005, https://www.theregister.com/2005/01/12/ hacker_penetrates_t-mobile/. deliver its file to my browser: kingthorin, “SQL Injection,” OWASP, accessed June 8, 2021, owasp.org/www-community/attacks/SQL_Injection. simple example: Example from Peter Yaworski, Real-World Bug Hunting: A Field Guide to Web Hacking (San Francisco: No Starch, 2019), 82–83. the following code: The snippets here are using the PHP server-side scripting language. “literally hundreds of injection vulnerabilities”: Paul Roberts, “Paris Hilton: Victim of T-Mobile’s Web Flaws?
Hands-On RESTful API Design Patterns and Best Practices
by
Harihara Subramanian
Published 31 Jan 2019
The following table captures a few common injection attack types, brief descriptions for each, and their potential impact: Type of Injection A brief description Potential impacts Code injection/OS command injection Execute operating system commands with application code Gains higher privileges with higher privilege escalation vulnerabilities and lead to full-system compromise CRLF injection Injects an EOL/carriage return character in an input sequence Results in splitting the HTTP header to facilitate arbitrary content injection in the response body, including XSS Email (Mail command/SMTP) injection Injects IMAP/SMTP statements to a mail server Personal information disclosure and relay of SPAM emails Host header injection Abuses the trust of the HTTP Host Header by dynamically generating headers based on user input Cache poisoning—manipulates the caching system and serves malicious pages Password reset poisoning—exploits with password reset email and delivers malicious content directly to the target LDAP injection Injects Lightweight Directory Access Protocol (LDAP) statements and executes them Modifies contents of LDAP tree and grants illegitimate permissions, privilege escalations, and bypass authentication SQL injection Injects fabricated SQL commands to exercise database read or modify data Leads to data loss, data theft, data integrity loss, DoS, and can even result in full system compromise due to advanced variations of SQL injections XPath injection Executes fabricated XPath queries by injecting false data into an application Results in information disclosure and bypass authentication Insecure direct object references Insecure direct object references (IDOR) are equally as harmful as the other top API vulnerabilities; they occur when an application exposes direct access to internal objects based on user inputs such as ID and filename.
…
API design and development flaws Missing or not adhering to API security principles and best practices may lead to defects that expose business-critical data. Another aspect of design and development is to keep APIs as simple as possible, as complexity may lead to less coverage and vulnerability. Poor user input validation, SQL injection loopholes, and buffer overflows are a few other causes. Chapter 2, Design Strategy, Guidelines, and Best Practices, discussed various aspects of design strategies and RESTful API design practices. Understanding and implementing those design principles and practices in APIs helps reduce design and development flaws.
…
This section discusses penetration tests and fuzz tests in detail and also discusses the tools/frameworks that provide out-of-the-box support for security tests so that API testers can make use of tools to get security assurance for underlying APIs. Penetration tests or pen tests One of the imperatives in API testing is penetration tests, also known as pen tests. Pen tests are the process of simulating cyber attack against a the system or API to expose/determine exploitable vulnerabilities such as intra-network loopholes, XSS attacks, SQL injections, and code-injection attacks. Pen tests asses the threat vector from an external standpoint, such as supported functions, available resources, and the API's internal components as well. Let's discuss more details about pen testing—its stages, testing methods, frameworks that support pen testing, and a few criteria for selecting the best penetration tool in the following sections.
Ajax: The Definitive Guide
by
Anthony T. Holdener
Published 25 Jan 2008
To give you a better idea of the scenario, consider the following code used to change a password: SELECT id FROM users WHERE username = '$username' AND password = '$password'; Now pretend that the user enters the following password and the system has no safeguards against this sort of thing: secret' OR 'x' = 'x You can see how a clever individual could enter a SQL script such as this and gain access to things she should not. In this case, the SQL injection would allow the user to log in to the system without actually knowing the password. The SQL that would be passed to the server would look like this: SELECT id FROM users WHERE username = 'Anthony' AND password = 'secret' OR 'x'='x'; To prevent this sort of scenario, many languages provide ways to strip out potential problem code before it becomes a problem. In PHP’s case, it provides the mysql_real_escape_string( ) function, which you can use like this: <?php /* Protect the query from a SQL Injection Attack */ $SQL = sprintf("SELECT id FROM users WHERE username='%s' AND password='%s'", mysql_real_escape_string($username), mysql_real_escape_string($password) ); ?
…
*/ if (@mysql_select_db(DB_NAME, $conn)) { /* Delete the username from the database */ $sql = sprintf('DELETE FROM users WHERE username = %s;', quote_smart($_REQUEST['username'])); @mysql_query($sql); /* Clear the session */ unset($_REQUEST['username'])); print(1); } else print(0); 686 | Chapter 20: For Your Business Communication Needs Example 20-5. logout.php: The file that is called when the user wishes to log off the chat client (continued) /* Close the server connection */ @mysql_close($conn); } else print(0); } else print('0'); ?> This code is pretty self-explanatory, though I am introducing a little function to take care of quote issues with SQL injection attacks with the function quote_smart( ). The function looks like this: <?php /** * This function, quote_smart, tries to ensure that a SQL injection attack * cannot occur. * * @param {string} $p_value The string to quote correctly. * @return string The properly quoted string. */ function quote_smart($p_value) { /* Are magic quotes on? */ if (get_magic_quotes_gpc( )) $p_value = stripslashes($p_value); /* Is the value a string to quote?
…
> PROLOG; /* Set up the parameters to connect to the database */ $params = array ('host' => $host, 'username' => $username, 'password' => $password, 'dbname' => $db); try { /* Connect to the database */ $conn = Zend_Db::factory('PDO_MYSQL', $params); /* Get the parameter values from the query $value1 = $conn->quote(($_GET['param1']) ? $value2 = $conn->quote(($_GET['param2']) ? $value3 = $conn->quote(($_GET['param3']) ? string */ $_GET['param1'] : ''); $_GET['param2'] : ''); $_GET['param3'] : ''); /* * Create a SQL string and use the values that are protected from SQL injections */ $sql = 'SELECT * FROM table1 WHERE condition1 = $value1 AND condition2 = $value2' .' AND condition3 = $value3'; /* Get the results of the query */ XML | 73 Example 4-3. A typical script for creating a server response (continued) $result = $conn->query($sql); /* Are there results? */ if ($rows = $result->fetchAll( )) { /* Create the response XML string */ $xml .= '<results>'; foreach($rows in $row) { $xml .= "<result>"; $xml .= "<column1>{$row['column1']}</column1>"; $xml .= "<column2>{$row['column2']}</column2>"; $xml .= "</result>"; } $xml .= '</results>'; } } catch (Exception $e) { $xml .= '<error>There was an error retrieving the data.
Bad Data Handbook
by
Q. Ethan McCallum
Published 14 Nov 2012
A common example of such an attack is the SQL injection attack, where the attacker tries to trick a form handler into running user-supplied SQL statements. There is a brilliant example of a SQL injection attack in the famous XKCD comic about “Little Bobby Tables” (http://xkcd.com/327/). The characters ', ;, --, and /* are often exploited in SQL injection attacks. They are used to terminate strings (') and statements (;), and to begin comments that span single (--) or multiple lines (/*). There are two main strategies for defending against SQL injection attacks. The first uses a database feature called “prepared statements” that separates a SQL statement from its parameters, eliminating the possibility that a maliciously crafted parameter could terminate the original statement and launch an attack.
…
Let’s take the bottleneck away: we’ll do the processing up front, rather than hitting the database with reads every time the website has a new request. Security Because there is no way to touch the database from the web content, our website will be resistant to intrusion, defacement, or worse. Many security vulnerabilities occur because the outside world can manipulate services that are running on the server. For example, SQL injection is a means to run arbitrary commands on the database via things like web forms. The static file architecture prevents this. (Granted, someone could hack into the server itself—either through another, unrelated service running on the machine, or by infiltrating the data center—but this is a vulnerability of any computer system.)
…
Social Security Administration (SSA) data example, Subtle Sources of Bias and Error–Sample Selection software, writing, Reading Data from an Awkward Format (see programming) spreadsheets, Understand the Data Structure, The Problem: Data Formatted for Human Consumption–Data Spread Across Multiple Files, Reading Data from an Awkward Format–Reading Data Spread Across Several Files, How Chemists Make Up Numbers–All Your Database Are Belong to Us, Rehab for Chemists (and Other Spreadsheet Abusers)–Rehab for Chemists (and Other Spreadsheet Abusers) disadvantages of, Rehab for Chemists (and Other Spreadsheet Abusers)–Rehab for Chemists (and Other Spreadsheet Abusers) importing tab-delimited data, Understand the Data Structure NCEA data using, The Problem: Data Formatted for Human Consumption–Data Spread Across Multiple Files reading with software, Reading Data from an Awkward Format–Reading Data Spread Across Several Files transferring data into databases, How Chemists Make Up Numbers–All Your Database Are Belong to Us SQL injection attacks, Problem: Application-Specific Characters Leaking into Plain Text SSA (Social Security Administration) data example, Subtle Sources of Bias and Error–Sample Selection statistics, Physical Interpretation of Simple Statistics, Recommendation Analysis–Recommendation Analysis, Summary, Summary, Example 3: When “Typical” Does Not Mean “Average”–Example 3: When “Typical” Does Not Mean “Average”, But First, Let’s Reflect on Graduate School …–But First, Let’s Reflect on Graduate School …, Service Call Data as an Applied Example classical training in, But First, Let’s Reflect on Graduate School …–But First, Let’s Reflect on Graduate School …, Service Call Data as an Applied Example for data validation, Physical Interpretation of Simple Statistics, Recommendation Analysis–Recommendation Analysis histograms, Summary (see histograms) of pattern matching, Summary (see NLP (natural language programming)) summary statistics, problems with, Example 3: When “Typical” Does Not Mean “Average”–Example 3: When “Typical” Does Not Mean “Average” stock market data example, When Data and Reality Don’t Match–Conclusion str.decode function, Python, Text Processing with Python structure of data, Understand the Data Structure (see data structure) survey data, compared to administrative data, Subtle Sources of Bias and Error–Subtle Sources of Bias and Error Survey of Income and Program Participation (SIPP) data example, Imputation Bias: General Issues, Proxy Reporting T tab-delimited data, Understand the Data Structure–Understand the Data Structure text data, Bad Data Lurking in Plain Text, Which Plain Text Encoding?
Beautiful security
by
Andy Oram
and
John Viega
Published 15 Dec 2009
Because the web server runs software that issues SQL commands to retrieve and modify the internal database (e.g., sensitive customer 68 CHAPTER FOUR information), a successful SQL injection attack that fools the web server into passing arbitrary SQL commands to the database can fetch whatever data it chooses. A well-known women’s clothing store was recently informed by their web application firewall vendor that an SQL injection error in their web application could lead to the compromise of their entire customer database, including credit card numbers, PINs, and addresses. It is almost routine now for security vendors who engage in web application scanning to discover not one, not two, but many SQL injection attack vulnerabilities in existing web applications.
…
How will attackers utilize client software vulnerabilities? As far back as 2002, a paper titled How to 0wn the Internet In Your Spare Time* came up with a disturbing possible scenario: a contagion worm exploit that targets both server and client vulnerabilities. First, the attack uses typical Web server security flaws, such as buffer overflows or SQL injection, to upload malicious * S. Staniford, V. Paxson, and N. Weaver, How to 0wn the Internet In Your Spare Time, http://www.icir .org/vern/papers/cdc-usenix-sec02 (last visited September 4, 2008). 131 code that is then downloaded whenever a targeted browser visits the website. Then, the downloaded code exploits vulnerabilities on the browser client.
…
Over time, development leaders focused on the issues that were raising their risk score and essentially competed with each other to achieve better results. This was not lost on the CIO, who would have no hesitation about calling a development leader up to his office to answer questions about an SQL injection vulnerability based on the monthly report. Further resistance to the controls diminished. Although there remained Acme developers and team leaders who believed that the security controls were burdensome and constraining, they accepted them over time as core requirements. FORCING FIRMS TO FOCUS: IS SECURE SOFTWARE IN YOUR FUTURE?
Kingpin: How One Hacker Took Over the Billion-Dollar Cybercrime Underground
by
Kevin Poulsen
Published 22 Feb 2011
Max found a copy of the certificate in one of JiLsi’s webmail accounts, protected by the carder’s usual password. From there, it was just a matter of logging in as JiLsi and leveraging his access to get at the database. On TalkCash and ScandinavianCarding, Max determined that the forum software’s search function was vulnerable to an “SQL injection” attack. It wasn’t a surprising discovery. SQL injection vulnerabilities are the Web’s most persistent weakness. SQL injection has to do with the behind-the-scenes architecture of most sophisticated websites. When you visit a website with dynamic content—news articles, blog posts, stock quotes, virtual shopping carts—the site’s software is pulling the content in raw form from a back-end database, usually running on a completely different computer than the host to which you’ve connected.
…
It’s a potentially perilous arrangement, because there are any number of situations where the software has to send a visitor’s input as part of an SQL command—in a search query, for example. If a visitor to a music site enters “Sinatra” in the search box, the website’s software will ask the database to look for matches. SELECT titles FROM music_catalog WHERE artist = ‘Sinatra’; An SQL injection vulnerability occurs when the software doesn’t properly sanitize the user’s input before including it in a database command. Punctuation is the real killer. If a user in the above scenario searches on “Sinatra’; DROP music_catalog;” it’s tremendously important that the apostrophe and semicolons not make it through.
…
Otherwise, the database server sees this. SELECT * FROM music_catalog WHERE artist = ‘Sinatra’; DROP music_catalog;’; As far as the database is concerned, that’s two commands in succession, separated by a semicolon. The first command finds Frank Sinatra albums, the second one “drops” the music catalog, destroying it. SQL injection is a standard weapon in every hacker’s arsenal—the holes, even today, plague websites of all stripes, including e-commerce and banking sites. And in 2005, the forum software used by TalkCash and ScandinavianCarding was a soft target. To exploit the bug on TalkCash, Max registered for a new account and posted a seemingly innocuous message on one of the discussion threads.
Building Web Applications With Flask
by
Italo Maia
Published 25 Jun 2015
Each filter call to query will return a BaseQuery instance, which allows you to stack your filters, like this: queryset = Employee.query.filter_by(name='marcos mango') queryset = queryset.filter_by(birthday=datetime.strptime('1990-09-06', '%Y-%m-%d')) queryset.all() # <= returns the result of both filters applied together The possibilities here are many. Why don't you try a few examples on your own now? Note The most common security problem related to web applications and databases is the SQL Injection Attack, where an attacker injects SQL instructions into your queries to the database, gaining privileges he/she should not have. The SQLAlchemy's engine object "auto-magically" escapes special characters in your consults; so, unless you explicitly bypass its quoting mechanism, you should be safe.
…
Besides differences in the imports, there is the MongoDB configuration, as MongoDB requires different parameters; we have the birthday field type as MongoEngine does not support DateField; there is birthday format overwrite as the default string format for datetimefield is different than what we want; and we have the changes in the index method. As we do not have to handle sessions with Flask-MongoEngine, we just remove all references to it. We also change how employee_list is built. Tip As MongoDB does not parse the data you send to it in an attempt to figure out what the query is about, you do not have SQL injection-like problems with it. Relational versus NoSQL You might be wondering when to use relational and when to use NoSQL. Well, given the techniques and technologies in existence today, I would recommend you work with the type you feel better working with. NoSQL brags about being schema-less, scalable, fast, and so on, but relational databases are also quite fast for most of your needs.
…
relational databaseversus NoSQL database / Relational versus NoSQL RESTful Web Service APIdata, recording to database / Beyond GET rowsabout / Concepts S Seleniumabout / Behavior testing serverabout / Placing your code in a server code, placing / Placing your code in a server sessionsabout / Sessions or storing user data between requests using / Sessions or storing user data between requests example / Sessions or storing user data between requests set statementabout / Control structures SQLreference link / SQLAlchemy SQLAlchemyabout / SQLAlchemy concepts / Concepts installing / Hands on example / Hands on reference link / Hands on Flask-SQLAlchemy / Flask-SQLAlchemy SQL Injection Attackabout / Flask-SQLAlchemy SQLiteabout / SQLAlchemy URL / SQLAlchemy StackOverflowabout / StackOverflow URL / StackOverflow T tablesabout / Concepts tagsabout / HTML forms for the faint of heart template contextmodifying / Messing with the template context testsabout / What kinds of test are there?
The Nature of Software Development: Keep It Simple, Make It Valuable, Build It Piece by Piece
by
Ron Jeffries
Published 14 Aug 2015
(Be warned, though; you may not want to put anything on the Net ever again!) Injection “Injection” is an attack on a parser or interpreter that relies on user-supplied input. The classic example is SQL injection, where ordinary user input is crafted to turn one SQL statement into more than one. This is the “Little Bobby Tables” attack.[55] In that classic XKCD strip, a school administrator asks if the character’s son is really named “Robert’); DROP TABLE Students;- -”. While an odd moniker, Bobby Tables illustrates a typical SQL injection attack. If the application concatenates strings to make its query, then the database will see an early sequence of ’); to terminate whatever query the application really meant to do.
…
There’s no excuse for SQL injections in this day and age. It happens when code bashes strings together to make queries. But every SQL library allows the use of placeholders in query strings. Don’t do this: // Vulnerable to injection String query = "SELECT * FROM STUDENT WHERE NAME = '" + name + "';" Instead do this: // Better String query = "SELECT * FROM STUDENT WHERE NAME = ?;" PreparedStatement stmt = connection.prepareStatement(query); stmt.setString(1, name); ResultSet results = stmt.executeQuery(); For more defenses, see the OWASP SQL Injection Prevention Cheat Sheet.[56] Other databases are also vulnerable to injection attacks.
…
Instead the attacker hopes that the error response from the endpoint will contain the offending input, with the external entity expanded. Most XML parsers are vulnerable to XXE injection by default. You need to configure them to be safe. No, the answer is not to parse the XML yourself with regular expressions! Just use the OWASP XXE Prevention Cheat Sheet to configure your parser for safety.[57] SQL injection and XXE are just two of the many ways user input can corrupt your service. Format string attacks, “Eval injection,” XPATH injection...Injection attacks have held their top spot on the OWASP Top 10 since 2010. Before that they were number two. Don’t let yourself fall prey. Broken Authentication and Session Management Authentication and session management covers a myriad of problems.
Shipping Greatness
by
Chris Vander Mey
Published 23 Aug 2012
A hypothetical postmortem, also known as a “Cause Of Error” (COE) report, appears in the sidebar Sample COE Report Sample COE Report COE #1 – SQL injection hack causes humiliation. 03/07/12—DRAFT—Chris Vander Mey (cvandermey@) TRACKING BUG: http://bugzilla/b=1234 WHAT was the problem? The Ads Optimizations team released an update to the frontend of our optimizer that didn’t correctly clean search statements. In parallel, the Database Operations team had updated our databases and rewritten some stored procedures that didn’t correctly protect against SQL injection either. An intern discovered this problem while working on a starter project. WHO did this impact?
…
The potential exposure, given where this break was featured, was ~10% of our user base, and required that the user have an account, which mitigated impact. WHEN did this occur? Issue started: 5/1/08 14:00 Issue discovered: 5/5/08 15:00 Rolled back to last-known-good server: 5/5/08 16:43 Issue resolved by pushing a new frontend: 5/6/08 16:00 WHY did this happen? We don’t have unit tests for SQL injection. Why? We can’t run builds against SQL servers effectively. Why? We aren’t mocking the SQL servers. Why? We had a hole in our essential test matrix. We didn’t coordinate with DB Ops. Why? DB Ops is intentionally separate from our frontend teams, to add autonomy. Why? Things got really slow when we had teams discussing.
…
Things got really slow when we had teams discussing. Why? Everyone had different opinions and we couldn’t make decisions. Why? There was no clear ownership and accountability around who is responsible for query security. HOW will we avoid this problem in the future? cvandermey@: Write unit tests to ensure that SQL injection fails. harry_the_db_lead@: Write predeployment checklist and get signoff from product leads on each team that relies on DB Ops. Run aforementioned tests against all DB release candidates. All TLs: Reinforce importance of code reviews. Charlie_tl@, in particular, write a checklist for things to look for in code reviews.
Engineering Security
by
Peter Gutmann
This type of configuration represents an example of a tunnelling threat in which the assumptions for each component, taken in isolation, are valid but for which an attacker can use the online-store application to tunnel untrusted data from an Internet-facing application to an internal application that assumes that it’s coming from a trusted source, effectively laundering the attack data via an intermediary. Figure 75: Threat tunnelling via SQL injection is everywhere Another example of threat tunnelling occurs with second-order SQL injection. In standard SQL injection the attacker submits a malicious string to a database back-end (for example via a carefully-crafted HTTP query) that the database then treats as part of the SQL command that it’s being sent rather than treating it as the data that it’s supposed to be (NoSQL databases are just as vulnerable to injection attacks as standard SQL databases, it’s just that we currently have very little experience in exploiting them and, conversely, almost no experience in defending them [163]).
…
In standard SQL injection the attacker submits a malicious string to a database back-end (for example via a carefully-crafted HTTP query) that the database then treats as part of the SQL command that it’s being sent rather than treating it as the data that it’s supposed to be (NoSQL databases are just as vulnerable to injection attacks as standard SQL databases, it’s just that we currently have very little experience in exploiting them and, conversely, almost no experience in defending them [163]). The standard defence against SQL injection is to use parameterised queries that separate the SQL command(s) sent to the database from the data items that they refer to. Second-order SQL injection bypasses this by storing the attack string in the database (which is immune at this level, since it’s using parameterised queries) and then triggering a second operation that fetches back the data that was submitted in the first request.
…
Since it’s coming from a trusted source, it’s no longer regarded as potentially tainted and so may not be subject to the careful handling via parameterised queries that it was originally [164]. This kind of SQL injection is really hard to detect because the act of laundering it via the database has removed any obvious connection from the original, tainted source to the data that’s currently being acted on. Although SQL injection is the canonical example of this type of data laundering, another example of this that comes with a nice DFD-based threat analysis is the Firefox URI-handling vulnerability [165][166]. This vulnerability was created through Firefox registering a URI handler for firefoxuri: that would be invoked by specifying the web page to visit as its accompanying URL parameter.
The Art of SQL
by
Stephane Faroult
and
Peter Robson
Published 2 Mar 2006
Concatenating the entry field to the SQL statement means that in practice anybody will be able to download our full database without any subscription. And of course some information is more sensitive than movie databases. Binding variables protects from SQL injection. SQL injection is a very real security matter for anyone running an on-line database, and great care should be taken to protect against its malicious use. Important When using dynamically built queries, use parameter markers and pass values as bind variables, for both performance and security (SQL injection) reasons. A query with prepared joins and dynamically concatenated filtering conditions executes very quickly when the tables are properly indexed.
…
All the setup is done, the query can be run immediately, and the end user gets the response faster. Besides performance, there is also a very serious concern associated with dynamically built hardcoded queries, a security concern: such queries present a wide-open door to the technique known as SQL injection. What is SQL injection? Let's say that we run a commercial operation, and that only subscribers are allowed to query the full database while access to movies older than 1960 is free to everybody. Suppose that a malicious non-subscriber enters into the movie_title field something such as: X' or 1=1 or 'X' like 'X When we simply concatenate entry fields to our query text we shall end up with a condition such as: where movie_title like 'X' or 1=1 or 'X' like 'X%' and movie_year < 1960 which is always true and will obviously filter nothing at all!
Django Book
by
Matt Behrens
Published 24 Jan 2015
It’s trivial to spoof the request metadata that browsers usually add automatically. Every one of the vulnerabilities discussed in this chapter stems directly from trusting data that comes over the wire and then failing to sanitize that data before using it. You should make it a general practice to continuously ask, “Where does this data come from?” SQL Injection SQL injection is a common exploit in which an attacker alters Web page parameters (such as GET/POST data or URLs) to insert arbitrary SQL snippets that a naive Web application executes in its database directly. It’s probably the most dangerous – and, unfortunately, one of the most common – vulnerabilities out there.
…
If that header is unescaped when building the e-mail message, an attacker could submit something like "hello\ncc:spamvictim@example.com" (where "\n” is a newline character). That would make the constructed e-mail headers turn into: To: hardcoded@example.com Subject: hello cc: spamvictim@example.com Like SQL injection, if we trust the subject line given by the user, we’ll allow him to construct a malicious set of headers, and he can use our contact form to send spam. The Solution We can prevent this attack in the same way we prevent SQL injection: always escape or validate user-submitted content. Django’s built-in mail functions (in django.core.mail) simply do not allow newlines in any fields used to construct headers (the from and to addresses, plus the subject).
…
If your site allows logged-in users to see any sort of sensitive data, you should always serve that site over HTTPS. Additionally, if you have an SSL-enabled site, you should set the SESSION_COOKIE_SECURE setting to True; this will make Django only send session cookies over HTTPS. E-mail Header Injection SQL injection’s less well-known sibling, e-mail header injection, hijacks Web forms that send e-mail. An attacker can use this technique to send spam via your mail server. Any form that constructs e-mail headers from Web form data is vulnerable to this kind of attack. Let’s look at the canonical contact form found on many sites.
Learning Android
by
Marko Gargenta
Published 11 Mar 2011
This chapter will show you the process for an insert, and the other operations work in similar ways. So, why not use SQL directly? There are three good reasons why. First, from a security point of view, an SQL statement is a prime candidate for a security attack on your application and data, known as an SQL injection attack. That is because the SQL statement takes user input, and unless you check and isolate it very carefully, this input could embed other SQL statements with undesirable effects. Second, from a performance point of view, executing SQL statements repeatedly is highly inefficient because you’d have to parse the SQL every time the statement runs.
…
Android's database framework only supports prepared statements for standard CRUD operations: INSERT, UPDATE, DELETE, and SELECT. For other SQL statements, we pass them directly to SQLite. That’s why we used execSQL() to run the code to CREATE TABLE.... This is OK because that code doesn’t depend on any user input, and as such SQL injection is not possible. Additionally, that code runs very rarely, so there’s no need to worry about the performance implications. Cursors A query returns a set of rows along with a pointer called a cursor. You can retrieve results one at a time from the cursor, causing it to advance each time to the next row.
…
sendBroadcast() method, Broadcasting Intents, Updating the Services to Enforce Permissions, Pending intents sendTimelineNotification() method, Sending Notifications SensorManager, Getting Updates from the Compass services, Services (see system services) setContentView() method, The StatusActivity Layout, Creating Your Application-Specific Object and Initialization Code setDirection() method, Custom Rose Widget setInexactRepeating() method, Updating BootReceiver setMyLocation() method, Updating the Status Activity Settings Provider, Content Providers setupList() method, Toggle Service setViewBinder() method, ViewBinder: A Better Alternative to TimelineAdapter setViewValue() method, ViewBinder: A Better Alternative to TimelineAdapter severity levels, log, Logging in Android shared preferences, Shared Preferences signing, application, Application Signing simulators, The Emulator single thread, Single Thread Software Development Kit (SDK), Installing the Android SDK (see SDK (Software Development Kit)) spyware, What about viruses, malware, spyware, and other bad things? SQL injection, Four Major Operations SQLite, Native Libraries, SQLite and Android’s Support for It, About SQLite sqlite3 tool, Using sqlite3 stack, Stack Overview, What about viruses, malware, spyware, and other bad things? startActivity() method, Update StatusActivity to Handle Menu Events, Initial App Setup, Pending intents Starting State, Starting state startManagingCursor() method, Creating the TimelineActivity Class, Querying Data startService() method, Update the Options Menu Handling, Pending intents Status activity, Updating the Status Activity status data, Looping in the Service, Update UpdaterService, Refactoring Status Data, Refactoring Status Data, Defining the URI, Implementing the YambaWidget class status updates, The StatusActivity Layout, Looping in the Service, The Database, Broadcast Receivers, The TimelineReceiver, Updating the Services to Enforce Permissions, Using Content Providers Through Widgets, Summary, Intent Service, Sending Notifications, Summary checking for new, Looping in the Service using IntentService to run, Intent Service notification of, The TimelineReceiver, Sending Notifications, Summary permissions for, Updating the Services to Enforce Permissions screen, The StatusActivity Layout sending, Broadcast Receivers storing locally, The Database widget for displaying, Using Content Providers Through Widgets, Summary status.xml, The StatusActivity Layout StatusActivity, The StatusActivity Layout, The StatusActivity Java Class, AsyncTask, Update StatusActivity to Load the Menu, Simplifying StatusActivity Stopped State, Stopped state stopService() method, Update the Options Menu Handling, The Network Receiver strings resource, Important Widget Properties Stub() method, Implementing the Service, Binding to the Remote Service subscribers, About Broadcast Receivers system services, Services, Project Design, Services, UpdaterService, Update the Manifest File, Add Menu Items, Testing the Service, Looping in the Service, Testing the Service, System Services, Compass Demo, Custom Rose Widget, Common Steps in Using System Services, Location Service, Updating the Status Activity, Intent Service, Intent Service, Sending Notifications adding and handling menus, Add Menu Items Android Services vs. native services, Services common steps in using, Common Steps in Using System Services Compass demo, Compass Demo, Custom Rose Widget creating, UpdaterService defining in manifest file, Update the Manifest File intents, Intent Service, Intent Service location service, Location Service, Updating the Status Activity looping in, Looping in the Service notification service, Sending Notifications overview, Services, System Services and project design, Project Design testing, Testing the Service, Testing the Service T TableLayout, TableLayout TableRow, TableLayout testing, Testing the Service, Testing the Service, Testing the Service, Testing That It All Works text property, Important Widget Properties TextWatcher, Other UI Events Thread.sleep() method, Looping in the Service, Looping in the Service threading, Threading in Android timeline activity example, TimelineActivity, Creating an Adapter in TimelineActivity.java, Toggle Service, The TimelineReceiver timeline adapter example, TimelineAdapter, ViewBinder: A Better Alternative to TimelineAdapter timeline receivers, The TimelineReceiver, The TimelineReceiver, Updating the Services to Enforce Permissions toggle service, Toggle Service Twitter, A Real-World Example, Broadcast Receivers, The Yamba Application, Creating Your Application-Specific Object and Initialization Code, Updating the Manifest File for Internet Permission, AsyncTask, Other UI Events, Other UI Events, Other UI Events, Shared Preferences, UpdaterService, Pulling Data from Twitter, Pulling Data from Twitter, The Database Schema and Its Creation, Update UpdaterService, Lists and Adapters 140-character counter, Other UI Events, Other UI Events creating compatible apps, Creating Your Application-Specific Object and Initialization Code, Updating the Manifest File for Internet Permission, AsyncTask, Other UI Events, Shared Preferences, Lists and Adapters example of app, A Real-World Example, Broadcast Receivers pulling data from, UpdaterService, Pulling Data from Twitter, Pulling Data from Twitter, The Database Schema and Its Creation, Update UpdaterService and Yamba, The Yamba Application U UI (user interface), Part 1: Android User Interface, Two Ways to Create a User Interface, Creating Your Application-Specific Object and Initialization Code, Optimizing the User Interface Android objects, Creating Your Application-Specific Object and Initialization Code optimizing, Optimizing the User Interface two ways to create, Two Ways to Create a User Interface Uniform Resource Identifier (URI), Defining the URI unmarshaling, The Android Interface Definition Language update() method, Four Major Operations, Creating a Content Provider updateAppWidget() method, Implementing the YambaWidget class UpdaterService, UpdaterService, Pulling Data from Twitter, Update UpdaterService, Update UpdaterService, BootReceiver, The TimelineReceiver, Broadcasting Intents, Updating the Services to Enforce Permissions updateStatus() method, Single Thread URI (Uniform Resource Identifier), Defining the URI user data partition, Filesystem Partitions, The User Data Partition user interface, Two Ways to Create a User Interface (see UI) user preferences, Preferences (see preferences) username, The User Data Partition V ViewBinder, ViewBinder: A Better Alternative to TimelineAdapter views and layouts, Views and Layouts viruses, What about viruses, malware, spyware, and other bad things?
Programming HTML5 Applications
by
Zachary Kessin
Published 9 May 2011
But any script running on the page that created the database has full access to the database, and of course the user has access to the data. IndexedDB has several advantages over SQLite. First, its native data storage format is a JavaScript object. There is no need to map a JavaScript object into a SQL table structure, which is always a poor fit and can allow for SQL injection attacks. Injection attacks can’t happen with IndexedDB, though in some cases XSS could be a problem, if a user manages to inject JavaScript into the store and have it put back into a page. The IndexedDB data store provides a set of interfaces to store JavaScript objects on a local disk. Each object must have a key by which objects can be retrieved, and may also have secondary keys.
…
URL query string separator, Debugging Manifest Files A Abelson, Harold, Functional Programming acceptance tests, Testing JavaScript Applications accessibility, Accessibility Through WAI-ARIA Accessible Rich Internet Applications, Accessibility Through WAI-ARIA actions, Selenium, Selenium Commands ActiveX controls, IndexedDB add() method (IndexedDB), Adding and Updating Records airplane mode, Adding Power to Web Applications Ajax, Developing Web Applications, Nonblocking I/O and Callbacks, Nonblocking I/O and Callbacks, Functional Programming, Functional Programming, A Simple Example, Offline Loading with a Data Store, Storing Changes for a Later Server Sync, Uploading Files, Structure of the Manifest File calls, Nonblocking I/O and Callbacks, Functional Programming, A Simple Example, Offline Loading with a Data Store, Storing Changes for a Later Server Sync, Structure of the Manifest File DataStore object, Nonblocking I/O and Callbacks uploading files with, Uploading Files versus XHR terminology, Functional Programming alert() method, Nonblocking I/O and Callbacks alt attribute, Accessibility Through WAI-ARIA Android, Selenium RC and a Test Farm, New Form Types _AndWait commands (Selenium), Selenium Commands anonymous functions, Lambda Functions Are Powerful Apache web server, Introduction to the Manifest File append() method, Uploading Files appending an image example, Working with Files Apple Safari, JavaScript’s Triumph, A Pattern for Reuse of Multithread Processing, Libraries for Web Workers, Web Sockets Aptana, JavaScript Tools You Should Know archive files, Drag-and-Drop array, Functional Programming, Array Iteration Operations, You Can Extend Objects, Too iteration operations, Array Iteration Operations, You Can Extend Objects, Too map function, Functional Programming assertElementPresent command (Selenium), Selenium Commands assertions, Selenium, Selenium Commands <audio> tag, Audio and Video automatic updates, Developing Web Applications B base types, extending, Prototypes and How to Expand Objects beforeload event handler, Offline Loading with a Data Store Benedetti, Ryan, JavaScript’s Triumph binary data, The Web Sockets Interface binding variables, Closures blob data type, Blobs BlobBuilder, Blobs Booleans as objects, Prototypes and How to Expand Objects bottlenecks, Splitting Up Work Through Web Workers, JavaScript Tools You Should Know Breadcrumbs specs, Microdata browsers, The Web As Application Platform, Developing Web Applications, Lambda Functions Are Powerful, Testing JavaScript Applications, Selenium RC and a Test Farm, Local Storage, jStore, Updates to the Manifest File, Updates to the Manifest File, Debugging Manifest Files (see also Chrome, Firefox, Internet Explorer, Safari) cache control header issues, Updates to the Manifest File, Debugging Manifest Files data storage and, Local Storage, jStore differences among, Testing JavaScript Applications function inside an if, Lambda Functions Are Powerful interactivity, The Web As Application Platform, Developing Web Applications testing on multiple, Selenium RC and a Test Farm buildMaster() method, Web Worker Fractal Example built-in objects, Prototypes and How to Expand Objects buttons, Lambda Functions Are Powerful, Closures, A Simple Example, Testing with QUnit, Selenium, Selenium Commands, Selenese Command Programming Interface click callbacks, Lambda Functions Are Powerful, A Simple Example, Testing with QUnit closures, Closures testing, Selenium, Selenese Command Programming Interface with XPath, Selenium Commands C C-type languages, Lambda Functions Are Powerful cache control headers, Updates to the Manifest File, Debugging Manifest Files cache() method, Expanding Functions with Prototypes Cagle, Kurt, Canvas and SVG callbacks, Nonblocking I/O and Callbacks, Lambda Functions Are Powerful, Closures, Array Iteration Operations, Testing JavaScript Applications, A Simple Example, Adding and Updating Records, Retrieving Data, Uploading Files, A Pattern for Reuse of Multithread Processing, Web Socket Example alternatives to for loops, Array Iteration Operations button, A Simple Example closures to construct, Closures cursor, Adding and Updating Records on DOM elements, Testing JavaScript Applications from ports, Web Socket Example and Web Workers, A Pattern for Reuse of Multithread Processing write, Retrieving Data XHMLHttpRequest, Uploading Files Canvas, Graphics, Web Worker Fractal Example <canvas> tag, Graphics changes, storing, Storing Changes for a Later Server Sync chat applications, Web Sockets checksum, manifest, Updates to the Manifest File Chrome, Google, JavaScript’s Triumph, Closures, Closures, The localStorage and sessionStorage Objects, The localStorage and sessionStorage Objects, IndexedDB, Blobs, Filesystem, Events, Events, Testing and Debugging Web Workers, Testing and Debugging Web Workers, A Pattern for Reuse of Multithread Processing, A Pattern for Reuse of Multithread Processing, Web Sockets, Tags for Applications, JavaScript Tools You Should Know BlobBuilder support, Blobs debugging web workers in, Testing and Debugging Web Workers Dev tools, Closures, The localStorage and sessionStorage Objects, Testing and Debugging Web Workers filesystem access, Filesystem IndexedDB in, IndexedDB list of closed variables, Closures manifest list, Events postMessage() in, A Pattern for Reuse of Multithread Processing <progress> tag support, Tags for Applications Speed Tracer, JavaScript Tools You Should Know storage viewer, The localStorage and sessionStorage Objects, Events web socket support, Web Sockets web worker support in, A Pattern for Reuse of Multithread Processing Church, Alonzo, Lambda Functions Are Powerful click command (Selenium), Selenium Commands client-side data storage, Local Storage Clojure, Web Socket Example ClojureScript, JavaScript Tools You Should Know close() method, The Worker Environment, The Web Sockets Interface closures, Nonblocking I/O and Callbacks, Closures, Closures, Array Iteration Operations, IndexedDB cloud test farms, Automatically Running Tests CoffeeScript, JavaScript Tools You Should Know color form input, New Form Types composite functions, Functional Programming config files, web server, Introduction to the Manifest File confirm() method, Nonblocking I/O and Callbacks content delivery network, Taking It Offline controls attribute, Audio and Video cookies, The localStorage and sessionStorage Objects, Using localStorage in ExtJS, Web Sockets CouchDB, IndexedDB Cranley, Ronan, JavaScript’s Triumph createObjectURL() method, Blobs Crockford, Douglas, JavaScript’s Triumph, The Power of JavaScript, JavaScript Tools You Should Know cross-platform web development, Developing Web Applications CruiseControl, Selenese Command Programming Interface currying, Currying and Object Parameters D data record example, IndexedDB data storage, Adding Power to Web Applications, Local Storage, jStore data trees, You Can Extend Objects, Too databases, Adding Power to Web Applications, Local Storage, IndexedDB, Deleting Data, IndexedDB, Adding and Updating Records, Retrieving Data, Deleting Data adding and updating records, Adding and Updating Records deleting data from, Deleting Data IndexedDB, Adding Power to Web Applications, IndexedDB, Deleting Data retrieving data from, Retrieving Data SQLite, Local Storage DataStore object (Ajax), Nonblocking I/O and Callbacks date form input, New Form Types dblclick command (Selenium), Selenium Commands debugging, JavaScript’s Triumph, Lambda Functions Are Powerful, Debugging Manifest Files, Splitting Up Work Through Web Workers, Testing and Debugging Web Workers, JavaScript Tools You Should Know Firebug, JavaScript’s Triumph, Lambda Functions Are Powerful and JSMin, JavaScript Tools You Should Know manifest files, Debugging Manifest Files and Web Workers, Splitting Up Work Through Web Workers, Testing and Debugging Web Workers $.decode() method (Hive API), Libraries for Web Workers decoratedFib(), Expanding Functions with Prototypes deepEqual() method, Testing with QUnit defer() method, Offline Loading with a Data Store degradation, handling, Testing JavaScript Applications deleteEach() method, Deleting Data doConditionalLoad() method, Offline Loading with a Data Store DOM (Document Object Model), The Web As Application Platform, Developing Web Applications, Testing JavaScript Applications downloading events, Events Drag and Drop widget, JavaScript Tools You Should Know drag-and-drop, Selenium Commands, Files, Drag-and-Drop Dragonfly, Opera, JavaScript’s Triumph, A Pattern for Reuse of Multithread Processing drop event (DOM), Drag-and-Drop drop handler example, Working with Files drop zone example, Putting It All Together DSt library, DSt E Eclipse, JavaScript Tools You Should Know ECMAScript objects, The Worker Environment Emacs JS2 mode, JavaScript Tools You Should Know email form input, New Form Types enclosing scope, Lambda Functions Are Powerful $.encode() method (Hive API), Libraries for Web Workers engines available in JQuery, jStore equal() method, A Simple Example, Testing with QUnit Erlang Yaws, Web Socket Example, Erlang Yaws errors, Retrieving Data, Events, Testing and Debugging Web Workers, Geolocation ETags, Offline Loading with a Data Store event loops, Splitting Up Work Through Web Workers, JavaScript Tools You Should Know Event Machine, Ruby, Ruby Event Machine Events specs, Microdata every() method, Array Iteration Operations expanding functions with prototypes, Expanding Functions with Prototypes, Expanding Functions with Prototypes extending base types, Prototypes and How to Expand Objects ExtJS, JavaScript’s Triumph, Lambda Functions Are Powerful, Currying and Object Parameters, Selenium Commands, Using localStorage in ExtJS button with function as handler, Lambda Functions Are Powerful click event problems, Selenium Commands currying parameters in, Currying and Object Parameters library, JavaScript’s Triumph localStorage object in, Using localStorage in ExtJS F FALLBACK section, manifest, Structure of the Manifest File feature detection, A Pattern for Reuse of Multithread Processing Fibonacci sequences, calculating, Expanding Functions with Prototypes FileReader API, Working with Files files, Adding Power to Web Applications, Files, Filesystem, Debugging Manifest Files FileSystem API, Filesystem filter() method, Array Iteration Operations, You Can Extend Objects, Too Firebug, JavaScript’s Triumph, Lambda Functions Are Powerful, Closures, The localStorage and sessionStorage Objects, Working with Files, Splitting Up Work Through Web Workers, Testing and Debugging Web Workers, Testing and Debugging Web Workers, JavaScript Tools You Should Know anonymous functions, Lambda Functions Are Powerful colorizing in script tag, JavaScript Tools You Should Know debugging web workers in, Testing and Debugging Web Workers developer tools, Testing and Debugging Web Workers editing storage object, The localStorage and sessionStorage Objects full path file names, Working with Files scope chain, Closures and Web Workers, Splitting Up Work Through Web Workers Firefox, Mozilla, Prototypes and How to Expand Objects, Array Iteration Operations, Array Iteration Operations, Selenium, Constructing Tests with the Selenium IDE, The localStorage and sessionStorage Objects, IndexedDB, Blobs, Debugging Manifest Files, Debugging Manifest Files, Worker Communication, Testing and Debugging Web Workers, A Pattern for Reuse of Multithread Processing, A Pattern for Reuse of Multithread Processing, Web Sockets, Tags for Applications developers’ site, Prototypes and How to Expand Objects, Array Iteration Operations IndexedDB in, IndexedDB iteration methods in, Array Iteration Operations manifest file opt-in issue, Debugging Manifest Files MozBlobBuilder, Blobs MozWebSockets, Web Sockets passing complex JavaScript objects, Worker Communication postMessage() in, A Pattern for Reuse of Multithread Processing <progress> tag support, Tags for Applications Selenium IDE for, Selenium, Constructing Tests with the Selenium IDE, Debugging Manifest Files storage objects in, The localStorage and sessionStorage Objects web workers in, Testing and Debugging Web Workers, A Pattern for Reuse of Multithread Processing FireRainbow, JavaScript Tools You Should Know first class citizens, Lambda Functions Are Powerful, Functional Programming :first-child() CSS selector, New CSS Flanagan, David, The Power of JavaScript flow control, Selenium Commands for loops, alternatives to, Array Iteration Operations forks, Adding Power to Web Applications FormData interface, Uploading Files fractal computation examples, Web Worker Fractal Example, Web Worker Fractal Example Fulton, Jeff, Graphics, Canvas and SVG Fulton, Steve, Graphics, Canvas and SVG function expressions, Lambda Functions Are Powerful function generators, Closures function interceptor example, Expanding Functions with Prototypes Function prototype, Expanding Functions with Prototypes function statements, Lambda Functions Are Powerful functional programming, Functional Programming, Functional Programming, JavaScript Tools You Should Know functions, Lambda Functions Are Powerful, Lambda Functions Are Powerful, Closures, Functional Programming, Functional Programming anonymous, Lambda Functions Are Powerful as first class citizens, Lambda Functions Are Powerful, Functional Programming higher order, Functional Programming inner and outer, Closures functions, expanding with prototypes, Expanding Functions with Prototypes, Expanding Functions with Prototypes G Garret, Jesse James, Functional Programming Gears, Google, Developing Web Applications, Local Storage, Introduction to the Manifest File, Splitting Up Work Through Web Workers offline file access, Introduction to the Manifest File SQLite database, Local Storage worker pool, Splitting Up Work Through Web Workers geolocation, Maps, Geolocation $.get() method (Hive API), Libraries for Web Workers getBlob() method (BlobBuilder), Blobs getCurrentPosition() method, Geolocation getEval() method (Selenese API), Selenese Command Programming Interface getText() method (Selenese API), Selenese Command Programming Interface getXpathCount() method (Selenese API), Selenese Command Programming Interface Gmail, Google’s, Files, Web Sockets Goerzen, John, Functional Programming Google Chrome, Tags for Applications (see Chrome, Google) Google Gears, Developing Web Applications (see Gears, Google) Google search predefined vocabularies, Microdata Google Web Toolkit, JavaScript’s Triumph grid object (ExtJS), Using localStorage in ExtJS H handleButtonClick() function, A Simple Example Haskell, Currying and Object Parameters Head First jQuery (Benedetti & Cranley), JavaScript’s Triumph Hello World testing example, Selenese Command Programming Interface Hickey, Rich, JavaScript Tools You Should Know High Performance JavaScript (Zakas), The Power of JavaScript High Performance Web Sites (Souders), JavaScript Tools You Should Know higher order functions, Functional Programming hoisting, Lambda Functions Are Powerful <hr> tag, Accessibility Through WAI-ARIA, Accessibility Through WAI-ARIA, Accessibility Through WAI-ARIA HTML 5, Putting It All Together, Introduction to the Manifest File, Graphics, Web Worker Fractal Example, New Tags, New CSS, New Form Types, Canvas and SVG, New CSS Canvas, Graphics, Web Worker Fractal Example, Canvas and SVG manifest declaration example, Introduction to the Manifest File new CSS features, New CSS new form types, New Form Types new tags, New Tags, New CSS progress bar, Putting It All Together HTML5 Canvas (Fulton & Fulton), Graphics, Canvas and SVG HTML5 Graphics with SVG & CSS3 (Cagle), Canvas and SVG HTML5 Media (Powers), Audio and Video HTML5 Rocks tutorial, Canvas and SVG HTTP (Hypertext Transfer Protocol), Adding Power to Web Applications, Web Sockets, Erlang Yaws I I/O, Nonblocking I/O and Callbacks IDs, importance of assigning, Selenium if statement, Lambda Functions Are Powerful images, Functional Programming, Blobs, Working with Files, Drag-and-Drop, Filesystem, Filesystem, Debugging Manifest Files, Graphics, Graphics, Graphics, Web Worker Fractal Example, Canvas and SVG appending to documents, Working with Files and Canvas, Graphics editing, Filesystem, Graphics missing, Debugging Manifest Files progressive drawing, Web Worker Fractal Example scaling example, Functional Programming streaming video, Filesystem SVG, Canvas and SVG use of src attribute, Blobs, Graphics user access to, Drag-and-Drop <img> tag, Graphics, Accessibility Through WAI-ARIA importScripts() method, The Worker Environment independent event loops, Splitting Up Work Through Web Workers index() method, Adding Indexes IndexedDB, Adding Power to Web Applications, IndexedDB, Deleting Data indexes, adding/removing, Adding Indexes indexOf() method, Prototypes and How to Expand Objects info() method, Web Worker Fractal Example inner functions, Closures integration testing, Testing JavaScript Applications, Selenium interceptor methods, Expanding Functions with Prototypes Internet Explorer (IE), Microsoft, Array Iteration Operations, IndexedDB, A Pattern for Reuse of Multithread Processing iOS Selenium, testing applications for, Selenium RC and a Test Farm iPad/iPod/iPhone platform, Selenium RC and a Test Farm, A Pattern for Reuse of Multithread Processing isDuplicate() method, Deleting Data isElementPresent() method (Selenese API), Selenese Command Programming Interface isTextPresent() method (Selenese API), Selenese Command Programming Interface itemprop attribute, Microdata itemscope attribute, Microdata itemtype attribute, Microdata J JavaScript, JavaScript’s Triumph, JavaScript’s Triumph, Nonblocking I/O and Callbacks, Lambda Functions Are Powerful, Lambda Functions Are Powerful, Closures, Functional Programming, Functional Programming, Prototypes and How to Expand Objects, Prototypes and How to Expand Objects, Prototypes and How to Expand Objects, Expanding Functions with Prototypes, Prototypes and How to Expand Objects, Expanding Functions with Prototypes, Expanding Functions with Prototypes, Currying and Object Parameters, Array Iteration Operations, You Can Extend Objects, Too, You Can Extend Objects, Too, Testing JavaScript Applications, Splitting Up Work Through Web Workers, Worker Communication, JavaScript Tools You Should Know, JavaScript Tools You Should Know, JavaScript Tools You Should Know array iteration operations, Array Iteration Operations, You Can Extend Objects, Too closures, Closures currying and object parameters, Currying and Object Parameters expanding functions, Expanding Functions with Prototypes, Expanding Functions with Prototypes expanding objects, Prototypes and How to Expand Objects, Prototypes and How to Expand Objects extending objects, You Can Extend Objects, Too function statement and function expression, Lambda Functions Are Powerful functional programming in, Functional Programming, Functional Programming functions act as data in, Lambda Functions Are Powerful helpful tools for, JavaScript Tools You Should Know, JavaScript Tools You Should Know libraries, JavaScript’s Triumph nonblocking I/O and callbacks, Nonblocking I/O and Callbacks passing objects in Firefox, Worker Communication primitives in, Prototypes and How to Expand Objects prototypes, Prototypes and How to Expand Objects, Expanding Functions with Prototypes recent improvements in, JavaScript’s Triumph runtime event loop, Splitting Up Work Through Web Workers runtime model, Testing JavaScript Applications syntax checker, JavaScript Tools You Should Know JavaScript Patterns (Stefanov), The Power of JavaScript JavaScript: The Definitive Guide (Flanagan), The Power of JavaScript JavaScript: The Good Parts (Crockford), JavaScript’s Triumph, The Power of JavaScript, JavaScript Tools You Should Know jQuery, JavaScript’s Triumph, Functional Programming, DSt, jStore, IndexedDB, IndexedDB, Libraries for Web Workers DSt plug-in, DSt Hive extension, Libraries for Web Workers IndexedDB plug-in, IndexedDB, IndexedDB jStore plug-in, jStore library, JavaScript’s Triumph, Functional Programming jQuery Cookbook (Lindley), JavaScript’s Triumph JS2 mode, Emacs, JavaScript Tools You Should Know JSBeautifier, JavaScript Tools You Should Know JSLint, The Power of JavaScript, JavaScript Tools You Should Know JSMin, JavaScript Tools You Should Know JSON manifest file, Introduction to the Manifest File JsonStore object, Offline Loading with a Data Store jStore plug-in, jStore L Lambda Calculus, Lambda Functions Are Powerful lambda functions, Lambda Functions Are Powerful lastIndexOf(), Prototypes and How to Expand Objects libraries, JavaScript’s Triumph, Libraries for Web Workers Lindley, Cody, JavaScript’s Triumph Lisp lambdas, Lambda Functions Are Powerful list recursion, Functional Programming list test examples, Testing JavaScript Applications load() method (JSON), Offline Loading with a Data Store local data storage, Adding Power to Web Applications, Local Storage, jStore local state provider, ExtJS, Using localStorage in ExtJS localStorage object, The localStorage and sessionStorage Objects, Using localStorage in ExtJS, Storing Changes for a Later Server Sync location object, The Worker Environment loops, Array Iteration Operations, Selenium Commands, Splitting Up Work Through Web Workers, Splitting Up Work Through Web Workers, A Pattern for Reuse of Multithread Processing for, Array Iteration Operations independent event, Splitting Up Work Through Web Workers runtime event, Splitting Up Work Through Web Workers and Selenium, Selenium Commands while, A Pattern for Reuse of Multithread Processing M macros, recording web, Selenium Mandelbrot computation examples, Web Worker Fractal Example, Web Worker Fractal Example manifest file, Adding Power to Web Applications, Introduction to the Manifest File, Debugging Manifest Files map() method, Array Iteration Operations, You Can Extend Objects, Too match() method, Prototypes and How to Expand Objects McCarthy, John, Lambda Functions Are Powerful MD5 checksum, Updates to the Manifest File memory leaks, Selenium Commands <meter> tag, Tags for Applications microdata, Microdata Microsoft Internet Explorer, Array Iteration Operations, IndexedDB, A Pattern for Reuse of Multithread Processing minification, JavaScript Tools You Should Know MongoDB, IndexedDB mouseDown command (Selenium), Selenium Commands mouseOver command (Selenium), Selenium Commands mouseUp command (Selenium), Selenium Commands MozBlobBuilder, Blobs Mozilla Firefox, Prototypes and How to Expand Objects (see Firefox, Mozilla) mozSlice() method (Firefox), Blobs MozWebSockets, Web Sockets N name/value pairs, Currying and Object Parameters navigator object, The Worker Environment .NET/CLR, Web Socket Example NETWORK section, manifest, Structure of the Manifest File node trees, You Can Extend Objects, Too Node.js, Web Sockets, Web Socket Example, JavaScript Tools You Should Know nonblocking I/O, Nonblocking I/O and Callbacks :not() CSS selector, New CSS notDeepEqual() method, Testing with QUnit notEqual() method, Testing with QUnit notStrictEqual() method, Testing with QUnit noupdate events, Events :nth-child() CSS selector, New CSS number form input, New Form Types numbers, Lambda Functions Are Powerful, Prototypes and How to Expand Objects, Prototypes and How to Expand Objects, Expanding Functions with Prototypes, Expanding Functions with Prototypes, Expanding Functions with Prototypes, Array Iteration Operations cubing example, Expanding Functions with Prototypes Fibonacci calculation examples, Expanding Functions with Prototypes, Expanding Functions with Prototypes as objects, Prototypes and How to Expand Objects squaring examples, Lambda Functions Are Powerful, Prototypes and How to Expand Objects, Array Iteration Operations O object parameters, Currying and Object Parameters object stores, IndexedDB, IndexedDB, Retrieving Data objects, extending, You Can Extend Objects, Too Offer-Aggregates specs, Microdata Offers specs, Microdata offline use, Local Storage, Offline Loading with a Data Store, Offline Loading with a Data Store, Storing Changes for a Later Server Sync, Introduction to the Manifest File, Debugging Manifest Files data access, Local Storage loading with data store, Offline Loading with a Data Store, Offline Loading with a Data Store manifest file, Introduction to the Manifest File, Debugging Manifest Files storing changes, Storing Changes for a Later Server Sync ok() method, Testing with QUnit onclose() method, The Web Sockets Interface onmessage() method, Using Web Workers, Web Worker Fractal Example, A Pattern for Reuse of Multithread Processing onopen() method, The Web Sockets Interface open command (Selenium), Selenium Commands Opera Dragonfly, JavaScript’s Triumph, A Pattern for Reuse of Multithread Processing Organizations specs, Microdata outer functions, Closures O’Reilly Answers website, The Power of JavaScript O’Sullivan, Bryan, Functional Programming P parameter blocks, Currying and Object Parameters path() method, You Can Extend Objects, Too pattern for reuse of multithread processing, A Pattern for Reuse of Multithread Processing, A Pattern for Reuse of Multithread Processing Payne, Alex, Functional Programming People specs, Microdata persistent local storage, Local Storage PhoneGap, Developing Web Applications PHP, Lambda Functions Are Powerful phpUnderControl, Selenese Command Programming Interface PHPUnit testing framework, Automatically Running Tests, Selenese Command Programming Interface pixel, drawing a, Web Worker Fractal Example PollenJS library, Libraries for Web Workers populate_form() method (DSt), DSt populating (example), Prototypes and How to Expand Objects port 443 (wss), Setting Up a Web Socket port 80 (ws), Setting Up a Web Socket port 8080, Web Socket Example $.post() method (Hive API), Libraries for Web Workers postMessage() method, Using Web Workers, A Pattern for Reuse of Multithread Processing, A Pattern for Reuse of Multithread Processing Powers, Shelley, Audio and Video PreloadStore object (JSON), Offline Loading with a Data Store pretty printer example, JavaScript Tools You Should Know primitives in JavaScript, Prototypes and How to Expand Objects Products specs, Microdata Programming Scala (Wampler & Payne), Functional Programming progress events, Events progress indicator examples, Putting It All Together, Tags for Applications <progress> tag, Tags for Applications prompt() method, Nonblocking I/O and Callbacks prototype object, Prototypes and How to Expand Objects prototypes, expanding functions with, Expanding Functions with Prototypes, Expanding Functions with Prototypes Q query access, IndexedDB QUnit, Testing JavaScript Applications, QUnit, Testing with QUnit, Running QUnit from Selenium R race conditions, IndexedDB, Splitting Up Work Through Web Workers raises() method, Testing with QUnit rar files, Drag-and-Drop RC server, Selenium, Selenium readAsArrayBuffer() method (FileReader), Working with Files readAsBinaryString() method (FileReader), Working with Files readAsText() method (FileReader), Working with Files readDataAsURL() method (FileReader), Working with Files Real World Haskell (Goerzen & Stewart), Functional Programming recall() method (DSt), DSt reduce(), reduceRight() methods, Array Iteration Operations refresh command (Selenium), Selenium Commands remove() method, Deleting Data replace() method, Prototypes and How to Expand Objects required attribute (forms), New Form Types Resig, John, JavaScript’s Triumph reuse of multithread processing, A Pattern for Reuse of Multithread Processing, A Pattern for Reuse of Multithread Processing Review-Aggregates specs, Microdata Reviews specs, Microdata revokeBlobURL() method, Blobs Rhino, JavaScript Tools You Should Know role attribute, Accessibility Through WAI-ARIA route finder, Maps Ruby, Web Socket Example Ruby Event Machine, Ruby Event Machine run function example, A Pattern for Reuse of Multithread Processing run() method, Web Worker Fractal Example running average example, Array Iteration Operations runtime model, JavaScript, Testing JavaScript Applications S Safari Nightly builds, Blobs Safari, Apple’s, JavaScript’s Triumph, A Pattern for Reuse of Multithread Processing, Libraries for Web Workers, Web Sockets same origin policy, The localStorage and sessionStorage Objects, IndexedDB sandboxed environment, Developing Web Applications, Filesystem save queue examples, Storing Changes for a Later Server Sync Scala, Web Socket Example Scalable Vector Graphics, Canvas and SVG scaling images, Functional Programming scope, Lambda Functions Are Powerful, Closures <script> tag, The Worker Environment search form input, New Form Types Selenese, Selenium Commands, Selenese Command Programming Interface, Selenese Command Programming Interface Selenium, Testing JavaScript Applications, Selenium, Selenium, Selenium, Selenium Commands, Selenium Commands, Selenium Commands, Constructing Tests with the Selenium IDE, Constructing Tests with the Selenium IDE, Automatically Running Tests, Automatically Running Tests, Selenese Command Programming Interface, Selenese Command Programming Interface, Running QUnit from Selenium, Drag-and-Drop automatically running tests, Automatically Running Tests, Automatically Running Tests commands, Selenium Commands, Selenium Commands constructing tests, Constructing Tests with the Selenium IDE example tests, Selenium IDE, Selenium, Constructing Tests with the Selenium IDE location options, Selenium Commands no drag-and-drop, Drag-and-Drop running QUnit from, Running QUnit from Selenium Selenese, Selenese Command Programming Interface, Selenese Command Programming Interface test table, Selenium Selenium Grid, Automatically Running Tests Selenium RC server, Selenium, Selenium RC and a Test Farm self object, The Worker Environment Sencha ExtJS library, JavaScript’s Triumph send(“data”) method, The Web Sockets Interface server delay, Developing Web Applications, Web Sockets server polling, Web Sockets server-side testing, Testing JavaScript Applications, Testing JavaScript Applications, Selenese Command Programming Interface sessionStorage object, The localStorage and sessionStorage Objects, Using localStorage in ExtJS setInterval() method, Expanding Functions with Prototypes, The Worker Environment setTimeout() method, Expanding Functions with Prototypes, Testing with QUnit, The Worker Environment, A Pattern for Reuse of Multithread Processing, A Pattern for Reuse of Multithread Processing setVersion transaction, Adding Indexes side effects, Functional Programming single-step mode, Selenium sleep() method, Selenese Command Programming Interface slice() method, Prototypes and How to Expand Objects, Blobs slider, form, New Form Types smartphones, Developing Web Applications some() method, Array Iteration Operations Souders, Steve, JavaScript Tools You Should Know speech input type, New Form Types Speed Tracer, JavaScript Tools You Should Know speed, data storage and, Local Storage split() method, Prototypes and How to Expand Objects SQL Injection attacks, IndexedDB SQLite versus IndexedDB, IndexedDB squaring numbers example, Prototypes and How to Expand Objects src attribute, Audio and Video StackOverflow website, The Power of JavaScript startWorker() method, Web Worker Fractal Example static data storage, Local Storage Stefanov, Stoyan, The Power of JavaScript step through, Selenium, Testing and Debugging Web Workers Stewart, Donald Bruce, Functional Programming stock price examples, Web Socket Example, Ruby Event Machine stop() method, A Pattern for Reuse of Multithread Processing storage events, The localStorage and sessionStorage Objects storage viewer widget, The localStorage and sessionStorage Objects $.storage() method (Hive API), Libraries for Web Workers store() method (DSt), DSt store_form() method (DSt), DSt strictEqual() method, Testing with QUnit string token replacement, Prototypes and How to Expand Objects strings, methods for, Prototypes and How to Expand Objects Structure and Interpretation of Computer Programs (Abelson & Sussman), Functional Programming structured data, query access to, IndexedDB subclassing, Currying and Object Parameters Sussman, Gerald Jay, Functional Programming SVG, Canvas and SVG <svg> tag, Graphics Symfony Yaml Library, Updates to the Manifest File T tar files, Drag-and-Drop TCP socket, Web Socket Protocol TCP/IP sockets, Web Sockets tel form input, New Form Types test machines, Automatically Running Tests test suites, Testing JavaScript Applications, QUnit, Testing with QUnit, Selenium, Selenium RC and a Test Farm, Selenium, Selenium, Running QUnit from Selenium programming language based, Selenium QUnit, QUnit, Testing with QUnit, Selenium, Running QUnit from Selenium Selenium, Selenium, Selenium RC and a Test Farm server-side, Testing JavaScript Applications Test-driven development, Testing JavaScript Applications, Testing JavaScript Applications thread safety, Splitting Up Work Through Web Workers threads, Adding Power to Web Applications time form input, New Form Types title attribute, Accessibility Through WAI-ARIA transaction object, database, IndexedDB transaction.abort() method (jQuery), IndexedDB transaction.done() method (jQuery), IndexedDB True Type font files, New CSS type command (Selenium), Selenium Commands U undefined value, Functional Programming unit testing, Testing JavaScript Applications update() method (IndexedDB), Adding and Updating Records updateEach() method (IndexedDB), Adding and Updating Records uploading files, Uploading Files URLs, QUnit, Blobs, Working with Files, Structure of the Manifest File, Events, Events, Debugging Manifest Files, The Web Sockets Interface adding ?
Going Dark: The Secret Social Lives of Extremists
by
Julia Ebner
Published 20 Feb 2020
‘For every one hundred young boys watching Pornhub.com, there would be one or two who could be recruited into their ranks,’ write former CIA agent Malcolm Nance and cyber-security expert Chris Sampson in Hacking ISIS.3 Cyber literacy was a welcome by-product that these young members would bring to jihadist organisations, and ironically it often came from their obsession with internet porn and online gaming.4 ‘Okay, some of you wanted to know more about device-hacking techniques,’ Mahed writes. SQL injections, he explains, are among the most commonly used hacking techniques. By injecting a code into a system, it allows you to take control of a database server. It can be used to steal and edit data or to destroy a database. Our teachers share a screenshot of a database, explaining that you could make the system spit out the following personal and financial details of its users: • Full name • Address • Email address • Payment-card number (PAN) • Expiration date • Card security code (CVV) A few days later, a frosty Monday in November 2017, I spill hot coffee over the keyboard of my laptop when opening the International Business Times to check the latest news.
…
A pro-ISIS group operating under the name Team System DZ has claimed responsibility for the cyberattacks. ‘The FBI is trying to determine who was behind the hack,’ according to the article.5 In a public statement the host of the compromised websites, SchoolDesk, speculated that the attack might have been an SQL injection. Our technical staff discovered that a small file had been injected into the root of one of the SchoolDesk websites, redirecting approximately 800 school and district websites to an iFramed YouTube page containing an audible Arabic message, unknown writing and a picture of Saddam Hussein. A few days later, the Prince Albert Police website is hacked as well, by the same group.
…
M. here, here b4bo here bin Laden, Osama here, here, here birthrates here, here Bissonnette, Alexandre here, here BitChute here bitcoin here, here, here Blissett, Luther here Bloc Identitaire here blockchain technology here bloggers here Blood & Honour here Bloom, Mia here Bloomberg, Michael here Böhmermann, Jan here Bowers, Robert here Breed Them Out here Breitbart here, here, here Breivik, Anders Behring here, here ‘Brentonettes’ here Brewer, Emmett here Brexit here, here Britain First here British National Party (BNP) here, here, here Broken Heart operation here Brown, Dan here Bubba Media here Bumble here, here Bundestag hack here, here BuzzFeed here C Star here, here ‘Call of Duty’ here, here Cambridge Analytica here, here Camus, Renaud here Carroll, Lewis here CBS here Channel programme here Charleston church shooting here Charlie Hebdo here Charlottesville rally here, here, here, here, here, here, here, here, here Chemnitz protests here, here Choudary, Anjem here Christchurch terror attacks here, here, here, here Christian identity here Chua, Amy here CIA here, here, here Clinton, Bill and Hillary here, here, here, here, here, here, here Cohn, Norman here Collett, Mark here Cologne rape crisis here Combat here, here Comey, James here Comvo here concentration camps here Conrad, Klaus here Conservative Political Action Conference here Constitution for the Ethno-State here Corem, Yochai here counter-extremism legislation here counter-trolling here Covington, Harold here Crash Override Network here Crusius, Patrick here cryptocurrencies here, here, here, here Cuevas, Joshua here Cyberbit here Cyborgology blog here ‘Daily Shoah’ podcast here Daily Stormer here, here, here, here, here, here, here, here, here Weev and here Damore, James here Dark Net here Data and Society Research Institute here Davey, Jacob here Dawkins, Richard here, here De La Rosa, Veronique here de Turris, Gianfranco here Dearden, Lizzie here deep fakes here, here DefCon here, here Der Spiegel here Deutsche Bahn here Diana, Princess of Wales here, here Die Linke here Die Rechte here ‘digital dualism’ here digital education here disinformation here, here, here Disney here Domestic Discipline here, here Donovan, Joan here Doomsday preppers here doubling here Dox Squad here, here doxxing here, here, here, here, here Doyle, Laura here, here Draugiem here DTube here Dugin, Alexander here Dunning–Kruger Effect here Dutch Leaks here Dylan, Bob here Earnest, John here 8chan here, here, here, here, here, here, here, here EKRE (Estonian fascist party) here El Paso shooting here Element AI here Emanuel, Rahm here encryption and steganography here Encyclopedia Dramatica here English Defence League here, here, here, here Enoch, Mike here environmentalism here, here ethno-pluralism here, here ‘Eurabia’ here, here ‘European Israel’ here European National here European Parliament elections here European Spring here Evola, Julius here executions here Facebook friends here fashions and lifestyles here, here Fawcett, Farah here Faye, Guillaume here FBI here, here, here, here, here Fearless Democracy here, here FedEx here Feldman, Matthew here Ferdinand II, King of Aragon here Fiamengo, Janice here Fields, James Alex here Fight Club here Finkelstein, Robert here Finsbury Mosque attack here, here, here Fisher, Robert here Foley, James here Follin, Marcus here football hooligans here, here Football Lads Alliance (FLA) here For Britain party here Fortnite here 4chan here, here, here, here, here, here, here, here, here FPÖ (Austrian Freedom Party) here, here, here, here, here Frankfurt School here Fransen, Jayda here Fraternal Order of Alt-Knights here Freedom Fighters, The here freedom of speech here, here, here, here F-Secure here FSN TV here Gab here, here, here, here, here, here Gamergate controversy here GamerGate Veterans here gamification here, here, here, here, here, here, here, here Ganser, Daniele here Gates of Vienna here Gateway Pundit here Gawker here GCHQ here GE here GellerReport here Generation Identity (GI) here, here, here, here, here, here, here, here Generation Islam here genetic testing here, here German elections here, here German Institute on Radicalization and De-Radicalization Studies here German National Cyber Defence Centre here Gervais, Ricky here Ghost Security here Giesea, Jeff here Gigih Rahmat Dewa here Gionet, Tim here gladiators here Global Cabal of the New World Order here global financial crisis here, here global warming here GNAA here Goatse Security here GOBBLES here Goebbels, Joseph here GoFundMe here Goldy, Faith here Goodhart, David here ‘Google’s Ideological Echo Chamber’ here Gorbachev, Mikhail here Graham, Senator Lindsey here Gratipay here Great Awakening here, here Great Replacement theory here, here, here, here, here ‘Grievance Studies’ here grooming gangs here, here Guardian here, here H., Daniel here Habeck, Robert here HackerOne here hackers and hacking here ‘capture the flag’ operations here, here denial of service operations here ethical hacking here memory-corruption operations here political hacking here ‘qwning’ here SQL injections here techniques here Halle shooting here Hamas here, here Hanks, Tom here Happn here Harris, DeAndre here ‘hashtag stuffing’ here Hate Library here HateAid here, here Hatreon here, here, here Heidegger, Martin here Heise, Thorsten here, here Hensel, Gerald here, here Herzliya International Institute for Counter-Terrorism here Heyer, Heather here, here, here Himmler, Heinrich here Hintsteiner, Edwin here Histiaeus here Hitler, Adolf here, here, here, here, here Mein Kampf here, here Hitler salutes here, here, here, here Hitler Youth here HIV here Hizb ut-Tahrir here, here, here Höcker, Karl-Friedrich here Hofstadter, Richard here Hollywood here Holocaust here Holocaust denial here, here, here, here, here Holy War Hackers Team here Home Office here homophobia here, here, here Hooton Plan here Hoover Dam here Hope Not Hate here, here, here Horgan, John here Horowitz Foundation here Hot or Not here House of Saud here Huda, Noor here human trafficking here, here Hussein, Saddam here, here Hutchins, Marcus here Hyppönen, Mikko here Identity Evropa here, here iFrames here Illuminati here Incels (Involuntary Celibacy) here, here Independent here Inkster, Nigel here Institute for Strategic Dialogue (ISD) here, here, here, here, here, here, here, here Intelius here International Business Times here International Centre for the Study of Radicalisation (ICSR) here International Federation of Journalists here International Holocaust Memorial Day here International Institute for Strategic Studies here Internet Research Agency (IRA) here iPads here iPhones here iProphet here Iranian revolution here Isabella I, Queen of Castile here ISIS here, here, here, here, here, here, here, here, here, here, here, here hackers and here, here, here, here, here Islamophobia here, here, here, here, here, here, here Tommy Robinson and here, here see also Finsbury Mosque attack Israel here, here, here, here, here Israel Defense Forces here, here Jackson, Michael here jahiliyya here Jakarta attacks here Jamaah Ansharud Daulah (JAD) here Japanese anime here Jemaah Islamiyah here Jesus Christ here Jewish numerology here Jews here, here, here, here, here, here, here, here, here see also anti-Semitism; ZOG JFG World here jihadi brides here, here JihadWatch here Jobs, Steve here Johnson, Boris here Jones, Alex here Jones, Ron here Junge Freiheit here Jurgenson, Nathan here JustPasteIt here Kafka, Franz here Kampf der Niebelungen here, here Kapustin, Denis ‘Nikitin’ here Kassam, Raheem here Kellogg’s here Kennedy, John F. here, here Kennedy family here Kessler, Jason here, here Khomeini, Ayataollah here Kim Jong-un here Kohl, Helmut here Köhler, Daniel here Kronen Zeitung here Kronos banking Trojan here Ku Klux Klan here, here Küssel, Gottfried here Lane, David here Le Loop here Le Pen, Marine here LeBretton, Matthew here Lebron, Michael here Lee, Robert E. here Li, Sean here Li family here Libyan Fighting Group here LifeOfWat here Lifton, Robert here Littman, Gisele here live action role play (LARP) here, here, here, here, here, here lobbying here Lokteff, Lana here loneliness here, here, here, here, here, here, here Lorraine, DeAnna here Lügenpresse here McDonald’s here McInnes, Gavin here McMahon, Ed here Macron, Emmanuel here, here, here, here MAGA (Make America Great Again) here ‘mainstream media’ here, here, here ‘Millennium Dawn’ here Manosphere here, here, here March for Life here Maria Theresa statue here, here Marighella, Carlos here Marina Bay Sands Hotel (Singapore) here Marx, Karl here Das Kapital here Masculine Development here Mason, James here MAtR (Men Among the Ruins) here, here Matrix, The here, here, here, here May, Theresa here, here, here Meechan, Mark here Meme Warfare here memes here, here, here, here and terrorist attacks here Men’s Rights Activists (MRA) here Menlo Park here Mercer Family Foundation here Merkel, Angela here, here, here, here MGTOW (Men Going Their Own Way) here, here, here MI6, 158, 164 migration here, here, here, here, here, here, here, here, here see also refugees millenarianism here Millennial Woes here millennials here Minassian, Alek here Mindanao here Minds here, here misogyny here, here, here, here, here see also Incels mixed martial arts (MMA) here, here, here, here Morgan, Nicky here Mounk, Yascha here Movement, The here Mueller, Robert here, here Muhammad, Prophet here, here, here mujahidat here Mulhall, Joe here MuslimCrypt here MuslimTec here, here Mussolini, Benito here Naim, Bahrun here, here Nance, Malcolm here Nasher App here National Action here National Bolshevism here National Democratic Party (NPD) here, here, here, here National Health Service (NHS) here National Policy Institute here, here National Socialism group here National Socialist Movement here National Socialist Underground here NATO DFR Lab here Naturalnews here Nawaz, Maajid here Nazi symbols here, here, here, here, here, here, here see also Hitler salutes; swastikas Nazi women here N-count here Neiwert, David here Nero, Emperor here Netflix here Network Contagion Research Institute here NetzDG legislation here, here Neumann, Peter here New Balance shoes here New York Times here News Corp here Newsnight here Nietzsche, Friedrich here, here Nikolai Alexander, Supreme Commander here, here, here, here, here, here 9/11 attacks here, here ‘nipsters’ here, here No Agenda here Northwest Front (NWF) here, here Nouvelle Droite here, here NPC meme here NSDAP here, here, here Obama, Barack and Michelle here, here, here, here, here Omas gegen Rechts here online harassment, gender and here OpenAI here open-source intelligence (OSINT) here, here Operation Name and Shame here Orbán, Viktor here, here organised crime here Orwell, George here, here Osborne, Darren here, here Oxford Internet Institute here Page, Larry here Panofsky, Aaron here Panorama here Parkland high-school shooting here Patreon here, here, here, here Patriot Peer here, here PayPal here PeopleLookup here Periscope here Peterson, Jordan here Pettibone, Brittany here, here, here Pew Research Center here, here PewDiePie here PewTube here Phillips, Whitney here Photofeeler here Phrack High Council here Pink Floyd here Pipl here Pittsburgh synagogue shooting here Pizzagate here Podesta, John here, here political propaganda here Popper, Karl here populist politicians here pornography here, here Poway synagogue shooting here, here Pozner, Lenny here Presley, Elvis here Prideaux, Sue here Prince Albert Police here Pro Chemnitz here ‘pseudo-conservatives’ here Putin, Vladimir here Q Britannia here QAnon here, here, here, here Quebec mosque shooting here Quilliam Foundation here, here, here Quinn, Zoë here Quran here racist slurs (n-word) here Radio 3Fourteen here Radix Journal here Rafiq, Haras here Ramakrishna, Kumar here RAND Corporation here Rasmussen, Tore here, here, here, here Raymond, Jolynn here Rebel Media here, here, here Reconquista Germanica here, here, here, here, here, here, here Reconquista Internet here Red Pill Women here, here, here, here, here Reddit here, here, here, here, here, here, here, here, here, here redpilling here, here, here, here refugees here, here, here, here, here Relotius, Claas here ‘Remove Kebab’ here Renault here Revolution Chemnitz here Rigby, Lee here Right Wing Terror Center here Right Wing United (RWU) here RMV (Relationship Market Value) here Robertson, Caolan here Robinson, Tommy here, here, here, here, here, here, here, here Rockefeller family here Rodger, Elliot here Roof, Dylann here, here Rosenberg, Alfred here Rothschilds here, here Rowley, Mark here Roy, Donald F. here Royal Family here Russia Today here, here S., Johannes here St Kilda Beach meeting here Salafi Media here Saltman, Erin here Salvini, Matteo here Sampson, Chris here, here Sandy Hook school shooting here Sargon of Akkad, see Benjamin, Carl Schild & Schwert rock festival (Ostritz) here, here, here Schilling, Curt here Schlessinger, Laura C. here Scholz & Friends here SchoolDesk here Schröder, Patrick here Sellner, Martin here, here, here, here, here, here, here, here, here, here Serrano, Francisco here ‘sexual economics’ here SGT Report here Shodan here, here Siege-posting here Sleeping Giants here SMV (Sexual Market Value) here, here, here Social Justice Warriors (SJW) here, here Solahütte here Soros, George here, here Sotloff, Steven here Southern, Lauren here Southfront here Spencer, Richard here, here, here, here, here, here Spiegel TV here spoofing technology here Sputnik here, here SS here, here Stadtwerke Borken here Star Wars here Steinmeier, Frank-Walter here Stewart, Ayla here STFU (Shut the Fuck Up) here Stormfront here, here, here Strache, H.
Humble Pi: A Comedy of Maths Errors
by
Matt Parker
Published 7 Mar 2019
It looks like I fell asleep on my keyboard but it is actually a fully functional computer program that will scan through a database without needing to know how it is arranged. It will simply hunt down all the entries in the database and make them available to whoever managed to sneak that code into the database in the first place. It’s yet another example of online humans being jerks. Typing it in as someone’s name is not a joke either. This is known as an SQL injection attack (named after the popular database system SQL; sometimes pronounced like ‘sequel’). It involves entering malicious code via the URL of an online form and hoping whoever is in charge of the database has not put enough precautions in place. It’s a way to hack and steal someone else’s data.
…
It’s really easy to convert between binary and base-16 hexadecimal numbers, which is why hexadecimal is used to make computer binary a bit more human-friendly; the hexadecimal 4C47 represents the full binary number 0100110001000111 but is much easier to read. You can think of hexadecimal as binary in disguise. It was used in the SQL injection example before, to hide computer code in plain sight. The mistake is to try to store computer data which uses hexadecimal values in Excel, a mistake I’m as guilty of as anyone. I had to store hexadecimal values in a spreadsheet of people who had crowdfunded my online videos, and Excel immediately turned them all to text.
…
Advertising Standards Authority: 198.64179–199.41791 Air Canada: 85.20896–85.23881, 87.41791–88.20896 Air Force: 69.00000–73.94216, 286.62687–287.50746 air traffic control: 302.95522–304.00000 Ariane rocket: 20.41791, 26.56530–27.26866, 29.80597–30.59515 attractiveness scores: 68.35821–69.53731 average Australian: 74.62687–75.95522, 77.56716–77.82090 Avery Blank: 258.02985–258.17910 Benford’s Law: 36.80597–40.95709 Big data: 259.62687–259.74627 big enough: 27.17910–27.47761, 40.50746, 73.14925, 171.20896, 196.02799, 303.08955–304.44776, 313.86567 big number: 47.74627, 203.55224, 253.77612, 289.05970, 310.00000–311.72948, 313.80597 Bill Gates: 141.11940, 160.05970 billion seconds: 290.23881–291.44776, 310.23881–310.41791 binary number: 34.29851, 179.38806–180.50746, 182.35821, 185.35821, 189.41791, 191.08955–191.53731, 246.23881, 250.05970–251.92537, 290.35821, 292.65672–292.80597, 301.00000 brewing beer: 147.89552, 149.59701 Brian Test: 258.02985–258.86567 Casio fx: 11.74627, 175.00000 cause problems: 81.23881, 122.29851–122.56716, 144.20896, 249.56716 CEO pay: 131.53731–131.62687, 133.29851–133.41791 cheese slices: 4.74627, 102.95709 classic football: 235.80597, 238.44776 clay tablets: 149.50746–150.89552 clocks going: 112.08955–114.71642 clockwise anticlockwise: 216.26866 Cluster mission: 29.68657–30.86381 computer code: 29.89552, 123.59701, 187.57836, 189.17910, 191.23881–192.68657, 250.14925, 256.56716, 261.80597, 289.53731, 302.38806–303.02985 constant width: 221.59701–222.86567 cot death: 167.56716–167.68657 crescent moon: 229.11940–231.92537 Datasaurus Dozen: 56.53731, 67.95709–68.92537 Date Line: 286.02985–286.56716 daylight saving: 64.11940–65.92537, 112.05970–114.83582 deliberately vague: somewhere in 7 to 10 and maybe 74 Dice O Matic: 52.20896–52.44776 diehard package: 34.00000–35.86567 Dow Jones: 125.29851, 143.00000–144.47761 drug trial: 62.47761–63.53731 electron beam: 184.05970, 186.53731–186.83582 expensive book: 141.14925–142.86567 explain why: 16.68657, 208.11940, 312.86567–312.89552 false positive: 64.62687, 154.74627, 247.08955, 252.50746, 276.86567, 301.20896, 308.26866–309.86567 fat fingers error: 143.44776, 150.11940 feedback loop: 144.35821, 268.08955–269.86567, 274.26866, 276.02985–277.80597 fence post problem: 208.83582–209.47761 Fenchurch Street: 280.20896, 282.35821–283.38806 fibre optic cable: 136.32836–136.89552, 138.08955 flash crash: 143.41791–144.38806 foot doughnut: 235.50933–235.71642 frigorific mixture: 92.44776–92.68657 fuel gauges: 85.05970–87.92537 full body workout: 213.71642–213.95522 functional sausage: 123.44776–123.47761 gene names: 247.00000–248.47761 Gimli Glider: 82.11940–83.68657 GOOD LUCK: 25.74627, 83.47761–84.83582, 288.74440 Gregorian calendar: 288.14925–288.41791, 293.08955–296.92537 Grime Dice: 160.56716–160.90672, 162.14925 Harrier Jet: 311.00000–313.65672 heart attacks: 64.11940–65.89552, 112.11940–114.92537 high frequency trading: 142.53731, 145.44776–146.71642 Hot Cheese: 1.92537, 4.59701 human brains: 149.05970, 159.00000, 266.29851, 308.02985–309.44776 Hyatt Regency: 264.80597, 267.95522 International Date Line: 286.02985–286.56716 JPMorgan Chase: 240.80597–241.68657 Julian calendar: 295.23881–297.59701 Julius Caesar: 206.32836–206.89552, 293.80597, 297.17910–298.56716 Kansas City: 264.80597–264.86567, 267.95522 lava lamps: 31.27612–32.82090 leap years: 206.29851, 288.47761, 295.26866, 297.00000–298.86567 Lego bricks: 201.05970–202.95522 long enough: 55.77425–56.83582, 99.20896–99.26866, 171.08955, 194.50746, 246.86567, 303.44776 Los Angeles: 255.17910–255.65672, 302.20896, 305.53731 magic squares: 6.11940–7.89552 Mars Climate Orbiter: 96.80410–97.87873 maths error: 9.91418, 11.29851, 28.56716, 79.62687, 97.75933, 147.50746, 175.00000, 181.56716, 245.68657, 304.29851–305.86567 maths mistake: 7.35821–8.11940, 89.65672, 149.80597, 174.11940–175.65672, 181.59701, 206.65672–206.71642, 210.14925, 259.71642, 264.41791, 300.08955, 305.47761, 308.74627–308.95522 McChoice Menu: 197.02985, 199.44776–200.00000 Millennium Bridge: 269.68657–269.83582, 274.26866–277.41791, 280.00000–281.68657 mobile phone masts: 57.19403–59.92537 most important: 6.00000, 190.62687 Mr Average: 74.14925–75.53731 NBA players: 166.17910–166.44776 non transitive dice: 141.17910, 160.00000–162.14925 non zero: 157.35821, 183.20896, 185.17910, 192.74627 Null Island: 254.50746–255.95522 null meal: 197.74627, 199.26866–200.50746 oddly specific: 117.2089552238806, 139.1194029850746, 190.5074626865672, 245.7462686567164 off-by-one errors: 206.53731, 209.65672–211.17910 Olympic Games: 293.26866, 300.14925 Parker Square: 5.05970–6.89552 Penney Ante: 163.32836–163.38806 Pepsi Points: 311.69963–313.71642 phone masts: 57.19403–59.92537 plug seal: 100.90672 Pseudorandom number generators: 43.74627–44.44776, 47.56716–48.86567 punch cards: 73.74627–73.77612, 75.08955–76.74627 real world: 38.34328–40.23881, 121.38806, 123.80597, 145.95522, 208.02985, 241.26866, 244.53731–245.95522 resonant frequencies: 269.68657–269.74627, 274.23881–280.95522 Richard Feynman: 154.95522, 222.00000–223.29851 rm -rf: 23.08955–23.47761 Rock Paper Scissors: 163.50746–163.71642 roll over errors: 25.38806, 180.32836, 184.53731, 190.26866–191.20896 Royal Mint: 215.92537–216.85075 salami slicing: 121.35821–123.80597 salt mines: 232.00000–233.80597 scientific notation: 249.32836–250.77612, 252.00000–253.83582 Scud missile: 179.00000–181.65672 sea level: 89.17910–89.83582 seemingly arbitrary: 53.20896, 190.38806, 296.86567, 302.23881 Sesame Street: 230.00000–230.26866 should open: 225.53731–225.62687, 227.41791–227.44776 skin tight garments: 73.23881–73.26866 something else: 57.29851, 64.65672, 247.02985, 270.11940, 301.14925 Space Invaders: 19.59701–21.83396 space shuttle: 4.10448–4.34328, 152.26866–153.50746, 223.14925–223.35821, 230.80597–230.89552 SQL injection: 250.11940, 256.02985 standard deviation: 34.35821–34.53731, 66.93097–68.83582, 73.65672, 132.86381 Steve Null: 258.00000–259.80597, 261.47761 Stock Exchange: 123.71642, 125.08955–125.65672, 145.29851–145.62687, 150.26866–151.92351 stock options: 131.14925–133.92537 street signs: 233.29851–234.38806, 238.35634–238.50746 survivor bias: 13.89552, 21.32836, 65.53731–66.77612 Swiss Cheese model: 4.56716–4.83582, 13.23881, 103.80597 synchronous lateral excitation: 276.38806–277.38806, 280.62687 T shirt: 5.00000–5.02985, 313.41791–313.62687 tabulating machines: 73.74627, 75.08955–76.65672 Tacoma Narrows Bridge: 268.47761–270.29851 tallest mountain: 115.11940–116.89552 tax return: 36.71642–36.80597, 38.61194, 41.05970–42.92537 Therac machine: 13.00000, 185.05970–185.80597 three cogs: 214.00000–215.74813, 219.02985–220.35821 Tokyo Stock Exchange: 125.38806, 150.26866–151.92351 torsional instability: 268.17910–271.89552 trading algorithms: 138.77612, 142.53731–142.74627, 144.23881–146.68657 Traffic Control: 302.20896–305.68657 Trump administration: 128.17910–129.89552 UK government: 52.47761, 233.23881–234.83582, 238.86567, 256.44776 UK lottery: 155.38806–155.86567, 159.41791–159.92537, 308.08955–308.14925 UK street signs: 238.35634–238.50746 US army: 179.77612, 181.68657–181.71642 USS Yorktown: 175.20896–175.77425 Vancouver Stock Exchange: 123.71642, 125.08955–125.65672 waka waka: 189.95709 went wrong: 27.62687, 29.80597, 84.41791, 134.80597–134.92537, 145.77612, 265.20896, 267.50746, 280.62687, 286.65672, 305.89552 Wobbly Bridge: 97.66978, 280.11940–280.41791 Woolworths locations: 60.00000–60.62687 world record: 119.11940–121.77612, 135.35261, 298.77612 wrong bolts: 99.02985–99.74627, 101.17910–102.70149, 104.77612 X rays: 185.86567–186.80597 THE BEGINNING Let the conversation begin … Follow the Penguin twitter.com/penguinukbooks Keep up-to-date with all our stories youtube.com/penguinbooks Pin ‘Penguin Books’ to your pinterest.com/penguinukbooks Like ‘Penguin Books’ on facebook.com/penguinbooks Listen to Penguin at soundcloud.com/penguin-books Find out more about the author and discover more stories like this at penguin.co.uk ALLEN LANE UK | USA | Canada | Ireland | Australia India | New Zealand | South Africa Allen Lane is part of the Penguin Random House group of companies whose addresses can be found at global.penguinrandomhouse.com.
Beautiful Testing: Leading Professionals Reveal How They Improve Software (Theory in Practice)
by
Adam Goucher
and
Tim Riley
Published 13 Oct 2009
When actual HTML, JavaScript, or CSS needs to be stored, whitelists are used instead of blacklists, as they control what is allowed rather than what is disallowed. SQL injection† This is another issue of trusting the client. Using raw user input to access your database directly is a Bad Idea. I will either manually correlate database interactions that I can control in the frontend with the actual code or use a script to pull out all the database calls and inspect that. SQL injection is a problem that has a known solution: parameterized SQL and/or diligent escaping so code inspection is a quick and efficient way of identifying this type of problem.
…
SQL injection is a problem that has a known solution: parameterized SQL and/or diligent escaping so code inspection is a quick and efficient way of identifying this type of problem. Appropriate permissions Can a user of a certain permission class do what they should be able to do? And only what they should be able to do? * http://en.wikipedia.org/wiki/Cross-site_scripting † http://en.wikipedia.org/wiki/SQL_injection 236 CHAPTER SEVENTEEN Information leakage Can a user access/view/modify information they should not be able to access? Consider a multitenant system with Coke and Pepsi as two of your clients. Clearly, Coke should not be able to see Pepsi’s information, and vice versa. L: Languages The next letter and layer of testing that is done centers around Languages.
…
Send email to index@oreilly.com. 323 central limit theorem, 135 change-centric testing, 143 caller and callee function dependencies, 147 code coverage and gap analysis, 151 complex code development models, 146–152 document-driven change-centric testing frameworks, 145 mapping of source files to test cases, 147 summary, 152–154 character palette, 237 Charmap, 237 chi-square test, 138 ClamAV (Clam Anti-Virus), 269–283 compatibility testing, 279 fuzz testing, 276 performance testing, 279 testing methods, 270–283 Autoconf, 278 black box and white box testing, 270 Buildbot, 278 collecting problem files, 278 false alerts, 281 memory checkers, 273–275 static analysis, 271–273 test scripts, 276 unit testing, 275 usability and acceptance testing, 282 clamd, 270 clamscan, 270 Clang Static Analyzer, 271 code coverage, 151, 247 code examples, xvii collaboration, 38, 190–193 command line, 108 communication, 27 compatibility testing of ClamAV, 279 compiler warnings, 125 condition coverage, 248 configuration of automated tests, planning, 108 conflict, 19 continuous integration, 106, 198, 288 Windmill testing framework, development for, 285 coordination, 29 Cosmo Web User Interface, 289 coverage metrics, 247 Coverity, 120 critical incident analysis, 51 cross-site scripting (XSS), 236 CruiseControl, 200 Cunningham, Ward, 177 D Dashboard (Socialtext product), 230 data, good versus bad, 11 324 INDEX DBA (dynamic binary analysis), 149 debriefings, 161 defects, 69 categories and examples, 215 defect reports, 70–77 structure, 71 defining, 70 development defects, 72 early development versus deployed product bugs, 71 location, finding, 72 measuring severity, 78 seeding for testing purposes, 247 tagged defect sets, 77 tagging, 76 test escapes, 78 Diderot Effect, 181 doc testing (Python), 123 DUMA, 274 dynamic analysis, 124 E eBox, 303–315 AJAX usage, 311 ANSTE, 304–307 modules, 303 new modules, difficulties of testing, 304 testing in new Linux releases, 304 efficient testing, 235 developer notes, accessing, 240 foreign languages, accommodating, 237, 240 measurement, 238 mindmaps, 242 mutation testing, 251 oracles, 241 regression testing, 239 requirements, 238 scripting, 239 security, 236 SLIME, 235 test data generation, 241 Electric Fence, 274 elegance, 18 EnableModules script, 310 engaging volunteers, 32 equivalent mutants, 253 events, 32 location and scheduling, 33 publicity, 33 exploratory testing, 161–163 eXtensible Messaging and Presence Protocol (see XMPP) Extreme Programming (see XP) F failure analysis, 114 false negatives, 281 false positives, 281 Firebug Lite, 291 Firefox, testing with Sisyphus, 297 Fit, 177 FitNesse, 201 FIXME developer notes, 240 foreign languages, 237 Fusil tool, 125, 277 fuzzing (fuzz testing), 57 ClamAV, 276 custom fuzzers, 63 Fusil tool, 125 general fuzzers, 61 interoperability, improving, 57 limitations, 65 ongoing testing, considerations, 65 preparation, 60 process, 60–65 purpose, 57 random fuzzing, 64 security flaws, detecting, 59 user satisfaction, improving, 58 using corrupted input, 61 working with known bugs, 60 G Gaussian distribution, 132 GCC, 271 Gecko rendering engine, 258 goal-question-metric approach, 23 H Huggins, Jason, 177 I incremental automation, 201 info/query stanzas, 86 information leakage, 237 initiator, 89 instrumented builds, 151 interoperability, 57 invalid input, testing with, 61 invalidation testing, 265 IRC (Internet relay chat), 27 J Jabber, 85 Javalanche, 255 JavaScript, 287 testing scripts, 297 JID (Jabber ID), 86 jsfunfuzz (JavaScript fuzzer), 63 JUnit, 200 K Kaner, Cem, 161 KCachegrind, 149, 150 Klocwork, 120 Knuth, Donald, 211 Komogorov-Smirnov test, 139 L large-scale automated testing, 104–106 choosing which tests to automate, 104 failure analysis, 114 reasonable goals, 115 reporting, 114 test automation systems, 105 test case management, 107 test collateral, 107 test distribution, 112 test infrastructure, 107 test labs, 111 test writing, 106 leadership (see coordination) libclamav, 270 Lightning, Thunderbird add-on, 27 Lint tool, 125 Lipton, Richard, 250 Lithium tool, 301 load testing, 238 lookupNode, 292 LOUD, 240 M manifest files, 259 Marick, Brian, 174, 195 Mark II computer, first bug, 68 mean test, 135 measurement, 238 medical software testing, 156 ad-hoc testing, 162 adding perspectives, 159 communication, 158 exploratory testing, 159, 161 multiuser testing, 160, 163–165 science labs, 165 scripted testing, 162, 165 simulation of real use, 166 teamwork, 157 testing according to regulations, 168–169 memory checkers, 273–275 Electric Fence and DUMA, 274 INDEX 325 limitations of, 275 Mudflap, 274 Valgrind, 273 memory leaks, 124 Mersenne Twister algorithm, 131 message stanzas, 86 Microsoft Office customer improvement program, 56 mindmaps, 242 Mothra, 250 Mozilla Calendar Project, 27 publicity, 33 quality assurance events, 34 Mozilla Project, 266 evolution of testing strategies, 257 Sisyphus and the Spider tool, 295–301 Mudflap pointer debugging tool, 274 µJava, 250 multi-code-base defect tracking, 74 multiuser testing, 163–165 mutation testing, 250–256 AspectJ example, 252 equivalent mutants, 253 evaluating mutation impacts, 254 growth in use, 255 Javalanche framework, 255 selecting mutation operators, 251 mutations, 250 MySQL, testing with ANSTE, 315 P network services testing, 303 (see also eBox; ANSTE) nonuniform random number generators, 132 normal distribution, 132 Pareto effect, 248 People (Socialtext product), 230 performance test cases, 41 performance testing, 37, 238 ClamAV, 279 collaboration, 38 examples, 42, 43, 45 value of different viewpoints, 46 defining a test model, 38–40 defining testing requirements, 38–45 documentation and problem solving, 48 UAT, 49 user interface and system load, 46 Pettichord, Bret, 176, 177 pipe functions, 150, 154 presence stanzas, 86 printing tests, 263 programming languages, 119 stability, 120 proportion, 181 proxy.xml, 308 pseudorandom number generators (see RNGs) publicity, 33 Python programming language, 120, 287 bug list, 128 testing, 121–127 bug fix releases, 124 Buildbot, 121 documentation testing, 123 dynamic analysis, 124 refleak testing, 122 release testing, 123 static analysis, 125 testing philosophy, 120 O Q N Occam’s Razor, 67 Office 2007, fuzz testing of, 59 office software, 55 user expectations, 55 open source communities communication, 27 coordination, 29 engaging, 32 events, 32 goals and rewards, 35 quality assurance using, 27 recruiting, 31 volunteers, 28 OpenSolaris Desktop case study, 79–83 opinion polling, 282 oracles, 241 326 INDEX QA (quality assurance), 178 open source communities, using, 27 queuing theory, 132 R random fuzzing, 64 random number generators (see RNGs) range tests, 134 recruiting, 31 reference testing, 257–266 manifest files, 259 test extensibility, 261–266 asynchronous tests, 262 invalidation tests, 265 printing tests, 263 test structure, 258–261 refleak testing, 122 reftest-print tests, 263 reftest-wait tests, 262 regression testing, 239 automated regression testing (see reference testing) manual versus automated, 23 release candidates, 282 release testing, 123 reporting test results, 114 responder, 89 RNGs (random number generators), 130 difficulty of testing, 130 nonuniform RNGs, 132 test progression, 134–140 bucket test, 138 Komogorov-Smirnov test, 139 mean test, 135 range tests, 134 variance test, 136 uniform RNGs, 131 Rogers, Paul, 177 Ruderman, Jesse, 63 S scalability testing, 239 scenario testing, 97 scripted testing, 162, 165 scripting, 239 security, 59, 236 Selenium, 177, 293, 306 Selenium IDE, 311 session initialization, 97 Sisyphus, 297 extension testing with, 298 Firefox, operation on, 299 Slideshow, 229 SocialCalc, 230 Socialtext, 215 business purpose, 216 software process, 218 software development, 171–176 aesthetics and, 176 agile teams, 172 checking versus investigating, 210 complexity, 175 as a creative process, 174 intensity, 175 joy, 175 multiple character sets, handling, 237 musical performance and, 172 practice, rehearsal, performance, 173 requirements that demand testing, 238 security, coding for, 236 test-driven development (see TDD) software test suites, evaluating, 247 simulating defects, 249 software testing antivirus software (see ClamAV) measurement of behavior, 238 regression testing, 239 seeding with defects, 247 web applications (see Windmill testing framework) software testing movements, 176 source functions, 150, 153 Spider tool, 296, 301 database, 298 spidering, 295 Splint tool, 272 Spolsky, Joel, 212 SQL injection, 236 stanzas, 86 payloads, 88 static analysis, 125 static analyzers, 271–273 Clang Static Analyzer, 271 GCC, 271 Splint, 272 stories, 218 illustrative example, 219–223 stress testing, 238 Sunbird calendar application, 27 Swift IM client, 85 sync functions, 150, 153 T tag clouds, 77 tags, 109 TCE (Test-Case Effectiveness) metric, 78 TDD (test-driven development), 182–194, 202– 206 author, 191 beauty, 193 delivering value, 206–208 incremental testing and coding, 207 planning, 207 team ownership of problems, 206 driving code with examples, 204 examples, 184 as permanent requirement artifacts, 186 automated examples, 189 readable examples, 185 executors, 191 experimentation, 203 facing problems, 205 full TDD process, 190 planning, 203 reader, 191 red-green-refactor cycle, 182 INDEX 327 report consumer, 192 requirements, 184 result consumer, 191 TDD triangle, 184 team collaboration, 192 testable designs, 187 tests, 185 tool support, 189–192 unit test development, 221 test automation pyramid, 200 test automation systems, 105 test bed preparation, 112 test binary or script, 108 test case ID, 108 test case management, 107–111 test collateral, 107 test data generation, 241 test days, 34 test distribution, 112 test escapes, 78 analyzing, 79 test ID numbers, 107 test infrastructure, 107 test results, reporting, 114 test scripts for ClamAV, 276 test stakeholders, 16 objectives and expectations, 18 test writing, 106 a common approach, 109 test-driven development (see TDD) testers, 3 experience and training, 8 qualities of testers, 5 roles and purpose, 8 testing agile testing, 177 balanced breakfast approach, 227 beauty, importance of, 210 continuous improvement, rules for, 198 delivering value, 198 reasonable goals, 115 as risk management, 210 teamwork, 195 testability, 199 TestRunner suite, 229 TODO developer notes, 240 TraceMonkey (Mozilla), 298 transparency, 57 Twill tool, 293 U UAT (user acceptance testing), 49 uniform random number generators, 131 unit testing, 89 ClamAV, 275 328 INDEX XMPP multistage protocols, 94–97 XMPP request-response protocols, 89–93 upstream bug tracking, 72 User Community Modeling Language (UCML), 45 user expectations, 55 user satisfaction, 58 V Valgrind, 120, 124, 149, 273 variance test, 136 VersionResponder test, 90 logic test, 92 with structured XML, 91 volunteers, 28 keeping engaged, 32 recruiting, 31 W Watir (Web Application Testing in Ruby), 177 web application testing (see Windmill testing framework) web page testing, 295 evolution of tools for, 296–299 JavaScript automated testing, 296 Spider tool, 296 white box testing, 270 Whittaker, James, 213 wikitests, 223–227 Windmill testing framework, 285–292 debugging tests, 291 Firebug Lite, 291 lookupNode, 292 other web application testing utilities, compared to, 293 running tests, 289–290 Windmill recorder, 286 Windmill website, 292 writing tests in Windmill’s IDE, 286–289 X XML schemas, 101 XMPP (eXtensible Messaging and Presence Protocol), 85–88 automated interoperability testing, 99–101 client-to-client versus client-to-server testing, 99 session initialization testing, 97–99 testing of protocols, 88 unit testing, 85 multistage protocols, 94–97 request-response protocols, 89–93 XML validation, 101 XMPP Extension Protocols (XEPs), 88 XMPP network, 86 XP (Extreme Programming), 182, 218 team collaboration, 192 XSS (cross-site scripting), 236 XUL markup language, 258 Z zzuf fuzzer, 61 INDEX 329 COLOPHON The cover image is from Getty Images.
Essential Sqlalchemy
by
Jason Myers
and
Rick Copeland
Published 27 Nov 2015
SQLAlchemy leverages powerful common statements and types to ensure its SQL statements are crafted efficiently and properly for each database type and vendor without you having to think about it. This makes it easy to migrate logic from Oracle to PostgreSQL or from an application database to a data warehouse. It also helps ensure that database input is sanitized and properly escaped prior to being submitted to the database. This prevents common issues like SQL injection attacks. SQLAlchemy also provides a lot of flexibility by supplying two major modes of usage: SQL Expression Language (commonly referred to as Core) and ORM. These modes can be used separately or together depending on your preference and the needs of your application. SQLAlchemy Core and the SQL Expression Language The SQL Expression Language is a Pythonic way of representing common SQL statements and expressions, and is only a mild abstraction from the typical SQL language.
…
Single insert as a method ins = cookies.insert().values( cookie_name="chocolate chip", cookie_recipe_url="http://some.aweso.me/cookie/recipe.html", cookie_sku="CC01", quantity="12", unit_cost="0.50" ) print(str(ins)) In Example 2-1, print(str(ins)) shows us the actual SQL statement that will be executed: INSERT INTO cookies (cookie_name, cookie_recipe_url, cookie_sku, quantity, unit_cost) VALUES (:cookie_name, :cookie_recipe_url, :cookie_sku, :quantity, :unit_cost) Our supplied values have been replaced with :column_name in this SQL statement, which is how SQLAlchemy represents parameters displayed via the str() function. Parameters are used to help ensure that our data has been properly escaped, which mitigates security issues such as SQL injection attacks. It is still possible to view the parameters by looking at the compiled version of our insert statement, because each database backend can handle the parameters in a slightly different manner (this is controlled by the dialect). The compile() method on the ins object returns a SQLCompiler object that gives us access to the actual parameters that will be sent with the query via the params attribute: ins.compile().params This compiles the statement via our dialect but does not execute it, and we access the params attribute of that statement.
Professional Node.js: Building Javascript Based Scalable Software
by
Pedro Teixeira
Published 30 Sep 2012
If queries are constructed this way and you have no direct control over the values used for the query (maybe because the value for content is provided by users inputting it through a form on a web page), your application becomes vulnerable to SQL injection. A malicious user can submit data to your application that, if passed on to the database unfiltered, breaks out of the intended SQL statement and allows the attacker to execute arbitrary statements that could cause harm. Here is an example of SQL injection in action: client.query('USE node'); var userInput = '"); DELETE FROM test WHERE id = 1; -- '; client.query('INSERT INTO test (content) VALUES ("' + userInput + '")'); client.end(); It’s a bit complicated to see what’s actually going on, but this is what happens.
The Debian Administrator's Handbook, Debian Wheezy From Discovery to Mastery
by
Raphaal Hertzog
and
Roland Mas
Published 24 Dec 2013
Many of the most obvious problems have been fixed as time has passed, but new security problems pop up regularly. VOCABULARY SQL injection When a program inserts data into SQL queries in an insecure manner, it becomes vulnerable to SQL injections; this name covers the act of changing a parameter in such a way that the actual query executed by the program is different from the intended one, either to damage the database or to access data that should normally not be accessible. → http://en.wikipedia.org/wiki/SQL_Injection Updating web applications regularly is therefore a must, lest any cracker (whether a professional attacker or a script kiddy) can exploit a known vulnerability.
Node.js in Action
by
Mike Cantelon
,
Marc Harter
,
Tj Holowaychuk
and
Nathan Rajlich
Published 27 Jul 2013
Add the code in the next listing to timetrack.js. Listing 5.11. Adding a work record Note that you use the question mark character (?) as a placeholder to indicate where a parameter should be placed. Each parameter is automatically escaped by the query method before being added to the query, preventing SQL injection attacks. Note also that the second argument of the query method is now a list of values to substitute for the placeholders. Deleting MySQL data Next, you need to add the following code to timetrack.js. This logic will delete a work record. Listing 5.12. Deleting a work record Updating MySQL data To add logic that will update a work record, flagging it as archived, add the following code to timetrack.js.
…
Install node-postgres via npm using the following command: npm install pg Connecting to PostgreSQL Once you’ve installed the node-postgres module, you can connect to PostgreSQL and select a database to query using the following code (omit the :mypassword portion of the connection string if no password is set): var pg = require('pg'); var conString = "tcp://myuser:mypassword@localhost:5432/mydatabase"; var client = new pg.Client(conString); client.connect(); Inserting a row into a database table The query method performs queries. The following example code shows how to insert a row into a database table: client.query( 'INSERT INTO users ' + "(name) VALUES ('Mike')" ); Placeholders ($1, $2, and so on) indicate where to place a parameter. Each parameter is escaped before being added to the query, preventing SQL injection attacks. The following example shows the insertion of a row using placeholders: client.query( "INSERT INTO users " + "(name, age) VALUES ($1, $2)", ['Mike', 39] ); To get the primary key value of a row after an insert, you can use a RETURNING clause to specify the name of the column whose value you’d like to return.
Python Web Development With Django
by
Jeff Forcier
Most ORM platforms support multiple database backends, and Django’s is no exception.At the time of this writing, code utilizing Django’s model layer runs on PostgreSQL, MySQL, SQLite, and Oracle—and this list is likely to grow as more database backend plugins are written. Safety Because you are rarely executing your own SQL queries when using an ORM, you don’t have to worry as much about the issues caused by malformed or poorly protected query strings, which often lead to problems such as SQL injection attacks. ORMs also provide a central mechanism for intelligent quoting and escaping of input variables, freeing up time otherwise spent dealing with that sort of minutia.This sort of benefit is common with modularized or layered software of which MVC frameworks are a good example.When all the code responsible for a specific problem domain is well-organized and selfcontained, it can often be a huge time-saver and increase overall safety.
…
One of the “best practices” of performing database queries from higher-level programming languages is to properly escape or insert dynamic parameters.A common mistake among beginners is to do simple string concatenation or interpolation to get their variables into the SQL query, but this opens up a whole host of potential security holes and bugs. Instead, when using extra, make use of the params keyword, which is simply a list of the values to use when replacing %s string placeholders in the where strings, such as: from myproject.myapp.models import Person from somewhere import unknown_input # Incorrect: will "work", but is open to SQL injection attacks and related problems. # Note that the ‘%s’ is being replaced through normal Python string interpolation. matches = Person.objects.all().extra(where=["first = ‘%s’" % unknown_input()]) # Correct: will escape quotes and other special characters, depending on # the database backend. Note that the ‘%s’ is not replaced with normal string # interpolation but is to be filled in with the ‘params’ argument. matches = Person.objects.all().extra(where=["first = ‘%s’"], params=[unknown_input()]) Utilizing SQL Features Django Doesn’t Provide The final word on Django’s model/query framework is that, as an ORM, it simply can’t cover all the possibilities.
Cybersecurity: What Everyone Needs to Know
by
P. W. Singer
and
Allan Friedman
Published 3 Jan 2014
Secure Internet Protocol Router Network (SIPRNet): The US military’s classified network, used to communicate secret information following the same basic protocols as the broader Internet. social engineering: The practice of manipulating people into revealing confidential information online. SQL injection: A common attack vector against web servers. The attacker attempts to trick a website into passing a “rogue” Structured Query Language (SQL) command to the database. If the database program can be compromised, the attacker may be able to gain access to other files or permissions on the server.
…
See Unit 61398 Silk Road, 109 Snowden, Edward, 50, 93, 104, 140, 249 social engineering, 32–34, 40, 57–58, 101–102, 244 social media Facebook, 15–16, 40, 58, 68, 82, 89, 170, 211, 215 Sina, 15 Tencent, 15 Twitter, 15–16, 32, 80–82, 170, 246, 248 Weibo, 15 YouTube, 25, 52, 82–83, 100 Social Security number, 31, 34, 81, 228 socialing. See social engineering Sommer, Peter, 163 Spafford, Eugene, 90 spear phishing. See phishing SQL injection. See Structured Query Language (SQL) State Department, 53–54, 107 Stop Huntingdon Animal Cruelty (SHAC), 79–80 Stop Online Piracy Act (SOPA), 107 Structured Query Language (SQL), 42 Stuxnet copycats of, 158–159 creation and effects of, 35, 38, 98, 114–118, 213 lessons of, 118–120, 132, 152, 156–158 supercomputer, 247–248 supervisory control and data acquisition (SCADA), 15, 115, 130, 159 supply chain, 203–204 Syria as cyberattack target, 55, 126–128 offensive cyber operations, 45, 112 (see also Operation Orchard) test beds, 212 The Cabal, 196 The Economist, 108, 120, 183 The Jetsons, 2 threat assessment, 148–150 Tor, 7, 107–110 Transport Control Protocol (TCP), 18 Trojan, 126–127 Twitter.
This Machine Kills Secrets: Julian Assange, the Cypherpunks, and Their Fight to Empower Whistleblowers
by
Andy Greenberg
Published 12 Sep 2012
And the id of the Internet wasn’t content merely causing his website some downtime. HBGary Federal used custom software for managing its website, and the Anonymous hackers quickly combed the site and found a critical flaw in the code. When the software stored data in a database, it didn’t always differentiate that information from executable commands—with a trick called SQL injection, a user could pretend to be entering something as innocuous as a username and password, but in fact include characters that triggered actions on the website’s back end—even actions like coughing up sensitive data. HBGary’s attackers sussed out the flaw immediately and used it to access the company’s password database.
…
See also Tor Opasnite Novini (“Dangerous News”) blog, 260–61 OpenLeaks and the Architect, 292–94, 297 and Chaos Communication Camp, 299 digital custody dispute of, 310 future of, 311 goal of, 275 headquarters of, 309–10 Müller-Maguhn on, 302 origins of, 297 penetration test of, 274, 276, 278, 292, 303 security of, 274–75, 278, 293–94 test launch of, 273–76, 277–82, 310 “owning” a target, 137–38 Paller, Alan, 188 Paradox quarterly magazine, 94, 98 Paranoia Meter, 218–20 Pardew, James, 262 Patriot Act, 268 Penet remailer, 115–17, 321 Pentagon Papers access to, 24–25, 36, 45 contents of, 24–26 copying of, 11–13 legacy of, 270, 321 release of, 34–37 See also Ellsberg, Daniel PGP (pretty-good privacy) and activism, 71, 74–75 distribution of, 74–75 investigation of, 75, 83–84, 86–87 MIT Press’s publication of, 87 origins of, 70–75 passwords for, 306 in remailers, 118 and Young’s Cryptome site, 101 PGP: Source Code and Internals (Zimmermann), 53, 83, 87 Pietrosanti, Fabio, 318–19 Pirate Bay, 238–39, 256, 305, 306, 318 political dissidents, 136–38, 140 PRQ (PeRiQuito AB), 238–40, 294 public key encryption described, 62–64 Manning’s use of, 34 and May’s BlackNet concept, 89–92 and Mix Network concept, 79–80 RSA, 64, 70, 71, 79 signatures in, 67 and Tor, 145–46 and Young’s Cryptome site, 101 See also PGP (pretty-good privacy) Qaddafi, Muammar, 3, 137 Radack, Jesselyn, 225 RAND, 11, 22, 24, 34, 36 Reader’s Digest, 101 Reagan administration, 55 remailers and Chaum’s Mix Network encryption, 80–81, 82, 117 Cottrell’s Mixmaster remailer, 118–19, 144 cypherpunk remailer, 82, 92, 118 Helsingius’s Penet remailer, 115–17 improvements to, 117–18 and Young’s Cryptome site, 102 Rolling Stone, 138 Rosenkranz, Richard, 109, 110 RSA (MIT’s public key encryption), 64, 70, 71, 79 Rubberhose, 126–27, 163, 164 Russo, Tony, 11–13, 37 Salin, Phil, 59–60, 64–65, 89 satellite modems, 135–37 secrets and crypto-anarchy, 90–91 culture of secrecy, 311 Domscheit-Berg on role of, 312 globalization of secrets control, 236 psychological effects of, 18, 225 Secrets (Ellsberg), 12, 24 Secure Sockets Layer encryption, 157 “Security without Identification” (Chaum), 66–67 Shamir, Israel, 262–63, 264, 300 Sheehan, Neil, 35 Shirky, Clay, 7, 235–36 Smith, Vaughan, 177 Soghoian, Chris, 141–42 Somalian leak, 164–65 Soviet Union, 55 SQL injection, 211 Steckman, Matthew, 181–82 Stefanov, Ognyan, 234, 260–61 Sterling, Bruce, 98 Stoev, Georgi, 234 S.266, 73–74, 84 Suburbia ISP, 112–13, 114 Sunde, Peter, 239 Svartholm, Gottfrid, 238–39 Sweden, 236–40, 292 Syverson, Paul, 143–44, 146–47, 156 Tamm, Thomas, 223 Tchobanov, Atanas, 232–35, 241–42, 252–53, 259–65, 269–70 TEMPEST tool of the NSA, 123, 130 text-message terrorism, 252–53 Thinthread, 220–24 Time magazine, 37 Tor anonymity ensured by, 7, 139–40 and Anonymous, 184 and Appelbaum, 136, 150–51, 155, 167 the Architect on, 292–93 Bivol’s use of, 261 Browser Bundle program, 150 and copycat sites, 230, 231, 233 development of, 139, 144–47, 149 distribution of, 149 government backing of, 139, 140, 144–45, 146, 149, 150–51 Hidden Service feature, 140, 142, 157, 278, 318–19 leakers’ acclimation to, 233 Manning’s use of, 39, 139 mechanics of, 141 MIT Hackfest, 135–36, 139 name of, 141 nodes of, 149, 150, 158–60 and OpenLeaks, 278 political applications of, 149 and satellite modems, 135–37 security of, 141–42 and unencrypted files, 158–59 uses of, 140–41 and U.S. government, 139–43 and WikiLeaks, 138, 157, 158–60, 168 Trailblazer, 221–24 Trax, 107, 112 Tryggvadóttir, Margrét, 252 Trynor, Mark, 192–94 Tsonev, Tsoni, 231 Twitter, 138–39, 266–67 UKUSA agreement, 235, 236 Underground (Assange and Dreyfus), 103, 106–7, 111, 112, 129 University of Melbourne, 94–96, 106, 127 U.S.
JavaScript Cookbook
by
Shelley Powers
Published 23 Jul 2010
You can package this functionality for reuse: function postEncodeURIComponent(str) { str=encodeURIComponent(str); return str.replace(/%20/g,"+"); } 416 | Chapter 18: Communication The escaping ensures that the Ajax request will be successfully communicated, but it doesn’t ensure that the Ajax request is safe. All data input by unknown persons should always be scrubbed to prevent SQL injection or cross-site scripting (XSS) attacks. However, this type of security should be implemented in the server-side application, because if people can put together a GET request in JavaScript, they can put together a GET request directly in a browser’s location bar and bypass the script altogether.
…
Only WebKit (and Chrome and Safari) and Opera have made any progress with this implementation, and there’s no guarantee that Mozilla or Microsoft will pick up on it, especially since the specification is blocked. It is an interesting concept, but it has significant problems. One is security, naturally. In our current applications, the client part of the applications handles one form of security, and the server component handles the other, including database security and protection against SQL injection: attaching text on to a data field value that actually triggers a SQL command—such as drop all tables, or expose private data. Now, with client-side relational database support, we’re introducing a new set of security concerns on the client. Another concern is the increasing burden we put on client-side storage.
PostgreSQL: Up and Running
by
Regina Obe
and
Leo Hsu
Published 5 Jul 2012
The postgres system account should always be created as a regular system user in the OS with just rights to the data cluster and additional tablespace folders. Most installers will set up the correct permissions for postgres. Don’t try to any favors by giving postgres more rights than it needs. Granting unnecessary rights leaves your system vulnerable should you fall under an SQL injection attack. There are cases where you’ll need to give the postgres account write/delete/read rights to folders or executables outside of the data cluster. With scheduled jobs that execute batch files, this need often arises. We advise you to practice restraint and only grant the minimum rights necessary to get the job done.
Learning Flask Framework
by
Matt Copperwaite
and
Charles Leifer
Published 26 Nov 2015
Suppose you add a column to a model. With SQLAlchemy it will be available whenever you use that model. If, on the other hand, you had hand-written SQL queries strewn throughout your app, you would need to update each query, one at a time, to ensure that you were including the new column. • SQLAlchemy can help you avoid SQL injection vulnerabilities. • Excellent library support: As you will see in later chapters, there are a multitude of useful libraries that can work directly with your SQLAlchemy models to provide things such as maintenance interfaces and RESTful APIs. I hope you're excited after reading this list. If all the items in this list don't make sense to you right now, don't worry.
Speaking Code: Coding as Aesthetic and Political Expression
by
Geoff Cox
and
Alex McLean
Published 9 Nov 2012
He also refers to the hegemonic operations of language in the work of Laclau and the ontological violence of language in Heidegger. 9. Walter Benjamin, “Critique of Violence” (1921), in Walter Benjamin: Selected Writings, vol. 1, 1913–1926, ed. Marcus Bullock and Michael W. Jennings (Cambridge, Mass.: Harvard University Press, 1996), 236–252. 10. Žižek, Violence, 168. 11. SQL injection techniques exploit security vulnerability occurring in the database layer of an application (like queries). 12. Some ideas related to this were developed by Geoff Cox and Martin Knahl in “Critique of Software Security,” in Geoff Cox and Wolfgang Sützl, eds., Creating Insecurity (New York: Autonomedia, 2009), 27–43. 13.
Realtime Web Apps: HTML5 WebSocket, Pusher, and the Web’s Next Big Thing
by
Jason Lengstorf
and
Phil Leggetter
Published 20 Feb 2013
."); } return TRUE; } } The __construct() method attempts to create a new MySQL connection using the values you stored in system/config/config.inc.php and throws an Exception if the connection fails. ■■Note We’re using PHP Data Objects (PDO)4 for database access because it provides an easy interface and makes SQL injection virtually impossible when used properly. Adding the Header and Footer Markup The last step before actually building one of the app’s pages is to get the header and footer markup added to the app for common use. Starting with the simplest file, create a new file in system/inc/ called footer.inc.php and insert the footer markup you built in Chapter 7: <footer> <ul> <li class="copyright"> © 2013 Jason Lengstorf & Phil Leggetter </li><!
PostgreSQL: Up and Running, 3rd Edition
by
Unknown
The postgres account should always be created as a regular system user in the OS with privileges just to the data cluster and additional tablespace folders. Most installers will set up the correct permissions without you needing to worry. Don’t try to do postgres any favors by giving it more access than it needs. Granting unnecessary access leaves your system vulnerable if you fall victim to an SQL injection attack. There are cases where you’ll need to give the postgres account write/delete/read rights to folders or executables outside of the data cluster. With scheduled jobs that execute batch files and foreign data wrappers that have foreign tables in files, this need often arises. Practice restraint and bestow only the minimum access necessary to get the job done.
Principles of Web API Design: Delivering Value with APIs and Microservices
by
James Higginbotham
Published 20 Dec 2021
This includes the use invalid combinations of the HTTP method and path, enforces the use of secure HTTP via TLS for encrypted communications, and blocking of known malicious clients ■ Message validation: Performs input validation to prevent submitting invalid data or overriding protected fields. This may also include parser attack prevention such as XML entity parser exploits, SQL injection, and JavaScript injection attacks sent via requests to gain access to unauthorized data ■ Data scraping and botnet protection: Detects intentional data scraping via APIs, online fraud, spam, and distributed denial-of-service (DDoS) attacks from malicious botnets. These attacks tend to be sophisticated and require specialized detection and remediation ■ Review and scanning: Manual and/or automated review and testing of API security vulnerabilities within source code (static reviews) and network traffic patterns (real-time reviews) Not all of these practices are included in a single solution.
Learning Ext Js
by
Shea Frederick
Published 19 Dec 2008
> [ 91 ] This material is copyright and is licensed for the sole use by Roman Heinrich on 25th December 2008 Am Hilligenbusch 47, , Paderborn, NRW, 33098 Displaying Data With Grids The PHP code used in these examples is meant to be the bare minimum needed to get the job done. In a production environment you would want to account for security against SQL injection attacks, other error checking, and probably user authentication—which the example code does not account for. Programming the grid Most of the code we have written so far concerns configuring the grid prior to it being displayed. Often, we will want the grid to do something in response to user input.
DarkMarket: Cyberthieves, Cybercops and You
by
Misha Glenny
Published 3 Oct 2011
This jumbled crossroads of imperial ambition, peculiar modern cultural icons and the dreamy nature of light form an ideal backdrop for the annual gathering of the Cooperative Cyber Defence Centre of Excellence (CCDOE), the NATO-backed complex that researches all aspects of cyber warfare. The characters at this conference live in a contemporary Wonderland where convention is oft disregarded – ponytails and wire-rimmed glasses earnestly exchange information with starched military uniforms about ‘SQL injection vulnerabilities’. Besuited civil servants are deep in conversation with young men in jeans and T-shirts detailing the iniquities of ‘man-in-the-middle attacks’. To grasp even the very basics of cyber security in its rich variety, one must be prepared to learn countless new idioms that are being constantly added to or amended.
Programming TypeScript
by
Boris Cherny
Published 16 Apr 2019
At the time of writing, Umed Khudoiberdiev’s excellent TypeORM is the most complete ORM for TypeScript, and supports MySQL, PostgreSQL, Microsoft SQL Server, Oracle, and even MongoDB. Using TypeORM, your query to get a user’s first name might look like this: let user = await UserRepository .findOne({id: 739311}) // User | undefined Notice the high-level API, which is both safe (in that it prevents things like SQL injection attacks) and typesafe by default (in that we know what type findOne returns without having to manually annotate it). Always use an ORM when working with databases—it’s more convenient, and it will save you from getting woken up at four in the morning because the saleAmount field is null because you updated it to orderAmount the night before and your coworker decided to run your database migration for you in anticipation of your pull request landing while you were out, but then around midnight your pull request failed even though the migration succeeded, and your sales team in New York woke up to realize that all your clients’ orders were for exactly null dollars (this happened to… a friend).
Masterminds of Programming: Conversations With the Creators of Major Programming Languages
by
Federico Biancuzzi
and
Shane Warden
Published 21 Mar 2009
They had to do with things like whether strings of data were enclosed in quotes or not, or whether the data was capitalized or not—things that you might consider to be trivial or inconsequential errors, not really related to the structure of the language or the data. Nevertheless those kinds of details were hard for users to get right. Today there are a lot of SQL injection attacks against web services that don’t correctly filter the input before it’s included in the queries to their databases. Any thoughts? Don: SQL injection attacks are a good example of something that we never dreamed of in the early days. We didn’t anticipate that queries would be constructed from user input from web browsers. I guess the lesson here is that software should always take a careful look at user input before processing it.
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by
Martin Kleppmann
Published 17 Apr 2017
In most server-side data systems, the cost of deploying Byzantine fault-tolerant solutions makes them impractical. Web applications do need to expect arbitrary and malicious behavior of clients that are under end-user control, such as web browsers. This is why input validation, sani‐ tization, and output escaping are so important: to prevent SQL injection and crosssite scripting, for example. However, we typically don’t use Byzantine fault-tolerant protocols here, but simply make the server the authority on deciding what client behavior is and isn’t allowed. In peer-to-peer networks, where there is no such cen‐ tral authority, Byzantine fault tolerance is more relevant.
…
The opposite of bounded. 558 | Glossary Index A aborts (transactions), 222, 224 in two-phase commit, 356 performance of optimistic concurrency con‐ trol, 266 retrying aborted transactions, 231 abstraction, 21, 27, 222, 266, 321 access path (in network model), 37, 60 accidental complexity, removing, 21 accountability, 535 ACID properties (transactions), 90, 223 atomicity, 223, 228 consistency, 224, 529 durability, 226 isolation, 225, 228 acknowledgements (messaging), 445 active/active replication (see multi-leader repli‐ cation) active/passive replication (see leader-based rep‐ lication) ActiveMQ (messaging), 137, 444 distributed transaction support, 361 ActiveRecord (object-relational mapper), 30, 232 actor model, 138 (see also message-passing) comparison to Pregel model, 425 comparison to stream processing, 468 Advanced Message Queuing Protocol (see AMQP) aerospace systems, 6, 10, 305, 372 aggregation data cubes and materialized views, 101 in batch processes, 406 in stream processes, 466 aggregation pipeline query language, 48 Agile, 22 minimizing irreversibility, 414, 497 moving faster with confidence, 532 Unix philosophy, 394 agreement, 365 (see also consensus) Airflow (workflow scheduler), 402 Ajax, 131 Akka (actor framework), 139 algorithms algorithm correctness, 308 B-trees, 79-83 for distributed systems, 306 hash indexes, 72-75 mergesort, 76, 402, 405 red-black trees, 78 SSTables and LSM-trees, 76-79 all-to-all replication topologies, 175 AllegroGraph (database), 50 ALTER TABLE statement (SQL), 40, 111 Amazon Dynamo (database), 177 Amazon Web Services (AWS), 8 Kinesis Streams (messaging), 448 network reliability, 279 postmortems, 9 RedShift (database), 93 S3 (object storage), 398 checking data integrity, 530 amplification of bias, 534 of failures, 364, 495 Index | 559 of tail latency, 16, 207 write amplification, 84 AMQP (Advanced Message Queuing Protocol), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 message ordering, 446 analytics, 90 comparison to transaction processing, 91 data warehousing (see data warehousing) parallel query execution in MPP databases, 415 predictive (see predictive analytics) relation to batch processing, 411 schemas for, 93-95 snapshot isolation for queries, 238 stream analytics, 466 using MapReduce, analysis of user activity events (example), 404 anti-caching (in-memory databases), 89 anti-entropy, 178 Apache ActiveMQ (see ActiveMQ) Apache Avro (see Avro) Apache Beam (see Beam) Apache BookKeeper (see BookKeeper) Apache Cassandra (see Cassandra) Apache CouchDB (see CouchDB) Apache Curator (see Curator) Apache Drill (see Drill) Apache Flink (see Flink) Apache Giraph (see Giraph) Apache Hadoop (see Hadoop) Apache HAWQ (see HAWQ) Apache HBase (see HBase) Apache Helix (see Helix) Apache Hive (see Hive) Apache Impala (see Impala) Apache Jena (see Jena) Apache Kafka (see Kafka) Apache Lucene (see Lucene) Apache MADlib (see MADlib) Apache Mahout (see Mahout) Apache Oozie (see Oozie) Apache Parquet (see Parquet) Apache Qpid (see Qpid) Apache Samza (see Samza) Apache Solr (see Solr) Apache Spark (see Spark) 560 | Index Apache Storm (see Storm) Apache Tajo (see Tajo) Apache Tez (see Tez) Apache Thrift (see Thrift) Apache ZooKeeper (see ZooKeeper) Apama (stream analytics), 466 append-only B-trees, 82, 242 append-only files (see logs) Application Programming Interfaces (APIs), 5, 27 for batch processing, 403 for change streams, 456 for distributed transactions, 361 for graph processing, 425 for services, 131-136 (see also services) evolvability, 136 RESTful, 133 SOAP, 133 application state (see state) approximate search (see similarity search) archival storage, data from databases, 131 arcs (see edges) arithmetic mean, 14 ASCII text, 119, 395 ASN.1 (schema language), 127 asynchronous networks, 278, 553 comparison to synchronous networks, 284 formal model, 307 asynchronous replication, 154, 553 conflict detection, 172 data loss on failover, 157 reads from asynchronous follower, 162 Asynchronous Transfer Mode (ATM), 285 atomic broadcast (see total order broadcast) atomic clocks (caesium clocks), 294, 295 (see also clocks) atomicity (concurrency), 553 atomic increment-and-get, 351 compare-and-set, 245, 327 (see also compare-and-set operations) replicated operations, 246 write operations, 243 atomicity (transactions), 223, 228, 553 atomic commit, 353 avoiding, 523, 528 blocking and nonblocking, 359 in stream processing, 360, 477 maintaining derived data, 453 for multi-object transactions, 229 for single-object writes, 230 auditability, 528-533 designing for, 531 self-auditing systems, 530 through immutability, 460 tools for auditable data systems, 532 availability, 8 (see also fault tolerance) in CAP theorem, 337 in service level agreements (SLAs), 15 Avro (data format), 122-127 code generation, 127 dynamically generated schemas, 126 object container files, 125, 131, 414 reader determining writer’s schema, 125 schema evolution, 123 use in Hadoop, 414 awk (Unix tool), 391 AWS (see Amazon Web Services) Azure (see Microsoft) B B-trees (indexes), 79-83 append-only/copy-on-write variants, 82, 242 branching factor, 81 comparison to LSM-trees, 83-85 crash recovery, 82 growing by splitting a page, 81 optimizations, 82 similarity to dynamic partitioning, 212 backpressure, 441, 553 in TCP, 282 backups database snapshot for replication, 156 integrity of, 530 snapshot isolation for, 238 use for ETL processes, 405 backward compatibility, 112 BASE, contrast to ACID, 223 bash shell (Unix), 70, 395, 503 batch processing, 28, 389-431, 553 combining with stream processing lambda architecture, 497 unifying technologies, 498 comparison to MPP databases, 414-418 comparison to stream processing, 464 comparison to Unix, 413-414 dataflow engines, 421-423 fault tolerance, 406, 414, 422, 442 for data integration, 494-498 graphs and iterative processing, 424-426 high-level APIs and languages, 403, 426-429 log-based messaging and, 451 maintaining derived state, 495 MapReduce and distributed filesystems, 397-413 (see also MapReduce) measuring performance, 13, 390 outputs, 411-413 key-value stores, 412 search indexes, 411 using Unix tools (example), 391-394 Bayou (database), 522 Beam (dataflow library), 498 bias, 534 big ball of mud, 20 Bigtable data model, 41, 99 binary data encodings, 115-128 Avro, 122-127 MessagePack, 116-117 Thrift and Protocol Buffers, 117-121 binary encoding based on schemas, 127 by network drivers, 128 binary strings, lack of support in JSON and XML, 114 BinaryProtocol encoding (Thrift), 118 Bitcask (storage engine), 72 crash recovery, 74 Bitcoin (cryptocurrency), 532 Byzantine fault tolerance, 305 concurrency bugs in exchanges, 233 bitmap indexes, 97 blockchains, 532 Byzantine fault tolerance, 305 blocking atomic commit, 359 Bloom (programming language), 504 Bloom filter (algorithm), 79, 466 BookKeeper (replicated log), 372 Bottled Water (change data capture), 455 bounded datasets, 430, 439, 553 (see also batch processing) bounded delays, 553 in networks, 285 process pauses, 298 broadcast hash joins, 409 Index | 561 brokerless messaging, 442 Brubeck (metrics aggregator), 442 BTM (transaction coordinator), 356 bulk synchronous parallel (BSP) model, 425 bursty network traffic patterns, 285 business data processing, 28, 90, 390 byte sequence, encoding data in, 112 Byzantine faults, 304-306, 307, 553 Byzantine fault-tolerant systems, 305, 532 Byzantine Generals Problem, 304 consensus algorithms and, 366 C caches, 89, 553 and materialized views, 101 as derived data, 386, 499-504 database as cache of transaction log, 460 in CPUs, 99, 338, 428 invalidation and maintenance, 452, 467 linearizability, 324 CAP theorem, 336-338, 554 Cascading (batch processing), 419, 427 hash joins, 409 workflows, 403 cascading failures, 9, 214, 281 Cascalog (batch processing), 60 Cassandra (database) column-family data model, 41, 99 compaction strategy, 79 compound primary key, 204 gossip protocol, 216 hash partitioning, 203-205 last-write-wins conflict resolution, 186, 292 leaderless replication, 177 linearizability, lack of, 335 log-structured storage, 78 multi-datacenter support, 184 partitioning scheme, 213 secondary indexes, 207 sloppy quorums, 184 cat (Unix tool), 391 causal context, 191 (see also causal dependencies) causal dependencies, 186-191 capturing, 191, 342, 494, 514 by total ordering, 493 causal ordering, 339 in transactions, 262 sending message to friends (example), 494 562 | Index causality, 554 causal ordering, 339-343 linearizability and, 342 total order consistent with, 344, 345 consistency with, 344-347 consistent snapshots, 340 happens-before relationship, 186 in serializable transactions, 262-265 mismatch with clocks, 292 ordering events to capture, 493 violations of, 165, 176, 292, 340 with synchronized clocks, 294 CEP (see complex event processing) certificate transparency, 532 chain replication, 155 linearizable reads, 351 change data capture, 160, 454 API support for change streams, 456 comparison to event sourcing, 457 implementing, 454 initial snapshot, 455 log compaction, 456 changelogs, 460 change data capture, 454 for operator state, 479 generating with triggers, 455 in stream joins, 474 log compaction, 456 maintaining derived state, 452 Chaos Monkey, 7, 280 checkpointing in batch processors, 422, 426 in high-performance computing, 275 in stream processors, 477, 523 chronicle data model, 458 circuit-switched networks, 284 circular buffers, 450 circular replication topologies, 175 clickstream data, analysis of, 404 clients calling services, 131 pushing state changes to, 512 request routing, 214 stateful and offline-capable, 170, 511 clocks, 287-299 atomic (caesium) clocks, 294, 295 confidence interval, 293-295 for global snapshots, 294 logical (see logical clocks) skew, 291-294, 334 slewing, 289 synchronization and accuracy, 289-291 synchronization using GPS, 287, 290, 294, 295 time-of-day versus monotonic clocks, 288 timestamping events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 Rest.li (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by
Martin Kleppmann
Published 16 Mar 2017
In most server-side data systems, the cost of deploying Byzantine fault-tolerant solutions makes them impractical. Web applications do need to expect arbitrary and malicious behavior of clients that are under end-user control, such as web browsers. This is why input validation, sanitization, and output escaping are so important: to prevent SQL injection and cross-site scripting, for example. However, we typically don’t use Byzantine fault-tolerant protocols here, but simply make the server the authority on deciding what client behavior is and isn’t allowed. In peer-to-peer networks, where there is no such central authority, Byzantine fault tolerance is more relevant.
…
setting up a new replica, Setting Up New Followers snapshot isolation and repeatable read, Snapshot Isolation and Repeatable Read-Repeatable read and naming confusionimplementing with MVCC, Implementing snapshot isolation indexes and MVCC, Indexes and snapshot isolation visibility rules, Visibility rules for observing a consistent snapshot synchronized clocks for global snapshots, Synchronized clocks for global snapshots snowflake schemas, Stars and Snowflakes: Schemas for Analytics SOAP, Web services(see also services) evolvability, Data encoding and evolution for RPC software bugs, Software Errorsmaintaining integrity, Maintaining integrity in the face of software bugs solid state drives (SSDs)access patterns, Advantages of LSM-trees detecting corruption, The end-to-end argument, Don’t just blindly trust what they promise faults in, Durability sequential write throughput, Hash Indexes Solr (search server)building indexes in batch processes, Building search indexes document-partitioned indexes, Partitioning Secondary Indexes by Document request routing, Request Routing usage example, Thinking About Data Systems use of Lucene, Making an LSM-tree out of SSTables sort (Unix tool), Simple Log Analysis, Sorting versus in-memory aggregation, The Unix Philosophy sort-merge joins (MapReduce), Sort-merge joins Sorted String Tables (see SSTables) sortingsort order in column storage, Sort Order in Column Storage source of truth (see systems of record) Spanner (database)data locality, Data locality for queries snapshot isolation using clocks, Synchronized clocks for global snapshots TrueTime API, Clock readings have a confidence interval Spark (processing framework), Dataflow engines-Discussion of materializationbytecode generation, The move toward declarative query languages dataflow APIs, High-Level APIs and Languages fault tolerance, Fault tolerance for data warehouses, The divergence between OLTP databases and data warehouses GraphX API (graph processing), The Pregel processing model machine learning, Specialization for different domains query optimizer, The move toward declarative query languages Spark Streaming, Stream analyticsmicrobatching, Microbatching and checkpointing stream processing on top of batch processing, Batch and Stream Processing SPARQL (query language), The SPARQL query language spatial algorithms, Specialization for different domains split brain, Leader failure: Failover, Glossaryin consensus algorithms, Distributed Transactions and Consensus, Single-leader replication and consensus preventing, Consistency and Consensus, Implementing Linearizable Systems using fencing tokens to avoid, The leader and the lock-Fencing tokens spreadsheets, dataflow programming capabilities, Designing Applications Around Dataflow SQL (Structured Query Language), Simplicity: Managing Complexity, Relational Model Versus Document Model, Query Languages for Dataadvantages and limitations of, Diversity of processing models distributed query execution, MapReduce Querying graph queries in, Graph Queries in SQL isolation levels standard, issues with, Repeatable read and naming confusion query execution on Hadoop, Diversity of processing models résumé (example), The Object-Relational Mismatch SQL injection vulnerability, Byzantine Faults SQL on Hadoop, The divergence between OLTP databases and data warehouses statement-based replication, Statement-based replication stored procedures, Pros and cons of stored procedures SQL Server (database)data warehousing support, The divergence between OLTP databases and data warehouses distributed transaction support, XA transactions leader-based replication, Leaders and Followers preventing lost updates, Automatically detecting lost updates preventing write skew, Characterizing write skew, Implementation of two-phase locking read committed isolation, Implementing read committed recursive query support, Graph Queries in SQL serializable isolation, Implementation of two-phase locking snapshot isolation support, Snapshot Isolation and Repeatable Read T-SQL language, Pros and cons of stored procedures XML support, The Object-Relational Mismatch SQLstream (stream analytics), Complex event processing SSDs (see solid state drives) SSTables (storage format), SSTables and LSM-Trees-Performance optimizationsadvantages over hash indexes, SSTables and LSM-Trees concatenated index, Partitioning by Hash of Key constructing and maintaining, Constructing and maintaining SSTables making LSM-Tree from, Making an LSM-tree out of SSTables staleness (old data), Reading Your Own Writescross-channel timing dependencies, Cross-channel timing dependencies in leaderless databases, Writing to the Database When a Node Is Down in multi-version concurrency control, Detecting stale MVCC reads monitoring for, Monitoring staleness of client state, Pushing state changes to clients versus linearizability, Linearizability versus timeliness, Timeliness and Integrity standbys (see leader-based replication) star replication topologies, Multi-Leader Replication Topologies star schemas, Stars and Snowflakes: Schemas for Analytics-Stars and Snowflakes: Schemas for Analyticssimilarity to event sourcing, Event Sourcing Star Wars analogy (event time versus processing time), Event time versus processing time statederived from log of immutable events, State, Streams, and Immutability deriving current state from the event log, Deriving current state from the event log interplay between state changes and application code, Dataflow: Interplay between state changes and application code maintaining derived state, Maintaining derived state maintenance by stream processor in stream-stream joins, Stream-stream join (window join) observing derived state, Observing Derived State-Multi-partition data processing rebuilding after stream processor failure, Rebuilding state after a failure separation of application code and, Separation of application code and state state machine replication, Using total order broadcast, Databases and Streams statement-based replication, Statement-based replication statically typed languagesanalogy to schema-on-write, Schema flexibility in the document model code generation and, Code generation and dynamically typed languages statistical and numerical algorithms, Specialization for different domains StatsD (metrics aggregator), Direct messaging from producers to consumers stdin, stdout, A uniform interface, Separation of logic and wiring Stellar (cryptocurrency), Tools for auditable data systems stock market feeds, Direct messaging from producers to consumers STONITH (Shoot The Other Node In The Head), Leader failure: Failover stop-the-world (see garbage collection) storagecomposing data storage technologies, Composing Data Storage Technologies-What’s missing?
Heart of the Machine: Our Future in a World of Artificial Emotional Intelligence
by
Richard Yonck
Published 7 Mar 2017
Its developers will certainly do what they can to make their work and devices user-friendly, but beyond this there will be the hackers, the entrepreneurs, the DIY innovators who will seek to unravel the mysteries of the technology and in doing so bestow far more of its awesome power upon anyone who wants it, including the technically unskilled. It sounds ridiculous, but this is exactly what we’ve seen in recent years as hackers have made what was once hard-won knowledge and skill available to all at very affordable prices. Distributed denial of service (DDOS) attacks, SQL injections, brute force password cracking, botnet services, and zero-day exploits are all hacking methods that once required sophisticated expertise to perform. Today anyone with money and an Internet connection can access the “Dark Web” and find these tools available for purchase—complete with user-friendly interfaces.
Map Scripting 101: An Example-Driven Guide to Building Interactive Maps With Bing, Yahoo!, and Google Maps
by
Adam Duvander
Published 14 Aug 2010
If you're going to use the value within a database, make sure you verify the data is good. If you expecting a postal code, make sure the data is in the correct format. If you have an address, watch out for strange characters that don't belong in addresses—semicolons come to mind. You don't want to be the victim of a SQL injection attack, where user input is co-opted to create a hazardous query. If you're going to use the input elsewhere, such as in #13: Geocode with an HTTP Web Service in #13: Geocode with an HTTP Web Service, you might be able to rely on its security. But most of the time, do as much as you can on your side to ensure data integrity
Solr 1.4 Enterprise Search Server
by
David Smiley
and
Eric Pugh
Published 15 Nov 2009
• Do not affect the scores of matched documents (nor would you want them to). [ 108 ] Download at Boykma.Com This material is copyright and is licensed for the sole use by William Anderson on 26th August 2009 4310 E Conway Dr. NW, , Atlanta, , 30327 Chapter 4 • Are easier to apply rather than modifying the user's query, which is error prone. Making a mistake could even expose data you are trying to hide (similar in spirit to SQL injection attacks). • Clarify the logs, which show what the user queried for without it being confused with the filters. In general, raw user query text doesn't wind up being part of a filter-query. Instead, the filters are usually known by your application in advance. Although it wouldn't necessarily be a problem for user query text to become a filter, there may be scalability issues if many unique filter queries end up being performed that don't get re-used and so consume needless memory.
Apache Solr 3 Enterprise Search Server
by
Unknown
Published 13 Jan 2012
stream.file=/Users/epugh/.ssh/authorized_keys If you have this turned on, then make sure that you are monitoring the log files, and also that access to Solr is tightly controlled. The example application has this function turned on by default. In addition, in a production environment, you want to comment out the /debug/dump request handler, unless you are actively debugging an issue. Just as you need to be wary of a SQL injection attack for a relational database, there is a similar concern for Solr. Solr should not be exposed to untrusted clients if you are concerned about the risk of a denial of service attack. This is also a concern if you are lax in how your application acts as a broker to Solr. It's fairly easy to bring down Solr by, say asking it to sort by every field in the schema, which would result in sudden exorbitant memory usage.
Hacker, Hoaxer, Whistleblower, Spy: The Story of Anonymous
by
Gabriella Coleman
Published 4 Nov 2014
You have accomplished nothing except inflaming ‘cyberwar’ rhetoric and fueling legislation that will end up with hackers getting 50 years in prison. The most retarded part is that you dont even realize you are the cause of the very thing you hate; Every time you DDoS a company Prolexic or DOSarrest sign up a new customer. Every time you SQL inject some irrelevant site a pentesting company gets a new contract. Every time you declare cyberwar on the government federal contractors get drowned in grant money. Other hackers and netizens also accused Anonymous of fortifying the cyberwar industrial complex. But it’s worth noting that long before Anonymous came to prominence, national governments around the world already aspired to control the Internet and were already developing statutes that eroded individual rights and privacies.
The Joy of Clojure
by
Michael Fogus
and
Chris Houser
Published 28 Nov 2010
[5]] Note that some words such as FROM and ON are taken directly from the input expression, whereas others such as ~max and AND are treated specially. The max that was given the value 5 when the query was called is extracted from the literal SQL string and provided in a separate vector, perfect for using in a prepared query in a way that will guard against SQL-injection attacks. The AND form was converted from the prefix notation of Clojure to the infix notation required by SQL. Listing 1.1. A domain-specific language for embedding SQL queries in Clojure But the point here isn’t that this is a particularly good SQL DSL—more complete ones are available.[4] Our point is that once you have the skill to easily create a DSL like this, you’ll recognize opportunities to define your own that solve much narrower, application-specific problems than SQL does.
Advanced Software Testing—Vol. 3, 2nd Edition
by
Jamie L. Mitchell
and
Rex Black
Published 15 Feb 2015
And every tester at every level of test had better consider security issues during analysis and design of their tests. If every person on the project does not take ownership of security, our systems are just not going to be secure. 4.2.1.1 Piracy There are a lot of ways that an intruder may get unauthorized access to data. SQL injection is a hacker technique that causes a system to run an SQL query in a case where it is not expected. Buffer overflow bugs, which we will discuss in the next section, may allow this, but so might intercepting an authorized SQL statement that is going to be sent to a web server and modifying it. For example, a query is sent to the server to populate a certain page, but a hacker modifies the underlying SQL to get other data back.
Programming Android
by
Zigurd Mednieks
,
Laird Dornin
,
G. Blake Meike
and
Masumi Nakamura
Published 15 Jul 2011
Executes the SQL command by passing the SQL template string and the bind arguments to execSQL. Using an SQL template and bind arguments is much preferred over building up the SQL statement, complete with parameters, into a String or StringBuilder. By using a template with parameters, you protect your application from SQL injection attacks. These attacks occur when a malicious user enters information into a form that is deliberately meant to modify the database in a way that was not intended by the developer. Intruders normally do this by ending the current SQL command prematurely, using SQL syntax characters, and then adding new SQL commands directly in the form field.
Daemon
by
Daniel Suarez
Published 1 Dec 2006
Sobol wanted him to get this far—that’s what this was all about. Gragg double-clicked on the file. A plain white Web page appeared in a browser window. It had logon and password text boxes and a SUBMIT button—nothing more. There were options here. Unicode directory traversal? Gragg smiled. Logon. Sobol was encouraging him. This had all the earmarks of an SQL-injection attack, and he had a favorite one. In the logon and password boxes he entered: ‘or 1=1-- He clicked the SUBMIT button. After a moment’s pause an animation appeared with the words “Logon successful. Please wait….” Gragg felt a rush of endorphins. He’d just received high praise from his new mentor.
Your Computer Is on Fire
by
Thomas S. Mullaney
,
Benjamin Peters
,
Mar Hicks
and
Kavita Philip
Published 9 Mar 2021
If, to use Lawrence Lessig’s famous analogy, “code is law,” Ken Thompson had the power to write and alter the digital constitution by personal fiat.33 Thompson’s possession of power gave him the authorization to play—and to play irresponsibly—that the marginal Dalton gang lacked. It is worthwhile at this point to note again that the methodology used in the Thompson hack, is, although impressive, more useful as a proof of the nonsecure nature of computing than it is in the building of practical attacks. More simple techniques—buffer overrun attacks, SQL-injection attacks, and so forth—are used in most real-world viruses. And most authors of Trojan horse programs do not need to conceal the evidence of their Trojans left behind in the source code, because most programs are closed source. Nevertheless, the hack is troubling. We find ourselves gathered around the distributed campfire of our laptop screens and our glowing mobile phones.
The TypeScript Workshop: A Practical Guide to Confident, Effective TypeScript Programming
by
Ben Grynhaus
,
Jordan Hudgens
,
Rayon Hunte
,
Matthew Thomas Morgan
and
Wekoslav Stefanovski
Published 28 Jul 2021
This will be pretty simple since our table only has two columns: create = (payload: PromiseModel) => this.db.run("INSERT INTO promise (desc) VALUES (?);", payload.desc); This method takes an object of type PromiseModel as an argument, sends a prepared statement (a parameterized SQL statement that is safe from SQL injection attacks), and then returns RunResult, which contains some metadata about the operation that took place. Since the sqlite library ships with typings, we're able to infer the return type without needing to specify it. The return type in this case is Promise<ISqlite.RunResult<sqlite.Statement>>. We could paste all of that into our code, but it's much cleaner the way it is.
Python Cookbook
by
David Beazley
and
Brian K. Jones
Published 9 May 2013
Another extremely critical complication concerns the formation of SQL statement strings. You should never use Python string formatting operators (e.g., %) or the .format() method to create such strings. If the values provided to such formatting operators are derived from user input, this opens up your program to an SQL-injection attack (see http://xkcd.com/327). The special ? wildcard in queries instructs the database backend to use its own string substitution mechanism, which (hopefully) will do it safely. Sadly, there is some inconsistency across database backends with respect to the wildcard. Many modules use ? or %s, while others may use a different symbol, such as :0 or :1, to refer to parameters.
Clojure Programming
by
Chas Emerick
,
Brian Carper
and
Christophe Grand
Published 15 Aug 2011
with-query-results supports parameterized queries, a common feature of SQL libraries where string query templates contain placeholders for query parameters that are provided separately and then interpolated by the database. Parameterized queries promote query reuse, which can increase the performance of queries that are run multiple times, and are also a boon to security compared to the dangerous equivalent: building queries via string concatenation, thereby opening the door to SQL injection attacks. Queries should be a vector where the first item is a string of SQL, and subsequent values correspond to each parameter placeholder in the string query. For example: (jdbc/with-connection db-spec (jdbc/with-query-results res ["SELECT * FROM authors WHERE id = ?" 2] (doall res))) ;= ({:id 2, :first_name "Christophe", :last_name "Grand"}) ?
PostGIS in Action
by
Regina O. Obe
and
Leo S. Hsu
Published 2 May 2015
Note that you could revise the code to read the SRID from geometry_columns, but for views based on tables with constraints, this information might not be available in geometry_columns. The string_to_array PostgreSQL function is then used to convert columns to an array of elements, and then the quote_ident function is used on each element of the array, and they are reconcatenated back into a comma-separated list with string_agg . This is done to prevent an SQL injection attack. Next you build a parameterized SQL statement that uses the table and column names . The executed SQL returns a GeoJSON feature collection when parameters var_geo, var_srid, param_limit, and var_input_srid are provided and replaced in the corresponding slots ($1,$2,$3,$4) of the parameterized SQL statement .
PostGIS in Action, 2nd Edition
by
Regina O. Obe
and
Leo S. Hsu
Published 2 May 2015
Note that you could revise the code to read the SRID from geometry_columns, but for views based on tables with constraints, this information might not be available in geometry_columns. The string_to_array PostgreSQL function is then used to convert columns to an array of elements, and then the quote_ident function is used on each element of the array, and they are reconcatenated back into a comma-separated list with string_agg F. This is done to prevent an SQL injection attack. Next you build a parameterized SQL statement that uses the table and column names G. The executed SQL returns a GeoJSON feature collection when parameters var_geo, var_srid, param_limit, and var_input_srid are provided and replaced in the corresponding slots ($1,$2,$3,$4) of the parameterized SQL statement H.
Real World Haskell
by
Bryan O'Sullivan
,
John Goerzen
,
Donald Stewart
and
Donald Bruce Stewart
Published 2 Dec 2008
If you care about the precise representation of data, you can still manually construct SqlValue data if you need to. Query Parameters HDBC, like most databases, supports a notion of replaceable parameters in queries. There are three primary benefits of using replaceable parameters: they prevent SQL injection attacks or trouble when the input contains quote characters, they improve performance when executing similar queries repeatedly, and they permit easy and portable insertion of data into queries. Let’s say you want to add thousands of rows into our new table test. You could issue queries that look like INSERT INTO test VALUES (0, 'zero') and INSERT INTO test VALUES (1, 'one').
Code Complete (Developer Best Practices)
by
Steve McConnell
Published 8 Jun 2004
Are all exceptions at the appropriate levels of abstraction for the routines that throw them? Does each exception include all relevant exception background information? Is the code free of empty catch blocks? (Or if an empty catch block truly is appropriate, is it documented?) Security Issues Does the code that checks for bad input data check for attempted buffer overflows, SQL injection, HTML injection, integer overflows, and other malicious inputs? Are all error-return codes checked? Are all exceptions caught? Do error messages avoid providing information that would help an attacker break into the system? Additional Resources cc2e.com/0875 Take a look at the following defensive-programming resources: Security Howard, Michael, and David LeBlanc.
Reamde
by
Neal Stephenson
Published 19 Sep 2011
“I’m working with a network security consultancy. You already know that. We got hired by a clothing store chain to do a pen test.” “What, their pens weren’t writing?” “Penetration test. Our job was to find ways of penetrating their corporate networks. We found that one part of their website was vulnerable to a SQL injection attack. By exploiting that, we were able to install a rootkit on one of their servers and then use that as a beachhead on their internal network to—to make a long story short—get root on the servers where they stored customer data and then prove that their credit card data was vulnerable.” “Sounds complicated.”