description: visual representation of database system relationships
81 results
Catalyst 5.8: The Perl MVC Framework
by
Antano Solar John
,
Jonathan Rockway
and
Solar John Antano
Published 15 Nov 2010
Installing SQLite The first step is to install SQLite from the SQLite website at http://sqlite.org/ (or through your distribution's package manager, the package is usually called "sqlite3"). You'll also need to install the Perl version of SQLite, called DBD::SQLite, and DBIx::Class itself as well as an adapter that will link our database schema definition to the Catalyst application. You can install these, except for SQLite, using cpan with the following: $ cpan -i DBD::SQLite DBIx::Class Catalyst::Model::DBIC::Schema Creating a database schema Once the dependencies are installed, we'll create a sample database with the sqlite3 command-line utility: $ sqlite3 tmp/database SQLite version 3.3.8 Enter ".help" for instructions sqlite> CREATE TABLE test (id INTEGER PRIMARY KEY, subject TEXT, message TEXT, date DATETIME); sqlite> INSERT INTO test (id, subject, message, date) VALUES(NULL, "test message", "this is a test message", "2005-01-01 00:00"); sqlite> INSERT INTO test (id, subject, message, date) VALUES(NULL, "another test message", "this is a another test message, hooray", "200501-01 00:01"); sqlite> SELECT * FROM test; 1|test message|this is a test message|2005-01-01 00:00 2|another test message|this is a another test message, hooray|2005-01-01 00:01 sqlite> .quit $ Here we created a test table in a database file called tmp/database and added two rows to it
…
The header, footer, messages, and stylesheets will all be handled automatically and will be easy to customize later. Database design The first real step will be to think about what kind of data we need to store and then design a database schema to efficiently store that data. To keep things simple (but realistic), let's set our specification as follows: • The address book should keep track of multiple addresses for a person • Each person can have a first and last name • Each address can have a street address, a phone number, and an e-mail address Translated into a relational database schema, that means we'll have a people table that will assign a unique identifier to each firstname and lastname pair. Then we'll have an addresses table that will allow each person to have multiple addresses, each consisting of a unique ID (so it's easy to reference later), a location ("Office", "Home"), a free-form postal address (so we don't have to worry about the complexity of having a city, country, state, postal code, and so on), a phone number, and an e-mail address.
…
See CPAN config method 68 configuration management 192 Content-type: header 157 Controller 182 Controller class 8 count_users function 142 COUNT function 117 CPAN about 10 Catalyst, installing from 12 disadvantages 12 modules, installing 41 CRUD 55 D Data::Page object 82 database installing 198 database, accessing ways Object-relational mapper (ORM) 35 database, skeleton application database model, creating 37 [ 223 ] database model, using 37-39 database schema, installing 36 SQLite, installing 36 database model creating 37, 39, 141, 142, 143 using 37, 39 database schema creating 36 data putting on web, ChatStat application _controversial function 126 action, adding 136 actions, adding to display ups or downs on page 139 actions, modifying 136, 138 all_things method 132 custom ResultSets method, writing 124 data, putting on web 121-123 everything query 131 everything ResultSet method 131 having attribute 127 highest_rated, custom ResultSets method 124, 125 index controller method, changing 134, 135 issues 128, 130 least_controversial, custom ResultSets method 124, 126 list_things macro, reusing 136 lowest_rated, custom ResultSets method 124, 125 macros, creating 121 macros, using 123 main page 120 main page, writing 122 most_controversial, custom ResultSets method 124, 126 one_thing method 132 person() links 134 person, macros 121 ResultSet, adding 130 root/src/thing.tt2, template 133, 134 score, macros 121 template root/src/pair.tt2, creating 121, 122 thing, macros 121 uri_for_action, using for link 130 data types 186 dbh_do method 144, 145 DBI::Class 69 dbh_do method 144 DBIC 35 DBIC schema ($self) 113 DBIx::Class 15.
Essential Sqlalchemy
by
Jason Myers
and
Rick Copeland
Published 27 Nov 2015
In addition to the generic types listed in Table 1-1, both SQL standard and vendor-specific types are available and are often used when a generic type will not operate as needed within the database schema due to its type or the specific type specified in an existing schema. A few good illustrations of this are the CHAR and NVARCHAR types, which benefit from using the proper SQL type instead of just the generic type. If we are working with a database schema that was defined prior to using SQLAlchemy, we would want to match types exactly. It’s important to keep in mind that SQL standard type behavior and availability can vary from database to database.
…
These modes can be used separately or together depending on your preference and the needs of your application. SQLAlchemy Core and the SQL Expression Language The SQL Expression Language is a Pythonic way of representing common SQL statements and expressions, and is only a mild abstraction from the typical SQL language. It is focused on the actual database schema; however, it is standardized in such a way that it provides a consistent language across a large number of backend databases. The SQL Expression Language also acts as the foundation for the SQLAlchemy ORM. ORM The SQLAlchemy ORM is similar to many other object relational mappers (ORMs) you may have encountered in other languages.
…
If we use hard references, such as cookies.c.cookie_id, in our ForeignKey definitions it will perform that resolution during module initialization and could fail depending on the order in which the tables are loaded. You can also define a ForeignKeyConstraint explicitly, which can be useful if trying to match an existing database schema so it can be used with SQLAlchemy. This works in the same way as before when we created keys, constraints, and indexes to match name schemes and so on. You will need to import the ForeignKeyConstraint from the sqlalchemy module prior to defining one in your table definition. The following code shows how to create the ForeignKeyConstraint for the order_id field between the line_items and orders table: ForeignKeyConstraint(['order_id'], ['orders.order_id']) Up until this point, we’ve been defining tables in such a way that SQLAlchemy can understand them.
Essential SQLAlchemy
by
Rick Copeland
Published 4 Jun 2008
Engines and MetaData This chapter introduces SQLAlchemy’s Engine and MetaData classes. The Engine class provides database connectivity, including a connection pool with various strategies for acquiring connections from the pool. The MetaData class maintains information about your database schema, including any tables and indexes defined. In this chapter, you will learn how to define a new database schema using MetaData as well as how to connect a MetaData instance to an existing schema. Engines and Connectables The SQLAlchemy-provided Engine class is responsible for managing the connection to the database. It does this by incorporating a database connection pool and a database-specific Dialect layer to translate the SQL expression language (Chapter 5) into database-specific SQL.
…
For instance, to retrieve all the users created in 2007, you would write: statement = user_table.select(and_( user_table.c.created >= date(2007,1,1), user_table.c.created < date(2008,1,1)) result = statement.execute() In order to use the SQL expression language, you need to provide SQLAlchemy with information about your database schema. For instance, if you are using the user table mentioned previously, your schema definition might be the following: metadata=MetaData('sqlite://') # use an in-memory SQLite database user_table = Table( 'tf_user', metadata, Column('id', Integer, primary_key=True), Column('user_name', Unicode(16), unique=True, nullable=False), Column('email_address', Unicode(255), unique=True, nullable=False), Column('password', Unicode(40), nullable=False), Column('first_name', Unicode(255), default=''), Column('last_name', Unicode(255), default=''), Column('created', DateTime, default=datetime.now)) If you would rather use an existing database schema definition, you still need to tell SQLAlchemy which tables you have, but SQLAlchemy can reflect the tables using the database server’s introspection capabilities.
…
For instance, if you are using the user table mentioned previously, your schema definition might be the following: metadata=MetaData('sqlite://') # use an in-memory SQLite database user_table = Table( 'tf_user', metadata, Column('id', Integer, primary_key=True), Column('user_name', Unicode(16), unique=True, nullable=False), Column('email_address', Unicode(255), unique=True, nullable=False), Column('password', Unicode(40), nullable=False), Column('first_name', Unicode(255), default=''), Column('last_name', Unicode(255), default=''), Column('created', DateTime, default=datetime.now)) If you would rather use an existing database schema definition, you still need to tell SQLAlchemy which tables you have, but SQLAlchemy can reflect the tables using the database server’s introspection capabilities. In this case, the schema definition reduces to the following: users_table = Table('users', metadata, autoload=True) Although the SQLAlchemy SQL expression language is quite powerful, it can still be tedious to manually specify the queries and updates necessary to work with your tables.
The Pragmatic Programmer
by
Andrew Hunt
and
Dave Thomas
Published 19 Oct 1999
With Perl you can manipulate text, interact with programs, talk over networks, drive Web pages, perform arbitrary precision arithmetic, and write programs that look like Snoopy swearing. Tip 28 Learn a Text Manipulation Language To show the wide-ranging applicability of text manipulation languages, here's a sample of some applications we've developed over the last few years. Database schema maintenance. A set of Perl scripts took a plain text file containing a database schema definition and from it generated: – The SQL statements to create the database – Flat data files to populate a data dictionary – C code libraries to access the database – Scripts to check database integrity – Web pages containing schema descriptions and diagrams – An XML version of the schema Java property access.
…
Of course, this scheme works only if you make the code generation part of the build process itself.[9] [9] Just how do you go about building code from a database schema? There are several ways. If the schema is held in a flat file (for example, as create table statements), then a relatively simple script can parse it and generate the source. Alternatively, if you use a tool to create the schema directly in the database, then you should be able to extract the information you need directly from the database's data dictionary. Perl provides libraries that give you access to most major databases. Figure 3.3. Active code generator creates code from a database schema Another example of melding environments using code generators happens when different programming languages are used in the same application.
…
This is a clear violation of the DRY principle. To correct this problem, we need to choose the authoritative source of information. This may be the specification, it may be a database schema tool, or it may be some third source altogether. Let's choose the specification document as the source. It's now our model for this process. We then need to find a way to export the information it contains as different views—a database schema and a high-level language record, for example.[10] [10] See It's Just a View, page 157, for more on models and views. If your document is stored as plain text with markup commands (using HTML, LATEX, or troff, for example), then you can use tools such as Perl to extract the schema and reformat it automatically.
Learning Android
by
Marko Gargenta
Published 11 Mar 2011
buttons, Programmatic User Interface, The StatusActivity Layout, Creating Your Application-Specific Object and Initialization Code, Update StatusActivity to Load the Menu, Add Menu Items, Toggle Service, Intent Service C C implementation, The APK, The Native Development Kit (NDK), Documentation and Standardized Headers, The JNI Header File, C Implementation Canvas, Custom Rose Widget classpaths, Adding the jtwitter.jar Library close() method, Refactoring Status Data coarse location permissions, The manifest file color, adding, Adding Color and Graphics, Adding Color Compass demo, Compass Demo, Custom Rose Widget compiling code, Compiling Code and Building Your Projects: Saving Files, Updating the Manifest File for Internet Permission comprehensive platform, Comprehensive, Yamba Project Overview content providers, Content Providers, Project Design, Creating a Content Provider, Getting the Data Type, Using Content Providers Through Widgets, Summary creating, Creating a Content Provider, Getting the Data Type overview, Content Providers, Project Design using through widgets, Using Content Providers Through Widgets, Summary CRUD principle, Content Providers cursors, Cursors, Creating the TimelineActivity Class, Querying Data custom permissions, Adding Custom Permissions to Send and Receive Broadcasts, Declaring Permissions in the Manifest File D .d() severity level, Logging in Android Dalvik, Dalvik databases, The Database, The Database Schema and Its Creation, Database Constraints, Creating the TimelineActivity Class constraints on, Database Constraints overview, The Database schema, creating, The Database Schema and Its Creation working with, Creating the TimelineActivity Class DbHelper, DbHelper DDMS, Setting Up Android Development Tools, LogCat from the Eclipse DDMS perspective, Exploring the Filesystem, The User Data Partition, Verify that the database was created debugging, Debugging Android Apps declarative user interface, Declarative User Interface delete() method, Four Major Operations, Creating a Content Provider design philosophy, Design Philosophy Destroyed State, Destroyed state development tools, Setting Up Android Development Tools distribution of applications, Application Distribution doInBackground() method, AsyncTask E .e() severity level, Logging in Android Eclipse Android Development Tools (ADT), Declarative User Interface Eclipse IDE, Installing Eclipse, Logging in Android, Using the WYSIWYG editor in Eclipse, Prefs Resource, PrefsActivity editing code in, Prefs Resource, PrefsActivity installing, Installing Eclipse Organize Imports tool, Logging in Android WYSIWYG Editor, Using the WYSIWYG editor in Eclipse emulators, The Emulator events, Other UI Events, Other UI Events execSQL() method, Four Major Operations, First Example F fetchStatusUpdates() method, Refactoring Status Data, Broadcasting Intents Fibonacci demo, An NDK Example: Fibonacci, Summary file system, Filesystem, The Filesystem Explained, Summary files, saving, Compiling Code and Building Your Projects: Saving Files fine location permissions, The manifest file format() method, Creating the TimelineActivity Class FrameLayout, FrameLayout G garbage collector, Creating the TimelineActivity Class Geocoder, Where Am I?
…
To use File Explorer in Eclipse, either open the DDMS perspective in the top-right corner of your Eclipse or go to Windows→Show View→Other…→Android→File Explorer. This will open the view of the filesystem of the device you are currently looking at. So far, you know that the database file is there, but don’t really know whether the database schema was created properly. The next section addresses that. Using sqlite3 Android ships with the command-line tool sqlite3. This tool gives you access to the database itself. To see whether your database schema was created properly: Open up your terminal or command-line window. Type adb shell to connect to your running emulator or physical phone. Change the directory to the location of your database file by typing cd /data/data/com.marakana.yamba/databases/.
…
startActivity() method, Update StatusActivity to Handle Menu Events, Initial App Setup, Pending intents Starting State, Starting state startManagingCursor() method, Creating the TimelineActivity Class, Querying Data startService() method, Update the Options Menu Handling, Pending intents Status activity, Updating the Status Activity status data, Looping in the Service, Update UpdaterService, Refactoring Status Data, Refactoring Status Data, Defining the URI, Implementing the YambaWidget class status updates, The StatusActivity Layout, Looping in the Service, The Database, Broadcast Receivers, The TimelineReceiver, Updating the Services to Enforce Permissions, Using Content Providers Through Widgets, Summary, Intent Service, Sending Notifications, Summary checking for new, Looping in the Service using IntentService to run, Intent Service notification of, The TimelineReceiver, Sending Notifications, Summary permissions for, Updating the Services to Enforce Permissions screen, The StatusActivity Layout sending, Broadcast Receivers storing locally, The Database widget for displaying, Using Content Providers Through Widgets, Summary status.xml, The StatusActivity Layout StatusActivity, The StatusActivity Layout, The StatusActivity Java Class, AsyncTask, Update StatusActivity to Load the Menu, Simplifying StatusActivity Stopped State, Stopped state stopService() method, Update the Options Menu Handling, The Network Receiver strings resource, Important Widget Properties Stub() method, Implementing the Service, Binding to the Remote Service subscribers, About Broadcast Receivers system services, Services, Project Design, Services, UpdaterService, Update the Manifest File, Add Menu Items, Testing the Service, Looping in the Service, Testing the Service, System Services, Compass Demo, Custom Rose Widget, Common Steps in Using System Services, Location Service, Updating the Status Activity, Intent Service, Intent Service, Sending Notifications adding and handling menus, Add Menu Items Android Services vs. native services, Services common steps in using, Common Steps in Using System Services Compass demo, Compass Demo, Custom Rose Widget creating, UpdaterService defining in manifest file, Update the Manifest File intents, Intent Service, Intent Service location service, Location Service, Updating the Status Activity looping in, Looping in the Service notification service, Sending Notifications overview, Services, System Services and project design, Project Design testing, Testing the Service, Testing the Service T TableLayout, TableLayout TableRow, TableLayout testing, Testing the Service, Testing the Service, Testing the Service, Testing That It All Works text property, Important Widget Properties TextWatcher, Other UI Events Thread.sleep() method, Looping in the Service, Looping in the Service threading, Threading in Android timeline activity example, TimelineActivity, Creating an Adapter in TimelineActivity.java, Toggle Service, The TimelineReceiver timeline adapter example, TimelineAdapter, ViewBinder: A Better Alternative to TimelineAdapter timeline receivers, The TimelineReceiver, The TimelineReceiver, Updating the Services to Enforce Permissions toggle service, Toggle Service Twitter, A Real-World Example, Broadcast Receivers, The Yamba Application, Creating Your Application-Specific Object and Initialization Code, Updating the Manifest File for Internet Permission, AsyncTask, Other UI Events, Other UI Events, Other UI Events, Shared Preferences, UpdaterService, Pulling Data from Twitter, Pulling Data from Twitter, The Database Schema and Its Creation, Update UpdaterService, Lists and Adapters 140-character counter, Other UI Events, Other UI Events creating compatible apps, Creating Your Application-Specific Object and Initialization Code, Updating the Manifest File for Internet Permission, AsyncTask, Other UI Events, Shared Preferences, Lists and Adapters example of app, A Real-World Example, Broadcast Receivers pulling data from, UpdaterService, Pulling Data from Twitter, Pulling Data from Twitter, The Database Schema and Its Creation, Update UpdaterService and Yamba, The Yamba Application U UI (user interface), Part 1: Android User Interface, Two Ways to Create a User Interface, Creating Your Application-Specific Object and Initialization Code, Optimizing the User Interface Android objects, Creating Your Application-Specific Object and Initialization Code optimizing, Optimizing the User Interface two ways to create, Two Ways to Create a User Interface Uniform Resource Identifier (URI), Defining the URI unmarshaling, The Android Interface Definition Language update() method, Four Major Operations, Creating a Content Provider updateAppWidget() method, Implementing the YambaWidget class UpdaterService, UpdaterService, Pulling Data from Twitter, Update UpdaterService, Update UpdaterService, BootReceiver, The TimelineReceiver, Broadcasting Intents, Updating the Services to Enforce Permissions updateStatus() method, Single Thread URI (Uniform Resource Identifier), Defining the URI user data partition, Filesystem Partitions, The User Data Partition user interface, Two Ways to Create a User Interface (see UI) user preferences, Preferences (see preferences) username, The User Data Partition V ViewBinder, ViewBinder: A Better Alternative to TimelineAdapter views and layouts, Views and Layouts viruses, What about viruses, malware, spyware, and other bad things?
The Productive Programmer
by
Neal Ford
Published 8 Dec 2008
Using an O/R mapper introduces repetition in projects where we have essentially the same information in three places: in the database schema, in an XML mapping document, and in a class file. This represents two violations of the DRY principle. The solution to this problem is to create a single representation and generate the other two. The first step is deciding who is the “official” holder of this knowledge. For example, if the database is the canonical source, generate the XML mapping and the corresponding class files. In this example, I used Groovy (the scripting language dialect of Java) to solve the mismatch. On the project that spawned this example, the developers had no control over the database schema. Thus, I decided that the database is the canonical representation of the data.
…
The Groovy script runs as part of the build process so that any time the database schema changes, it automatically generates the corresponding mapping file and the Java class to which it maps. Migrations Another situation where duplication creeps into projects also arises from the impedance mismatch between code and SQL. Lots of projects treat source code and SQL as completely separate artifacts, sometimes created by entirely separate groups of developers. Yet, for the source code to function correctly, it must rely on a certain version of the database schema and data. There are a couple of ways to solve this problem, one framework is specific and the other was designed to work across frameworks and languages.
…
NOTE Never keep two copies of the same thing (like code and a diagram describing it). If you are going to draw your diagrams first, use the tool to generate the code. If you create informal diagrams using the whiteboard approach and need more formal diagrams later, generate them from code. Otherwise, they will always get out of sync. Database Schemas Just like class diagrams, database schemas are a dangerous area for needless repetition. SchemaSpy§ is an open source tool that does for database entity/relationship diagrams what yDoc does for code. It attaches to the database and produces table information (including metadata) and relationship diagrams like the one shown in Figure 5-6
Deploying OpenStack
by
Ken Pepple
Published 26 Jul 2011
Database The database stores most of the configuration and run-time state for a cloud infrastructure. This includes the instance types that are available for use, instances in use, networks available, and projects. Table 4-4 details all the tables in the current Nova database scheme. Table 4-4. Nova Database Schema Table NameDescription migrate_version Stores current version of the database schema as well as other migration-related info. Only used internally and by developers during upgrades. migrations Used for running host-to-host migration. auth_tokens Maps Authorization tokens (for all API transactions) to actual users (via the user id field).
…
The sync subcommand will upgrade the database scheme for new versions of Nova and the version will report the current version. Note Nova uses a database abstraction library called SQL-Alchemy to interact with its database. A complimentary package called sqlalchemy-migrate is used to manage the database schema. Inspired by Ruby on Rails’ migrations feature, it provides a programmatic way to handle database schema changes. For Nova administrators, this only applies when they are upgrading versions. To upgrade scheme versions, use the nova-manage db sync. This should be rarely used unless you are installing from source or upgrading your installation.
…
First up is adding the network manager related options to /etc/nova.conf: --network_manager=nova.network.manager.FlatDHCPManager --fixed_range=10.0.0.0/8 --network_size=8 Now, let’s add our network into the database with the nova-manage command: # nova-manage network create 10.0.0.0/8 1 8 Finally, we will create the floating addresses using a range that will not overlap with the StackOps installation. # nova-manage floating create cactus 192.168.1.136/29 With all the /etc/nova.conf changes made, we can create the database schema by running the nova-manage db commands. First, we will sync the database (create the schema), then we will version the database to make sure that the schema creation succeeded. $ sudo nova-manage db sync $ sudo nova-manage db version 14 And we are done with the configuration. All that is left to do now is restart the Nova daemons to pick the new configuration options
Build Awesome Command-Line Applications in Ruby: Control Your Computer, Simplify Your Life
by
David B. Copeland
Published 6 Apr 2012
(Recall that we must use bundle exec since we are running out of our source tree for this example; users of todo will, of course, be able to run todo on its own, as discussed in Chapter 9, Be Easy to Maintain.) $ bundle exec bin/todo new Reading new tasks from stdin... Design database schema Get access to production logs Code Review Implement model objects for new scheme ^D $ bundle exec bin/todo done 3 $ bundle exec bin/todo list 1 - Design database schema Created: Sun Oct 30 12:53:11 -0400 2011 2 - Get access to production logs Created: Sun Oct 30 12:53:11 -0400 2011 3 - Code Review Created: Sun Oct 30 12:53:11 -0400 2011 Completed: Sun Oct 30 13:00:05 -0400 2011 4 - Implement model objects for new schema Created: Sun Oct 30 12:53:11 -0400 2011 Let’s enhance the output of list as follows: Show the task name in a brighter/bolder font (the ANSI escape sequences provide two version of each color: a normal color and a brighter/bolder version).
…
Since the location of the “home directory” is really just shorthand for “whatever directory is in ENV[’HOME’],” we now have test coverage without worrying that our personal task list will be touched. Let’s run our new test and make sure it passes. To prove that we aren’t touching our home directory, we’ll list our actual task list before and after running our feature. $ todo list 1 - Design database schema Created: Sun Oct 02 08:06:12 -0500 2011 2 - Get access to production logs Created: Sun Oct 02 08:06:12 -0500 2011 3 - Code Review Created: Sun Oct 02 08:06:12 -0500 2011 $ rake features Feature: We can add new tasks As a busy developer with a lot of things to do I want to keep a list of tasks I need to work on Scenario: The task list is in our home directory by default Given there is no task list in my home directory When I successfully run `todo new 'Some new todo item'` Then the task list should exist in my home directory When I successfully run `todo list` Then the stdout should contain "Some new todo item" 1 scenarios (1 passed) 5 steps (5 passed) 0m0.793s $ todo list 1 - Design database schema Created: Sun Oct 02 08:06:12 -0500 2011 2 - Get access to production logs Created: Sun Oct 02 08:06:12 -0500 2011 3 - Code Review Created: Sun Oct 02 08:06:12 -0500 2011 Our test passed, but our task list wasn’t modified—everything worked!
…
$ todo list 1 - Design database schema Created: Sun Oct 02 08:06:12 -0500 2011 2 - Get access to production logs Created: Sun Oct 02 08:06:12 -0500 2011 3 - Code Review Created: Sun Oct 02 08:06:12 -0500 2011 $ rake features Feature: We can add new tasks As a busy developer with a lot of things to do I want to keep a list of tasks I need to work on Scenario: The task list is in our home directory by default Given there is no task list in my home directory When I successfully run `todo new 'Some new todo item'` Then the task list should exist in my home directory When I successfully run `todo list` Then the stdout should contain "Some new todo item" 1 scenarios (1 passed) 5 steps (5 passed) 0m0.793s $ todo list 1 - Design database schema Created: Sun Oct 02 08:06:12 -0500 2011 2 - Get access to production logs Created: Sun Oct 02 08:06:12 -0500 2011 3 - Code Review Created: Sun Oct 02 08:06:12 -0500 2011 Our test passed, but our task list wasn’t modified—everything worked! Manipulating the environment is a great technique for testing behavior like this; the environment works as a “middleman” that allows us to change things (like the location of the user’s home directory) without affecting the code of our tests or our app.
Intrusion Detection With Snort, Apache, Mysql, Php, and Acid
by
Rafeeq Ur Rehman
Published 7 May 2003
The following command shows the structure of the iphdr table: mysql> describe iphdr; +----------+----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------+----------------------+------+-----+---------+-------+ | sid | int(10) unsigned | | PRI | 0 | | | cid | int(10) unsigned | | PRI | 0 | | | ip_src | int(10) unsigned | | MUL | 0 | | | ip_dst | int(10) unsigned | | MUL | 0 | | | ip_ver | tinyint(3) unsigned | YES | | NULL | | | ip_hlen | tinyint(3) unsigned | YES | | NULL | | | ip_tos | tinyint(3) unsigned | YES | | NULL | | | ip_len | smallint(5) unsigned | YES | | NULL | | | ip_id | smallint(5) unsigned | YES | | NULL | | | ip_flags | tinyint(3) unsigned | YES | | NULL | | | ip_off | smallint(5) unsigned | YES | | NULL | | | ip_ttl | tinyint(3) unsigned | YES | | NULL | | | ip_proto | tinyint(3) unsigned | | | 0 | | | ip_csum | smallint(5) unsigned | YES | | NULL | | +----------+----------------------+------+-----+---------+-------+ 14 rows in set (0.00 sec) mysql> For people who want to go into details of how data is stored, database schema provides great information. You can view complete database schema at http://www.incident.org/snortdb/. 5.1.5.1 Creating Extra Tables When you are using some other programs with database and Snort to map service numbers to service names, additional mapping information is needed. For example, TCP port 23 is used for Telnet.
…
Fortunately you can use the script create_mysql in the contrib directory and it will create all of the necessary tables for you. The contrib directory is present when you download Snort in the source code form from its web site http://www.snort.org and extract its source files. The create_mysql script is present along with other useful stuff in this directory. For example, scripts to create database schema in other types of database servers are also found in this directory. The following command uses this script to create all database tables in the snort database. [root@laptop]# mysql -h localhost -u rr -p snort < contrib/create_mysql Enter password: [root@laptop]# Different command line options are used with this command.
…
The opt table contains options. The reference and reference_system tables contain information about reference sites used to get more information about a vulnerability. This is the same information that is used inside Snort rules using the ref keyword as discussed in Chapter 3. The schema tables shows the version of database schema. The sensor table contains information about different sensors that are logging data to the Snort database. If there is only one Snort sensor, the table contains only one row. Similarly, the table contains one row for each sensor. The sig_class contains information about different classes of Snort rules as discussed in Chapter 3.
The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2
by
Thomas A. Limoncelli
,
Strata R. Chalup
and
Christina J. Hogan
Published 27 Aug 2014
) * * * 11.8 Live Schema Changes Sometimes a new software release expects a different database schema from the previous release. If the service could withstand downtime, one would bring down the service, upgrade the software and change the database schema, and then restart the service. However, downtime is almost always unacceptable in a web environment. In a typical web environment there are many web server frontends, or replicas, that all talk to the same database server. The older software release is unable to understand the new database schema and will malfunction or crash. The newer software release is unable to understand the old database schema and will also malfunction.
…
The newer software release is unable to understand the old database schema and will also malfunction. Therefore you cannot change the database schema and then do the software upgrade: the older replicas will fail as soon as the database is modified. There will be no service until replicas are upgraded. You cannot upgrade the replicas and then change the database schema because any upgraded replica will fail. The schema cannot be upgraded during the rolling upgrade: new replicas will fail, and then at the moment the database schema changes, all the failing replicas will start to work and the working replicas will start to fail. What chaos! Plus, none of these methods has a decent way to roll back changes if there is a problem.
…
Flag flips are a technique whereby new features are included in a release but are disabled. A flag or software toggle enables the feature at a designated time. This makes it easy to revert a change if problems are found: simply turn the toggle off. Database schema changes on live systems can be difficult to coordinate. It is not possible to change the software of all clients at the exact same time as the database schema. Instead, the McHenry Technique is used to decouple the changes. New fields are added, software is upgraded to use the old and new fields, software is upgraded to use only the new fields, and any old fields are then removed from the schema.
PostgreSQL 9 Admin Cookbook: Over 80 Recipes to Help You Run an Efficient PostgreSQL 9. 0 Database
by
Simon Riggs
and
Hannu Krosing
Published 23 Oct 2010
After all of these have executed, the databases are fully replicated with slon daemons running. ff 01_create_init.sh creates the bash scripts for later use. You can edit $SLON_ TMP/my_replica.sh and add/remove tables, if needed. So selective replication is a simple matter of editing those files. ff create_struct.sh performs a dump from Master and a restore to the slave of the database schema. This dumps the whole database schema. ff 02_exec_init.sh executes the scripts. This copies the data, so may take a while. Maintaining replication Slony doesn't replicate the following items: ff Table definition changes (DDL commands) ff Changes to users and roles ff Changes to large objects (BLOBS) ff Changes made by the TRUNCATE command (may change in future) So, once you've set up Slony, you'll need to make any further changes to the replicated tables onto each of the nodes in the configuration.
…
A recommended minimal log_line_prefix format string for auditing DDL is '%t %u %d', which tells postgresql to log timestamp, database user, and database name at the start of every log line. Can't I find out this information from the database If you don't have logging enabled, or don't have all the logs, then you can get only very limited information on who changed the database schema and when from system tables, and even that is not reliable. What you can can get is the "owner" of the database object (table, sequence, function, and so on) but this may have been changed by "ALTER TABLE … SET OWNER to yyyy". You may be able to guess the approximate time of object creation or of the latest modification by looking up the transaction id in xmin system column in pg_class and pg_attrib system tables.
…
HOT updates not only avoid creating new index entries but also can perform a fast mini-vacuum inside the page to make room for new rows. ALTER TABLE t1 SET (fillfactor = 70); Tells PostgreSQL to fill only 70% of each page in table t1 when doing inserts, so that 30% is left for use by in-page (HOT) updates. Rewriting the schema—a more radical approach In some occasions, it may make sense to rewrite the database schema, and provide an old view for unchanged queries using views, triggers, rules, and functions. One such case occurs when refactoring the database, and you want old queries to keep running while changes are made. Another is an external application that is unusable with the provided schema, but can be made to perform OK with a different distribution of data between tables. 259 Performance & Concurrency Why is my query not using an index?
Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith
by
Sam Newman
Published 14 Nov 2019
This is clearly not a great solution to a problem we shouldn’t have had in the first place, but it worked—mostly. However, we soon realized that most of these applications weren’t undergoing active maintenance, meaning there was no chance that they would be updated to reflect a new schema design.2 In effect, our database schema had become a public-facing contract that couldn’t change: we had to maintain that schema structure going forward. Views to Present Our solution was to first resolve those situations where external systems were writing into our schema. Luckily, in our case they were easy to resolve. For all those clients who wanted to read data, though, we created a dedicated schema hosting views that looked like the old schema, and had clients point at that schema instead, as Figure 4-3 shows.
…
Well, fundamentally, logical and physical decomposition achieve different goals. Logical decomposition allows for simpler independent change and information hiding, whereas physical decomposition potentially improves system robustness, and could help remove resource contention allowing for improved throughput or latency. When we logically decompose our database schemas but keep them on the same physical database engine, as in Figure 4-24, we have a potential single point of failure. If the database engine goes down, both services are affected. However, the world isn’t that simple. Many database engines have mechanisms to avoid single points of failure, such as multiprimary database modes, warm failover mechanisms, and the like.
…
In the end, they settled on a model in which the new revenue functionality was effectively deployed as a single service, containing three isolated bounded contexts (each of which ended up as separate JAR files), as shown in Figure 4-28. Figure 4-28. Each bounded context in the Revenue service had its own separate database schema, allowing separation later on Each bounded context had its own, totally separate databases. The idea was that if there was a need to separate them into microservices later, this would be much easier. However, it turned out that this was never needed. Several years later, this revenue service remains as it is, a monolith with multiple associated databases—a great example of a modular monolith.
Programming Android
by
Zigurd Mednieks
,
Laird Dornin
,
G. Blake Meike
and
Masumi Nakamura
Published 15 Jul 2011
Note The tables created by SQL CREATE TABLE statements and the attributes they contain are called a database schema. DROP TABLE This removes a table added with the CREATE TABLE statement. It takes the name of the table to be deleted. On completion, any data that was stored in the table may not be retrieved. Here is some SQL code that will create and then delete a simple table for storing contacts: CREATE TABLE contacts ( first_name TEXT, last_name TEXT, phone_number TEXT, height_in_meters REAL); DROP TABLE contacts; When entering commands through sqlite3, you must terminate each command with a semicolon. You may change the database schema after you create tables (which you may want to do to add a column or change the default value of a column) by entering the ALTER TABLE command.
…
Detailed information on Monkey can be found at http://developer.android.com/guide/developing/tools/monkey.html. sqlite3 Android uses SQLite as the database system for many system databases and provides APIs for applications to make use of SQLite, which is convenient for data storage and presentation. SQLite also has a command-line interface, and the sqlite3 command enables developers to dump database schemas and perform other operations on Android databases. These databases are, of course, in an Android device, or they are contained in an AVD, and therefore the sqlite3 command is available in the adb shell. Detailed directions for how to access the sqlite3 command line from inside the adb shell are available at http://developer.android.com/guide/developing/tools/adb.html#shellcommands.
…
This may help you to avoid orphaned references, also known as enforcement of foreign keys. However, the foreign key constraint in SQLite is optional, and is turned off in Android. As of Android 2.2, you cannot rely on a foreign key constraint to catch incorrect foreign key references, so you will need to take care when creating database schemas that use foreign keys. There are several other constraints with less far-reaching effects: UNIQUE Forces the value of the given column to be different from the values in that column in all existing rows, whenever a row is inserted or updated. Any insert or update operation that attempts to insert a duplicate value will result in an SQLite constraint violation.
Refactoring: Improving the Design of Existing Code
by
Martin Fowler
,
Kent Beck
,
John Brant
,
William Opdyke
and
Don Roberts
Published 9 Mar 2012
As we learn more about refactoring, we will come up with more solutions to problems and learn about what problems are difficult to solve. Databases One problem area for refactoring is databases. Most business applications are tightly coupled to the database schema that supports them. That's one reason that the database is difficult to change. Another reason is data migration. Even if you have carefully layered your system to minimize the dependencies between the database schema and the object model, changing the database schema forces you to migrate the data, which can be a long and fraught task. With nonobject databases a way to deal with this problem is to place a separate layer of software between your object model and your database model.
…
Whereas applying (and possibly backing out) a series of incremental refactorings is fairly natural for Smalltalk and CLOS, the cost per iteration (in terms of recompilation and retesting) is higher for C++ programs; thus programmers tend to be less willing to make these small changes. Many applications use a database. Changes to the structure of objects in a C++ program may require that corresponding schematic changes be made to the database. (Many of the ideas I applied in my refactoring work came from research into object-oriented database schema 317 evolution.) Another limitation, which may interest software researchers more than many software practitioners, is that C++ does not provide support for metalevel program analysis and change. There is nothing analogous to the metaobject protocol available for CLOS. The metaobject protocol of CLOS, for example, supports a sometimes useful refactoring for changing selected instances of a class into instances of a different class while having all references to the old objects automatically point to the new objects.
…
There are many other checks, most as simple as this, some more complex. In my research, I defined safety in terms of program properties (related to activities such as scoping and typing) that need to continue to hold after refactoring. Many of these program properties are similar to integrity constraints that must be maintained when database schemas change.[17] Each refactoring has associated with it a set of necessary preconditions that if true would ensure that the program properties are preserved. Only if the tool were to determine that everything is safe would the tool perform the refactoring. Fortunately, determining whether a refactoring is safe often is trivial, especially for the low-level refactorings that constitute most of our refactoring.
Beautiful Visualization
by
Julie Steele
Published 20 Apr 2010
So, I wrote some scripts to extract only the part of the data I needed for my visualization and stored it in a SQLite database. The schema is shown in Figure 8-1. In the interests of simplicity, I assigned a party based on only the most recent <role>, a decision that would come back to haunt me. Figure 8-1. Simple database schema for representing the raw data required for the visualization Computing the Voting Affinity Matrix With the raw data munged into a more pliant format, I was ready to tackle the problem of computing the affinities that would represent the edges in the graph. This entailed building an affinity matrix (Figure 8-2) that tallied the number of times different senators voted the same way on the same bills.
…
Most databases have a special cube or rollup keyword to create an aggregation from a regular table. This has the advantage that no special processing is needed beforehand, but the disadvantage of being slower and requiring more disk space to store all the original values. Structuring the data specifically for fast read and aggregation performance (as is done in data warehouses and our database schema) considerably speeds up the most common operation at the expense of more processing being required when new data is added. While the ParSets program does not currently show numerical dimensions, it does store them in the database. They are stored in a separate table, containing the key of the row the values correspond to and one column per numerical dimension.
…
For instance, it was assumed that we might want to split winners by country (and compare this data to the overall submission statistics) and look at the relationships between authors and categories. Figure 13-1. Matrix of initial interest in attribute combinations Understanding the Data Situation Next, I began to look into the available data, together with Sandor Herramhof. Over the years, a number of database schemas with different conventions and varying degrees of modeling detail had been used, which made it very difficult to get an early overview of the existing data. For instance, one database featured additional information stored in an XML format inside a text field, but only for some of the submissions.
PostgreSQL: Up and Running, 3rd Edition
by
Unknown
Figure 1-2. phpPgAdmin Adminer If you manage other databases besides PostgreSQL and are looking for a unified tool, Adminer might fit the bill. Adminer is a lightweight, open source PHP application with options for PostgreSQL, MySQL, SQLite, SQL Server, and Oracle, all delivered through a single interface. One unique feature of Adminer we’re impressed with is the relational diagrammer that can produce a schematic layout of your database schema, along with a linear representation of foreign key relationships. Another hasslereducing feature is that you can deploy Adminer as a single PHP file. Figure 1-3 is a screenshot of the login screen and a snippet from the diagrammer output. Many users stumble in the login screen of Adminer because it doesn’t include a separate text box for indicating the port number.
…
This book is not even going to attempt to describe all that you’ll find in a standard PostgreSQL install. With PostgreSQL churning out features at breakneck speed, we can’t imagine any book that could possibly do this. We limit our quick overview to those objects that you should be familiar with. Databases Each PostgreSQL service houses many individual databases. Schemas Schemas are part of the ANSI SQL standard. They are the immediate next level of organization within each database. If you think of the database as a country, schemas would be the individual states (or provinces, prefectures, or departments, depending on the country.) Most database objects first belong to a schema, which belongs to a database.
…
It can even export results as HTML, providing you with a turn-key reporting engine, albeit a bit crude. Backup and restore wizard Can’t remember the myriad of commands and switches to perform a backup or restore using pg_restore and pg_dump? pgAdmin has a nice interface that lets you selectively back up and restore databases, schemas, single tables, and globals. You can view and copy the underlying pg_dump or pg_restore command that pgAdmin used in the Message tab. Grant wizard This time-saver allows you to change privileges on many database objects in one fell swoop. pgScript engine This is a quick-and-dirty way to run scripts that don’t have to complete as a transaction.
Ansible for DevOps: Server and Configuration Management for Humans
by
Jeff Geerling
Published 9 Oct 2015
Add a line to the main.yml file to include a new deploy.yml playbook: 1 --- 2 - include: provision.yml 3 - include: deploy.yml Now we’re ready to create a the deploy.yml playbook, which will do the following: Use git to check out the latest production release of the Rails app. Copy over a secrets.yml template that holds some secure app data required for running the app. Make sure all the gems required for the app are installed (via Bundler). Create the database (if it doesn’t already exist). Run rake tasks to make sure the database schema is up-to-date and all assets (like JS and CSS) are compiled. Make sure the app files’ ownership is set correctly so Passenger and Nginx can serve them without error. If any changes or updates were made, restart Passenger and Nginx. Most of these tasks will use Ansible’s modules, but for a few, we’ll just wrap the normal deployment-related commands in shell since there aren’t pre-existing modules to take care of them for us: 1 --- 2 - hosts: all 3 sudo: yes 4 5 vars_files: 6 - vars.yml 7 8 roles: 9 - geerlingguy.passenger 10 11 tasks: 12 - name: Ensure demo application is at correct release. 13 git: 14 repo: https://github.com/geerlingguy/demo-rails-app.git 15 version: "{{ app_version }}" 16 dest: "{{ app_directory }}" 17 accept_hostkey: true 18 register: app_updated 19 notify: restart nginx 20 21 - name: Ensure secrets file is present. 22 template: 23 src: templates/secrets.yml.j2 24 dest: "{{ app_directory }}/config/secrets.yml" 25 owner: "{{ app_user }}" 26 group: "{{ app_user }}" 27 mode: 0664 28 notify: restart nginx 29 30 - name: Install required dependencies with bundler. 31 shell: "bundle install --path vendor/bundle chdir={{ app_directory }}" 32 when: app_updated.changed == true 33 notify: restart nginx 34 35 - name: Check if database exists. 36 stat: "path={{ app_directory }}/db/{{ app_environment.RAILS_ENV }}.sqlite3" 37 register: app_db_exists 38 39 - name: Create database. 40 shell: "bundle exec rake db:create chdir={{ app_directory }}" 41 when: app_db_exists.stat.exists == false 42 notify: restart nginx 43 44 - name: Perform deployment-related rake tasks. 45 shell: "{{ item }} chdir={{ app_directory }}" 46 with_items: 47 - bundle exec rake db:migrate 48 - bundle exec rake assets:precompile 49 environment: app_environment 50 when: app_updated.changed == true 51 notify: restart nginx 52 53 - name: Ensure demo application has correct user for files. 54 file: 55 path: "{{ app_directory }}" 56 state: directory 57 owner: "{{ app_user }}" 58 group: "{{ app_user }}" 59 recurse: yes 60 notify: restart nginx The first thing you’ll notice (besides the fact that we’ve included the vars.yml file again, since we need those variables in this playbook as well) is that we’ve added the geerlingguy.passenger role in this playbook.
…
Since this application uses a simple SQLite database, it’s a matter of checking if the .sqlite3 file exists, and if not, running the db:create task. (Lines 44-51) If the app_updated variable shows that a change occurred as part of the first git task, we’ll also run a couple rake tasks to make sure the database schema is up to date, and all assets (like scripts and stylesheets) are compiled. (Lines 53-60) Make sure all app files have the correct permissions for Passenger/Nginx to serve them correctly. Because many of the tasks result in filesystem changes that could change the behavior of the application, they all notify the restart nginx handler provided by the geerlingguy.passenger role, so Passenger reloads the configuration and restarts the app.
…
This way the entire state of your infrastructure is encapsulated in your playbook files, which ideally should be version controlled and managed similarly to the application they deploy. Plus, who wants to enter any more information on the command line than is absolutely required? Our application is a fairly generic web application that has updates to application code (which require a webserver reload), styles (which need recompiling), and possibly the database schema (which needs rake migrate tasks to be run). Any time app_version is changed inside playbooks/vars.yml, the deploy playbook will automatically run all the required tasks to get our app running with the latest code. Update app_version to 1.3.0, and then run the following command again: $ ansible-playbook deploy.yml -i inventory-ansible After a minute or so, the deployment should complete, and once that’s done, you’ll see the much improved new version of the Demonstration Ruby on Rails Application: Rails app - version 1.3.0 with a responsive UI.
Principles of Web API Design: Delivering Value with APIs and Microservices
by
James Higginbotham
Published 20 Dec 2021
A Word of Caution on Resource Identification While it may be tempting to use a database schema as the starting point for resource identification, keep in mind that API designs should not leak internal implementation details. A database schema reflects optimizations for transactional read and write operations rather than exposing business domain concepts over a network API. It is best to work top-down when modeling an API to avoid leaking internal data model decisions into the API design. If the implementation phase demonstrates a one-to-one relationship between resources and a database schema, then it becomes a “happy accident”, as painter Bob Ross would say.
…
Relational databases are still heavily used, but additional choices including document and columnar databases have gained recent popularity. While APIs may have an impact on the data model, an API design should evolve independently from the latest database trends. What happens when teams expose a data model as an API? Constant code changes: Database schema changes will result in a constantly changing API, as the API must keep in lockstep with the underlying database. This change to the data model will force consumers into a complex conformist relationship in which they must rewrite their API integration code every time the underlying data model changes.
Django Book
by
Matt Behrens
Published 24 Jan 2015
Middleware modules running inside it (coming later in the stack) will be under the same transaction control as the view functions. See Appendix B for more about information about database transactions. What’s Next? Web developers and database-schema designers don’t always have the luxury of starting from scratch. In the next chapter, we’ll cover how to integrate with legacy systems, such as database schemas you’ve inherited from the 1980s. © Copyright 2012, Matt Behrens. Created using Sphinx 1.2.2. index next | previous | Django Book 0.1 documentation » Chapter 18: Integrating with Legacy Databases and Applications Django is best suited for so-called green-field development – that is, starting projects from scratch, as if you were constructing a building on a fresh field of green grass.
…
It looks at all of the models in each app in your INSTALLED_APPS setting, checks the database to see whether the appropriate tables exist yet, and creates the tables if they don’t yet exist. Note that syncdb does not sync changes in models or deletions of models; if you make a change to a model or delete a model, and you want to update the database, syncdb will not handle that. (More on this in the “Making Changes to a Database Schema” section in Chapter 10.) If you run python manage.py syncdb again, nothing happens, because you haven’t added any models to the books app or added any apps to INSTALLED_APPS. Ergo, it’s always safe to run python manage.py syncdb – it won’t clobber things. If you’re interested, take a moment to dive into your database server’s command-line client and see the database tables Django created.
…
Here’s the revised code: class Book(models.Model): title = models.CharField(max_length=100) authors = models.ManyToManyField(Author) publisher = models.ForeignKey(Publisher) publication_date = models.DateField(blank=True, null=True) Adding null=True is more complicated than adding blank=True, because null=True changes the semantics of the database – that is, it changes the CREATE TABLE statement to remove the NOT NULL from the publication_date field. To complete this change, we’ll need to update the database. For a number of reasons, Django does not attempt to automate changes to database schemas, so it’s your own responsibility to execute the appropriate ALTER TABLE statement whenever you make such a change to a model. Recall that you can use manage.py dbshell to enter your database server’s shell. Here’s how to remove the NOT NULL in this particular case: ALTER TABLE books_book ALTER COLUMN publication_date DROP NOT NULL; (Note that this SQL syntax is specific to PostgreSQL.)
PostgreSQL: Up and Running
by
Regina Obe
and
Leo Hsu
Published 5 Jul 2012
What is unique about PostgreSQL tables is the inheritance support and the fact that every table automatically begets an accompanying custom data type. Tables can inherit from other tables and querying can bring up child records from child tables. schema Schemas are part of the ANSI-SQL standards, so you’ll see them in other databases. Schemas are the logical containers of tables and other objects. Each database can have multiple schemas. tablespace Tablespace is the physical location where data is stored. PostgreSQL allows tablespaces to be independently managed, which means you can easily move databases to different drives with just a few commands.
…
It can even export as HTML, providing you with a turn-key reporting engine, albeit a bit crude. Backup and restore wizard. Can’t remember the myriad of commands and switches to perform a backup or restore using pg_restore and pg_dump? pgAdmin has a nice interface that’ll let you selectively back up and restore databases, schemas, single tables, and globals, and the message tab shows you the command line pg_dump or pg_restore it used to do it. Grant Wizard. This time-saver will allow you to change permissions on many database objects in one fell swoop. pgScript engine. This is a quick and dirty way to run scripts that don’t have to complete as a transaction.
…
Selective Backup of Database Objects pgAdmin provides a graphical interface to pg_dump that we covered in Selective Backup Using pg_dump for doing selective backup of objects. To back up selective objects right-mouse click on the object you want to back up and select Backup .... You can back up a whole database, schema, table, or anything else. Figure 4-11. PgAdmin Right-click Backup Schema If all you wanted to back up was that one object, you can forgo the other tabs and just do as we did in Figure 4-10. However, you can selectively pick or uncheck some more items by clicking on the objects tab as shown in Figure 4-12.
Data Mining: Concepts and Techniques: Concepts and Techniques
by
Jiawei Han
,
Micheline Kamber
and
Jian Pei
Published 21 Jun 2011
Figure 3.12 A concept hierarchy for the attribute price, where an interval ($X … $Y] denotes the range from $X (exclusive) to $Y (inclusive). 6. Concept hierarchy generation for nominal data, where attributes such as street can be generalized to higher-level concepts, like city or country. Many hierarchies for nominal attributes are implicit within the database schema and can be automatically defined at the schema definition level. Recall that there is much overlap between the major data preprocessing tasks. The first three of these strategies were discussed earlier in this chapter. Smoothing is a form of data cleaning and was addressed in Section 3.2.2.
…
Nominal attributes have a finite (but possibly large) number of distinct values, with no ordering among the values. Examples include geographic_location, job_category, and item_type. Manual definition of concept hierarchies can be a tedious and time-consuming task for a user or a domain expert. Fortunately, many hierarchies are implicit within the database schema and can be automatically defined at the schema definition level. The concept hierarchies can be used to transform the data into multiple levels of granularity. For example, data mining patterns regarding sales may be found relating to specific regions or countries, in addition to individual branch locations.
…
Consequently, the user may have included only a small subset of the relevant attributes in the hierarchy specification. For example, instead of including all of the hierarchically relevant attributes for location, the user may have specified only street and city. To handle such partially specified hierarchies, it is important to embed data semantics in the database schema so that attributes with tight semantic connections can be pinned together. In this way, the specification of one attribute may trigger a whole group of semantically tightly linked attributes to be “dragged in” to form a complete hierarchy. Users, however, should have the option to override this feature, as necessary.
API Design Patterns
by
Jj Geewax
Published 19 Jul 2021
Who should read this book API Design Patterns is for anyone who is building or plans to build a web API, particularly when that API will be exposed to the public. Familiarity with some serialization formats (e.g., JSON, Google Protocol Buffers, or Apache Thrift) or common storage paradigms (e.g., relational database schemas) is certainly nice to have but is in no way required. It’s also a nice bonus if you’re already familiar with HTTP and its various methods (e.g., GET and POST), as it is the transport of choice throughout the examples in this book. If you find yourself designing an API and running into problems and thinking, “I’m sure someone must have figured this out already,” this book is for you.
…
For example, if we were to build an API for a chat room, the resource layout refers to the choices we make that might result in a ChatRoom resource along with a User resource that might associate somehow with a room. This manner, in which User and ChatRoom resources associate with one another, is what we’re interested in. If you’ve ever designed a relational database with various tables, this should feel familiar: the database schema you design is often very similar in nature to how the API is represented. While it might be tempting to say that the relationship is all that matters, it’s actually a bit more involved than this. While the relationship itself is the only thing with a direct influence in the resource layout we end up with, there are many other factors that influence the layout indirectly.
…
Let’s start by looking at an important question: do we need a relationship at all? 4.2.1 Do you need a relationship at all? When building an API, after we’ve chosen the list of things or resources that matter to us, the next step is to decide how these resources relate to one another. Sort of like normalizing a database schema, it can often be tempting to connect everything that might ever need to be connected in order to build a rich web of interconnected resources that we can easily navigate through a variety of pointers and references. While this will certainly lead to a very rich and descriptive API, it can also become untenable and run into serious performance degradation as the API receives more and more traffic and stores more and more data.
Software Design Decoded: 66 Ways Experts Think
by
Marian Petre
,
André van Der Hoek
and
Yen Quach
Published 5 Oct 2016
When the situation calls for it, they employ much more formal diagrams than the sketches they typically create. To model certain phenomena more clearly—with more precision and completeness—they may well work out a state machine in all of its detail, edit pseudocode on a whiteboard, or meticulously specify all of the entities, fields, keys, and relationships in a database schema. Once done, however, they will quickly return to sketching, employing a more informal style that abandons much notational detail. 27 Experts invent notations Experts choose a notation that suits the problem, even if the notation does not exist. New notations arise when, in the heat of design, shorthand symbols are used that take on a meaning of their own.
RDF Database Systems: Triples Storage and SPARQL Query Processing
by
Olivier Cure
and
Guillaume Blin
Published 10 Dec 2014
A set of triples is characterized by the omnipresence of URIs at the three positions of RDF statements, although over forms can also be used, e.g., literals at the object position.The fact that a given URI can be repeated multiple times in a data set and that URIs are generally long strings of characters raises some memory footprint issues. Therefore, the use of efficient dictionaries—encoding URIs with identifiers that are used for the representation of triples—is crucial in triple stores, while in relational models a suitable database schema minimizes such data redundancy. The second feature is that a data management system requires a language to query the data it contains. For this purpose, the Semantic Web group at the W3C has published the SPARQL Protocol and RDF Query Language (SPARQL) recommendation; one of its main aspects is to support a query language.
…
At least three out of four NoSQL system families adopt this approach, and they are mainly differentiated by the type of information they can store at the value position.The other quarter of NoSQL systems rely on the graph model and have a set of totally different properties and use cases. An important aspect of the application domains motivating the emergence of NoSQL stores was the need for flexibility in terms of database schema. In fact, the designers of these systems pushed this characteristic to its logical limit by not providing facilities to define a schema. This schemaless property facilitates an important feature needed in many data management approaches: the integration of novel data into the database. For example, consider a relation where one can add as many attributes as needed.
…
Moreover, if the RDF data set is associated to an RDFS or OWL ontology, then the possibility of cleverly handling query processing is increased. In this chapter, we do not consider deep reasoning mechanisms, but defer that presentation to Chapter 8. Nonetheless, the expressivity of Semantic Web ontologies supports some query simplifications that cannot be performed using the metadata contained in a relational database schema. For instance, we present a semantic simplification of a SPARQL basic graph pattern (BGP), a set of triples in a WHERE clause, that can be integrated in the rewriting component. Most of the queries we are dealing with in this chapter involve selections, or SPARQL queries with SELECT, ASK, DESCRIBE, or CONSTRUCT clauses.
Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale
by
Jan Kunigk
,
Ian Buss
,
Paul Wilkinson
and
Lars George
Published 8 Jan 2019
For example, newer versions of Hive or Hue may need to update their internal database schemas. Since the recommended way of dealing with Hadoop is having multiple environments (see Chapter 2), you can stage the service database-related schema updates first and then roll them out, as supported by each Hadoop subsystem. For example, Oozie can be shut down briefly to update the software and database schemas, without interrupting data pipelines that are in flight. Instead of manually dealing with these maintenance tasks, the Hadoop management systems support the automated upgrade of components and their internal database schemas. Therefore, you can simply press a button, wait, and see all of the systems return to operational again.
…
Most analysts are able to process and explore data in statistical or scientific programming languages such as R or Python. For scalable productive BI reports, the analyst uses a range of algorithms on multidimensional views on the data in tabular form (also known as cubes) based on star or snowflake database schemas in the data warehouse. This achieves a high degree of abstraction for the analyst, who focuses on the data algorithms that often have been optimized for cubes in the BI solution. Software developer Software developers not only write, build, and maintain code, but also are involved with the continuous integration and operation of the underlying infrastructure.
…
scaling clusters up and down, automating in cloud deployments, Scaling Up and Down scaling data warehouses, The Traditional Approach scaling networks, Modular switches selecting instances for, Instances spine-leaf networks, Scalability scaling, Big Data Technology Primer schemasHive, Service Databases schema on read, Apache Hive Metastore schemaless mode for Solr collections, Apache Solr updating for service databases, Database Considerations schemes (ZooKeeper), ZooKeeper scopes (HBase), HBase SCR (see short-circuit reads) SCSI SYNCHRONIZE CACHE command, Disk cache SDN (see software-defined networking) search bind (LDAP, LDAP Authentication searches, Big Data Technology Primer, Storage Enginesin Solr, Apache Solr secret keys for Amazon S3, Amazon Simple Storage Service Secure Sockets Layer (SSL) (see Transport Layer Security) security, Security-Summaryaccess security for clusters, Access Security at-rest encryption, At-Rest Encryption-Encrypting Temporary FilesHDFS Transparent Data Encryption, HDFS Transparent Data Encryption-KMS implementations options for Hadoop services and data, At-Rest Encryption volume encryption with Cloudera NE and KTS, Volume Encryption with Cloudera Navigator Encrypt and Key Trustee Server authentication, Authentication-Impersonationdelegation tokens, Delegation Tokens impersonation, Impersonation Kerberos, Kerberos LDAP, LDAP Authentication authorization, Authorization-Sentrycentralized security management, Centralized Security Management group resolution, Group Resolution Hadoop service level authorization, Hadoop Service Level Authorization HBase, HBase HDFS, HDFS Hive, Hive Hue, Hue Kafka, Kafka Kudu, Kudu Sentry, Sentry Solr, Solr superusers and supergroups, Superusers and Supergroups-Supergroups YARN, YARN ZooKeeper, ZooKeeper cluster security and backups, Hadoop Cluster Backups configuring for high availability in Hue, Deployment options configuring for high availability in Oozie, Deployment considerations considerations when using load balancing, Security considerations disabling for platform benchmarks, Hadoop Validation HDFS and high availability, Security in cloud auto provisioning of clusters, Deploying with Security-TLSintegrating with Kerberos KDC, Integrating with a Kerberos KDC TLS, TLS in OpenShift containers, Isolation in public cloud solutions, Key Things to Know in the cloud, Security in the Cloud-Summaryassessing the risk, Assessing the Risk-Assessing the Risk auditing, Auditing encryption for data at rest, Encryption for Data at Rest-Recommendations and Summary for Cloud Encryption identity provider options for Hadoop, Identity Provider Options for Hadoop-Option C: On-Premises ID Services object storage security and Hadoop, Object Storage Security and Hadoop-Auditing perimeter controls and firewalling, Perimeter Controls and Firewalling-Summary risk model, Risk Model in-flight encryption, In-Flight Encryption-Authenticationenabling, Enabling in-Flight Encryption SASL quality of protection, SASL Quality of Protection TLS encryption, TLS Encryption internet-facing clusters, Layer 1 Recommendations LDAP, LDAP Security operating system security adjustments, OS Configuration for Hadoop Security Assertion Markup Language (SAML), Authentication, Hue security groups, AWSnetwork security groups, Azure Security-Enhanced Linux (SELinux)disabling, OS Configuration for Hadoop in OpenShift private cloud, Isolation seek system call, Important System Calls segmentation fault (Linux), Nonuniform Memory Access self-signed certificates, Certificate Management, TLSuse cases, Certificate Management Sentry, Required Databases, Other Servicesapplying HDFS ACLs, HDFS authorization in, Sentry backing up Apache Sentry, Apache Sentry centralized authorization control for Kafka, Kafka centralized authorization with, Centralized Security Management protecting all user interaction with Hive, Hive protection for reads/writes to Kudu tables through Impala integration, Kudu used for Kafka authorization, Deployment considerations using for Impala authorization, Impala using in Solr, Solr separate database mode, Database Integration Options server form factors, Server Form Factors-Guidancecomparison of, Form Factor Comparisonconfigurations and price sampling, Form Factor Comparison guidelines for choosing, Guidance Server Message (SMB) storage, Network-attached storage server-server traffic pattern, Scalability server-side encryption (SSE), Server-side and client-side encryptionsupport by Amazon S3, Encryption in AWS serversbootstrapping, Operating Systems certificates, TLS commodity, Commodity Servers-Server CPUs and RAM for ZooKeeper deployments, Deployment considerations optimized configurations, Optimized Server Configurations server CPUs and RAMcores and threads in Hadoop, Threads and cores in Hadoop role of x86 architecture, The role of the x86 architecture service accountsaccess in Azure Data Lake Store, ADLS in GCP Cloud Storage, GCP Cloud Storage, Service account service databases, Service Databases-Database Integration Optionsintegration options, Database Integration Options required databases, Required Databases service edge nodes, Interaction Patterns service level authorization, Authorization, Hadoop Service Level Authorization service principal name (SPN), Layer 3 Recommendations service principals (Kerberos), Principals service proxies, Service proxies service tags, Azure service tickets (Kerberos), Kerberos service-level agreements (SLAs), Multiple Clusters for Workload Isolation, Measuring Availability, Policies and Objectives service-level objectives (SLOs), Measuring Availability servicesavailability in public cloud solutions, Service availability high availability, Lateral/Service HA resource contention between cluster services, Multiple Clusters for Workload Isolation separation of identical service roles, Separation of Identical Service Roles sessions, LDAP Authenticationsession persistence, Session persistencefor Hue, Hue in Impala, Architecting for HA load balancer with, Deployment considerations shared access signatures (SASs), Blob storage Shared Data Experience (SDX), Multiple Clusters and Independent Storage and Compute shared database mode, Database Integration Options shells (Linux), Command-Line Access short-circuit reads (HDFS), Requirements for Multitenancy, Short-Circuit and Zero-Copy Readsbenefits of, Short-Circuit and Zero-Copy Reads short-lived (or transient) life cycle, Cluster Life Cycle Models Shortest Path Bridging (SPB), Implementation shuffles, Shuffles Simple and Protected GSSAPI Negotiation Mechanism (SPNEGO), Kerberos over HTTP, Hue Simple Network Management Protocol (SNMP) traps, Monitoring Simple Storage Service (S3) (see Amazon Simple Storage Service) SimpleAWSCredentialsProvider, Persistent credentials simultaneous multithreading (SMT), Threads and cores in Hadoop single endpoint interaction pattern, Interaction Patterns single points of failure (SPOFs), Consensus, High Availability Definedusing a single load balancer, Dedicated load balancers single sign-on (SSO) technologies, Authenticationsupport by Hue, Hue single switch network architecture, Single switchimplementation, Implementation small cluster configurations, Small Cluster Configurations Small Computer Systems Interface (SCSI) disk blocks, SANsiSCSI, SANs SMART, Disk health smoke testing, Platform Validation SMP (symmetric multiprocessing), Threads and cores in Hadoop SMT (simultaneous multithreading), Threads and cores in Hadoop snapshots, Snapshotsin HBase, HBase, Subflow: HBase of Amazon S3 object storage, AWS storage options SOCKS proxies, SOCKS proxies, YARN softwarein datacenters, effects on ingest and extraction speed, Software load balancers, Dedicated load balancers software configuration management (SCM) tools, Operating Systemsusing in platform validation testing, Useful Tools using to automate installation of Hadoop binaries, Installation Choices software developers, Software developer software license efficiency, Server Form Factors software overlay networks, Network Virtualization software-defined networking (SDN), Network Virtualization-Network Virtualization, Solutions for Private Cloudsimportance for Hadoop, Network Virtualization in OpenShift Kubernetes, Isolation software-only SDN, Network Virtualization solid state drives (SSDs), SAS, Nearline SAS, or SATA (or SSDs)?
Learning Flask Framework
by
Matt Copperwaite
and
Charles Leifer
Published 26 Nov 2015
We covered the benefits of using a relational database and an ORM, configured a Flask application to connect to a relational database, and created SQLAlchemy models. All this allowed us to create relationships between our data and perform queries. To top it off, we also used a migration tool to handle future database schema changes. In Chapter 3, Templates and Views we will set aside the interactive interpreter and start creating views to display blog entries in the web browser. We will put all our SQLAlchemy knowledge to work by creating interesting lists of blog entries, as well as a simple search feature. We will build a set of templates to make the blogging site visually appealing, and learn how to use the Jinja2 templating language to eliminate repetitive HTML coding.
…
Below the tags relationship, add the following code to the Entry model definition: class Entry(db.Model): # ... tags = db.relationship('Tag', secondary=entry_tags, backref=db.backref('entries', lazy='dynamic')) comments = db.relationship('Comment', backref='entry', lazy='dynamic') [ 148 ] Chapter 7 We've specified the relationship as lazy='dynamic', which, as you will recall from Chapter 5, Authenticating Users, means that the comments attribute on any given Entry instance will be a filterable query. Creating a schema migration In order to start using our new model, we need to update our database schema. Using the manage.py helper, create a schema migration for the Comment model: (blog) $ python manage.py db migrate INFO [alembic.migration] Context impl SQLiteImpl. INFO [alembic.migration] Will assume non-transactional DDL. INFO [alembic.autogenerate.compare] Detected added table 'comment' Generating /home/charles/projects/blog/app/migrations/ versions/490b6bc5f73c_.py ... done Then apply the migration by running upgrade: (blog) $ python manage.py db upgrade INFO [alembic.migration] Context impl SQLiteImpl.
97 Things Every Programmer Should Know
by
Kevlin Henney
Published 5 Feb 2010
If an object has many possible variations in its behavior, these behaviors can be injected using the Strategy pattern rather than large if-then structures. In fact, the formulation of design patterns themselves is an attempt to reduce the duplication of effort required to solve common problems and discuss such solutions. In addition, DRY can be applied to structures, such as database schema, resulting in normalization. A Matter of Principle Other software principles are also related to DRY. The Once and Only Once principle, which applies only to the functional behavior of code, can be thought of as a subset of DRY. The Open/Closed Principle, which states that "software entities should be open for extension, but closed for modification," only works in practice when DRY is followed.
…
However, the problem is that the three functions change for entirely different reasons. The calculatePay function will change whenever the business rules for calculating pay do. The reportHours function will change whenever someone wants a different format for the report. The save function will change whenever the DBAs change the database schema. These three reasons to change combine to make Employee very volatile. It will change for any of those reasons. More importantly, any classes that depend upon Employee will be affected by those changes. Good system design means that we separate the system into components that can be independently deployed.
Hadoop: The Definitive Guide
by
Tom White
Published 29 May 2009
Start it using the following commands: % export ANT_LIB=/path/to/ant/lib % hive --service hwi (You only need to set the ANT_LIB environment variable if Ant’s library is not found in /opt/ant/lib on your system.) Then navigate to http://localhost:9999/hwi in your browser. From there, you can browse Hive database schemas and create sessions for issuing commands and queries. It’s possible to run the web interface as a shared service to give users within an organization access to Hive without having to install any client software. There are more details on the Hive Web Interface on the Hive wiki at https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface.
…
The data typically resides in HDFS, although it may reside in any Hadoop filesystem, including the local filesystem or S3. Hive stores the metadata in a relational database—and not in HDFS, say (see The Metastore). In this section, we shall look in more detail at how to create tables, the different physical storage formats that Hive offers, and how to import data into them. Multiple Database/Schema Support Many relational databases have a facility for multiple namespaces, which allow users and applications to be segregated into different databases or schemas. Hive supports the same facility, and provides commands such as CREATE DATABASE dbname, USE dbname, and DROP DATABASE dbname. You can fully qualify a table by writing dbname.tablename.
…
Chapter 2 (“Installing and Upgrading MySQL”) in particular should help. Users of Debian-based Linux systems (e.g., Ubuntu) can type sudo apt-get install mysql-client mysql-server. RedHat users can type sudo yum install mysql mysql-server. Now that MySQL is installed, let’s log in and create a database (Example 15-1). Example 15-1. Creating a new MySQL database schema % mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 349 Server version: 5.1.37-1ubuntu5.4 (Ubuntu) Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE hadoopguide; Query OK, 1 row affected (0.02 sec) mysql> GRANT ALL PRIVILEGES ON hadoopguide.* TO '%'@'localhost'; Query OK, 0 rows affected (0.00 sec) mysql> GRANT ALL PRIVILEGES ON hadoopguide.* TO ''@'localhost'; Query OK, 0 rows affected (0.00 sec) mysql> quit; Bye The password prompt above asks for your root user password.
Apache Solr 3 Enterprise Search Server
by
Unknown
Published 13 Jan 2012
In order to make it easier for you to play with this data, the online code supplement to this book includes the data in formats that can readily be imported into Solr. Alternatively, if you already have your own data then I recommend starting with that, using this book as a guide. The MusicBrainz.org database is highly relational. Therefore, it will serve as an excellent instructional data set to discuss Solr schema choices. The MusicBrainz database schema is quite complex, and it would be a distraction to go over even half of it. I'm going to use a subset of it and express it in a way that has a straightforward mapping to the user interface seen on the MusicBrainz website. Each of these tables depicted in the following diagram can be easily constructed through SQL sub-queries or views from the actual MusicBrainz tables.
…
Even if you have one type of data to search for in an application, you might still use multiple cores (with the same configuration) and shard the data for scaling. Managing Solr Cores is discussed further in the deployment chapter. One combined index A combined index might also be called an aggregate index. As mentioned in the first chapter, an index is conceptually like a single-table relational database schema, thus sharing similarities with some NoSQL (non-relational) databases. In spite of this limitation, there is nothing to stop you from putting different types of data (say, artists and releases from MusicBrainz) into a single index. All you have to do is use different fields for the different document types, and use a field to discriminate between the types.
…
If it is not specified, then it defaults to the column name. When a column in the result can be placed directly into Solr without further processing, there is no need to specify the field declaration, because it is implied. When importing from a database, use the SQL AS keyword to use the same names as the Solr schema instead of the database schema. This reduces the number of<field/> elements and shortens existing ones. An attribute of the entity declaration that we didn't mention yet is transformer. This declares a comma-separated list of transformers that create, modify, and delete fields and even entire documents. The transformers are evaluated in-order, which can be significant.
Developing Web Applications with Haskell and Yesod
by
Michael Snoyman
Published 22 Apr 2012
Your table might look something like: CREATE TABLE Person(id SERIAL PRIMARY KEY, name VARCHAR NOT NULL, age INTEGER) And if you are using a database like PostgreSQL, you can be guaranteed that the database will never store some arbitrary text in your age field. (The same cannot be said of SQLite, but let’s forget about that for now.) To mirror this database table, you would likely create a Haskell data type that looks something like: data Person = Person { personName :: Text , personAge :: Int } It looks like everything is type safe: the database schema matches our Haskell data types, the database ensures that invalid data can never make it into our data store, and everything is generally awesome. Well, until: You want to pull data from the database, and the database layer gives you the data in an untyped format. You want to find everyone older than 32, and you accidently write “thirtytwo” in your SQL statement.
…
Suppose that we now want to add a field for a person’s favorite programming language: {-# LANGUAGE QuasiQuotes, TypeFamilies, GeneralizedNewtypeDeriving, TemplateHaskell, OverloadedStrings, GADTs, FlexibleContexts #-} import Database.Persist import Database.Persist.Sqlite import Database.Persist.TH import Data.Time share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persist| Person name String age Int Maybe created UTCTime default=now() language String default='Haskell' |] main = withSqliteConn ":memory:" $ runSqlConn $ do runMigration migrateAll Note The default attribute has absolutely no impact on the Haskell code itself; you still need to fill in all values. This will only affect the database schema and automatic migrations. We need to surround the string with single quotes so that the database can properly interpret it. Finally, Persistent can use double quotes for containing white space, so if we want to set someone’s default home country to be El Salvador: {-# LANGUAGE QuasiQuotes, TypeFamilies, GeneralizedNewtypeDeriving, TemplateHaskell, OverloadedStrings, GADTs, FlexibleContexts #-} import Database.Persist import Database.Persist.Sqlite import Database.Persist.TH import Data.Time share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persist| Person name String age Int Maybe created UTCTime default=now() language String default='Haskell' country String "default='El Salvador'" |] main = withSqliteConn ":memory:" $ runSqlConn $ do runMigration migrateAll One last trick you can do with attributes is to specify the names to be used for the SQL tables and columns.
Programming Computer Vision with Python
by
Jan Erik Solem
Published 26 Jun 2012
Indexing images in this context means extracting descriptors from the images, converting them to visual words using a vocabulary, and storing the visual words and word histograms with information about which image they belong to. This will make it possible to query the database using an image and get the most similar images back as search result. Table 7-1. A simple database schema for storing images and visual words. imlist imwords imhistograms rowid imid imid filename wordid histogram vocname vocname Here we will use SQLite as database. SQLite is a database that stores everything in a single file and is very easy to set up and use. We are using it here since it is the easiest way to get started without having to go into database and server configurations and other details way outside the scope of this book.
…
SQLite is imported from the pysqlite2 module (see Appendix A for installation details). The Indexer class connects to a database and stores a vocabulary object upon creation (where the __init__() method is called). The __del__() method makes sure to close the database connection and db_commit() writes the changes to the database file. We only need a very simple database schema of three tables. The table imlist contains the filenames of all indexed images, and imwords contains a word index of the words, which vocabulary was used, and which images the words appear in. Finally, imhistograms contains the full word histograms for each image. We need those to compare images according to our vector space model.
Team Topologies: Organizing Business and Technology Teams for Fast Flow
by
Matthew Skelton
and
Manuel Pais
Published 16 Sep 2019
Such applications are typically deployed as a unit, often causing headaches for users (the application is not available during deployment) and operators (unexpected issues because the production environment is a moving target; even if we tested the monolith in an environment similar to production, it has surely drifted since then). Joined-at-the-Database Monolith A joined-at-the-database monolith is composed of several applications or services, all coupled to the same database schema, making them difficult to change, test, and deploy separately. This monolith often results from the organization viewing the database, not the services, as the core business engine. It’s common to find that one or more database-administration (DBA) teams were put in place to not only maintain the database but also coordinate changes to the database—a task they are often understaffed for—and they become a large bottleneck to delivery.
…
germane cognitive load: relates to aspects of the task that need special attention for learning or high performance (e.g., “How should this service interact with the ABC service?”). intrinsic cognitive load: relates to aspects of the task fundamental to the problem space (e.g., “What is the structure of a Java class?” “How do I create a new method?”). joined-at-the-database monolith: composed of several applications or services all coupled to the same database schema, making them difficult to change, test, and deploy separately. monolithic build: uses one gigantic continuous integration (CI) build to get a new version of a component. monolithic model: software that attempts to force a single domain language and representation (format) across many different contexts.
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by
Martin Kleppmann
Published 16 Mar 2017
If you use Avro, you can fairly easily generate an Avro schema (in the JSON representation we saw earlier) from the relational schema and encode the database contents using that schema, dumping it all to an Avro object container file [25]. You generate a record schema for each database table, and each column becomes a field in that record. The column name in the database maps to the field name in Avro. Now, if the database schema changes (for example, a table has one column added and one column removed), you can just generate a new Avro schema from the updated database schema and export data in the new Avro schema. The data export process does not need to pay any attention to the schema change—it can simply do the schema conversion every time it runs. Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema.
…
In a document database, you would just start writing new documents with the new fields and have code in the application that handles the case when old documents are read. For example: if (user && user.name && !user.first_name) { // Documents written before Dec 8, 2013 don't have first_name user.first_name = user.name.split(" ")[0]; } On the other hand, in a “statically typed” database schema, you would typically perform a migration along the lines of: ALTER TABLE users ADD COLUMN first_name text; UPDATE users SET first_name = split_part(name, ' ', 1); -- PostgreSQL UPDATE users SET first_name = substring_index(name, ' ', 1); -- MySQL Schema changes have a bad reputation of being slow and requiring downtime.
…
Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema. By contrast, if you were using Thrift or Protocol Buffers for this purpose, the field tags would likely have to be assigned by hand: every time the database schema changes, an administrator would have to manually update the mapping from database column names to field tags. (It might be possible to automate this, but the schema generator would have to be very careful to not assign previously used field tags.) This kind of dynamically generated schema simply wasn’t a design goal of Thrift or Protocol Buffers, whereas it was for Avro.
Python Web Development With Django
by
Jeff Forcier
Introduction Ichange f you’re a Web developer, a programmer who creates Web sites, then Django just might your life. It has certainly changed ours. Anyone with even a little experience building dynamic Web sites knows the pain of reinventing certain standard features over and over.You need to create database schemas. You need to get data into and out of the database.You need to parse URLs.You need to sanitize input.You need to provide content-editing tools.You need to attend to security and usability.And so on. Where Web Frameworks Come From At some point you realize the wastefulness of reimplementing all these features on every new project; you decide to code your own libraries from scratch to provide them—or, more likely, you extract those libraries from your latest and greatest creation.Thereafter, when you start a new project, the first thing you do is install your library code.
…
Modeling a real-world problem in an object-oriented system is often a comparatively simple task, but for high-traffic Web sites the most realistic model isn’t always the most efficient. The model encompasses a wide range of potential pitfalls, one of which is changing the model code after your application has been deployed.Although you are “just changing code,” you have actually altered your database schema under the covers, and this often causes side effects to the preexisting data stored in the database.We go over many of these real-life concerns in the chapters ahead when exploring the design of some example applications. Views Views form much (sometimes most or all) of the logic in Django applications.Their definition is deceptively simple: Python functions are linked to one or more defined URLs, which return HTTP response objects.What happens in-between those two endpoints of Django’s HTTP mechanisms is entirely up to you.
…
Instead, it makes sure all model classes are represented as database tables, creating new tables as necessary— but not altering existing ones. Database “Synchronization” The reasoning behind syncdb’s behavior is Django’s core development team strongly believes one’s production data should never be at the mercy of an automated process. Additionally, it is a commonly held belief that changes to database schemas should only be performed when a developer understands SQL well enough to execute those changes by hand. The authors tend to agree with this approach; a better understanding of underlying technology is always preferable when developing with higher-layer tools. At the same time, an automatic or semi-automatic schema change-set mechanism (such as Rails’ migrations) can often speed up the development process.
Kanban: Successful Evolutionary Change for Your Technology Business
by
David J. Anderson
Published 6 Apr 2010
While this may sound trivial, it often isn’t. Often it is necessary to plan an elaborate procedure to switch off databases, application servers, and other systems gracefully, and then upgrade them and bring them all back again. One of the biggest issues is data migration from one generation of a database schema to another. Databases can get very large. The process of serializing the data to a file, parsing it, unpacking it, perhaps embellishing or augmenting it with other data, then re-parsing it and unpacking it into a new schema can take hours—perhaps even days. In some environments, software deployment can take hours or days.
…
More often than that is common in fast-moving domains such as media, where the release cycles may be very frequent. Delivery/Release Now we must agree to a similar thing with downstream partners. A delivery cadence that makes sense is very specific to a domain or situation. If it’s web-based software, we have to deploy to a server farm. Deployment involves copying files and perhaps upgrading a database schema and then migrating data from one version of the schema to another. This data migration will probably have its own code and it will take its own time to execute. To calculate the total deployment time, you will need to factor in how many servers, how many files to copy, how long it takes to pull systems down gracefully and reboot them, how long data takes to migrate, and so forth.
PostGIS in Action
by
Regina O. Obe
and
Leo S. Hsu
Published 2 May 2015
CREATE SCHEMA ch01; In PostgreSQL it’s very easy to back up selected schemas and also to set up permissions based on schemas. You could, for example, have a big schema of fairly static data that you exclude from your daily backups, and also divide schemas along user groups so that you can allow each group to manage their own schema set of data. The postgis_in _action database schemas are chapter-themed so that it’s easy to download just the set of data you need for a specific chapter. Refer to appendix D for more details about schemas and security management. Restaurants table Next, you need to create a lookup table to map franchise codes to meaningful names. You can then add all the franchises you’ll be dealing with.
…
The main downside of Imposm is that it only runs on Linux and OS X, and not on Windows, whereas osm2pgsql has support for all the platforms PostGIS runs on. Imposm also currently doesn’t support OSM differential loads, so it’s not a good solution if you have a large OSM data set and want to perform differential updates as things get updated on OSM. One key benefit of Imposm is that it supports custom database schemas, so you can better control the table structures of your data as part of the load process. You can find out more about Imposm at http://imposm.org. Visit www.openstreetmap.org/export to download both utilities. Now that we’ve covered the common free options available for loading data, let’s test these tools. 4.4.1.
…
Open up the file in an editor and make changes to the path and database connection as appropriate. Once you’ve made the changes and double-checked your typing, go ahead and execute the script file from your OS command line. After you’ve executed the script, you should see a couple of new tables in your tiger_data database schema. The two of most importance are states_all and county_all. Now that you’ve had some practice at loading data, let’s generate a script to load individual state data. If you’re on Linux, Unix, or Mac, run listing 8.4. If you’re on Windows, run listing 8.5. If you created a custom profile in the tiger.loader_platform table, use that name instead of sh or windows.
Testing Extreme Programming
by
Lisa Crispin
and
Tip House
Published 15 Apr 2003
This is definitely an area where you may need to wait for the system to be available and then play with it to get an idea of how to abuse it. These kinds of tests require a lot of creativity; have fun with them. Sometimes it's tough to even think how to test a particular item. For example, what if a story implements a data model within the database schema but no story uses it in a user interface? It may be that this type of test is better handled by the unit testsit may not even be appropriate to have a separate acceptance test. Consider whether it impinges on the customer in some indirect way, and design tests for that. After all, if the customer really can't tell whether the story has been implemented or not, that should be a pretty quick story to implement, eh?
…
Each test was unique code from start to finish. A SubmitRequest for content item A would not work for submitting a request for content item B. Very little code could be shared between test scripts. The team was next asked to develop a similar application for a different Web site but was allowed to design the database schema. They were able to make this application much more generic, using only four tables. The object model needed only nine classes. Now the SubmitRequest test script worked for every type of content. This made test script development go much faster, and the scripts were vastly easier to maintain. Rethinking the design for all the content management, the team came up with a design needing just two tables, where a single definition of content could be used to represent all types of information.
Beautiful Data: The Stories Behind Elegant Data Solutions
by
Toby Segaran
and
Jeff Hammerbacher
Published 1 Jul 2009
This is perhaps oversimplifying our database design process, however. I should back up a bit. We are a group of 10 or so graduate students with our own research interests, and as expected, work on individual components of PEIR. This affected how we work a great deal. PEIR data was very scattered to begin with. We did not use a unified database schema; we created multiple databases as we needed them, and did not follow any specific design patterns. If anyone joined PEIR during this mid-early stage, he would have been confused by where and what all the data was and who to contact to find out. I say this because I joined the PEIR project midway.
…
It became quite clear that this consolidation of code and schemas was necessary once user experience development began. In retrospect, it would have been worth the extra effort to take a more calculated approach to data storage in the early goings, but such is the nature of graduate studies. Coordination and code consolidation are not an issue with YFD, since there is only one developer. I can change the database schema, user interface, and data collection mechanism with little fuss. I also use Django, a Python web framework, which uses a modelview-control approach and allows for rapid and efficient development. I do, however, have to do everything myself. Because of the group’s diversity in statistics, computer science, engineering, GIS, and environmental science, PEIR is able to accomplish more— most notably in the area of data processing, as discussed in the next section.
…
To balance the need for effective provenance collection with a time-sensitive, highly dynamic workflow, modern approaches to data capture are required: we never want software to be the rate-limiting step in a workflow. Flexible Data Capture Instead of classical database refactoring, typically involving adding, removing, or renaming fields on a database schema, the high-throughput production pipeline at Sanger uses a collection of data modeling tools that allow data descriptions to be changed at runtime. When a laboratory protocol is updated to include a new set of instruments or a faster approach to an existing step, the provenance information captured that relates to that task can be restructured on the fly by laboratory staff.
Bad Data Handbook
by
Q. Ethan McCallum
Published 14 Nov 2012
For example, in a comma-separated file, the newline character separates records, and commas separate fields. Picking a delimiter lives in a class of problems that sucks more energy than is possible to imagine. I know of a production system that uses four bars (||||) to delimit strings, to work around not being able to modify the database schema and a fear of modifying the underlying data to escape text. Paying homage to http://xkcd.com/927/, the ASCII standard includes delimiter characters that are almost never used. ASCII 30 is the record separator and ASCII 31 is designated for delimiting fields. In principle, user-facing systems reading in this data could neatly align things together in a tabular form.
…
Whatever our schema, using SQL and RDBMS technology to solve graph problems courts an impedance mismatch often seen in object-oriented systems backed by relational databases. Impedance matching originates in the field of electrical engineering and is the process of matching source and load impedance to maximize power transfer and minimize signal reflection in the transmission line. The object-relational impedance mismatch is a metaphor applying when a relational database schema (source) is at odds with the target object model in object-oriented languages such as Java or Ruby (load), necessitating a mapping between the two (signal reflection/power loss). Object-relational mapping (ORM) frameworks such as Hibernate, MyBatis, and ActiveRecord attempt to abstract away this complexity.
Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
by
Eric Redmond
,
Jim Wilson
and
Jim R. Wilson
Published 7 May 2012
We’ll begin with the many ways that PostgreSQL can search actor/movie names using fuzzy string matching. Then we’ll discover the cube package by creating a movie suggestion system based on similar genres of movies we already like. Since these are all contributed packages, the implementations are special to PostgreSQL and not part of the SQL standard. Commonly, when designing a relational database schema, you’ll start with an entity diagram. We’ll be writing a personal movie suggestion system that keeps track of movies, their genres, and their actors, as modeled in Figure 6, Our movie suggestion system. * * * Figure 6. Our movie suggestion system As a reminder, on Day 1 we installed several contributed packages.
…
Also, because of the nature of documents, they often map well to object-oriented programming models. This means less impedance mismatch when moving data between the database model and application model. Not-So-Good For: If you’re used to performing elaborate join queries on highly normalized relational database schemas, you’ll find the capabilities of document databases lacking. A document should generally contain most or all of the relevant information required for normal use. So while in a relational database you’d naturally normalize your data to reduce or eliminate copies that can get out of sync, with document databases, denormalized data is the norm.
PostGIS in Action, 2nd Edition
by
Regina O. Obe
and
Leo S. Hsu
Published 2 May 2015
CREATE SCHEMA ch01; In PostgreSQL it’s very easy to back up selected schemas and also to set up permissions based on schemas. You could, for example, have a big schema of fairly static data that you exclude from your daily backups, and also divide schemas along user groups so that you can allow each group to manage their own schema set of data. The postgis_in _action database schemas are chapter-themed so that it’s easy to download just the set of data you need for a specific chapter. Refer to appendix D for more details about schemas and security management. RESTAURANTS TABLE Next, you need to create a lookup table to map franchise codes to meaningful names. You can then add all the franchises you’ll be dealing with.
…
The main downside of Imposm is that it only runs on Linux and OS X, and not on Windows, whereas osm2pgsql has support for all the platforms PostGIS runs on. Imposm also currently doesn’t support OSM differential loads, so it’s not a good solution if you have a large OSM data set and want to perform differential updates as things get updated on OSM. One key benefit of Imposm is that it supports custom database schemas, so you can better control the table structures of your data as part of the load process. You can find out more about Imposm at http://imposm.org. Visit www.openstreetmap .org/export to download both utilities. Now that we’ve covered the common free options available for loading data, let’s test these tools. 4.4.1 Getting OSM data You can choose to download and load the whole OSM database, which is about 16 GB in size, or you can download extracts of key areas such as those available at http:// download.geofabrik.de/ or http://metro.teczno.com.
…
Open up the file in an editor and make changes to the path and database connection as appropriate. Once you’ve made the changes and double-checked your typing, go ahead and execute the script file from your OS command line. After you’ve executed the script, you should see a couple of new tables in your tiger_data database schema. The two of most importance are states_all and county_all. Now that you’ve had some practice at loading data, let’s generate a script to load individual state data. If you’re on Linux, Unix, or Mac, run listing 8.4. If you’re on Windows, run listing 8.5. If you created a custom profile in the tiger.loader_platform table, use that name instead of sh or windows.
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
by
Martin Kleppmann
Published 17 Apr 2017
If you use Avro, you can fairly easily generate an Avro schema (in the JSON representation we saw earlier) from the relational schema and encode the database contents using that schema, dumping it all to an Avro object container file [25]. You generate a record schema for each database table, and each column becomes a field in that record. The column name in the database maps to the field name in Avro. Now, if the database schema changes (for example, a table has one column added and one column removed), you can just generate a new Avro schema from the updated database schema and export data in the new Avro schema. The data export process does not need to pay any attention to the schema change—it can simply do the schema conversion every time it runs. Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema.
…
In a document database, you would just start writing new documents with the new fields and have code in the application that handles the case when old documents are read. For example: if (user && user.name && !user.first_name) { // Documents written before Dec 8, 2013 don't have first_name user.first_name = user.name.split(" ")[0]; } On the other hand, in a “statically typed” database schema, you would typically per‐ form a migration along the lines of: ALTER TABLE users ADD COLUMN first_name text; UPDATE users SET first_name = split_part(name, ' ', 1); -- PostgreSQL UPDATE users SET first_name = substring_index(name, ' ', 1); -- MySQL Schema changes have a bad reputation of being slow and requiring downtime.
…
Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema. By contrast, if you were using Thrift or Protocol Buffers for this purpose, the field tags would likely have to be assigned by hand: every time the database schema changes, an administrator would have to manually update the mapping from data‐ base column names to field tags. (It might be possible to automate this, but the schema generator would have to be very careful to not assign previously used field 126 | Chapter 4: Encoding and Evolution tags.) This kind of dynamically generated schema simply wasn’t a design goal of Thrift or Protocol Buffers, whereas it was for Avro.
REST API Design Rulebook
by
Mark Masse
Published 19 Oct 2011
For example, this URI design: http://api.soccer.restapi.org/leagues/seattle/teams/trebuchet indicates that each of these URIs should also identify an addressable resource: http://api.soccer.restapi.org/leagues/seattle/teams http://api.soccer.restapi.org/leagues/seattle http://api.soccer.restapi.org/leagues http://api.soccer.restapi.org Resource modeling is an exercise that establishes your API’s key concepts. This process is similar to the data modeling for a relational database schema or the classical modeling of an object-oriented system. Before diving directly into the design of URI paths, it may be helpful to first think about the REST API’s resource model. Resource Archetypes When modeling an API’s resources, we can start with the some basic resource archetypes.
Solr 1.4 Enterprise Search Server
by
David Smiley
and
Eric Pugh
Published 15 Nov 2009
Another interesting bit of data MusicBrainz stores is the PUID "lookup count", which is how often it has been requested by their servers—a decent measure of popularity. MusicBrainz uses the proprietary "MusicDNS" audio fingerprinting technology, which was donated to the project by MusicIP. It is not open source. One combined index or multiple indices As mentioned in the first chapter, technically, an index is basically like a single-table database schema. Imagine a massive spreadsheet, if you will. Inspite of this limitation, there is nothing to stop you from putting different types of data (say, artists and tracks from MusicBrainz) into a single index, thereby, in effect mitigating this limitation. All you have to do is use different fields for the different document types, and use a field to discriminate between the types.
…
The field element must have a column attribute that matches the corresponding named column in the SQL query. The name attribute is the Solr schema field name that the column is going into. If it is not specified (and it never is for our example), then it defaults to the column name. Use the SQL as a keyword as we've done to use the same names as the Solr schema instead of the database schema. This reduces the number of explicit mappings needed in <field/> elements and shortens existing ones. When a column in the result can be placed directly into Solr without further processing, there is no need to specify the field declaration, because it is implied. An attribute of the entity declaration that we didn't mention yet is transformer.
SQL Hacks
by
Andrew Cumming
and
Gordon Russell
Published 28 Nov 2006
The marshalling of columns for transport, sending them to your code, and then splitting them up again all require CPU time and effort. You could perform the addition in the database system and save all that code. If lots of code needs those rows added, you could put that processing into a view. This might be more reliable. It would certainly promote code reuse, and would offer a single point of control so that when the database schema is altered, you've just got to modify the view. Put another way, the choice comes down to this: mysql> CREATE TABLE widetable (a INT,b INT, c INT, d INT, e INT); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO widetable VALUES (5,10,19,11,3); Query OK, 1 row affected (0.00 sec) mysql> SELECT a,b,c,d,e FROM widetable; +------+------+------+------+------+ | a | b | c | d | e | +------+------+------+------+------+ | 5 | 10 | 19 | 11 | 3 | +------+------+------+------+------+ 1 row in set (0.00 sec) Then, in your programming language, do (5+10+19+11+3), throw away a,b,c,d,e, and just use the value 48 or this: mysql> SELECT a+b+c+d+e AS v FROM widetable; +------+ | v | +------+ | 48 | +------+ 1 row in set (0.00 sec) In addition to arithmetic, other calculations may best be done in the database.
…
Another approach is to have the table names combined with a program variable, and then to set the variable to the new prefix. 11.3.2. Keep Your CREATE Script When I am working on a new application one of the first things I do is to create a file called mktable.sql that has the CREATE script for every table being used. Of course, I never get the database schema right the first time, so I have to edit and run the mktable.sql script many times before I have it the way I want it: DROP TABLE casting; DROP TABLE movie; DROP TABLE actor; CREATE TABLE actor (id INTEGER NOT NULL ,name VARCHAR(35) ,PRIMARY KEY (id) ); CREATE INDEX actor_name ON actor(name); CREATE TABLE movie (id INTEGER NOT NULL ,title VARCHAR(70) ,yr DECIMAL(4) ,score FLOAT ,votes INTEGER ,director INTEGER ,PRIMARY KEY (id) ,FOREIGN KEY (director) REFERENCES actor(id) ); CREATE INDEX movie_title ON movie(title); CREATE TABLE casting (movieid INTEGER NOT NULL ,actorid INTEGER NOT NULL ,ord INTEGER ,PRIMARY KEY (movieid, actorid) ,FOREIGN KEY (movieid) REFERENCES movie(id) ,FOREIGN KEY (actorid) REFERENCES actor(id) ); Notice that the DROP commands come before the CREATE commands.
Clojure Programming
by
Chas Emerick
,
Brian Carper
and
Christophe Grand
Published 15 Aug 2011
As a bonus, eliminating synthetic identifiers means that different threads or even processes no longer need to communicate to assign unique identifiers to chunks of data, because you can derive natural identifiers from the data at hand. The debate of the respective virtues of synthetic keys versus natural keys is not new, but is still very much alive when it comes to database schema design. The arguments for synthetic identifiers revolve around two preoccupations: that data is going to survive the process that created it (and the rules it embodies), and that data is hard to refactor. The implication is that synthetic identifiers are of little use for in-process or otherwise application-level computation.
…
If all we want is to run a query and fetch the results in their entirety, we can easily set up a utility function to do this for us: (defn fetch-results [db-spec query] (jdbc/with-connection db-spec (jdbc/with-query-results res query (doall res)))) ;= #'user/fetch-results (fetch-results db-spec ["SELECT * FROM authors"]) ;= ({:id 1, :first_name "Chas", :last_name "Emerick"} ;= {:id 2, :first_name "Christophe", :last_name "Grand"} ;= {:id 3, :first_name "Brian", :last_name "Carper"}) Transactions Performing database operations as a single transaction is a simple matter of wrapping your code in a transaction form. transaction accepts any number of forms and executes each in turn as part of a single transaction. If an exception is thrown, or if the operation(s) performed will violate a constraint established in your database schema, the transaction will be rolled back. If the body of transaction finishes executing without error, the transaction is committed. (jdbc/with-connection db-spec (jdbc/transaction (jdbc/delete-rows :authors ["id = ?" 1]) (throw (Exception. "Abort transaction!")))) ;= ; Exception Abort transaction!
…
DOCTYPE hibernate-configuration SYSTEM "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd"> <hibernate-configuration> <session-factory> <property name="hibernate.connection.driver_class">org.sqlite.JDBC</property> <property name="hibernate.connection.url">jdbc:sqlite::memory:</property> <property name="hibernate.dialect">org.hibernate.dialect.HSQLDialect</property> <!-- Drop and re-create the database schema on startup --> <property name="hbm2ddl.auto">create</property> <mapping class="com.clojurebook.hibernate.Author"/> </session-factory> </hibernate-configuration> Finally, if we’re using Leiningen, we just need to add a couple of keys to our project.clj; one to indicate the source root for Java code we’d like it to compile (i.e., Author.java), and another to indicate the resources root, where we placed our hibernate.cfg.xml file: :java-source-path "java" :resources-path "rsrc" This leaves our project structure looking like so: |-- project.clj |-- rsrc | `-- hibernate.cfg.xml |-- java | `-- com | `-- clojurebook | `-- hibernate | |-- Author.java We can now compile our Java class and start a REPL to start seeing how we can use Hibernate from Clojure: % lein javac ... % lein repl We first need to import the required Java classes from Hibernate, and our new Author class: (import 'org.hibernate.SessionFactory 'org.hibernate.cfg.Configuration 'com.clojurebook.hibernate.Author) Hibernate requires setting up a session factory object to allow us to open and close database sessions and run queries.
The Art of SQL
by
Stephane Faroult
and
Peter Robson
Published 2 Mar 2006
All technological solutions are merely means to an end; the great danger for the inexperienced developer is that the attractions of the latest technology become an end in themselves. And the danger is all the greater for enthusiastic, curious, and technically minded individuals! Important Foundations before Fashion: learn your craft before playing with the latest tools. Stable Database Schema The use of data definition language (DDL) to create, alter, or drop database objects inside an application is a very bad practice that in most cases should be banned. There is no reason to dynamically create, alter, or drop objects, with the possible exception of partitions—which I describe in Chapter 5--and temporary tables that are known to the DBMS to be temporary tables.
…
We therefore need three more tables: people to store information such as name, first name, sex, year of birth, and so on; roles to define how people may contribute to a movie (actor, director, but also composer, director of photography, and the like); and movie_credits to state who was doing what in which movie. Figure 8-3 shows our complete movie schema. Figure 8-3. The movie database schema Let's suppose now that we want to let people search movies in our database by specifying either: words from the title, the name of the director, or up to three names of any of the actors. Following is the source of our prototype page, which I have built in HTML to act as our screen display: <html> <head> <title>Movie Database</title> </head> <body> <CENTER> <HR> <BR> Please fill the form to query our database and click on <b>Search</b> when you are done...
Seeking SRE: Conversations About Running Production Systems at Scale
by
David N. Blank-Edelman
Published 16 Sep 2018
Instead, the go-to solution should be to deploy self-service capabilities at those handoff points. These self-service capabilities should provide pull-based interfaces to whatever was previously needed to be done by someone on the other end of a request queue (investigating performance issues, changing network/firewall settings, adding capacity, updating database schemas, restarts, etc.). The point of self-service is to stay out of the way of people who need an operations task completed. Rather than having someone fill out a ticket and sitting in a request queue, you give them a GUI, API, or command-line tool to do it themselves, when they need to do it. This capability eliminates wait time, shortens feedback loops, avoids miscommunication, and improves the labor capacity of the teams that previously had to field those requests, freeing them from repetitive requests so that they can focus on value-adding engineering work.
…
The team had always blamed the customers for going over its contracted rates, which is like a parent blaming their infant for eating dirt. Karen rightly expected that her nonoperations background would have been a benefit to the team. It’s not always clear whether a problem will require understanding a database schema, Ruby debugging, C++ performance understanding, product knowledge, or people skills. Karen proposed a new design based on technology she’d used during her internship. Her coworkers were unfamiliar with the new technology and immediately considered it too risky. Karen dropped her proposal without discussion.
…
Teams faced with supporting software written in multiple languages, with different underlying technologies and frameworks, spend a huge amount of time trying to understand the system and consequently have less time to improve it. To reduce complexity, software engineers deploy more and more abstractions. Abstractions can be like quicksand. Object/relational mappers (ORMs) are a wonderful example of a tool that can make a developer’s life easy by reducing the amount of time thinking about database schemas. By obviating the need for developers to understand the underlying schema, developers no longer consider how ORM changes impact production performance. Operations now need to understand the ORM layer and why it impacts the database. Monolithic designs are often easier to develop and extend than microservices.
Backup & Recovery
by
W. Curtis Preston
Published 9 Feb 2009
Finding out if the datafile contains a rollback segment is a little more difficult, but it is still possible. If you use rman with a recovery catalog, you can connect to the catalog, set the DBID of the database, and issue the report schema command. This command displays the files, locations, and whether they contain rollback segments. RMAN> report schema; Report of database schema File Size(MB) Tablespace RB segs Datafile Name ---- ---------- ---------------- ------- ------------------- 1 307200 SYSTEM NO /oracle/oradata/trgt/system01.dbf 2 20480 UNDOTBS YES /oracle/oradata/trgt/undotbs01.dbf 3 10240 CWMLITE NO /oracle/oradata/trgt/cwmlite01.dbf 4 10240 DRSYS NO /oracle/oradata/trgt/drsys01.dbf 5 10240 EXAMPLE NO /oracle/oradata/trgt/example01.dbf 6 10240 INDX NO /oracle/oradata/trgt/indx01.dbf 7 10240 TOOLS NO /oracle/oradata/trgt/tools01.dbf 8 10240 USERS NO /oracle/oradata/trgt/users01.dbf To find out which datafiles are in the system tablespace, you’ll need to query sys.dba_data_files: SQL> select file_name, tablespace_name from sys.dba_data_files; /oracle/oradata/trgt/system01.dbf SYSTEM /oracle/oradata/trgt/undotbs01.dbf UNDOTBS /oracle/oradata/trgt/cwmlite01.dbf CWMLITE /oracle/oradata/trgt/drsys01.dbf DRSYS /oracle/oradata/trgt/example01.dbf EXAMPLE /oracle/oradata/trgt/indx01.dbf INDX /oracle/oradata/trgt/tools01.dbf TOOLS /oracle/oradata/trgt/users01.dbf USERS This example report shows three datafiles that are members of the system , sysaux, and undo tablespaces.
…
restoring, Restoring an RDBMS, Loss of Any Nondata Disk, Loss of a Data Disk, Loss of a Data Disk, Online Partial Restores, Documentation and Testing documentation and testing, Documentation and Testing loss of a data disk, Loss of a Data Disk loss of a nondata disk, Loss of Any Nondata Disk loss of the master database, Loss of a Data Disk online partial restores, Online Partial Restores rollback log, Rollback log row (tuple), Row segments, Tablespace set up information, recording, Take an Inventory tables, Table tablespace, Extents transaction, Transaction transaction log, Transaction log transaction logs, importance of, Don’t Go Overboard view, Table what can go wrong, What Can Happen to an RDBMS? DB2, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs, Managing archive logs, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Backup path and filenaming convention, Discovering the history of your backup operations, Automatic maintenance, Automatic maintenance, Crash recovery, Version recovery, Rollforward recovery, The restore Command, The rollforward Command, The recover Command, Performing an In-Place Version Recovery, Performing a Redirected Version Recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery, Reorganizing Data and Collecting Statistics architecture, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs archive logging, Managing archive logs circular logging, Managing archive logs container, Containers database, Databases database managed spaces (DMS), Tablespaces engine dispatch units (EDUs), DB2 engine dispatch units index, Indexes instance, Instance large (or long) tablespace, Large objects (LOBs) large objects (LOBs), Large objects (LOBs) partition, Database partition partition group, Database partition schema, Schemas system catalog tables, System catalog tables system managed spaces (SMS), Tablespaces tables, Tables tablespaces, Tablespaces transaction logs, Transaction logs view, Views write ahead logging, Transaction logs automated backup utilities, Automatic maintenance backup command, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Discovering the history of your backup operations delta backup, Backup levels full backup, Backup levels incremental backup, Backup levels pathname and filename, Backup path and filenaming convention recovery history file, Discovering the history of your backup operations crash recovery, Crash recovery health monitor, Automatic maintenance in-place version recovery, Performing an In-Place Version Recovery offline backup, Managing archive logs partial backups, Backup path and filenaming convention recover command, The recover Command redirected version recovery, Performing a Redirected Version Recovery restore command, The restore Command rollforward command, The rollforward Command rollforward recovery, Rollforward recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery point in time (PIT), Performing a Rollforward Recovery statistics, gathering, Reorganizing Data and Collecting Statistics version recovery, Version recovery DB2 UDB, IBM DB2 Backup and Recovery, Reorganizing Data and Collecting Statistics dd utility, Backups change atime, Backing Up and Restoring with the dd Utility, Basic dd Options, Using dd to Copy a File or Raw Device, Using dd to Convert Data, Using dd to Determine the Block Size of a Tape, Using dd to Figure out the Backup Format atime, changing, Backups change atime backup format, Using dd to Figure out the Backup Format convert data with, Using dd to Convert Data copy a file or raw device with, Using dd to Copy a File or Raw Device determine block size of a tape with, Using dd to Determine the Block Size of a Tape options, Basic dd Options de-duplication, De-Duplication Backup Systems, De-duplication backup systems, De-Duplication Backup Systems density versus compression, Density Versus Compression development procedures (backups), following proper, Following Proper Development Procedures differential backups, Backup Levels digital audio tape (DAT), DDS drive Digital Data Storage (DDS), DDS drive Digital Linear Tape (DLT), DLT drives (end-of-lifed) directories, cpio’s Restore Options, Torture-Testing Backup Programs, Pass I, Pass III cpio restore, making needed, cpio’s Restore Options data, writing (dump utility), Pass III dump utility, evaluating for backup, Pass I files, becoming (volatile filesystem backups), Torture-Testing Backup Programs disasters, types of, Be Ready for Anything: 10 Types of Disasters, Be Ready for Anything: 10 Types of Disasters disk drives, Be Ready for Anything: 10 Types of Disasters, Reliability failure, losing data from, Be Ready for Anything: 10 Types of Disasters versus tape media, Reliability disk mirroring, System Recovery and Disk Mirroring disk staging, Disk Targets disk-as-disk targets, Disk-As-Disk Targets disk-as-tape units (virtual tape libraries), Disk-As-Tape: Virtual Tape Libraries disk-based backup, You Want to Learn About Disk-Based Backup disk-to-disk-to-tape (D2D2T) backup, You Want to Learn About Disk-Based Backup disks left powered off, Re-presentation ditto utility, Backing Up and Restoring with the ditto Utility, Syntax of ditto When Backing Up, The Options to the ditto Command, Syntax of ditto when Restoring options, The Options to the ditto Command restoring with, Syntax of ditto when Restoring syntax, Syntax of ditto When Backing Up documentation, Take an Inventory, 12,000 gold pieces, Self-preservation: Document, document, document backups, 12,000 gold pieces importance in backup plan, Take an Inventory restore program, importance of, Self-preservation: Document, document, document downtime, cost of, What Will Downtime Cost You?
…
DB2, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs, Managing archive logs, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Backup path and filenaming convention, Discovering the history of your backup operations, Automatic maintenance, Automatic maintenance, Crash recovery, Version recovery, Rollforward recovery, The restore Command, The rollforward Command, The recover Command, Performing an In-Place Version Recovery, Performing a Redirected Version Recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery, Reorganizing Data and Collecting Statistics architecture, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs archive logging, Managing archive logs circular logging, Managing archive logs container, Containers database, Databases database managed spaces (DMS), Tablespaces engine dispatch units (EDUs), DB2 engine dispatch units index, Indexes instance, Instance large (or long) tablespace, Large objects (LOBs) large objects (LOBs), Large objects (LOBs) partition, Database partition partition group, Database partition schema, Schemas system catalog tables, System catalog tables system managed spaces (SMS), Tablespaces tables, Tables tablespaces, Tablespaces transaction logs, Transaction logs view, Views write ahead logging, Transaction logs automated backup utilities, Automatic maintenance backup command, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Discovering the history of your backup operations delta backup, Backup levels full backup, Backup levels incremental backup, Backup levels pathname and filename, Backup path and filenaming convention recovery history file, Discovering the history of your backup operations crash recovery, Crash recovery health monitor, Automatic maintenance in-place version recovery, Performing an In-Place Version Recovery offline backup, Managing archive logs partial backups, Backup path and filenaming convention recover command, The recover Command redirected version recovery, Performing a Redirected Version Recovery restore command, The restore Command rollforward command, The rollforward Command rollforward recovery, Rollforward recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery point in time (PIT), Performing a Rollforward Recovery statistics, gathering, Reorganizing Data and Collecting Statistics version recovery, Version recovery DB2 UDB, IBM DB2 Backup and Recovery, Reorganizing Data and Collecting Statistics dd utility, Backups change atime, Backing Up and Restoring with the dd Utility, Basic dd Options, Using dd to Copy a File or Raw Device, Using dd to Convert Data, Using dd to Determine the Block Size of a Tape, Using dd to Figure out the Backup Format atime, changing, Backups change atime backup format, Using dd to Figure out the Backup Format convert data with, Using dd to Convert Data copy a file or raw device with, Using dd to Copy a File or Raw Device determine block size of a tape with, Using dd to Determine the Block Size of a Tape options, Basic dd Options de-duplication, De-Duplication Backup Systems, De-duplication backup systems, De-Duplication Backup Systems density versus compression, Density Versus Compression development procedures (backups), following proper, Following Proper Development Procedures differential backups, Backup Levels digital audio tape (DAT), DDS drive Digital Data Storage (DDS), DDS drive Digital Linear Tape (DLT), DLT drives (end-of-lifed) directories, cpio’s Restore Options, Torture-Testing Backup Programs, Pass I, Pass III cpio restore, making needed, cpio’s Restore Options data, writing (dump utility), Pass III dump utility, evaluating for backup, Pass I files, becoming (volatile filesystem backups), Torture-Testing Backup Programs disasters, types of, Be Ready for Anything: 10 Types of Disasters, Be Ready for Anything: 10 Types of Disasters disk drives, Be Ready for Anything: 10 Types of Disasters, Reliability failure, losing data from, Be Ready for Anything: 10 Types of Disasters versus tape media, Reliability disk mirroring, System Recovery and Disk Mirroring disk staging, Disk Targets disk-as-disk targets, Disk-As-Disk Targets disk-as-tape units (virtual tape libraries), Disk-As-Tape: Virtual Tape Libraries disk-based backup, You Want to Learn About Disk-Based Backup disk-to-disk-to-tape (D2D2T) backup, You Want to Learn About Disk-Based Backup disks left powered off, Re-presentation ditto utility, Backing Up and Restoring with the ditto Utility, Syntax of ditto When Backing Up, The Options to the ditto Command, Syntax of ditto when Restoring options, The Options to the ditto Command restoring with, Syntax of ditto when Restoring syntax, Syntax of ditto When Backing Up documentation, Take an Inventory, 12,000 gold pieces, Self-preservation: Document, document, document backups, 12,000 gold pieces importance in backup plan, Take an Inventory restore program, importance of, Self-preservation: Document, document, document downtime, cost of, What Will Downtime Cost You?
Natural Language Processing with Python and spaCy
by
Yuli Vasiliev
Published 2 Apr 2020
This time, you should get the following result: {'product': 'pizza', 'ptype': 'Greek', 'qty': 2} The value of the 'qty' field is now a digit, and we have a consistent format to send to the database. Preparing Your Database Environment To prepare your database environment, you need to install or obtain access to a database; create the components you’ll need in the database, such as a database schema, a table, and so on; and install a Python module that enables you to interact with the database. Although you can choose any database that can receive and process JSON data, such as an Oracle database, this section uses MySQL. The MySQL database has a long history of supporting the most popular data interchange formats, which are XML and JSON.
PostgreSQL Cookbook
by
Chitij Chauhan
Published 30 Jan 2015
This is useful for comparing what is stored in the database against the definitions in a data or object modeling tool. It is also useful to make sure that you can recreate objects in exactly the correct schema, tablespace, and database with the correct ownership and permissions. To back up all object definitions in all the databases, including roles, tablespaces, databases, schemas, tables, indexes, triggers, functions, constraints, views, ownership, and privileges, you can use the following command in Windows: pg_dumpall --schema-only > c:\pgdump\definitiononly.sql If you want to back up the role definition only, use the following command: pg_dumpall --roles-only > c:\pgdump\myroles.sql If you want to back up tablespace definitions, use the following command: pg_dumpall --tablespaces-only > c:\pgdump\mytablespaces.sql You can also use the pgAdmin tool to backup all the databases on the server, including the roles, users, groups, and tablespaces.
Learning Node.js: A Hands-On Guide to Building Web Applications in JavaScript
by
Marc Wandschneider
Published 18 Jun 2013
Again, one of the trickier parts of Node.js is that it’s constantly under development and improvement; things are still changing quite a bit. You use the 2.x series in this book because it’s much more robust than what was there before. Creating a Schema for the Database When working with MySQL, you need to create a database schema for the application, which you place in schema.sql. The first part of this is to create a database with UTF-8 as the default character set and sorting order, as follows: Click here to view code image DROP DATABASE IF EXISTS PhotoAlbums; CREATE DATABASE PhotoAlbums DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; USE PhotoAlbums; You then have to create a schema for your Users table, where you place information for registered users of your photo-sharing app.
Flask Web Development: Developing Web Applications With Python
by
Miguel Grinberg
Published 12 May 2014
Flask-SQLAlchemy creates database tables from models only when they do not exist already, so the only way to make it update tables is by destroying the old tables first, but of course this causes all the data in the database to be lost. A better solution is to use a database migration framework. In the same way source code version control tools keep track of changes to source code files, a database migration framework keeps track of changes to a database schema, and then incremental changes can be applied to the database. The lead developer of SQLAlchemy has written a migration framework called Alembic, but instead of using Alembic directly, Flask applications can use the Flask-Migrate extension, a lightweight Alembic wrapper that integrates with Flask-Script to provide all operations through Flask-Script commands.
Kill It With Fire: Manage Aging Computer Systems
by
Marianne Bellotti
Published 17 Mar 2021
If you configure them to talk to the same database, they are tightly coupled (Figure 3-2). Figure 3-2: Tightly coupled services Such coupling creates a few potential problems. To begin with, any of the three services could make a change to the data that breaks the other two services. Any changes to the database schema have to be coordinated across all three services. By sharing a database, you lose the scaling benefit of having three separate services, because as load increases on one service, it is passed down to the database, and the other services see a dip in performance. However, giving each service its own database trades those problems for other potential problems.
Learning SPARQL
by
Bob Ducharme
Published 15 Jul 2011
If you’ve followed along with the examples in this chapter, then you’ve already used one of these combinations of an RDBMS and SPARQL middleware without knowing it: the Linked Moved Database stores its data using the MySQL database and uses the D2RQ server as a middleware layer. D2RQ is free, and once you give it read access to a relational database, it can read the database schema and generate the files it needs to map SPARQL queries to SQL for that database. (These include the resource and property names that you’ll use in your queries, and can be customized.) If the same D2RQ server has this kind of access to multiple databases, a single SPARQL query can ask for data from multiple databases at once, which is not possible with a standard SQL query.
Practical Ext JS Projects With Gears
by
Frank Zammetti
Published 7 Jul 2009
UML class diagram of the DAO class The class starts out with the databaseName field that names the Gears database we’ll be using. The value in this case is TimekeeperExt. Following that we find 15 fields, the value of each of which is an SQL statement. There are three types of entities this application deals with: projects, tasks, and resources. The database schema is simple, as you can see in Figure 4-5, Figure 4-6, and Figure 4-7—one for each of the three tables corresponding to the three entities involved. C ha p ter 4 ■ M a K I N G p r O Je C t M a N a G e M e N t C O O L: t I M e K e e p e r e X t Figure 4-5. Table structure of the projects table Figure 4-6.
…
To do that we’ll have to play with Gears and SQLite a bit: var db = google.gears.factory.create("beta.database"); db.open(inDatabaseName); var rs = db.execute ( "SELECT name, sql FROM sqlite_master where type='table';" ); 393 394 Ch apt er 7 ■ YOU r Da D ha D a W Or K B eNC h , N O W S O D O Y O U : S Q L W O r K B e N C h Figure 7-10. The Tables Window Every SQLite database has a special read-only table named sqlite_master in it. The data in this table describes the database schema. The structure of this table is always what is shown in Table 7-1. Table 7-1. The Structure of the sqlite_master Table Field Description type Tells the type of entity the record describes. For our purposes, the only value we care about is table. name Gives you the name of the entity the record describes.
Business Metadata: Capturing Enterprise Knowledge
by
William H. Inmon
,
Bonnie K. O'Neil
and
Lowell Fryman
Published 15 Feb 2008
For more detail in this regard, see his book Data Model Patterns: Conventions of Thought1. It is therefore the consensus of the authors that an ER model constitutes business metadata because it communicates relationships between concepts in both a graphical and linguistic way. Parenthetically, ER models are also used to generate technical metadata (database schema), and to communicate relationships between data elements to technical people: database designers, database administrators (DBAs), and developers. 11.4.5 Conceptual Model, RDF and OWL, Topic Map, UML 11.4.5.1 Conceptual Model The techniques we have just discussed (glossaries, thesauri, and data models) are used to capture business metadata in ways that can be understood by human beings.
Blitzscaling: The Lightning-Fast Path to Building Massively Valuable Companies
by
Reid Hoffman
and
Chris Yeh
Published 14 Apr 2018
You see, Minna specializes in taking a successful software product and helping it go global. She has a very particular set of skills that she has acquired over a long career dating back to the dot-com era. She knows exactly what a software development and product team needs to do in order to make Internet software work in different languages and markets, in areas ranging from database schemas to user interface. She then works with a cross-functional team to implement these changes in advance of a global rollout. It’s not easy to find people who fit your needs so perfectly; you can’t just go on LinkedIn and filter by “preferred stage of blitzscaling.” (Though come to think of it, that might not be a bad idea…) You’ll probably have to rely on your network for recommendations, which is where your investors and board of directors can help.
Programming TypeScript
by
Boris Cherny
Published 16 Apr 2019
Backend Frameworks When you build an application that interacts with a database, you might start with raw SQL or API calls, which are inherently untyped: // PostgreSQL, using node-postgres let client = new Client let res = await client.query( 'SELECT name FROM users where id = $1', [739311] ) // any // MongoDB, using node-mongodb-native db.collection('users') .find({id: 739311}) .toArray((err, user) => // user is any ) With a bit of manual typing you can make these APIs safer and get rid of most of your anys: db.collection('users') .find({id: 739311}) .toArray((err, user: User) => // user is any ) However, raw SQL APIs are still fairly low-level, and it’s still easy to use the wrong type, or forget a type and accidentally end up with anys. That’s where object-relational mappers (ORMs) come in. ORMs generate code from your database schema, giving you high-level APIs to express queries, updates, deletions, and so on. In statically typed languages, these APIs are typesafe, so you don’t have to worry about typing things correctly and manually binding generic type parameters. When accessing your database from TypeScript, consider using an ORM.
Hands-On RESTful API Design Patterns and Best Practices
by
Harihara Subramanian
Published 31 Jan 2019
Metrics This Java library acts as a powerful toolkit and offers ways to measure the behavior of components that are deployed in production. Combined with other prominent libraries, such as Ehcache and Graphite, Metrics stands as it provides full-stack visibility to our RESTful API and web applications. Liquibase Dropwizard includes the open source solution for managing revisions of database-schema scripts. Liquibase supports various types of database, and various file formats for defining the DB structure. The highlight of Liquibase is its ability to roll back changes (forward and backward) from a specific point. Other noteworthy features Before we get into its advantages and disadvantages, let's take a glimpse at other noteworthy libraries in Dropwizard: Logging: Logback and slf4j for performant and flexible logging Hibernate validator: Offers easy ways for user input validation, along with capabilities of generating i18n-friendly error messages Http interactions: Bundled with Apache HttpClient and Jersey client libraries, it helps low-level and high-level HTTP interactions with other web services JDBI: Provides a simple, comfortable, and straightforward way to establish Relational database connectivity with Java backends Templates: Dropwizard supports Freemarker and Mustache, the simplest templating systems for consumer or user-facing applications Benefits of Dropwizard As we are covering the advantages and disadvantages of each framework, let's touch upon them for Dropwizard as well, so that you can decide why or why not to use Dropwizard.
The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise
by
Martin L. Abbott
and
Michael T. Fisher
Published 1 Dec 2009
R OLLBACK C APABILITIES Rollback Technology Considerations We often hear during our discussions around the rollback insurance policy that clients in general agree that being able to roll back would be great but that it is technically not feasible for them. Our answer to this is that it is almost always possible; it just may not be possible with your current team, processes, or architecture. The most commonly cited reason for an inability to roll back in Web enabled platforms and back office IT systems is database schema incompatibility. The argument usually goes that for any major development effort, there may be significant changes to the schema resulting in an incompatibility with the way old and new data is stored. This modification may result in table relationships changing, candidate keys changing, table columns changing, tables added, tables merged, tables disaggregated, and tables removed.
…
X-axis implementations are limited by the aforementioned replication technology limitations and the size of data that is being replicated. In general, x-axis implementations do not scale well with data size and growth. The Y-Axis of the AKF Database Scale Cube The y-axis of the AKF Database Scale Cube represents a separation of data meaning within your database schema or data storage system. When discussing database scale, we are usually aligning data with a predetermined application y-axis split. The y-axis split addresses the monolithic nature of the data architecture by separating the data into schemas that have meaning relative to the applications performing work on that T HE Y-A XIS OF THE AKF D ATABASE S CALE C UBE data or reading from that data.
Chaos Engineering: System Resiliency in Practice
by
Casey Rosenthal
and
Nora Jones
Published 27 Apr 2020
For example, in the absence of unit or functional testing, some code changes may warrant greater scrutiny via code review. Or at particular times (maybe during a daily peak of traffic, or the holiday season experienced by ecommerce sites) a more conservative rate of “ramp-up” for new code paths may be considered. Database schema changes, search index rebuilding, lower-layer network routing, and “cosmetic” markup changes are all examples of activities that carry different uncertainties, different consequences if things go “sideways,” and different contingencies to take when they do. Chaos Engineering is no different; to experiment effectively is to be context-sensitive.
Clean Code: A Handbook of Agile Software Craftsmanship
by
Robert C. Martin
Published 1 Jan 2007
This is still duplication and should be addressed by using the TEMPLATE METHOD,4 or STRATEGY5 pattern. 4. [GOF]. 5. [GOF]. Indeed, most of the design patterns that have appeared in the last fifteen years are simply well-known ways to eliminate duplication. So too the Codd Normal Forms are a strategy for eliminating duplication in database schemae. OO itself is a strategy for organizing modules and eliminating duplication. Not surprisingly, so is structured programming. I think the point has been made. Find and eliminate duplication wherever you can. G6: Code at Wrong Level of Abstraction It is important to create abstractions that separate higher level general concepts from lower level detailed concepts.
Python Geospatial Development - Second Edition
by
Erik Westra
Published 23 May 2013
Shoreline and lake boundaries (as well as other land-water boundaries such as islands within lakes) are readily available using the GSHHS shoreline database.City and town data can be found in two places: The GNIS Database (http://geonames.usgs.gov/domestic) provides official place-name data for the United States, while the GEOnet Names Server (http://earth-info.nga.mil/gns/html) provides similar data for the rest of the world. Looking at these data sources, we can start to design the database schema for the DISTAL system: Tip The level field in the shorelines table corresponds to the level value in the GSHHS database: a value of 1 represents a coastline, 2 represents a lake, 3 represents an island within a lake, and 4 represents a pond on an island in a lake. While this is very simple, it's enough to get us started.
Docker in Action
by
Jeff Nickoloff
and
Stephen Kuenzli
Published 10 Dec 2019
We’ll use the long form to enhance databases.yml and add a volume to store the postgres data: version: "3.7" volumes: pgdata: # empty definition uses volume defaults services: postgres: image: dockerinaction/postgres:11-alpine volumes: - type: volume source: pgdata # The named volume above target: /var/lib/postgresql/data environment: POSTGRES_PASSWORD: example adminer: image: dockerinaction/adminer:4 ports: - 8080:8080 deploy: replicas: 1 # Scale down to 1 replica so you can test In this example, the file defines a volume named pgdata, and the postgres service mounts that volume at /var/lib/postgresql/data. That location is where the PostgreSQL software will store any database schema or data. Deploy the stack and inspect the results: docker stack deploy \ -c databases.yml \ --prune \ my-databases Run docker volume ls after the changes have been applied to verify that the operation was successful: DRIVER VOLUME NAME local my-databases_pgdata The name of the stack prefixes any resources created for it such as services or volumes.
The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win
by
Gene Kim
,
Kevin Behr
and
George Spafford
Published 14 Jul 2013
It conflicts with some of the Phoenix database changes, so Chris’ guys are going to need to do some recoding.” “Shit,” I say. “Wait. Which Phoenix module?” “It’s one of Sarah’s projects that we released after the project freeze was lifted,” he replies. “It was before we put the kanban around Brent. It was a database schema change that slipped through the cracks.” I swear under my breath. Sarah again? Chris has a pinched expression on his face. “This is going to be tricky. We’re going to have to rename a bunch of database columns, which will affect, who knows, maybe hundreds of files. And all the support scripts.
The Nature of Software Development: Keep It Simple, Make It Valuable, Build It Piece by Piece
by
Ron Jeffries
Published 14 Aug 2015
Adjust connection pool sizes and timeouts. Disable specific outbound integrations. Reload configuration. Start or stop accepting load. Feature toggles. Not every service will need all of these controls. They should give you a place to start, though. Many services also expose controls to update the database schema, or even to delete all data and reseed it. These are presumably helpful in test environments but extremely hazardous in production. These controls result from a breakdown in roles. Developers don’t trust operations to deploy the software and run the scripts correctly. Operations doesn’t allow developers to log in to the production machines to update the schemata.
Test-Driven Development With Python
by
Harry J. W. Percival
Published 10 Jun 2014
Test fixtures Test fixtures refers to test data that needs to be set up as a precondition before a test is run—often this means populating the database with some information, but as we’ve seen (with browser cookies), it can involve other types of preconditions. Avoid JSON fixtures Django makes it easy to save and restore data from the database in JSON format (and others) using the dumpdata and loaddata management commands. Most people recommend against using these for test fixtures, as they are painful to man‐ age when your database schema changes. Use the ORM, or a tool like factory_boy. Fixtures also have to work remotely LiveServerTestCase makes it easy to interact with the test database using the Django ORM for tests running locally. Interacting with the database on the staging server is not so straightforward—one solution is Django management commands, as I’ve shown, but you should explore what works for you, and be careful!
Scala in Action
by
Nilanjan Raychaudhuri
Published 27 Mar 2012
The following is how you represent a story class in Scala: class Story(val number: String, val title: String, val phase: String) To make this class work with Squeryl, you have to do a couple of simple setups. First you have to tell Squeryl you need a table that will store all the stories for you in the database. The way to do that in Squeryl is to create a subclass of org.squeryl.Schema. Think of this class as equivalent to a database schema where you’ll keep all the data definitions of an application. The following code defines the schema with a table called “STORIES” for your Story class: package com.kanban.models import org.squeryl._ object KanbanSchema extends Schema { val stories = table[Story]("STORIES") } Save this as the file KanbanSchema.scala under src/main/scala/com/kanban/models.
Advanced Software Testing—Vol. 3, 2nd Edition
by
Jamie L. Mitchell
and
Rex Black
Published 15 Feb 2015
—Gustave Flaubert The fifth chapter of the Advanced syllabus is concerned with reviews. As you will recall from the Foundation syllabus, reviews are a form of static testing where people, rather than tools, analyze the project or one of the project’s work products, such as a requirements specification, design specification, database schema, or code. The primary goal is typically to find defects in that work product before it serves as a basis for further project activity, though other goals can also apply. In the Advanced Technical Test Analyst syllabus, we focus on reviews of technical work products. Chapter 5 of the Advanced Technical Test Analyst syllabus has two sections. 1.
Code Complete (Developer Best Practices)
by
Steve McConnell
Published 8 Jun 2004
A telecommunications organization went from 86 percent correct before reviewing code changes to 99.6 percent afterward (Perrott 2004). Adjust your approach depending on the risk level of the refactoring. Some refactorings are riskier than others. A refactoring like "Replace a magic number with a named constant" is relatively risk-free. Refactorings that involve class or routine interface changes, database schema changes, or changes to boolean tests, among others, tend to be more risky. For easier refactorings, you might streamline your refactoring process to do more than one refactoring at a time and to simply retest, without going through an official review. For riskier refactorings, err on the side of caution.
…
You can deal with proliferating software versions by using version-control tools for Source-code control Dependency control like that offered by the make utility associated with UNIX Project documentation versioning Relating project artifacts like requirements, code, and test cases so that when a requirement changes, you can find the code and tests that are affected Data Dictionaries A data dictionary is a database that describes all the significant data in a project. In many cases, the data dictionary focuses primarily on database schemas. On large projects, a data dictionary is also useful for keeping track of the hundreds or thousands of class definitions. On large team projects, it's useful for avoiding naming clashes. A clash might be a direct, syntactic clash, in which the same name is used twice, or it might be a more subtle clash (or gap) in which different names are used to mean the same thing or the same name is used to mean subtly different things.
Python for Unix and Linux System Administration
by
Noah Gift
and
Jeremy M. Jones
Published 29 Jun 2009
You can see that we have simply removed the # character from the beginning of the line to include the admin URLs config file. Now that we have configured a database, added the admin and inventory applications, and added the admin interface to the URLs config file, we are ready to start defining the database schema. In Django, each application has its own schema definition. In each application directory, “inventory” in this case, there is a file named models.py that contains definitions for the tables and columns that your application will use. With Django, as well as many other web frameworks that rely on ORMs, it is possible to create and use a database without having to write a single SQL expression.
Data Wrangling With Python: Tips and Tools to Make Your Life Easier
by
Jacqueline Kazil
Published 4 Feb 2016
Setting up and getting data into a relational database can involve many steps, but if your datasets are complex with many different relationships, it shouldn’t take more than a few steps to figure out how to join them and get the information you desire. When building relational databases, spend time mapping out the relations and their attributes, similar to what we did with the friend database. What are the different types of data, and how can they be mapped to one another? In relational database schema, we figure out how we want to match data by thinking about how we will most often use the data. You want the queries you ask the database to be easy to answer. Because we thought we might use occupation to help identify a friend, we put the occupation_id in the friend-table. Another thing to note is there are several different kinds of relationships.
Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design
by
Diomidis Spinellis
and
Georgios Gousios
Published 30 Dec 2008
Second, the operations group greatly appreciated the easy deployment of new releases. Previous systems had required the studios to ship removable hard drives back and forth, with all the attendant logistics problems. Finally, having the update mechanism allowed us to focus on “just sufficient” database design. We did not peer into the crystal ball or overengineer the database schema. Instead, we just designed enough of the schema to support the current iteration. Immutable Data and Ubiquitous GUIDs In working with customers, the studio associate creates some compositions that use multiple photographs, inset into a design. These designs come from a design group at company headquarters.
Site Reliability Engineering: How Google Runs Production Systems
by
Betsy Beyer
,
Chris Jones
,
Jennifer Petoff
and
Niall Richard Murphy
Published 15 Apr 2016
In particular, some of this chapter’s authors were in charge of a portion of the serving system (shown in Figure 31-1) that continually extracts and processes data from the database, in order to generate a set of indexed files that are then loaded and served around the world. This system was distributed over several datacenters and used about 1,000 CPUs and 8 TB of RAM to index 100 TB of data every day. Figure 31-1. A generic ads serving system The migration was nontrivial: in addition to migrating to a new technology, the database schema was significantly refactored and simplified thanks to the ability of F1 to store and index protocol buffer data in table columns. The goal was to migrate the processing system so that it could produce an output perfectly identical to the existing system. This allowed us to leave the serving system untouched and to perform, from the user’s perspective, a seamless migration.
Building Secure and Reliable Systems: Best Practices for Designing, Implementing, and Maintaining Systems
by
Heather Adkins
,
Betsy Beyer
,
Paul Blankinship
,
Ana Oprea
,
Piotr Lewandowski
and
Adam Stubblefield
Published 29 Mar 2020
You can consider each of the following to be a deployment: Pushing code: Issuing a command to cause a server to download and run a new binary Updating a Kubernetes Deployment object to pick up a new Docker image Booting a VM or physical machine, which loads initial software or firmware Updating configuration: Running a SQL command to change a database schema Updating a Kubernetes Deployment object to change a command-line flag Publishing a package or other data, which will be consumed by other users: Uploading a deb package to an apt repository Uploading a Docker image to a container registry Uploading an APK to the Google Play Store Post-deployment changes are out of scope for this chapter.
Real World Haskell
by
Bryan O'Sullivan
,
John Goerzen
,
Donald Stewart
and
Donald Bruce Stewart
Published 2 Dec 2008
With the lazy version, results will be evaluated in chunks; with the strict version, all results are read up front, stored in RAM, and then printed. Database Metadata Sometimes it can be useful for a program to learn information about the database itself. For instance, a program may want to see what tables exist so that it can automatically create missing tables or upgrade the database schema. In some cases, a program may need to alter its behavior depending on the database backend in use. First, there is a getTables function that will obtain a list of defined tables in a database. You can also use the describeTable function, which will provide information about the defined columns in a given table.
Ajax: The Definitive Guide
by
Anthony T. Holdener
Published 25 Jan 2008
The star schema is more popular than the snowflake schema, but you can find good information on both. Principles and Implementations of Datawarehousing by Rajiv Parida (Laxmi Publications) and The Art of SQL by Stéphane Faroult and Peter Robson (O’Reilly) are good places to start for information on database schemas. Other resources include Advanced Topics in Database Research by Keng Siau (Ed.) (Idea Group Publishing) and Oracle Essentials: Oracle Database 10g, Third Edition, by Rick Greenwald et al. (O’Reilly). Databases | 47 Object databases represent information in the form of objects, essentially in the same way as objects are used in object-oriented programming.
The Web Application Hacker's Handbook: Finding and Exploiting Security Flaws
by
Dafydd Stuttard
and
Marcus Pinto
Published 30 Sep 2007
NoSQL data stores represent data using key/value mappings and do not rely on a fixed schema such as a conventional database table. Keys and values can be arbitrarily defined, and the format of the value generally is not relevant to the data store. A further feature of key/ value storage is that a value may be a data structure itself, allowing hierarchical storage, unlike the flat data structure inside a database schema. Chapter 9 ■ Attacking Data Stores 545 NoSQL advocates claim this has several advantages, mainly in handling very large data sets, where the data store's hierarchical structure can be optimized exactly as required to reduce the overhead in retrieving data sets. In these instances a conventional database may require complex cross-referencing of tables to retrieve information on behalf of an application.
The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities
by
Justin Schuh
Published 20 Nov 2006
ViewState The ViewState, stored in a client-side cookie, contains information on form parameter content, control status, and other display-specific information. By default, ViewState is protected with a secure message digest by using a secret in the validationKey attribute of the machineKey field in web.config. However, some controls can be bound to data sources that reveal column and table names along with other potential database schema. To address this problem, ViewState can also be encrypted by setting the validation attribute to AES or DES and providing a value for decryptionKey. If ViewState isn’t encrypted, you can use one of many ViewState decoder tools to search for interesting information (a ViewState decoder is available from www.pluralsight.com/tools.aspx).
The C++ Programming Language
by
Bjarne Stroustrup
Published 2 Jan 1986
These techniques work well in specific areas where there is either a sound theoretical foundation (e.g., math, state machines, and relational databases) or where a general framework exists into which small application fragments can be embedded (e.g., graphical user interfaces, network simulations, and database schema). The obvious usefulness of these techniques in limited – and typically crucial – areas can tempt people to think that the elimination of traditional programming by these techniques is ‘‘just around the corner.’’ It is not. The reason is that expanding specification techniques outside areas with sound theoretical frameworks implies that the complexity of a general-purpose programming language would be needed in the specification language.