Tuesday, March 22, 2011

Referential Integrity

Referential integrity is a relational database concept in which multiple tables share a relationship based on the data stored in the tables, and that relationship must remain consistent.


Referential integrity is the state of a database in which all values of all foreign keys are valid. 


Example:




Principle of Referential Integrity:


The principle of referential integrity may be stated as: Every non-null foreign key value must match an existing primary key value.




The non-null part means that a value is not required in the foreign key field, but every one that does exist must match a value in the primary key field of the table to which it is related. Referential integrity is all about preserving the
validity of foreign key values.









  • Rules of Referential Integrity:



  • 1. Comprehend that when referential integrity is enforced, you will not be able to enter a value in the foreign key field of the related table unless that value exists in the primary key of the primary table. For example, in your CD database, you cannot assign a CD to an artist that doesn't exist in your Artist table. You could assign a null value to the CD however. This would indicate that the CD is assigned to no one.



  • 2. 



  • Know that when referential integrity is enforced, you will not be able to delete a record from a table if there is a matching record in a related table. For example, you could not delete a record from your CD table if there is a matching record in your Genre table.



  • 3. 



  • Know that you cannot change a primary key value in the primary table if that record has related records. For example, you cannot change the ID of a record in your CD database if the User table shows that this particular ID has been borrowed by a friend.



  • The rules of referential integrity involve the following concepts and terminology:
    Parent key: 
    A primary key or unique key of a referential constraint.



    Parent row: A row that has at least one dependent row.



    Parent table: A table that is a parent in at least one referential constraint. A table can be defined as a parent in an arbitrary number of referential constraints.



    Dependent table: A table that is a dependent in at least one referential constraint. A table can be defined as a dependent in an arbitrary number of referential constraints. A dependent table can also be a parent table.



    Descendent table: A table is a descendent of table T if it is a dependent of T or a descendent of a dependent of T



    Dependent row: A row that has at least one parent row.



    Descendent row: A row is a descendent of row p if it is a dependent of p or a descendent of a dependent of p.



    Referential cycle: A set of referential constraints such that each table in the set is a descendent of itself.



    Self-referencing row: A row that is a parent of itself.



    Self-referencing table: A table that is a parent and a dependent in the same referential constraint. The constraint is called a self-referencing constraint.


    The insert rule of a referential constraint is that a nonnull insert value of the foreign key must match some value of the parent key of the parent table. The value of a composite foreign key is null if any component of the value is null.

    The update rule of a referential constraint is specified when the referential constraint is defined. The choices are NO ACTION and RESTRICT. The update rule applies when a row of the parent or dependent table is updated. The update rule is that a nonnull update value of a foreign key must match some value of the parent key of the parent table. The value of a composite foreign key is null if any component of the value is null.

    The delete rule of a referential constraint is specified when the referential constraint is defined. The choices are NO ACTION, RESTRICT, CASCADE, SET NULL or SET DEFAULT. SET NULL can be specified only if some column of the foreign key allows null values.

    The delete rule of a referential constraint applies when a row of the parent table is deleted. More precisely, the rule applies when a row of the parent table is the object of a delete or propagated delete operation (defined below) and that row has dependents in the dependent table of the referential constraint. Let P denote the parent table, let D denote the dependent table, and let p denote a parent row that is the object of a delete or propagated delete operation. If the delete rule is:
    • RESTRICT or NO ACTION, an error occurs and no rows are deleted
    • CASCADE, the delete operation is propagated to the dependents of p in D
    • SET NULL, each nullable column of the foreign key of each dependent of p in D is set to null
    • SET DEFAULT, each column of the foreign key of each dependent of p in D is set to its default value
    Each referential constraint in which a table is a parent has its own delete rule, and all applicable delete rules are used to determine the result of a delete operation. Thus, a row cannot be deleted if it has dependents in a referential constraint with a delete rule of RESTRICT or NO ACTION, or if the deletion cascades to any of its descendants that are dependents in a referential constraint with the delete rule of RESTRICT or NO ACTION.
    The deletion of a row from parent table P involves other tables and may affect rows of these tables:
    • If table D is a dependent of P and the delete rule is RESTRICT or NO ACTION, D is involved in the operation but is not affected by the operation.
    • If D is a dependent of P and the delete rule is SET NULL, D is involved in the operation, and rows of D may be updated during the operation.
    • If D is a dependent of P and the delete rule is SET DEFAULT, D is involved in the operation, and rows of D may be updated during the operation.
    • If D is a dependent of P and the delete rule is CASCADE, D is involved in the operation and rows of D may be deleted during the operation.
    If rows of D are deleted, the delete operation on P is said to be propagated to D. If D is also a parent table, the actions described in this list apply, in turn, to the dependents of D.
    Any table that may be involved in a delete operation on P is said to be delete-connected to P. Thus, a table is delete-connected to table P if it is a dependent of P or a dependent of a table to which delete operations from P cascade.

    Introduction to Constraints


    Data types are a way to limit the kind of data that can be stored in a table. For many applications, however, the constraint they provide is too coarse. For example, a column containing a product price should probably only accept positive values. But there is no data type that accepts only positive numbers. Another issue is that you might want to constrain column data with respect to other columns or rows. For example, in a table containing product information, there should only be one row for each product number.


    To that end, SQL allows you to define constraints on columns and tables. Constraints give you as much control over the data in your tables as you wish. If a user attempts to store data in a column that would violate a constraint, an error is raised. This applies even if the value came from the default value definition.


    A constraint is a property assigned to a column or the set of columns in a table that prevents certain types of inconsistent data values from being placed in the column(s). Constraints are used to enforce the data integrity. This ensures the accuracy and reliability of the data in the database.


    The primary job of a constraint is to enforce a rule in the database. Together, the constraints in a database maintain the integrity of the database.


    The following categories of the data integrity exist:
    • Entity Integrity
    • Domain Integrity
    • Referential integrity
    • User-Defined Integrity
    Entity Integrity ensures that there are no duplicate rows in a table. You can apply entity integrity to a table by specifying a PRIMARY KEY constraint. For example, the ProductID column of the Products table is a primary key for the table.
    Referential integrity ensures that rows cannot be deleted, which are used by other records (for example, corresponding data values between tables will be vital) i.e., Referential integrity ensures the relationships between tables remain preserved as data is inserted, deleted, and modified. You can apply referential integrity using a FOREIGN KEY constraint. The ProductID column of the Order Details table has a foreign key constraint applied referencing the Orders table. The constraint prevents an Order Detail record from using a ProductID that does not exist in the database. Also, you cannot remove a row from the Products table if an order detail references the ProductID of the row. Note: Entity and referential integrity together form key integrity.
    Domain Integrity enforces valid entries for a given column by restricting the type, the format, or the range of possible values. Domain integrity ensures the data values inside a database follow defined rules for values, range, and format. A database can enforce these rules using a variety of techniques, including CHECK constraints, UNIQUE constraints, and DEFAULT constraints. These are the constraints we will cover in this article, but be aware there are other options available to enforce domain integrity. Even the selection of the data type for a column enforces domain integrity to some extent. For instance, the selection of datetime for a column data type is more restrictive than a free format varchar field.
    The following list gives a sampling of domain integrity constraints.
    • A product name cannot be NULL.
    • A product name must be unique.
    • The date of an order must not be in the future.
    • The product quantity in an order must be greater than zero.
    User-Defined Integrity enforces some specific business rules that do not fall into entity, domain, or referential integrity categories.