Saturday, 5 November 2016

Database Keys

Keys are crucial to a table structure for many reasons, some of which are identified below:
§  They ensure that each record in a table is precisely identified.
§  They help establish and enforce various types of integrity.
§  They serve to establish table relationships.

Now let's see how you should choose your key(s). First, let's make up a little table to look at:

PersonID
LastName
FirstName
D.O.B
1
Smith
Robert
01/01/1970
2
Jones
Robert
01/01/1970
4
Smith
Henry
01/01/1970
5
Jones
Henry
01/01/1970

a) Super Key: A superkey is a column or set of columns that uniquely identify a record. This table has many superkeys:

§  PersonID
§  PersonID + LastName
§  PersonID + FirstName
§  PersonID + DOB
§  PersonID + LastName + FirstName
§  PersonID + LastName + DOB
§  PersonID + FirstName + DOB
§  PersonID + LastName + FirstName + DOB
§  LastName + FirstName + DOB

All of these will uniquely identify each record, so each one is a superkey. Of those keys, a key which is comprised of more than one column is a composite key; a key of only one column is a simple key.

b) Candidate Key: A candidate key is a superkey that has no unique subset; it contains no columns that are not necessary to make it unique. This table has 2 candidate keys:

§  PersonID
§  LastName + FirstName + DOB

Not all candidate keys make good primary keys: Note that these may work for our current data set, but would likely be bad choices for future data. It is quite possible for two people to share a full name and date of birth. We select a primary key from the candidate keys. This primary key will uniquely identify each record. It may or may not provide information about the record it identifies. It must not be Null-able, that is if it exists in a record it can not have the value Null. It must be unique. It can not be changed. Any candidate keys we do not select become alternate keys. We will select (PersonID) as the primary key. This makes (LastName + FirstName + DOB) an alternate key. Now, if this field PersonID is meaningful, that is it is used for any other purpose than making the record unique, it is a natural key or intelligent key. In this case PersonID is probably not an AutoNumber field, but is rather a "customer number" for use, much like the UPC or ISBN. 

However, if this field is not meaningful, that is it is strictly for the database to internally identify a unique record, it is a surrogate key or blind key. In this case Person ID probably is an AutoNumber field, and it should not be used except internally by the database. There is a long running debate over whether one should use natural or surrogate keys, and I'm not going to foolishly attempt to resolve it here. Whichever you use, stick with it. If you choose to generate an AutoNumber that is only used to identify a record, do not expose that number to the user. They will surely want to change it, and you can not change primary keys. I can now use my chosen primary key in another table, to relate the two tables. It may or may not have the same name in that second table. In either case, with respect to the second table it is a foreign key, and if in that second table the foreign key field is not indexed it is a fast foreign key.

c) Primary Keys: The first type of key we’ll discuss is the primary key. Every database table should have one or more columns designated as the primary key. The value this key holds should be unique for each record in the database. For example, assume we have a table called Employees that contains personnel information for every employee in our firm. We’d need to select an appropriate primary key that would uniquely identify each employee. Your first thought might be to use the employee’s name. This wouldn’t work out very well because it’s conceivable that you’d hire two employees with the same name. A better choice might be to use a unique employee ID number that you assign to each employee when they’re hired. Some organizations choose to use Social Security Numbers (or similar government identifiers) for this task because each employee already has one and they’re guaranteed to be unique. However, the use of Social Security Numbers for this purpose is highly controversial due to privacy concerns. (If you work for a government organization, the use of a Social Security Number may even be illegal under the Privacy Act of 1974.) For this reason, most organizations have shifted to the use of unique identifiers (employee ID, student ID, etc.) that don’t share these privacy concerns. Once you decide upon a primary key and set it up in the database, the database management system will enforce the uniqueness of the key. If you try to insert a record into a table with a primary key that duplicates an existing record, the insert will fail. Most databases are also capable of generating their own primary keys. Microsoft Access, for example, may be configured to use the AutoNumber data type to assign a unique ID to each record in the table. While effective, this is a bad design practice because it leaves you with a meaningless value in each record in the table.

d) Foreign Keys: The other type of key that we’ll discuss in this course is the foreign key. These keys are used to create relationships between tables. Natural relationships exist between tables in most database structures. Returning to our employees database, let’s imagine that we wanted to add a table containing departmental information to the database. This new table might be called Departments and would contain a large amount of information about the department as a whole. We’d also want to include information about the employees in the department, but it would be redundant to have the same information in two tables (Employees and Departments). Instead, we can create a relationship between the two tables. Let’s assume that the Departments table uses the Department Name column as the primary key. To create a relationship between the two tables, we add a new column to the Employees table called Department. We then fill in the name of the department to which each employee belongs. We also inform the database management system that the Department column in the Employees table is a foreign key that references the Departments table. The database will then enforce referential integrity by ensuring that all of the values in the Departments column of the Employees table have corresponding entries in the Departments table.

e) Unique Key: Unique key can uniquely identify each row in a table, and is closely related to the Superkey concept. A unique key comprises a single column or a set of columns. No two distinct rows in a table can have the same value (or combination of values) in those columns if NULL values are not used. Depending on its design, a table may have arbitrarily many unique keys but at most one primary key. Unique keys do not enforce the NOT NULL constraint in practice. Because NULL is not an actual value (it represents the lack of a value), when two rows are compared, and both rows have NULL in a column, the column values are not considered to be equal. Thus, in order for a unique key to uniquely identify each row in a table, NULL values must not be used.

No comments:

Post a Comment