Delphi Magazine-Database Design Primer.pdf
(
88 KB
)
Pobierz
Surviving Client/Server:
Database Design Primer
by Steve Troxell
D
esigning tables for a client/
server database can present
new challenges to traditional file
server oriented developers. There
are many more features and issues
in the database server world that
the developer should to be aware
of. Nearly any reference book will
tell you what the additional fea-
tures are, but you also need to now
how to employ them effectively.
I get some of my best pearls of
wisdom from pop culture. About
ten years ago I saw a comedy skit
that pretty much summed up my
opinion of a lot of people in soft-
ware development. The skit in-
volved a quasi-prehistoric tribe of
people out on a hunt, trying to for-
mulate a plan for felling some ani-
mal, a buffalo let’s say. The leader
suggested a head-on assault and
commented that 3 or 4 of their
party would probably be killed, but
they would have food for the tribe
for a whole month. A younger mem-
ber of the hunting party came up
with an idea. “What if we all spread
out and waved our arms and
shouted and drove the animal over
that cliff over there? Then we can
simply walk down to the base of the
cliff and drag the carcass back to
camp.” The leader of the hunting
party thought for a while and said,
“A very interesting plan. But your
idea is new and we fear new things.
Therefore it must be rejected.”
People entering client/server
development are faced with a vari-
ety of new capabilities as well. It
surprises me to find that some de-
velopers still shy away from such
things apparently simply because
they are unfamiliar with them. The
following is meant to shed some
light on some of the issues facing a
client/server developer.
Open Architecture
The most important issue you
must come to grips with as early as
possible is whether your system
will be an open or closed architec-
ture. Nearly all client/server RDBM
systems are designed to allow an
open architecture, but that doesn’t
mean you must have one. What I
mean by open architecture is the
database can be accessed not only
by the proprietary applications
you develop, but also by off the
shelf third party productivity tools
such as spreadsheets, data extrac-
tors, word processors and report
generators. This is generally
achieved by using a database
driver such as ODBC to connect the
application to the database server.
Third-party applications can be
written to conform to the standard
ODBC interface and the ODBC
driver in turn is written to access a
particular database server’s API.
In this way the same application
can access different vendor’s data-
bases by substituting an ODBC
driver appropriate for that
database.
Many file server databases, such
as Paradox or dBase, provide for
open architecture as well, so this
isn’t the exclusive domain of cli-
ent/server. However, most devel-
opers are accustomed to closed
architectures where they are not
concerned about anything but
strict proprietary application ac-
cess to the database. It’s important
that developers and customers
both understand the role of third
party software in the system as
early as possible, because many
aspects of the database design can
be an aid or hindrance to an open
architecture.
If the extent of your system’s
need to provide external access to
third-party programs is simply
allowing your users to run prefab-
ricated reports through a commer-
cial report writer, then you really
don’t have much of an open archi-
tecture to worry about. While
The Delphi Magazine
you’ll still need to consider some
issues of user permissions to ta-
bles and other objects, your users
have no need to know what the
tables are, their layout, what the
fields are, the relationships be-
tween tables, etc. But for a mere
technicality, this is still essentially
a closed system. But if your users
will be designing their own ad hoc
reports or queries, then you’ll need
to consider an open architecture.
The real value of an open archi-
tecture is allowing at least some
cross section of the user base di-
rect access to at least some of the
data beyond a prefabricated frame-
work. Generally, this access is pro-
vided in the form of ad hoc
querying for data analysis and re-
porting. This is a very valuable
quality for a database to have as it
opens up selected bits of data to
more users so they can get what
they need and how they need it in
a timely manner. This means you
may make different database de-
sign choices than you would if the
design was the sole domain of
software developers.
User Security
Open architectures are the princi-
pal reason why all client/server da-
tabases provide user login
accounts. A typical closed data-
base system might not be con-
cerned with exactly who was doing
what in the application. Or if it was,
it might employ a simple
Users
ta-
ble within itself. Some true cli-
ent/server applications that are
unconcerned with user account-
ability and external access to data
simply create a single fixed ac-
count in the database which is
used by all applications to gain ac-
cess to the data. The account user-
name and password are encoded
directly into the applications and
there is no login dialog for the user
to complete.
June 1997
41
However, user accounts in
client/server databases are not
primarily concerned with tracking
who’s connected and who’s not.
Their chief purpose is in control-
ling who has access to which parts
of the database. Remember, third
party tools connect to the data-
base independently, and the whole
database is laid bare for all to see.
Would you want marketing people,
whilst creating ad hoc reports on
product sales, to see the payroll
data for the whole company? On
the other hand, you do want the
accounting personnel to be able to
get to the payroll data? When all
access is running through proprie-
tary applications, you can easily
control who gets to what. But with
an open architecture, you have to
rely on the built-in security avail-
able via the user accounts in the
RDBMS you’re using.
This means granting or revoking
permissions to individual users to
read, insert, update or delete data
(or any combination) for a given
table, as shown in Listing 1. So, no
permissions would be granted on
payroll tables to any user that
wasn’t in the accounting depart-
ment (see Listing 2). Many RDBM
systems let you create groups of
users in just this manner to make it
easier to manage user permissions.
Actionable Information
Database design involves striking a
balance between performance,
functionality, and size. With an
open architecture there is also the
need to optimize the amount of “ac-
tionable information” within the
data. This means data that carries
meaning in and of itself, minimizing
the requirement to decode, trans-
late or interpret the data. Software
developers are accustomed to de-
coding, translating and interpret-
ing data and design their databases
accordingly, to achieve compact-
ness and because there’s been no
compelling reason not to. With an
open architecture you have to keep
in mind that users other than soft-
ware developers will be working
with the data.
Take a look at the data shown in
Figure 1. Which rows are for open
orders? Which rows are for orders
on hold? Many database develop-
ers don’t think twice about using a
numeric sequence to encode val-
ues for a column. But the values
carry no meaning in and of them-
selves. You are forced to make a
translation from a completely ab-
stract value by referring to a data
encoding sheet or joining to a
lookup table.
Now look at Figure 2. The same
data has been encoded differently.
Sure, you can say a mental transla-
tion of the code value
OP
to the term
“open order” is still required, but
the value is now a mnemonic rather
than an abstract value. It carries
meaning by itself.
To get the same level of meaning
in a result set from the abstract
data shown in Figure 1 would
require a join to a lookup table.
Several columns employing ab-
stract encoding could lead to
several joined tables and you can
see how quickly ad hoc querying
can become complicated and
inefficient.
®
Listing 1
Declarative Constraints
If your open architecture allows
data modifications from third-
party programs, then you’ll defi-
nitely want to consider using some
form of automatic processing to
perform data validation and ensure
data integrity. With client/server
databases, you have two choices:
declarative constraints or triggers.
Declarative constraints are defined
at the time you create the table.
They include nullability, defaults,
uniqueness and check constraints.
Listing 3 shows a few examples.
The exact syntax varies from ven-
dor to vendor so be sure to check
your manual before trying this with
your database server.
Any field may contain a null
value. A null indicates the actual
value is not available or not known.
For example, if an employee does
not wish to divulge their birthday,
we may store a null in the
DateOf-
Birth
field. Nulls are available to all
datatypes and save the developer
from having to contrive a special
/* Give John read-only access to the employees table */
GRANT SELECT ON Employees TO JohnA
®
Listing 2
/*Deny all access to the payroll table */
REVOKE ALL ON Payroll FROM Public
GRANT ALL ON Payroll TO Accounting
®
Figure 1
OrderNum CustNo
871182
871183
871184
871185
113
291
88
195
Date
6/2/97
6/2/97
6/2/97
6/2/97
Status
0
0
1
3
Clerk
78
78
54
99
Total
$507.92
$122.45
$1,209.00
$45.22
®
Figure 2
OrderNum CustNo
871182
871183
871184
871185
113
291
88
195
Date
6/2/97
6/2/97
6/2/97
6/2/97
Status
OP
OP
HD
VD
Clerk
78
78
54
99
Total
$507.92
$122.45
$1,209.00
$45.22
42
The Delphi Magazine
Issue 22
code value to serve the same pur-
pose. However, nulls introduce
some complexities to query logic.
You can disallow nulls in any col-
umn by using the
NOT NULL
con-
straint. We will explore nulls more
closely later in the article.
When a column value is omitted
from an SQL
INSERT
statement, then
the values for that column in the
rows being inserted will be null by
default unless you supply your own
using the
DEFAULT
constraint. If
you’ve declared a column
NOT NULL
,
then you should supply a default
value unless there is no logical de-
fault (like the
DeptNo
field in the
example). Defaults can be literal
values or functions built into the
database. The
HireDate
field in this
example uses the hypothetical
built in function
CurrentDate
to
post today’s date as a default value.
Some columns within a table re-
quire non duplicating values, like
employee badge numbers. We can
enforce this rule by using the
UNIQUE
constraint, which simply
places a unique index on the col-
umn. Databases vary in how they
handle null values on unique col-
umns. Some require that there be
no more than one row with a null
value in the column (strict inter-
pretation of uniqueness). Others
®
Listing 3
CREATE TABLE Employees(
EmpNo
integer
FirstName
varchar(15)
LastName
varchar(20)
DeptNo
char(3)
PhoneExt
char(3)
HireDate
date
DateOfBirth date
SSN
char(11)
Salary
double
allow any number of rows to have
a null value in the unique column,
following the rule that no null value
can be equal to any other null value
(strict interpretation of null).
In the case of our badge number
column, our business rule is that
not every employee is required to
have a badge number, but those
who do cannot have duplicate
badge numbers. The only way we
can enforce this with declarative
constraints is to define the column
unique and allow any number of
null values within the unique col-
umn. If we chose to use a special
non-null code value for employees
without badges, we could not use
the unique constraint.
Check constraints are the most
flexible declarative constraints.
Check constraints allow you to de-
fine an expression that must evalu-
ate to true. In Listing 3 we’ve used
check constraints to ensure that
the
Salary
field is a positive num-
ber and that the
BadgeNo
field is a
null or a string of one uppercase
letter followed by two digits.
Nearly all declarative con-
straints can be given a name, like
our check constraint for the
Salary
field. When a constraint is violated
by an insert or update operation,
the server raises an error and the
name of the constraint being vio-
lated should be present in the error
message text. In this way, client
apps can respond to particular er-
rors, parse the error text for con-
straint names, and provide a
cleaner, more meaningful error
message to the user.
Triggers
Declarative constraints only go so
far. For example, how do we en-
force the rule that employees in
departments 012, 014 and 155 re-
quire badges? For more powerful
automatic data processing, you
have to turn to triggers. Triggers
contain static SQL code that fires
whenever a data modification op-
eration is performed, regardless of
the application that initiated the
operation. Within a trigger you can
generally use more complex logic
than would be available in a decla-
rative constraint. Separate triggers
can be placed specifically for in-
sert, update or delete operations
on a given table. Some databases
allow many triggers to exist for the
same operation on the table, each
firing in sequence.
Listing 4 shows a trigger we
might use to enforce the
BadgeNo
rule. Again, exact syntax varies
from database to database. Trig-
gers provide a mechanism for ex-
amining the values of the row being
affected. In this case, the
New
vari-
able gives us access to the values
being written to the table.
A common use of triggers is auto-
matic stamping of username, date
or time information in sensitive ta-
bles. Also, cascading deletes can
be implemented with triggers,
where deleting a record in a master
table automatically deletes associ-
ated records in one or more detail
tables (see Listing 5).
You’ll want to be judicious in
your use of triggers and con-
straints since there is a bit of over-
head in their execution, and they
will always execute for every data
modification they are associated
with.
Domains
Domains are essentially user-
defined datatypes in SQL. Domains
are very helpful for ensuring the
BadgeNo
char(3)
NOT NULL,
NOT NULL DEFAULT ’’,
NOT NULL DEFAULT ’’,
NOT NULL,
NOT NULL DEFAULT ’’,
NOT NULL DEFAULT CurrentDate,
NULL,
NOT NULL DEFAULT ’<unknown>’,
NOT NULL DEFAULT 0
CONSTRAINT SalaryPositive
CHECK (Salary >= 0),
NULL UNIQUE
CHECK (BadgeNo IS NULL or
BadgeNo LIKE ’[A-Z][0-9][0-9]’)
);
®
Listing 4
CREATE TRIGGER BadgeRule ON Employees
FOR INSERT, UPDATE
AS
BEGIN
IF (New.DeptNo IN (’012’, ’014’, ’155’)) AND
(New.BadgeNo IS NULL) THEN
EXCEPTION BadgeNoReqd;
END
June 1997
The Delphi Magazine
43
consistency of data declarations
across tables in a non-trivial data-
base. For example, people’s names
might appear in several different
tables: customer contacts, sales-
people, employees, vendor con-
tacts, etc. If a design decision has
been made that all name fields al-
low up to 30 characters, then it is
much simpler to define a
TPerson-
Name
domain as
VARCHAR(30)
and
make all your table column defini-
tions with the domain name rather
than the direct data type as shown
in Listing 6.
Domains can be valuable for
keeping consistent definitions and
making it much easier to change
definitions across the entire data-
base. A frequently overlooked
value of domains is their use in
stored
procedure
parameter
datatypes. While there may be only
a few fields in table definitions
sharing the same domain, there
may be many different stored pro-
cedures with parameters of that
same data type. Once again, ensur-
ing the consistency of the data type
definitions within stored proce-
dure parameters and their associ-
ated table columns is much easier
with a domain name.
Normalization
Normalization generally refers to
eliminating redundancies in tables
by splitting them apart into smaller
tables with a relationship defined
between them. Typically several
narrow (fewer columns) tables are
better than one wide (lots of col-
umns) table. The narrow tables
usually result in more compact
data because omissions and vari-
able quantities can be handled
more efficiently. Also, a narrow ta-
ble means more rows are packed
into a data page and the server can
scan more rows per I/O operation,
improving performance. There are
five main rules of normalization. In
practice the first three are usually
sufficient for most systems.
First Normal Form:
Atomic Values
The point where a column and row
intersect in a table we’ll call a cell.
First normal form requires that
each cell in a table contain an
atomic value. That is, a single value
cannot be further broken down
into useful values. For example, for
a table of mailing addresses, it is
not uncommon for designers to
group the elements city, state and
postal code into a single field of the
form
<city>, <state> <postal
code>
since that is how they need
to displayed when printed on a
mailing label. However, this single
value can be broken down into
three useful elements and should
be stored in three separate col-
umns. The values in the three
columns can easily be combined to
produce the desired format for the
mailing label.
Having the values separated out
permits you to manipulate the data
at a finer level. For example, most
bulk mailings can gain a postage
discount if the addresses are
sorted by postal code. How are you
going to sort the records when all
three elements are pushed to-
gether in a single field. Also you
may want to stagger mailings to
different states. You will need to
select all addresses for state A for
the first mailing and all addresses
for state B for the second mailing.
It would be extremely difficult to
filter records on state if it’s not
broken out into its own field.
You shouldn’t get carried away
with atomizing values though. For
example, given a phone number
with area code, you could argue
that the area code must be sepa-
rated from the actual phone num-
ber to be in first normal form.
Generally it is not necessary to go
to this extreme if the phone num-
ber and area code will always be
®
Listing 5
used together, which is nearly al-
ways the case. You should only be
concerned about this if your sys-
tem has some functional need to
isolate the area code, for example
a requirement to produce a list of
contacts sorted by area code.
This reasoning also applies to
people’s names. Many designers
create a single field for a name,
such as
Steve Troxell
. If there is
absolutely no requirement to oper-
ate on the individual elements of
the name, this may be fine. For
example, a contact name for a
customer or vendor record.
However, there usually is much
value in splitting name fields into
first name and last name compo-
nents that may not be immediately
obvious. Is it important to you that
the appearance of the name be con-
sistent? If you have a single name
field how are you going to prevent
problems like user A always enters
Steve Troxell
but user B always
enters
Troxell, Steve
. If you sort
by the name are you making an
assumption that all data entry will
be
lastname, firstname
? If there is
the possibility that these issues
may arise in your system, it’s bet-
ter to split the name field apart; you
can always assemble the complete
name in any form you need.
First Normal Form:
No Repeating Groups
An extension of the requirement of
atomic values is that there be no
repeating groups of information.
That is, a single column cannot be
an array, list, or multi-field struc-
ture. Most RDBM systems enforce
this by simply not providing
CREATE TRIGGER MasterCascade ON MasterTable
FOR DELETE
AS BEGIN
DELETE FROM DetailTable WHERE KeyNo = Old.KeyNo;
END
®
Listing 6
CREATE DOMAIN TPersonName AS varchar(30);
CREATE TABLE Customers(
CustNo
integer;
Company
varchar(30);
Contact
TPersonName);
44
The Delphi Magazine
Issue 22
non-scalar datatypes. However, de-
velopers still try to buck the sys-
tem by designing tables like that
shown in Figure 3 .
This is in first normal form by
technicality only. Each field is an
atomic value. But the layout still
breaks the spirit of first normal
form and should be redesigned.
The
AuthorID
fields presumably
link to a separate
Authors
table con-
taining each author’s name and
other info. This approach intro-
duces several problems. First, it
imposes an artificial business rule
that there can be no more than
®
Figure 3: Books table
Title
Delphi
How-To
Delphi
Unleased
ISBN
1-57169-019-0
0-672-30499-6
PublisherID
SAMS
WAITE
three authors for any given book.
So do we simply refuse to accept a
book with four or more authors?
Most likely, the book will get en-
tered but only the first three
authors will be credited. Second, if
I want to see a list of books by any
given author, I must query all three
AuthorId
fields to guarantee cover-
age of all possible positions of the
value I’m looking for. This can lead
to some hideous code workar-
ounds to effectively implement a
filter on author.
A solution to these problems is
shown in Figure 4. The
BookAuthors
table contains as many rows for a
book as there are authors for that
book. You can see how it would be
much simpler to search for all oc-
currences of a single author in the
BookAuthors
table and then link in
the book title from
Books
. Tables
such as this go by one of several
names: relationship, association or
junction tables. They describe a
relationship between two other
tables and don’t necessarily con-
tain any useful independent data
themselves.
Second Normal Form
Second normal form means that if
a table’s primary key is composed
of more than one column then no
column in the table should be de-
pendent on just part of the primary
key. This is easier to see with an
example. Figure 5 shows a project
table assigning employees to pro-
jects. The primary key for the table
is both the project identifier and
the employee number. As you can
see the employee’s name is only
dependent on part of the key for
the row (employee number) and
should be eliminated from this ta-
ble. The
BillRate
column is the
hourly rate that employee charges
for this project. An employee work-
ing on multiple projects might
charge different rates for each pro-
ject. Therefore, the
BillRate
col-
umn is dependent on the entire
primary key and is appropriate for
this table.
The problem with a table that
fails to meet second normal form is
that if we change a single fact about
the database (in this case, an
employee’s name), then we must
change that fact in several different
places (in this case, every
occurrence in the
ProjEmp
table).
Third Normal Form
Third normal form extends this de-
pendency logic by stating that all
non-key columns must depend on
the primary key and not depend
solely on another non-key column.
In Figure 6 it is sufficient to stop
with the
DeptNo
column as
DeptName
and
DeptManager
are not directly
relevant to the employee, but in-
stead should be broken out into a
separate table keyed by
DeptNo
.
AuthorID1
32
16
AuthorID2
45
<null>
AuthorID3
18
<null>
®
Figure 4
Books
Title
Delphi How-To
Delphi Unleased
ISBN
1-57169-019-0
0-672-30499-6
PublisherID
SAMS
WAITE
BookAuthors
ISBN
0-672-30499-6
1-57169-019-0
1-57169-019-0
1-57169-019-0
AuthorID
16
18
32
45
®
Figure 5
ProjectID
X1104
X1104
X1256
X1256
J3880
EmpID
102
76
102
88
88
EmpName
Bob Smith
Stan Stevens
Bob Smith
Roy Rogers
Roy Rogers
BillRate
65.00
35.00
90.00
75.00
75.00
®
Figure 6
EmpNo
112
115
121
EmpName
Sue Bennet
Roy Robinson
John James
DeptNo
80
14
80
DeptName
Shipping
Payroll
Shipping
DeptManager
Bob Smith
Carly Johnson
Bob Smith
June 1997
The Delphi Magazine
45
Plik z chomika:
xlo-40
Inne pliki z tego folderu:
(Ebook - Pdf) Informix 4Gl And Informix Sql.pdf
(195 KB)
(eBook) - Borland Delphi SQL User Guide.PDF
(605 KB)
(Ebook) - Delphi - Delphi Learning Pack (8600 Pages).pdf
(55943 KB)
(Ebook) Borland - Delphi For Dotnet.pdf
(2193 KB)
(eBook) Delphi - Indy In Depth.pdf
(777 KB)
Inne foldery tego chomika:
Delphi 4 dla każdego
Delphi 6 Vademecum profesjonalisty - Tom I
Delphi 7 Enterprise
Delphi 7 Enterprise(1)
DELPHI 7 FOR WINDOWS
Zgłoś jeśli
naruszono regulamin