[Novalug] Database advice
Maxwell Spangler
maxlists at maxwellspangler.com
Mon Mar 15 17:05:23 EDT 2010
On Mon, 2010-03-15 at 15:54 -0400, Bonnie Dalzell wrote:
> The thing I am not sure about is that I think the table has to have a
> predetermined number of columns - it is rare for more than 4 people to
> own the dog or be listed as breeders. I do not think the table can be
> open ended.
>
> The alternative is to make the owner or breeder list one entry with the
> individual people_id numbers separated by something like the pipe symbol
> and then use a program module to process the contents of the cell to list
> the individual people by name. This is what i decided to do when I was
> using a flat file for the dog db. I had a separate people flat file with
> each person having a different id number.
A table definitely has to have a predetermined number of columns, but a
table can contain any number of rows. So don't try to combine multiple
pieces of data into a single table row because you've only defined one
column:
dog_id owners
500 max,mary,clark,hal
This breaks the first rule of normalization that every piece of data be
atomic -- here you are trying to combine four pieces of data into one
and it will result in problems. Instead, simply make a table with four
records. I don't know enough about your actual data but I might suggest
something like this:
DOG_OWNER_MASTER
dog_id master_id owned_starting owned_ending name
500 100 1/1/1995 12/31/1997 Bob Lumbergh
500 101 1/1/1998 7/1/2009 Greyhound LLC
DOG_OWNER_DETAIL
master_id name
100 Bob Lumbergh
101 Maxwell
101 Mary
101 Clark
101 Hal
DOG_OWNER_MOST_RECENT
dog_id master_id
500 101
This represents two owners, a single person "Bob Lumbergh" from 1/1/95
to 12/31/97 and a group organized as "Greyound LLC" with four individual
owners who owned the dog from 1/1/98 to 7/1/2009.
This would allow you to do queries to find who owned the dog at a given
time:
select dog.dog_name, dog_owner_master.name
from dog, dog_owner_master
where dog.dog_id = 500
and dog_owner_master.owned_starting >= 1/1/1996
and dog_owner_master.owned_ending <= 1/1/1996;
Freckles | Bob Lumbergh
select dog.dog_name, dog_owner_master.name, dog_owner_detail.name
from dog, dog_owner_master, dog_owner_detail
where dog.dog_id = 500
and dog_owner_master.owned_starting >= 1/1/1996
and dog_owner_master.owned_ending <= 1/1/1996
and dog_owner_detail.master_id = dog_owner_master.master_id;
Greyhound LLC | Maxwell
Greyhound LLC | Mary
Greyhound LLC | Clark
Greyhound LLC | Hal
You could programmatically combine this to present it as:
Greyhound LLC (Maxwell, Mary, Clark, Hal)
Knowing that a dog could have an unlimited amount of owners, you could
have a DOG_OWNER_MOST_RECENT table that has a single entry pointing to
the "current" or "most recently tracked" owner.
select dog.dog_name, dog_owner_master.name
from dog, dog_owner_master, dog_owner_most_recent
where dog.dog_id = 500
and dog_owner_most_recent.dog_id = dog.dog_id
and dog_owner_master.master_id = dog_owner_most_recent.master_id;
find dog #500,
find dog_owner_most_recent for dog 500 (result: master id->101)
find the dog_owner_master for master id 101:
Freckles | Greyhound LLC
The thing about SQL (and databases in general) is that there are many
ways to do the same thing. Some of us keep reminding you about
normalization because it places some simple and powerful restrictions on
what you could otherwise do so that you don't shoot yourself in the foot
later.
Using the dog_owner_master_id table above is a very purist approach for
database modeling but you could, if you wanted to, simply place a
'Current" yes/no field in the dog_owner table and turn specific records
on an off to indicate who is the master. If you were to do so, it is up
to you to make sure:
* you always have one record marked yes when programs need it
..so it might default to yes when you make the first record for a
particular dog.
* you never have more than one record (per dog id) marked yes
Some, perhaps myself, would argue that a separate table is more in line
with the relational model and others would argue that its overkill.
I hope this discussion is helping you and not scaring you away.
Sometimes when people have small projects that should be in databases
(as opposed to one big spreadsheet table they're maintaining) I get
intimidated regarding helping them because do to things right takes a
lot of work up front. Patience, training, etc. But once the system is
setup it can scale as big as you want and the rewards are significant.
--
Maxwell Spangler
More information about the Novalug
mailing list