[Novalug] Database advice

Maxwell Spangler maxlists at maxwellspangler.com
Mon Mar 15 17:05:23 EDT 2010


On Mon, 2010-03-15 at 15:54 -0400, Bonnie Dalzell wrote:

> The thing I am not sure about is that I think the table has to have a 
> predetermined number of columns - it is rare for more than 4 people to 
> own the dog or be listed as breeders. I do not think the table can be 
> open ended.
> 
> The alternative is to make the owner or breeder list one entry with the 
> individual people_id numbers separated by something like the pipe symbol 
> and then use a program module to process the contents of the cell to list 
> the individual people by name. This is what i decided to do when I was 
> using a flat file for the dog db. I had a separate people flat file with 
> each person having a different id number.

A table definitely has to have a predetermined number of columns, but a
table can contain any number of rows.  So don't try to combine multiple
pieces of data into a single table row because you've only defined one
column:

dog_id  owners
500     max,mary,clark,hal

This breaks the first rule of normalization that every piece of data be
atomic -- here you are trying to combine four pieces of data into one
and it will result in problems.  Instead, simply make a table with four
records.  I don't know enough about your actual data but I might suggest
something like this:

DOG_OWNER_MASTER
dog_id  master_id owned_starting owned_ending name
500     100       1/1/1995       12/31/1997   Bob Lumbergh
500     101       1/1/1998       7/1/2009     Greyhound LLC

DOG_OWNER_DETAIL
master_id  name
100        Bob Lumbergh
101        Maxwell
101        Mary
101        Clark
101        Hal

DOG_OWNER_MOST_RECENT
dog_id    master_id
500       101

This represents two owners, a single person "Bob Lumbergh" from 1/1/95
to 12/31/97 and a group organized as "Greyound LLC" with four individual
owners who owned the dog from 1/1/98 to 7/1/2009.

This would allow you to do queries to find who owned the dog at a given
time:

select dog.dog_name, dog_owner_master.name
from dog, dog_owner_master
where dog.dog_id = 500
  and dog_owner_master.owned_starting >= 1/1/1996
  and dog_owner_master.owned_ending   <= 1/1/1996;

Freckles | Bob Lumbergh

select dog.dog_name, dog_owner_master.name, dog_owner_detail.name
from dog, dog_owner_master, dog_owner_detail
where dog.dog_id = 500
  and dog_owner_master.owned_starting >= 1/1/1996
  and dog_owner_master.owned_ending   <= 1/1/1996
  and dog_owner_detail.master_id = dog_owner_master.master_id;

Greyhound LLC | Maxwell
Greyhound LLC | Mary
Greyhound LLC | Clark
Greyhound LLC | Hal

You could programmatically combine this to present it as:

Greyhound LLC (Maxwell, Mary, Clark, Hal)

Knowing that a dog could have an unlimited amount of owners, you could
have a DOG_OWNER_MOST_RECENT table that has a single entry pointing to
the "current" or "most recently tracked" owner.

select dog.dog_name, dog_owner_master.name
  from dog, dog_owner_master, dog_owner_most_recent
where dog.dog_id = 500
  and dog_owner_most_recent.dog_id = dog.dog_id
  and dog_owner_master.master_id = dog_owner_most_recent.master_id;

find dog #500,
	find dog_owner_most_recent for dog 500 (result: master id->101)
		find the dog_owner_master for master id 101:

Freckles | Greyhound LLC

The thing about SQL (and databases in general) is that there are many
ways to do the same thing.  Some of us keep reminding you about
normalization because it places some simple and powerful restrictions on
what you could otherwise do so that you don't shoot yourself in the foot
later.

Using the dog_owner_master_id table above is a very purist approach for
database modeling but you could, if you wanted to, simply place a
'Current" yes/no field in the dog_owner table and turn specific records
on an off to indicate who is the master.  If you were to do so, it is up
to you to make sure:

* you always have one record marked yes when programs need it
..so it might default to yes when you make the first record for a
particular dog.
* you never have more than one record (per dog id) marked yes

Some, perhaps myself, would argue that a separate table is more in line
with the relational model and others would argue that its overkill.

I hope this discussion is helping you and not scaring you away.
Sometimes when people have small projects that should be in databases
(as opposed to one big spreadsheet table they're maintaining) I get
intimidated regarding helping them because do to things right takes a
lot of work up front.  Patience, training, etc.  But once the system is
setup it can scale as big as you want and the rewards are significant.

-- 
Maxwell Spangler




More information about the Novalug mailing list