[Novalug] January Meeting Topic(Oracle Poll)
Peter Larsen
plarsen at famlarsen.homelinux.com
Wed Dec 6 20:22:58 EST 2006
Kevin Cole wrote:
> Doug Needham wrote:
>
>>Should this topic be renamed Database Flame :) ?
>>
>><IMO>
>>
>><RANT>
>>
>>All Databases are similar. There are various levels of expertise in each
>>of them.
>
>
> Since we're ranting... ;-)
>
> I don't know... Every time I look at another relational database system
> that uses SQL, I miss System 1022 and its successor System 1032, which
> were relational database systems with a much more useful and logical
> language (although it was unique to System 1022/1032 as far as I know).
> It was both a query language and a procedural language that novices and
> programmers alike could appreciate. A simple comparison of the inter-
> active use:
Interesting subjects we've found here. I've spend most of my carreer in
IT within databases and how to deal with data, from entry procdessing,
to storage to reporting and manipulation. So - this thread has by far
beaten anything I've seen on here ;)
Just basic input to your obseration - if you see SQL as procedural,
that's why you have problems with it. It's not, and not meant to be.
It's meant to take the developer away from the "sequentual" issues of
dealing with data. C.J. Date would have a field day based on your
observations here :)
> SQL
>
> >>> select name from addressbook where zip = 14621;
> ...output...
> >>> select phone from addressbook where zip = 14621;
> ...output...
> >>> select birthday from addressbook where zip = 14621;
> ...output...
Very very very BAD SQL indeed. With SQL it's "select name, phone,
birthday from addressbook where zip=14621". One time access, and all
data has been accumulated, setup and presented to your program. All it
has is to loop through the returned set and print/deal with it.
> System 1032:
>
> >>> set addressbook
> >>> find zip = 14621
> 30 records found
> >>> print name
> ...output...
> >>> print phone
> ...output...
> >>> print birthday
> ...output...
How would you deal with multiple source inputs? Simple joins, or even
uneven joins? Or queries that said: "All customers who has placed
orders in the last week"? Your procedural approach soons becomes very
complex?
What strikes me is, that SQL was actually invented to fix that issue.
When I (briefly) used network databases, it was used pretty much in the
same manner. You walked the tree of data, and it was pretty inflexible.
If you had not foreseen a way to use data, you had plenty to program
later trying to walk all the data over and over again.
> The system remembered the state of your query across subsequent
> commands. One could write rather complex, yet readable code
> directly in the native PL as well as via calls from other
> languages. No need for a degree in "relational calculus vs.
> relational algebra".
And today I still see developers without that knowledge. You're right in
the essense that it adds requirements to the developer, of knowing
basically a new langauge/paradime. But there's a reason behind it - the
developer can do much more complex things easier than he EVER could with
the procedural way.
I'm interested in seeing an example of how to get summary results group
by sales of more than 3000 by person in the sequential language - and
have the result returned descending?
In SQL it's easy: "select salesperson, sum(sales) from sales group by
salesperson having sum(sales)>3000 order by sum(sales) desc".
What I admore about SQL is that it makes complex operations like this,
hidden from me. It is the database problem to read through the data,
sort it, ground/break the data down and give me the result. I simply
tell it what to get. Not how to get it.
> Another useful feature was
> the "stacking" of the selection set:
>
> >>> find all
> 1367 records found
> >>> find zip = 02134
> 500 records found
> >>> consider on
> >>> find all
> 500 records found
> >>> find year = 1995
> 25 records found
> >>> consider on
> >>> find all
> 25 records found
> >>> consider off
> >>> find all
> 500 records found
> >>> consider off
> >>> find all
> 1367 records found
This sequence is very confusing to me. What's "consider"? Getting the
number of records on conditions can actually be very easy to in one
single SQL:
select count(case zip when 02134 then 1 otherwise null end case)
num_zip_02134
, count(case year when 1995 then 1 otherwise null end case) num_year
from my_table;
Or in today's SQL world:
select ( select count(*) from my_table where zip = 02134 ) num_zip
, ( select count(*) from my_table where year = 1995 ) num_year
from dual;
(ok, I know - it's been in Oracle for many years; I would like to know
how other vendors have implemented structures like that?).
The big advantage of the last approach is of course the use of indexes.
In today's market with statitical analysis, break down of all data with
simple 1-2-3 line SQLs it's hard for me to understand the longing from
sequential languages.
> While it may not be comparable to databases used by multinational
> corporations, we kept 30 years of demographic data as well as
> standardized test responses (~500 questions) and scores on roughly
> 40,000 students/year in our databases.
Reminds me of a contract I had I with the national statistical bearo of
Denmark. For a database guy I thought I had landed THE contract; just to
learn that they actually didn't use database management systems at all,
until the project I was on. Until then, the processed national data,
collections etc. with an inhouse developed language, that more or less
was sequential processing directives like what you specify. They would
load a tape of data, and the script would read record by record, and do
simple extractions and accumulations - and spit out the result.
It was impressive, but also very restrictive to them. Imagine the
complexity of trying to co-related 2, 3 or more datasets/tapes? They had
to do the "join" stuff, or have cobol programmers create new sets that
they could process. All in all, they had realized they needed databases
and hence - ME :) Too bad some scureze of an Oracle sales-person had
told them they could run Oracle on a IBM VM mainframe ... something
nobody else in Denmark - or the world was really doing ;) Ohhh boy,
this was in the days of OS/2 ... those were some server-farms they had.
My point is, that if your data can be organized basically as simple
files and you have little interaction between your datasets, you also
have little need for a DBMS?
> I can't tell you much about
> the resource usage, but in terms of human productivity, it beat the
> crap out of SQL. Unfortunately, when the DECsystem-10 and the VAX
> went, so did our licenses for System 1022/1032.
Based on the assumption that your programmers didn't know SQL, yes of
course it was faster. For projects where your team already have
experience in a given tool, using the "better" tool can be quite costly
and not make sense. In the long run, I doubt an approach like that saves
any time.
Another subject I guess we can talk about one day? I see that often on
this list; we speak from angles of experience. What's easy for one, is
hard for another. From that to declaring one system better than another,
simply because we know one better, well - that's not fair/right to anyone.
> There may have been an SQL engine buried in it somewhere, but it none
> of the manuals spoke of such a beastie, as far as I recall.
That will not happen. They're diametrically different. You have no
control on how data is retrieved from the relational database in the
sense of what's done first, second and third in the processing of the
data. And that's a feature - not a bug. ARGH - did I really use Oracle's
(tm) statement?? Time to look for another career I guess.
In short, it looks like we do have some interest in database subjects
here for next year?? I know we have a limited time (2 hours) at the
conference center. Is there a way of getting more time, so we could have
more than one subject??
Regards
Peter Larsen
More information about the Novalug
mailing list