[Novalug] Enterprise Linux: Object/Relational Mapping is the Vietnam of Computer Science
Demetrius Gallitzin
gallitzin at gmail.com
Wed Feb 28 21:59:26 EST 2007
Because this article is so timely and well-written (maybe a bit long
in areas), it deserves a mention of its own.
Object/Relational Mapping is the Vietnam of Computer Science
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
PDF format:
http://www.odbms.org/download/031.01%20Neward%20The%20Vietnam%20of%20Computer%20Science%20June%202006.PDF
Summary:
O/R mapping represents a quagmire which starts well, gets more
complicated as time passes, and before long entraps its users in a
commitment that has no clear demarcation point, no clear win
conditions, and no clear exit strategy.
Given, then, that objects-to-relational mapping is a necessity in a
modern enterprise system, how can anyone proclaim it a quagmire from
which there is no escape? Again, Vietnam serves as a useful analogy
here--while the situation in South Indochina required a response from
the Americans, there were a variety of responses available to the
Kennedy and Johson Administrations, including the same kind of
response that the recent fall of Suharto in Malaysia generated from
the US, which is to say, none at all. (Remember, Eisenhower and Dulles
didn't consider South Indochina to be a part of the Domino Theory in
the first place; they were far more concerned about Japan and Europe.)
Several possible solutions present themselves to the O/R-M problem,
some requiring some kind of "global" action by the community as a
whole, some more approachable to development teams "in the trenches":
1. Abandonment. Developers simply give up on objects entirely, and
return to a programming model that doesn't create the
object/relational impedance mismatch. While distasteful, in certain
scenarios an object-oriented approach creates more overhead than it
saves, and the ROI simply isn't there to justify the cost of creating
a rich domain model. ([Fowler] talks about this to some depth.) This
eliminates the problem quite neatly, because if there are no objects,
there is no impedance mismatch.
2. Wholehearted acceptance. Developers simply give up on relational
storage entirely, and use a storage model that fits the way their
languages of choice look at the world. Object-storage systems, such as
the db4o project, solve the problem neatly by storing objects directly
to disk, eliminating many (but not all) of the aforementioned issues;
there is no "second schema", for example, because the only schema used
is that of the object definitions themselves. While many DBAs will
faint dead away at the thought, in an increasingly service-oriented
world, which eschews the idea of direct data access but instead
requires all access go through the service gateway thus encapsulating
the storage mechanism away from prying eyes, it becomes entirely
feasible to imagine developers storing data in a form that's much
easier for them to use, rather than DBAs.
3. Manual mapping. Developers simply accept that it's not such a
hard problem to solve manually after all, and write straight
relational-access code to return relations to the language, access the
tuples, and populate objects as necessary. In many cases, this code
might even be automatically generated by a tool examining database
metadata, eliminating some of the principal criticism of this approach
(that being, "It's too much code to write and maintain").
4. Acceptance of O/R-M limitations. Developers simply accept that
there is no way to efficiently and easily close the loop on the O/R
mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever
percentage seems appropriate) of the problem and make use of SQL and
relational-based access (such as "raw" JDBC or ADO.NET) to carry them
past those areas where an O/R-M would create problems. Doing so
carries its own fair share of risks, however, as developers using an
O/R-M must be aware of any caching the O/R-M solution does within it,
because the "raw" relational access will clearly not be able to take
advantage of that caching layer.
5. Integration of relational concepts into the languages.
Developers simply accept that this is a problem that should be solved
by the language, not by a library or framework. For the last decade or
more, the emphasis on solutions to the O/R problem have focused on
trying to bring objects closer to the database, so that developers can
focus exclusively on programming in a single paradigm (that paradigm
being, of course, objects). Over the last several years, however,
interest in "scripting" languages with far stronger set and list
support, like Ruby, has sparked the idea that perhaps another solution
is appropriate: bring relational concepts (which, at heart, are
set-based) into mainstream programming languages, making it easier to
bridge the gap between "sets" and "objects". Work in this space has
thus far been limited, constrained mostly to research projects and/or
"fringe" languages, but several interesting efforts are gaining
visibility within the community, such as functional/object hybrid
languages like Scala or F#, as well as direct integration into
traditional O-O languages, such as the LINQ project from Microsoft for
C# and Visual Basic. One such effort that failed, unfortunately, was
the SQL/J strategy; even there, the approach was limited, not seeking
to incorporate sets into Java, but simply allow for embedded SQL calls
to be preprocessed and translated into JDBC code by a translator.
6. Integration of relational concepts into frameworks. Developers
simply accept that this problem is solvable, but only with a change of
perspective. Instead of relying on language or library designers to
solve this problem, developers take a different view of "objects" that
is more relational in nature, building domain frameworks that are more
directly built around relational constructs. For example, instead of
creating a Person class that holds its instance data directly in
fields inside the object, developers create a Person class that holds
its instance data in a RowSet (Java) or DataSet (C#) instance, which
can be assembled with other RowSets/DataSets into an easy-to-ship
block of data for update against the database, or unpacked from the
database into the individual objects.
Note that this list is not presented in any particular order; while
some are more attractive to others, which are "better" is a value
judgment that every developer and development team must make for
themselves.
Just as it's conceivable that the US could have achieved some measure
of "success" in Vietnam had it kept to a clear strategy and understood
a more clear relationship between commitment and results (ROI, if you
will), it's conceivable that the object/relational problem can be
"won" through careful and judicious application of a strategy that is
celarly aware of its own limitations. Developers must be willing to
take the "wins" where they can get them, and not fall into the trap of
the Slippery Slope by looking to create solutions that increasingly
cost more and yield less. Unfortunately, as the history of the Vietnam
War shows, even an awareness of the dangers of the Slippery Slope is
often not enough to avoid getting bogged down in a quagmire. Worse, it
is a quagmire that is simply too attractive to pass up, a Siren song
that continues to draw development teams from all sizes of
corporations (including those at Microsoft, IBM, Oracle, and Sun, to
name a few) against the rocks, with spectacular results. Lash yourself
to the mast if you wish to hear the song, but let the sailors row.
More information about the Novalug
mailing list