[Novalug] NoSQL databases (and Cassandra)
Peter Larsen
plarsen at famlarsen.homelinux.com
Wed Aug 4 11:00:39 EDT 2010
I've with interest followed the "noSQL" arguments over the last year.
Since I took part of the infamy of SQL I still remember the arguments
for/against SQL vs. the more traditional databases like Network and
Hierarchical and even traditional files.
And I cannot help but see "NoSQL" as a simple "Polyfile" implementation;
a simple flat file with an index on top; and the programming effort
seems to be very file oriented. So far, the NoSQL alternatives I've read
about are all severely restricted: Single Set, no joins and in most
cases a single key. In other words, the equivalent to a single table
with SQL; there's not really any design needed to create that
implementation with SQL. Even with SQLight.
To me, data access to diverse implementations of data storage, through a
common and standardized language is a big plus. I'm not really "married"
to relational databases but a common standard to access the data-layer I
certainly am. Going backwards to an API based access isn't something
that looks promising to me.
Your observations about hashes are correct. That's really all the NoSQL
implementations I've seen has been. But it's still done through a single
key. Not a biggie with SQL either: select * into :var from table where
key = :id and I have a single structure in the variable var with all the
columns in that one table based the key (not pretty SQL but it'll work).
What it looks to me is that programmers are going to repeat their
data-layer mistakes into the DB introducing a lot of redundancy. While
that makes queries easier, it certainly makes updates a mess.
As some of you know, I've got quite a history with Oracle's DB. Here
we've been able to break even 1st normal form since 10g. Meaning we can
retrieve multi-dimensional datasets in "one row" based on one key. Even
sets that are dynamic in nature. Of course the advantage is quick access
to a large set of data and reading about the NoSQL efforts that is
exactly the effect they're going for. I just wonder what happens when
they need only a subset of their data in other parts of the application.
They'll end up running into the same problems network/hierarchical
databases has/had: you spend a lot of time/effort fighting the model
instead of getting help from it.
--
Best Regards
Peter Larsen
Wise words of the day:
If loving linux is wrong, I dont wanna be right.
-- Topic for #LinuxGER
On Tue, 2010-08-03 at 16:40 -0400, Doug Toppin wrote:
> Just an fyi that I have been experimenting with the Cassandra
> database. It's a popular one in the burgeoning area of "NoSQL"
> databases.
> If you really do not need relational capabilities a NoSQL db will
> probably significantly reduce some of the issues that you may have in
> designing and implementing a large data store. Being able to quickly
> stand up new instances for additional capacity is very useful vs
> having to mess with sharding and failover/replication impacts.
>
> I will post more on this later if anyone is interested. If anyone is
> already knowledgeable on this subject please pass along any thoughts
> or experiences that you might have.
>
> So far, it has been interesting messing with it. I have been
> experimenting with records containing a large number of fields (1,200
> for example). If you viewed the db a multidimensional hash table it
> might give you a better perspective on what it is.
>
> More at:
> http://en.wikipedia.org/wiki/Apache_Cassandra
> http://wiki.apache.org/cassandra/
>
> Doug
> _______________________________________________
> Novalug mailing list
> Novalug at calypso.tux.org
> http://calypso.tux.org/mailman/listinfo/novalug
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
Url : http://calypso.tux.org/pipermail/novalug/attachments/20100804/0b7d3b6e/attachment.bin
More information about the Novalug
mailing list