Programming Trends to Follow?

OO and Relational theory are both datacentric. That is why OO is called Object Oriented. Relational Theory is based on a Mathematical Theory, though. Relational Theory also is concerned with process. It is really a matter of how the data is structured. The theoretical basis for an RDB is much sounder than OO.

A well-designed database schema allows a system to be reconstructed if the codebase is lost (I know...I've had to do it before in MS Access when all the VBA modules were lost). However, it is difficult to visualize what the process is unless you already know something about it.

On the other hand, a well-designed OO system tends to automatically result in a persistence layer that is good database design. In fact, using Fluent NHibernate, I create the database schema entirely in the code without having to look at the RDMS at all.

Domain Driven Design (IMO) attempts to smooth the differences between data and process by, in part, using some of the same language to conceptualize units of data...for instance, the term "Entity" in DDD is practically interchangable with the same term in database design.
 
If data integrity is more important than performance, then the more normalized, the better. If performance is more important than data integrity, then denormalized tables are often best.

Denormalized tables are useful in OLAP and other data warehousing technologies.

This has been my experience as well. Denormalized tables see frequent use in situations where they are not the primary, authoritative source of information, but rather are built from such a primary. In such cases, integrity is enforced elsewhere.
 
Last edited:
The only industrial strength databases are still the SQL databases

Simply not true. You don't think Hadoop or CouchDB are industrial strength? How about Google BigTable? Is that "industrial strength"?

In fact, there's been a resurgence in the development and use of non-relational databases in recent years, driven partly by the focus on cloud computing and partly by the massive data requirements of actors like Google and Facebook.
 
This has been my experience as well. Denormalized tables see frequent use in situations where they are not the primary, authoritative source of information, but rather are built from such a primary. In such cases, integrity is enforced elsewhere.

The classic case I have heard of one of the limitations of RDB theory is the bank account balance. You aren't going to have a record of every transaction, and add that up each time someone wants a balance.

That is a valid point, as is the use of reporting tools such as OLAP, although I do wonder at the massive use of resources it takes to build up the result. I wonder if just a regular query would do just as well.

My point was more about OO databases, based on hierarchical concepts, which were proven years ago to be inherently dysfunctional. An OO system based on relational concepts would be, IMHO, a much better direction. Their is nothing inherently wrong with OO, it was just the wrong headed use of hierarchy that has wrong footed it for many years.
 
Simply not true. You don't think Hadoop or CouchDB are industrial strength? How about Google BigTable? Is that "industrial strength"?

In fact, there's been a resurgence in the development and use of non-relational databases in recent years, driven partly by the focus on cloud computing and partly by the massive data requirements of actors like Google and Facebook.

Parallelism is great, MS SQL Server uses it, and it speeds up large queries significantly.
 
My point was more about OO databases, based on hierarchical concepts, which were proven years ago to be inherently dysfunctional.

The hierarchical concepts were proven dysfunctional, or the OO databases?

How were they proven dysfunctional?

Not being adversarial, just interested.
 
That is a valid point, as is the use of reporting tools such as OLAP, although I do wonder at the massive use of resources it takes to build up the result. I wonder if just a regular query would do just as well.

OLAP works best when the system makes use of command/query separation and inserts and updates are event-driven. If this is the case, then you can have two different systems listening for the events; one the transaction database and one the reporting/OLAP database. The data warehouse would take a lot of resources, but it would be a separate system whose operation doesn't hurt performance of the production system.
 
The hierarchical concepts were proven dysfunctional, or the OO databases?

How were they proven dysfunctional?

Not being adversarial, just interested.

The hierarchical database, such as the one sold by IBM for many years, was proven to be dysfunctional over time. You had to arrange the data in it's hierarchy, and naturally ran into problems with data that came along later, insertions into the hierarchy, how to relate the data to data not in the hierarchy. It was a dead end, once set in place, the hierarchy was inflexible, it only allowed one direct view of your data.


The RDB then came along, which solved that 'set in concrete' view of your data problem. It was based on mathematical concepts, not ad hoc day dreaming. In a very short period of time, the hierarchical database was consigned to the backwaters of legacy systems and a few dedicated applications that just wanted something simple and inflexible. The RDB rapidly became the most popular database model, and still is, because it works.

The OO concept, while good in many ways, used the hierarchical data model. OO is not intrinsically wrong, but it used a bad meta data organisation. It was supposed to provide multiple inheritance, but that never worked properly, so Java dropped that concept. But even single inheritance still causes problems, when something is not where it should be, or it would be good if it was somewhere else. Java provides interfaces, but you have to code these yourself.
 
The hierarchical database, such as the one sold by IBM for many years, was proven to be dysfunctional over time. You had to arrange the data in it's hierarchy, and naturally ran into problems with data that came along later, insertions into the hierarchy, how to relate the data to data not in the hierarchy. It was a dead end, once set in place, the hierarchy was inflexible, it only allowed one direct view of your data.


The RDB then came along, which solved that 'set in concrete' view of your data problem. It was based on mathematical concepts, not ad hoc day dreaming. In a very short period of time, the hierarchical database was consigned to the backwaters of legacy systems and a few dedicated applications that just wanted something simple and inflexible. The RDB rapidly became the most popular database model, and still is, because it works.

The OO concept, while good in many ways, used the hierarchical data model. OO is not intrinsically wrong, but it used a bad meta data organisation. It was supposed to provide multiple inheritance, but that never worked properly, so Java dropped that concept. But even single inheritance still causes problems, when something is not where it should be, or it would be good if it was somewhere else. Java provides interfaces, but you have to code these yourself.
So... To sum up your statements about OO:

Hammers are "dysfunctional" because I cannot use them to saw a board in half.
 
The OO concept, while good in many ways, used the hierarchical data model.

No it doesn't. Object-oriented programming has a hierarchical type system and, in some sense, a hierarchical system for behaviour, but there is nothing intrinsically hierarchical about the way data is represented in OOP. If anything, OOP lends itself to structuring data as a directed graph.
 
The hierarchical database, such as the one sold by IBM for many years, was proven to be dysfunctional over time. You had to arrange the data in it's hierarchy, and naturally ran into problems with data that came along later, insertions into the hierarchy, how to relate the data to data not in the hierarchy. It was a dead end, once set in place, the hierarchy was inflexible, it only allowed one direct view of your data.

This is only a problem if one is trying to apply the technology indiscriminately. All tools have limitations, even relational systems.

The RDB rapidly became the most popular database model, and still is, because it works.

It became popular because it provided a solution to a very common problem. It was not inherently better than other models, it was just better at doing what a lot of people needed.

The OO concept, while good in many ways, used the hierarchical data model. OO is not intrinsically wrong, but it used a bad meta data organisation. It was supposed to provide multiple inheritance, but that never worked properly, so Java dropped that concept. But even single inheritance still causes problems, when something is not where it should be, or it would be good if it was somewhere else. Java provides interfaces, but you have to code these yourself.

Do not confuse type-taxonomy with data-model. They are not the same thing. One could have a complex class-hierarchy of file-types, for instance, and a data-model that's just a set of filenames.
 
And don't forget that RDBMSes are not so strong at dealing with hierarchies. See for instance Joe Celko's Trees and Hierarchies where he struggles to deal with some fairly simple* cases.


*for given values of "fairly simple"
 
And don't forget that RDBMSes are not so strong at dealing with hierarchies. See for instance Joe Celko's Trees and Hierarchies where he struggles to deal with some fairly simple* cases.


*for given values of "fairly simple"

Dealing with recursive relationships (trees of arbitrary depth) gets ugly really quickly in an RDMS setting.

I found this out when designing an EDI system.
 
This is only a problem if one is trying to apply the technology indiscriminately. All tools have limitations, even relational systems.



It became popular because it provided a solution to a very common problem. It was not inherently better than other models, it was just better at doing what a lot of people needed.



Do not confuse type-taxonomy with data-model. They are not the same thing. One could have a complex class-hierarchy of file-types, for instance, and a data-model that's just a set of filenames.

I said meta data because I believe that it is just as valid a use of relationships. Why not have classes defined using a relational model? Another hierarchical system that leads to pain is the file system that was developed for C and appears to be the universal model for a file system these days. It is a mess.

Hierarchies are good where needed, a relational database index is implemented as a hierarchy in every product I've used.

What I am questioning is what appears to be the default for many developments, of a data system that is hierarchical, when it was discovered many years ago that it is inherently problematic, for reasons discovered years ago. The newest mess is XML, for example.
 
Last edited:
Why not have classes defined using a relational model?

I'm not sure what value, if any, that would have. Object orientation is about behavior, not data.

What I am questioning is what appears to be the default for many developments, of a data system that is hierarchical, when it was discovered many years ago that it is inherently problematic, for reasons discovered years ago.

It's not inherently problematic. It's just limited in a way that relational systems are not. Whether or not that limitation is a problem depends on the situation.

The newest mess is XML, for example.

XML is perfectly fine in some cases. Unfortunately, as with any technology, people sometimes employ it inappropriately.
 
Since all computing is about data in some way, the data is important, including the metadata.

There's two things wrong with this argument.

First, and least important, not all computing is about data in some way. A program that calculates the value of pi or the embedded system which controls the anti-lock braking system on your car do not deal with data in this sense at all. (There is a difference between inputs and data.)

Secondly, and more importantly, even though most computing systems do involve data at some level it doesn't follow that data is the only or the central or most interesting aspect of them. For many systems behaviour is the aspect that will need the most modeling. For those systems (and, as data-flow is behaviour, that is most non-trivial systems -- even the data-heavy ones) the implementation language should be tooled more towards matching the behaviour model than the data model.
 

Back
Top Bottom