Table Oriented Programming (TOP)

Posted on 2nd July 2013 by Tony Marston

Amended on 5th September 2013

Introduction

Unless you are developing software which directly manipulates a real-world object, such as process control, robotics, avionics or missile guidance systems, then some of the properties and methods which apply to that real-world object may be completely irrelevant in your software representation. If, for example, you are developing an enterprise application such as Sales Order Processing which deals with entities such as Products, Customers and Orders, you are only manipulating the information about those entities and not the actual entities themselves. This information is held in a database in the form of tables, columns and relationships. Regardless of what properties a real-world object may have, in a database application it is only necessary to store those pieces of information that are actually required by that application. For example, each real-world product may have a large collection of attributes, but the application may not require anything more than an identity, a description and a price. Regardless of the operations that can be performed on a real-world object, with a database table the only operations that can be performed are Create, Read, Update and Delete (CRUD). For example, a real-world customer (person) may have such operations as eat, sleep, sit, stand, walk and run, but none of these operations will be needed in the application. Following the process called data normalisation the information for an entity may need to be split across several tables, each with its own columns and relationships, and in these circumstances I personally think that it would be a good idea to create a separate class for each table instead of having a single class for a collection of tables.

When it comes to writing Object Oriented (OO) applications which need to communicate with a relational database management system (RDBMS) there are basically two schools of thought:

That references to the database structure and SQL queries should be avoided at all costs, and that the structure of the software should be totally independent from the structure of the database.
That the application objects should have intimate knowledge of the database tables, their structure, and SQL queries, and that the software and database be kept fully synchronised at all times.

The former approach promotes discord between the software and the database, and requires special efforts to write application code which has no knowledge of the database structure. This results in a painful condition known as Object-Relational Impedance Mismatch. This problem requires a solution known as an Object Relational Mapper (ORM), but as this "solution" merely hides the problem beneath additional layers of complexity I do not think that it is much of a solution at all. It is over-engineered and extremely inefficient, and is a shining example of one of those eccentric machines designed by Heath Robinson or Rube Goldberg. This can lead to bad practices such as those described in 10 Common Mistakes Java Developers Make when Writing SQL and 10 More Common Mistakes Java Developers Make when Writing SQL.

The latter approach promotes harmony between the software and the database, and is used by programmers who know and understand the power of a relational database and the SQL language, and like to harness that power without having to deal with any artificial barriers. Consequently they design their classes around the structure of the database, thus avoiding the problem of Object-Relational Impedance Mismatch and its painful solution.

What is the source of this difference of opinion? I can think of two:

Before the rise of relational databases and the SQL language there were things such as hierarchical and network databases which required specialised function calls in the code instead of the construction and execution of SQL query strings. As relational databases began to become more popular the number of programmers who were familiar with the "new" SQL language were quite rare, so instead of training all existing staff in its use there was usually only one "expert" who was responsible for creating and maintaining a specialised component. All that the other programmers had to do was link to this "black box" component which did its magic without them getting their hands dirty.
With the rise of Object Oriented programming languages a lot of programmers hailed this new paradigm as the best thing since sliced bread, and regarded everything which was not OO as being beneath them and unworthy of further consideration. Relational databases fell into this category. "Relational Databases are not Object Oriented" they cried. "We cannot possibly work which such an outdated technology!" So rather than try to find a way to deal with this "old" technology as efficiently as possible they invented a balls-up of a solution which sacrificed efficiency and common-sense for "purity".

The former argument should have disappeared with the increased usage of relational databases as both they and the SQL language have become the de-facto standard. Any developer who works on database applications who does not know SQL should be in an extremely small minority, just as developers who work on web applications who do not know HTML should be. The original argument for hiding SQL was that few programmers could work with it, but some how this has been corrupted into "few programmers should work with it".

The latter argument I can put down to nothing more than pure bloody-mindedness. If an application has two components - database and software - which must communicate with each other then common sense should dictate that you make that communication as smooth and efficient as possible. If you deliberately choose to design each component using incompatible methodologies which you know are going to cause a problem, then you are turning yourself from a solutions provider into a problems provider and should be sent back to programming school where you can be taught to write more efficient and effective software.

In his article Domain Logic and SQL Martin Fowler, who is also the author of Patterns of Enterprise Application Architecture (PoEAA) has this to say:

Many application developers tend to treat relational databases as a storage mechanism that is best hidden away. Frameworks exist who tout the advantages of shielding application developers from the complexities of SQL.
Yet SQL is much more than a simple data update and retrieval mechanism. SQL's query processing can perform many tasks. By hiding SQL, application developers are excluding a powerful tool.

In his article OrmHate the also says the following:

I often hear people complain that they are forced to compromise their object model to make it more relational in order to please the ORM. Actually I think this is an inevitable consequence of using a relational database - you either have to make your in-memory model more relational, or you complicate your mapping code. I think it's perfectly reasonable to have a more relational domain model in order to simplify [or even eliminate] your object-relational mapping.

Note that the words or even eliminate are mine. Later in the same article he also says:

To use a relational model in memory basically means programming in terms of relations, right the way through your application. In many ways this is what the 90's CRUD tools gave you. They work very well for applications where you're just pushing data to the screen and back, or for applications where your logic is well expressed in terms of SQL queries. Some problems are well suited for this approach, so if you can do this, you should.

I have been building database applications for several decades in several languages, and I have personally designed and built many thousands of user transactions. No matter how complex an individual transaction may be, it always involves performing one or more CRUD operations on one or more database tables. All I have done is adapt my procedural programming method to encapsulate the knowledge of each database table in its own class, then use inheritance and polymorphism to increase code reuse and decrease code maintenance. This is supposed to be what OOP is all about, so how can I be wrong? In addition, because my class structures and database structures are always kept in sync I do not have any mapping code or any mapping problems.

In his article The power of table-oriented programming Fredrik Bertilsson has this to say:

The table-oriented programming model allows the application source code to be aware of the actual database structure instead of hiding it in a mapping layer deep in the application stack. Many enterprise applications have a lot of CRUD-related (create, read, update, delete) logic, and developing CRUD functionality is much simpler if the database structure is not hidden.
One argument for having an object model that doesn't correspond with the actual database structure is that the business logic should remain the same even if the database structure changes. But this argument neglects the fact that much business logic is implemented in a relational database schema. Changing the database schema will, by definition, also change the business logic.

Most OO programmers fail to realise that the effort required to hide the database structure is compounded by the effort required to deal with the fact that it is hidden. As Fredrik Bertilsson says:

One consequence of this refusal to model a relational database's real structure is an extra layer of abstraction. Table, columns, and foreign keys must be mapped to classes, properties, and associations. In most cases, there is one class for each table, one property for each column, and one association for each foreign key. Besides, SQL must be reinvented (Hibernate Query Language (HQL), JDO-QL, EJB-QL, etc). Why bother adding this extra layer? It adds complexity and provides few additional features.

This habit of (theoretically) solving a problem by adding on another layer of indirection or abstraction is often over-used. A quote usually attributed either to David Wheeler or Butler Lampson reads as follows:

There is no problem in computer science that cannot be solved by adding another layer of indirection, except having too many layers of indirection.

In a similar vein Michael Padlipsky wrote the following:

If you know what you're doing, three layers is enough; if you don't, even seventeen levels won't help.

My background with database applications

Most of today's programmers do not have any previous experience in the development of database applications using non-OO languages, so when they are taught how to do it the OO way they do not know if it is the best way, just the only way that they have been taught.

I, on the other hand, have decades of prior experience. Consequently I know what I have learned, which is much more than I have ever been taught. I was designing and building business applications before the rise of relational databases, OO languages and the internet. My primary language in the 1970s and 80s was COBOL, and while using it I learned two valuable lessons:

When building a database application the most important part is the database design - get this wrong and you are stuffed before you start.
The most efficient way to build software which accesses a database is to match your software structure with the database structure. This was one of the main points I learned on a Jackson Structured Programming course.

In the 1990s the software house where I worked changed from COBOL to UNIFACE, a component-based language that was platform independent and database independent. It originally started off by being 2 Tier as it came with a series of pre-built database drivers which could connect to any of the supported DBMS engines. This meant that the entire application could be switched from one DBMS to another simply by switching the database driver and without having to change a single line of application code. Each application component was built around one or more entities in the Application Model, which was an internal representation of the database. An entity in the application model matched a table in the database in a direct one-to-one relationship. The application model also contained details of any relationships between the tables by identifying which foreign key field(s) mapped with which primary key field(s) together with referential integrity.

A new version of the language made it possible to implement the 3 Tier Architecture with a form component responsible for the user interface and a separate middle-tier service component responsible for all the data validation, business rules and task-specific behaviour. It was possible for the same service component to be accessed by more than one form component. This was important as the new version also introduced the ability to create a type of form component which generated an HTML page for output to a web browser instead of a compiled screen which could only be displayed on a local monitor. It was therefore possible to take a desktop application and make it available on the web simply by creating a separate set of web-enabled form components which could share the existing business layer and data access layer components.

I liked the idea of the 3 Tier Architecture very much, but I did not like the way in which UNIFACE created HTML pages as it was too convoluted and required the use of proprietary HTML tags. This version of UNIFACE also introduced me to XML and XSL, but I was disappointed to see that it could only transform XML into XML and not into HTML which could be sent to the browser. Those deficiencies made me decide to switch to a different language which had a better method of generating HTML pages, which is what brought me to PHP.

As I had already built development frameworks in both COBOL and UNIFACE for the software house where I worked I saw no reason why I should not create a new one (or rebuild my previous ones) in PHP. This new version had to meet the following objectives:

It had to be based on the 3 Tier Architecture.
It had to generate all HTML output using XML and XSL transformations. I chose this method because it was written to W3C standards and not not specific to PHP, and it had the ability for the transformation to be run on the client instead of the server.
It had to take advantage of the OO capabilities in PHP.

As this was the year 2002 it was PHP4 that I used as PHP5 was not available until several years later. There are some people who claim that PHP4 was never a "proper" OO language, and some still make the same claim today regarding PHP5. I ignore them all because they are completely and utterly wrong. According to Alan Kay (who invented the term) and Bjarne Stroustrup (who designed and implemented the C++ programming language) the term "Object Oriented" can be covered by this broad definition:

A language or technique is object-oriented if and only if it directly supports:

Abstraction - providing some form of classes and objects.

Inheritance - providing the ability to build new abstractions out of existing ones.

Runtime polymorphism - providing some form of runtime binding.

As each of those concepts were supported in PHP4 I went ahead and used them in my new framework. Because I have been a long-time follower of the KISS principle I decided to stick with the simplest way of using encapsulation, inheritance and polymorphism to increase the amount of reusable code and thereby improve its maintainability. I have to say that when I saw some examples on the internet of how to write "proper" OO code, my previous years of experience told me that it was not the simplest way, so I chose to ignore it. In my COBOL days I saw plenty of examples of badly structured monolithic code which was littered with GOTOs, and which was given the name spaghetti code. Although this condition was supposed to be impossible with OO programming, all that happened was that the resulting mess was given a different name, such as ravioli code, lasagna code or Spaghetti with meatballs, where overzealous separation and encapsulation of code has resulted in bloated call stacks, which makes navigation through the code for maintenance purposes much more difficult. A typical example of this is the SwiftMailer library which uses 100 (one hundred!) classes to send a single friggin' email. WTF!!! I tried to debug the code once, but I got so pissed off with my IDE filling up with different class files, and seeing methods containing just a single line of code that I gave up in disgust.

The structure that I aimed to produce when I started to build my new framework is shown in Figure 1:

Figure 1 - the 3 Tier Architecture

A more detailed description of this architectural pattern can be found at What is the 3-Tier Architecture?.

This resulted in a different number of components in each layer, as shown in Figure 2:

Figure 2 - components in the 3 Tier Architecture

Note that this structure shows the following levels of sharability:

A single component in the Business layer can be shared by multiple components in the Presentation layer.
A single component in the Data Access layer can be shared by multiple components in the Business layer.

This is where I began to ignore the "advice" I saw in online tutorials and certain books as I could see that in their example code the coupling between each of those layers was too tight, causing a controller to be tied to a single model, or a model to be tied to a single controller, thus greatly reducing the possibility of reusing either of those components. For example, too many programmers had controllers which had the class names and the property names of the model hard-coded, which meant that the controller could only work with that particular class. Another "expert" had a separate Data Access Object (DAO) for each table in the database whereas in my framework I have a single DAO which can handle any number of different tables in any number of databases. I have a separate class for each different DBMS engine which allows me to switch the entire application between MySQL, PostgreSQL, Oracle and SQL Server without having to change any code in the Business layer.

Because I was used to writing code which worked with database tables, and because the service components in the middle tier of the UNIFACE 3 Tier Architecture were built around individual database tables, I saw absolutely no reason why I should not build my own middle tier components around individual database tables. This is why I created, and still do, a separate class for each table in my application database. My critics, of which there are many, are very fond of telling me that having a separate class for each database table is not good OO, but as their arguments are ~~so full of shit~~ so full of holes I choose to ignore them.

For my first table class I inserted all the code necessary to perform the CRUD (Create, Read, Update, Delete) operations. For my second table class I made a copy of the first one and changed all the references to table names and column names from "TableA" to "TableB". Then I used the supposedly "approved" technique of getting rid of duplicated code by comparing the two files and moving as much as possible into an abstract class which could then be inherited by each concrete class. This resulted in a large abstract class, but each concrete class ended up by being very small, consisting of only the following:

The database name (so that I can deal with multiple databases in the same application)
The table name
The table structure, obtained from the INFORMATION_SCHEMA.

This means that regardless of how many database table classes I have in my application they all inherit the majority of their code from a single abstract class while the data inside each class is limited to that which differentiates it from other tables. This way of inheriting code from abstract classes is supposed to be the way to reuse code in OOP, so how can it be wrong?

I also separated out the generation and execution of SQL statements into a separate DAO class so that I would be able to connect to different DBMS engines in the future. This came in useful when the method for accessing MySQL version 4.1 and above was upgraded from the original to the improved extension. This also enabled me to create classes to connect to PostgreSQL, Oracle and SQL Server. This is supposed to be an "approved" method of splitting up code so that different responsibilities are handled by different classes, so how can my implementation possibly be wrong?

Note here that my Model classes, one for each database table, are concerned only with data validation, business rules and task-specific behaviour, not the generation and execution of SQL statements.

When I built the components for the Presentation layer the fact that I produced all the HTML output by using an XSL stylesheet on an XML document meant that I had to split the component into two parts - the part that communicated with the Business layer and the part that handled the build-XML-and-transform-into-HTML. A colleague pointed out that this was an implementation of the Model-View-Controller design pattern, and after looking at the description I could see his point. My framework architecture can therefore be described as shown in Figure 3:

Figure 3 - MVC plus 3 Tier Architecture

A more detailed diagram can be found in RADICORE - A Development Infrastructure for PHP.

The MVC design pattern is supposed to be an "approved" method of splitting up code, so how can my implementation possibly be wrong?

Relational Databases are not Object Oriented

This is a common complaint, but does it actually have any merit? I suppose it depends largely on your definition of OO, but if you limit yourself to the concepts of encapsulation, inheritance and polymorphism, can you find anything in a relational database which matches these concepts? I think you can:

Encapsulation
The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations.

Note that this requires ALL the properties and ALL the methods to be placed in the SAME class. Breaking a single class into smaller classes so that the count of methods in any one class does not exceed an arbitrary number is therefore a bad idea as it violates encapsulation and makes the system harder to read and understand.

In OO, you identify an entity, then put all the properties and methods for that entity in a single class. Thus the CUSTOMER, PRODUCT and ORDER entities each have their own class. The class becomes the blueprint for each instance (object) of that class.

In a database you put all the properties for each entity in their own tables. Thus the CUSTOMER, PRODUCT and ORDER entities each have their own table. The DDL script becomes the blueprint for each instance (row) of that class. As for operations, anyone who knows databases will tell you that there are only four operations that you can perform on a databases table - select, insert, update and delete - and their implementations are built into the DBMS engine, so you don't have to define them yourself.
Inheritance
The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass.

In OO a class hierarchy is created when something is a type of something else. For example, "dog" is a class, but because "alsatian", "beagle" and "collie" are regarded as types of "dog" they are automatically represented as subclasses. This results in a structure similar to that shown in Figure 4:

Figure 4 - hierarchy of "dog" classes

With this approach you cannot introduce a new type of dog without creating a new subclass which uses the "extends" keyword to inherit from the superclass.

In a database you create a hierarchy by using a foreign key. So if each instance of "dog" has a "type", and each "type" has its own set of properties, you create a "type" table and then have a foreign key in the "dog" table which relates back to an entry in the "type" table. This results in the structure shown in Figure 5:

Figure 5 - hierarchy of "dog" and "dog-type" tables

With this design all the attributes of a particular type/breed of dog are stored on the DOG-TYPE table, so instead of a separate subclass for each DOG_TYPE I would have a separate row on the DOG-TYPE table. When reading from the DOG table you can include a JOIN in the SQL query so that the result combines the data from both tables. This is how you can "inherit" attributes in a database. The introduction of a new type of dog requires no more effort than adding a record to the DOG-TYPE table. There are no changes required to the software, no new classes, no new screens, no new database tables, no nothing. From a programmer's point of view this simple 2-table structure is far easier to deal with than an unknown number of subclasses.
Polymorphism
Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method signature (which is not the same as an object interface), but the result which is returned by each method will be different as the code behind each method (the implementation) is different in each class.

Note that this does NOT require the use of the keywords "interface" and "implements" as these are totally optional in PHP. All that is required is that different classes implement the same method name with the same signature. This is usually achieved using inheritance using the "extends" keyword, but need not be.

In OO you achieve polymorphism by having the same method signature available in more than one class. This means that wherever you call that method you can use it on more than one class and it will still work. This increases reusability and provides loose coupling between the caller and the callee.

In databases the four operations - select, insert, update and delete - are automatically available on all tables, which means you have instant built-in polymorphism.

If you look at what is provided by encapsulation, inheritance and polymorphism instead of concentrating on how they are provided, you should see that a relational database does in fact achieve similar results. The fact that it does so by using a different mechanism should not be that much of an issue.

Neither are web pages - so what?

A large number of OO programmers claim to have difficulty with relational databases because they are not object oriented, but what exactly does this mean? When you consider that all you need to communicate with a database is an SQL query which is a simple string of plain text, and all you need to send an HTML document to a web browser is a simple string of plain text, then why don't the same programmers make the same complaint with the production of web pages? If programmers who deal with web pages don't have a problem with strings of HTML, then why do programmers who deal with database tables have such a problem with strings of SQL? They don't seem to have a problem with structuring their code in order to produce an HTML document as efficiently as possible, so what is the problem with SQL?

I think the following book title found on www.oreillymaker.com says it all:

Figure 6 - Book for ORM fanatics

Which is best - OO Design or Database Design?

Having become familiar with the processes of data analysis, Data Normalisation and database design over many years, and having successfully implemented many applications which used those designs, I wanted to find out what was so special about Object Oriented Design (OOD). I was bitterly disappointed. Having personally witnessed the results of badly structured code, followed by the benefits of code which was properly structured around the database design thanks to Jackson Structured Programming, I expected something which was at least workable. Instead I found a bunch of loose theories which were so vague, wishy-washy and open to much interpretation (and over-interpretation) that the results were totally incompatible with the database design. This was a problem that needed a solution, but the "solution" offered by these OO fanatics seemed to be nothing more than a patch-up job, papering over the cracks and hiding the problem under layers of obfuscation, indirection and redirection. As a pragmatist I decided that instead of trying to treat the symptoms of the problem the best solution would be to eliminate the problem all together. So if OOD, when compared with database design, is incapable of producing reliable and workable results my simple solution was to not waste any time on OOD at all.

There is an old proverb which sums it up quite succinctly: Prevention is better than Cure.

Further observations on this topic can be found at:

Table Oriented Programming in a nutshell

Traditional Object Oriented Programming is centered around objects whose structure is at odds with the database structure, whereas with Table Oriented Programming, which still uses objects, the structure of those objects is a mirror image of the database structure. This means that it is much easier to keep the two structures synchronised and in perfect harmony. It also makes it easier for those developers who already know and understand how SQL works to help build and maintain the software when each software component is directly related to a table in the database.

The idea of designing software which is centered around the database structure is not new. In the Wikipedia article called Database-centric architecture it says the following:

The use of table-driven logic, i.e. behavior that is heavily dictated by the contents of a database, allows programs to be simpler and more flexible.

Here are the simple steps you need to take to become a practitioner of this black art. Note that some of these steps may cause some OO purists to turn purple with rage and explode, so read on at your own peril!

The Object Model

Don't waste time with OOD to design your classes, then build the database later only to discover that the two structures are incompatible. Start with the database design and ignore OOD altogether. For each table in the database you create a separate class so that there is always a one-to-one relationship between the database and the software. As the operations which can be performed on a database table have already been fixed (create, read, update and delete), they can be specified in an abstract table class, so all that is necessary in each table class is to define the following:

The table name. I also include the database name as I quite often deal with large applications which have their data split across multiple databases, and I need the ability to switch from one database to another using a single database connection.
The table structure, which can be obtained from the INFORMATION_SCHEMA within the database. This information includes
- A list of column names, their types and their sizes. This makes it easy to automate the task of validating user input before it is sent to the database.
- The fields which comprise the primary key.
- The fields which comprise any candidate keys.
- All relationships, both parent and child.
- Additional information which is not available in the INFORMATION_SCHEMA.

Some programmers like to hold the column details and the relationship details in separate objects, but to my mind this breaks a fundamental rule of encapsulation which states that all the properties and all the methods for an entity must be defined in the same class. Data is held as either a primitive data type or an array. I don't bother with objects inside objects as they offer no advantages over arrays.

I spent 20 years as a database programmer before I became an OO programmer, so I know that the design of the database comes first, then the software must be matched to be synchronised with that structure. Consequently there are several OO principles which I refuse to follow as their results would be counter-productive:

Class Methods

I do not see any real reason to have method names which are specific to a particular table class as there are only four basic operations which can be performed on any database table - create, read, update and delete. I define these operations in my abstract table class so that they can be automatically inherited by every concrete table class. By having multiple classes which share the same methods I can also make greater use of polymorphism, such as having a small set of page controllers which can work with any table class within my application.

I should point out here that each of the create/read/update/delete methods in my abstract table class does not simply pass control to my DAO to communicate with the database. As shown in these UML diagrams each of these methods will execute a series of sub-methods in a predefined sequence. Some of these are empty methods which can be filled with code inside the concrete table class, thus enabling the developer to insert whatever code is necessary to execute business rules and task-specific behaviour.

But how is it possible to perform operations which are more complex than a simple create/read/update/delete? For example, in an accounting system I will want a function which transfers funds from one account to another. The steps are quite straightforward:

Create a user transaction for that operation which the user can select from a menu. There are patterns for creating/updating/deleting multiple records at a time, plus other variations, so there will always be one pattern available as a starting point.
Put the relevant code into the relevant customisable method in the table class. If different transactions require different processing it could be tricky to have all that code contained within a single class and then executed conditionally using a switch statement. It would be far easier to create a subclass which contains the different code, then tell the transaction to use this subclass instead of the main table class. The transaction will then automatically execute the correct code unconditionally without the need for any switch statements.

Note that in these circumstances I do not create a special method in the Model class to perform this operation, then create a special Controller to call this method as this would produce extremely tight coupling between that Controller and that Model, which is something I prefer to avoid if at all possible. I always start with a simple CRUD transaction from my catalogue of Transaction Patterns, then insert code into the relevant class methods to perform whatever actions are necessary.

As well as having one set of user transactions which perform the basic CRUD (Create/Read/Update/Delete) operations on a database table, I may want to create an additional transaction which performs more complex processing, such as transferring funds from one account to another, so how can I have this complex code in the same class as the basic code? The simple answer is that I don't. When I want a new transaction with some non-standard processing which requires program code that would conflict with existing code in that table's class I create a subclass of the original table class just for that transaction, and this subclass will contain only that code which is specific to that transaction. This still enables me to inherit any common code from both the parent table class and the abstract table class.

Note that what I do not do is create a new class method for this non-standard processing - I create a new (sub)class and reuse one of the existing generic methods. This means that instead of having to create a new controller to access a new method I can reuse one of my existing controllers. Using my framework I have created an Order Fulfilment/Supply Chain Management application which has over 450 database tables, over 1,200 relationships, and over 4,000 tasks (user transactions). While some of these tasks are relatively simple there are a significant number of tasks which are quite complex and some which are very complex. Yet my standard library contains only 40+ reusable controllers and 12+ reusable XSL stylesheets.

Class Properties

In every tutorial I saw each class had a separate property for each column which meant that each column had to be fed in (via a 'setter') or pulled out (via a 'getter') one at a time. This seemed painfully slow and laborious to me. Databases deal in record sets which can contain any number of rows and any number of columns, and as I could see how easy it was for PHP to pass such record sets around as arrays, especially the $_GET and $_POST arrays, I saw no reason why I could not pass the data from one object to another as a single array instead of feeding in one named column at a time.

This is supposed to be the *right* way using 'setters':

<?php 

$client = new Client(); 
$client->setUserID    ( $_POST['userID']   ); 
$client->setEmail     ( $_POST['email']    ); 
$client->setFirstname ( $_POST['firstname']); 
$client->setLastname  ( $_POST['lastname'] ); 
$client->setAddress1  ( $_POST['address1'] ); 
$client->setAddress2  ( $_POST['address2'] ); 
$client->setCity      ( $_POST['city']     ); 
$client->setProvince  ( $_POST['province'] ); 
$client->setCountry   ( $_POST['country']  ); 

if ($client->submit($db) !== true) 
{ 
    // do error handling 
} 
?>

This is my way:

<?php 

$dbobject = new Client;
$dbobject->insertRecord($_POST);
$errors = $dbobject->getErrors();

?>

Note here that $errors is an array of validation errors. If there are no validation errors then the record is added to the database, otherwise the screen is automatically populated with these error messages and the user invited to try again.

My approach is infinitely simpler, and can be made reusable by passing down the class name in a variable, as in:

<?php 

$dbobject = new $table_id;
$dbobject->insertRecord($_POST);
$errors = $dbobject->getErrors();

?>

That is why I define a single $fieldarray property in my abstract class and define nothing else in each concrete class. This allows me to get data in an out of an object as a single array regardless of how many rows and columns of data there are. If I need to access a particular column it is just as easy to use $this->fieldarray['column_name'] or even $this->fieldarray[$rownum]['column_name'] as it is $this->column_name, so there is no loss of functionality.

This means that I can easily change the number of columns in the array, either by removing columns that I don't want, or adding in columns from other tables via a JOIN, without having the calling code know which columns may or may not be present in the array.

Fetching Records

Having used SQL for quite some time I knew that all manner of different selection criteria can be placed into the WHERE clause of a SELECT statement, so when I see collections of specialised methods such as:

$array = $object->findByPK('1234');
$array = $object->findByName('Fred');
$array = $object->findByDate('2013-06-24');
$array = $object->findByStatus('PENDING');
$array = $object->findAll();

I have to ask the question: why are you restricting yourself to a subset of the possibilities when a single method could deal with any number of possibilities? Anybody who works with SQL knows that the WHERE clause is nothing more than a string of text which can contain any number of elements, so why not pass the entire string around as a single argument? I have found that the following single method can deal with any eventuality that I can possibly dream up:

$result = $object->getData($where);

The $where argument is a string which can contain anything which is valid in the WHERE clause of an SQL query. The result is an array of zero or more rows containing 1 or more columns which PHP can deal with very easily. I don't see why I should choose an option which requires a lot of effort when a simpler option is available.

Customising the Query

The ability to create complex and powerful SQL queries was something that UNIFACE did not have, so I made sure that my new framework could handle anything that the DBMS could accept. Anyone who has ever studied the structure of a query will soon be able to tell you that it follows a standard pattern:

SELECT $select_str
  FROM $from_str 
       $where_str 
       $group_str 
       $having_str 
       $sort_str 
       $limit_str

Each of these components is defined as a separate property within my abstract table class (not a separate class). By keeping each component in a separate string I can amend the contents of any component very easily. It is only after these component parts are handed over to my Data Access Object that they are assembled into a complete query and sent to the database.

By default the constructed query will be as follows:

SELECT * FROM <tablename> WHERE $where_str;

If the data dictionary identifies any columns which are to be retrieved from a parent/foreign table the framework will automatically insert a JOIN as follows:

SELECT <tablename>.*, <parent>.field1, <parent>.field2
FROM <tablename>
LEFT JOIN <parent> ON (<parent>.pkey = <tablename>.fkey)
WHERE $where_str

If you wish to customise the query even further then the sky's the limit. All you have to do is modify each of the components strings as required:

$this->sql_select = 'mnu_task.task_id, mnu_task.task_desc, mnu_menu.button_text, global_access, '
                  . 'CASE WHEN mnu_role_task.role_id IS NOT NULL THEN 1 ELSE 0 END AS access_allowed';
$this->sql_from   = 'mnu_menu '
                  . 'LEFT JOIN mnu_task ON (mnu_task.task_id=mnu_menu.task_id_jnr) '
                  . "LEFT JOIN mnu_role_task ON (mnu_role_task.task_id=mnu_menu.task_id_jnr AND mnu_role_task.role_id='$role_id') "
                  . "LEFT JOIN mnu_role ON (mnu_role.role_id='$role_id')";
$this->sql_select .= ", CASE WHEN (SELECT count(ip_address) 
                                     FROM mnu_task_ip_address 
                                    WHERE mnu_task_ip_address.task_id=mnu_menu.task_id_jnr) = 0 THEN true"
                          ." WHEN (SELECT count(ip_address) 
                                     FROM mnu_task_ip_address 
                                    WHERE mnu_task_ip_address.task_id=mnu_menu.task_id_jnr AND ip_address='$ip_address') = 1 THEN true"
                          ." ELSE false"
                     ." END AS ip_address_valid";
$this->sql_having = "ip_address_valid=true";

$where = "menu_id='$task_id'"
       ." AND (mnu_role_task.role_id IS NOT NULL OR mnu_role.global_access='Y')"
       ." AND (mnu_task.is_disabled='N')";

I have seen quite a few examples on the internet where each of these substrings can only be manipulated using specialised objects and methods, but in PHP you do not need objects to manipulate strings, so why bother? Why go to all that effort of inventing a complicated method when such a simple one is available?

Inserting and Updating Data

I have seen sample code which injects each piece of data separately before it is actioned, as in the following:

$dbobject->set('OrderDate', $_POST['order_date'];
$dbobject->set('ShippingDate'. $_POST['shipping_date'];
$dbobject->save();

I have even seen examples where each column had its own argument with the method, as in:

$result = $dbobject->insert($pk, $name, $start_date, $status, $category_id);

I do not like either of these methods as they require the calling code to have knowledge of the property names within the object, which I find wrong on so many levels:

It ties the calling code to a single object, which eliminates reusability.
The calling code requires knowledge of the property (column) names within the object, which means that changes to the properties within the object will require corresponding changes to the calling code.

Again I am so used to passing data around in complete sets instead of one piece at a time, so I prefer to use the following:

$result = $dbobject->updateRecord($_POST);
$errors = $dbobject->getErrors();

$result = $dbobject->insertRecord($_POST);
$errors = $dbobject->getErrors();

$result = $dbobject->deleteRecord($_POST);
$errors = $dbobject->getErrors();

With this approach there is no need to identify the contents of the $_POST array, which brings the following benefits:

The calling code is not tied to any particular object, so it can be reused on any object which has those method signatures.
The calling code does not require knowledge of the property names within the object, so those properties can be changed at will without any need to change the calling code.

Note that the purpose of these methods is not simply to generate an SQL query and send it to the database. As shown in Data Validation and Business Rules there are several steps to be performed both before and after the execution of the query, and each of these steps has its own method which is either a standard part of the framework or can be customised to suit the application.

If there are any validation errors then the $errors array will be populated with the relevant error message(s), and the database will not be altered. The View component will automatically redisplay the current screen with each error message under the appropriate input field.

Updating Multiple Tables

If reading from multiple tables can be achieved with a SELECT query containing JOINs, how is it possible to update several tables in a single transaction? If you consider the following operation that a controller can perform on a model:

$result = $dbobject->insertRecord($_POST);
$errors = $dbobject->getErrors();

It is possible for the $_POST array to contain data that needs to be spread over more than one table in the database. Each table class will only validate those columns that it knows about, and will ignore the rest. Only columns which exist in that table will be included in any SQL query. This means that any filtering does not have to be performed on the data before it is passed to the model as it is performed by the model itself using structure data held inside the model. As each table class can only function on a single table it will therefore be necessary to communicate with other table classes, but all this requires is the relevant code to be inserted in the _cm_post_insertRecord() method of the main object, as in the following example:

function _cm_post_insertRecord ($fieldarray)
// perform custom processing after database record has been inserted.
{
    $dbobject2 = singleton::getInstance('table2');
    $result2 = $dbobject2->insertRecord($fieldarray);
    if ($dbobject2->errors) {
        $this->errors = array_merge($this->errors, $dbobject2->errors);
        return $fieldarray;
    } // if
		
    $dbobject3 = singleton::getInstance('table3');
    $result3 = $dbobject3->insertRecord($fieldarray);
    if ($dbobject3->errors) {
        $this->errors = array_merge($this->errors, $dbobject3->errors);
        return $fieldarray;
    } // if

    return $fieldarray;
		
} // _cm_post_insertRecord

Some OO purists would object to this code as it requires knowledge of the existence of other tables within a table object, but when you consider that the knowledge has to exist somewhere in the software (otherwise it simply would not function) I believe that it is better to bring it into the daylight and make it get-attable instead of burying it deep in the bowels where it is a deep, dark secret.

Data Validation and Business Rules

You may wonder how data validation is handled, or business rules can be applied, but again my approach is very simple based on what has worked in previous languages. When the insertRecord() method is called on an object it actually executes a series of internal methods in a predetermined sequence:

insertRecord($array)

There is some similar for the updateRecord() method:

updateRecord($array)

The great advantage of each Model class being the representation of a single database table is that the class "knows" the structure of that table, and because it knows what columns the table contains, and the data type and size of each column, all primary validation can be performed automatically by the _validateInsert() and _validateUpdate() methods without the developer having to write a single line of code.

Any additional business rules or secondary validation must be defined in one of the _cm_commonValidation(), _cm_validateInsert() or _cm_validateUpdate() methods.

The _dml_insertRecord() and _dml_updateRecord() methods transfer control to a separate Data Access Object (DAO) which is responsible for constructing and executing the relevant query using Data Manipulation Language (DML). I have a separate class for each of the major DBMS engines (MySQL, PostgreSQL, Oracle, SQL Server) so I can switch my application from one DBMS to another without having to change a line of application code. Note that as each table's structure is known any item in the array which does not belong, such as the SUBMIT button for example, is automatically excluded otherwise it would cause an invalid query to be executed, thus causing a fatal error.

The various _cm_???() methods are defined in the abstract table class as empty methods, but because they are empty there is no code to be executed at runtime. However, any of these empty methods can be copied into a concrete (sub)class and filled with code, which means that they will override the empty method at runtime.

Screen Structures

All HTML screens are generated from a single standard component. Here is a snippet from one of the reusable page controllers which shows how the database object and the screen structure file are passed to the View object for processing:

$dbobject = new $table_id;
$result = $dbobject->insertRecord($_POST);
$errors = $dbobject->getErrors();
$messages = $dbobject->getMessages();

// build XML document and perform XSL transformation
$view = new radicore_view($screen_structure);
$html = $view->buildXML($dbobject, $errors, $messages);
echo $html;
exit;

Note that the identity of the table class and screen structure file are passed down from the component script. There is one of these for each task (user transaction) in the application. Both the component script and screen structure file are generated by the Transaction Generator.

The contents of the screen structure file are obtained at the start of the script so that they may be amended before they are passed to the View component and processed as follows:

Application data is pulled out of the $fieldarray property of the database object which, as stated previously, can hold any number of rows and any number of columns. This data may need to be formatted before it is added to the XML document.
Framework data (menu buttons, navigation buttons, action buttons, titles, messages, etc) are obtained from framework objects and added to the XML document.
The contents of the screen structure file are also added to the XML document. This information is then used by the XSL stylesheet to place each item of data in the application area of the HTML output.

The XML document is then transformed into HTML output using the XSL stylesheet. The advantage of using XSL transformations instead of other templating mechanisms is that the load on the server can be reduced by having the transformations performed by the client (provided that the client browser supports XSL) instead of the server.

A Table Oriented Framework

Having gone down the path of Table Oriented Programming, what can you build into a framework to make the development process that much easier? As I have found time and time again in my long career, if you can standardise and simplify the way that something is done it is not a big step to then find a way to automate that function. For example, in my own Table Oriented Framework (TOF) (an open source framework called RADICORE) I have the following:

Class Generator

As I have one class for each table in the database I could have a mechanism which reads the INFORMATION_SCHEMA and produces a class file, but this has two problems that I can see:

The code to extract data from the INFORMATION_SCHEMA is quite intense, and I don't want to have the overhead of executing that code each time an object is created.
The details held within the INFORMATION_SCHEMA are quite limited, and I would like the ability to expand this with extra data which could be of use to my framework.

As a consequence the creation of class files for the Model/Business layer is now a three-part process:

Import table details from the INFORMATION_SCHEMA into my custom-built Data Dictionary.
Edit the details within the Data Dictionary to provide additional information.
Export table details from the Data Dictionary to produce the class files and table structure files.

Reading the table's structure from a single pre-built file thus eliminates the two problems which I identified previously.

Each of these generated classes inherits a great deal of sharable code from an abstract table class, which makes each generated class initially quite small. This inherited code performs all standard processing, but custom logic can be incorporated into any transaction simply by adding the relevant code into the relevant empty customisable method which has been pre-defined in the abstract table class.

Transaction Generator

I am used to designing and building applications for end users which are centered around user transactions (tasks). These typically ended up with specifications which identified a screen layout plus an idea of how the user expected to interact with it. When you have built a lot of transactions - and I have built thousands - you begin to see lots of similarities. Lots of transactions had the same structure and behavior, with the only difference being the content. How many times have you been given a specification which said simply "Make a copy of that transaction which works on TableA and make it work with TableB"? With so many similarities it was a simple step to identify patterns of structure and behaviour, then to put these patterns into code so that they could be reused over and over again. I devised a series of Transaction Patterns which are comprised of the following:

A group of reusable XSL stylesheets which provide the screen structure.
A group of reusable page controllers which provide the behaviour.

I think built a transaction generator into Data Dictionary which obtains the following variables:

The name of a Transaction Pattern.
The name of a table class (or classes).

This information is then used to create a component script which, very briefly, states: "Use this Model, this View, and this Controller" such as in the following example:

<?php
$table_id = "person";                      // identify the Model
$screen   = 'person.detail.screen.inc';    // identify the View
require 'std.enquire1.inc';                // activate the Controller
?>

I can therefore create a working transaction for a new database table in minutes without having to write a line of code - no PHP, no HTML, no SQL. If you cannot achieve this with your framework I suggest that you go back to the drawing board and start again.

Note here that although each transaction/unit-of-work has its own unique identity (otherwise how could you select it from the menu system) I do not create a separate class or class method for each transaction. This would result in too many unique methods in my Model classes and a separate controller to call each of those methods. This would greatly reduce any opportunity for polymorphism and greatly increase the amount of tightly-coupled code as there would be too many Controllers which would be tied to a single Model and too many Models which would be tied to a single controller. This would negate any benefits from from using the OO paradigm. In my main application I have over 2,000 user transactions, so to have 2,000 different methods and 2,000 different controllers would be a complete nightmare. Instead I can make use of a small number (about 40) of reusable controllers which all use a different combination of the same generic methods. The method signatures may be generic, but it's the code behind the method which actually does the work.

Framework Structure

As has been shown in Figure 3 there are four main components to this framework, with the View, Controller and DAO being part of the framework while the Model is generated for each application.

Model - each class is generated using information obtained from your application's database schema. The generated class file can then be modified to include the processing for business rules or task-specific behaviour.
View - this is built into the framework and uses a series of predefined XSL stylesheets to transform the application data into HTML output.
Controller - each predefined page controller is paired with an XSL stylesheet to form a Transaction Pattern which can be joined with a Model to generate a user transaction.
Data Access Object - there is a separate class for each DBMS engine which can operate on any number of tables.

All knowledge of the application - data validation, business rules and task-specific behaviour - is encapsulated within the Business (Model) layer, which means that the remaining components, the View, Controller and DAO, are application agnostic. A single Model can be shared by multiple user transactions, while the View, Controller and DAO can operate with any Model.

Criticisms of my approach

Ever since I started publishing articles on my approach to building database applications using the OO capabilities of PHP I have been heavily criticised by people who call themselves OO "experts". A popular statement is the following:

This approach is only useful for simple CRUD applications

This comment can only come from someone who has little experience in writing database applications which contain hundreds, if not thousands, of user transactions. Using my RADICORE framework I have personally built over 2,000 user transactions, and this is in addition to the many thousands that I wrote in previous languages while working for a variety of software houses. This volume of experience leads me to two simple facts:

1. Every user transaction (task) in a database application touches the database in some way.
2. Every complex user transaction starts life as a simple transaction into which varying levels of complexity can be added.

The simplest form of CRUD transactions perform nothing more than the basic operations - Create, Read, Update and Delete - on a single table without the addition of any complex processing. In the RADICORE framework each of these operations is performed in a different transaction which are grouped together into a forms family as shown in Figure 7:

Figure 7 - A typical Family of Forms

This family consists of 6 simple forms - List1, Add1, Enquire1, Update1, Delete1 and Search1 - which can be built to operate on any table in any database. The the parent form can be activated from a menu button while each of the child forms can be activated from a navigation button in the parent form. The advantage of implementing each operation in a separate component is that it becomes easy to either allow or disallow a user to gain access to any individual component using the Role Based Access Control features of the framework. Contrast this with the approach used by my less learned brethren where instead of a separate Controller for each operation they use a single controller for all operations. This controller then has to contain the code to verify that the current user is allowed to access each of those operations.

That simple set of user transactions can be implemented using just six of the Transaction Patterns which are supported by my framework. More complex transactions can be implemented by using any of the forty additional patterns. The main advantage of my approach is that, using nothing more than the database schema, I can generate a working basic user transaction in a matter of minutes without ever writing a line of code - no PHP, no HTML, no SQL - which means I have much more time to spend on adding in the code for the business rules and other areas of complexity. While each table class will, in its original form, only support the basic operations, additional code to support more complex business rules can be inserted into the empty customisable methods. These are also known as hook methods which follow the Template Method Pattern.

Using this library of Transaction Patterns I have managed to build a large enterprise application which contains 2,800 user transactions of varying degrees of complexity, so anyone who says that my framework can only deal with simple transactions is talking out of the wrong end of their alimentary canal.

Among the other criticisms are the following:

You should see from these that my critics have a blinkered, short-sighted and narrow-minded attitude when it comes to implementing the principles of Object Oriented Programming. They seem to think that just because they have been taught one way that it is the only way, the one true way, and that no dissension or non-conformity should be allowed at any cost. They can do nothing except follow their rules, or their weird interpretations of other people's rules, in a purely dogmatic fashion without any regard for the consequences. They are like religious zealots who cannot tolerate their faith being questioned, and they are incapable of responding to either valid arguments or plain common sense.

If they have only been taught only one way then they have been badly taught, so shame on their teachers.

If, after being told that there is more than one way, they refuse to believe it, or even consider it, then they are incapable of learning, so shame on them.

I am a pragmatist, not a dogmatist, so I am only interested in the results, and not the strict adherence to a set of artificial rules created by people of questionable intelligence and dubious intellect.

Conclusion

Any engineer will tell you that when two different components have to work together it is much more efficient if they can do so in harmony, with as little friction as possible. The alternative is discord, where the components are constantly out of step, fighting each other, and generating a lot of noise. Table Oriented Programming is not an alternative to Object Oriented Programming, it is a technique where OOP is allowed to work with the RDBMS instead of against it, thereby removing unnecessary layers of complexity.

The fact that Fredrik Bertilsson's article was written with the Java language in mind while this one was written for PHP does not cause our conclusions to be different. OO is a paradigm that is supported in many programming languages, and many programs are written which need to access a relational database, so why not take the path of least resistance and make them work together? Why not have harmony instead of discord? To quote Fredrik Bertilsson:

This article has described a non-mainstream solution for database programming. The solution resembles database programming before the object-oriented era. But there is one big difference: it utilizes the full power of the object-oriented programming language. There does not need to be any impedance mismatch between the use of a relational database and an object-oriented programming language.

It would appear that this solution has already found its way into the PHP community. In his blog post Decoupling models from the database: Data Access Object pattern in PHP Jani Hartikainen makes this opening statement:

Nowadays it's a quite common approach to have models that essentially just represent database tables.

So this is not such an outlandish idea after all!

By choosing to work with the relational model instead of against it I have eliminated the need for an ORM, plus I have opened up possibilities that otherwise would not have existed:

By having one class per database table I can generate my table (Model) classes from the database schema, with additional meta-data being supplied through my Data Dictionary.
By generating and executing all the SQL queries in a separate Data Access Object I can switch from one DBMS to another without having to change any Model/Business layer classes.
By identifying transaction patterns which can be implemented using pre-written code in the form of reusable XSL stylesheets (Views) and reusable page controllers I was able to take this a step further and devise a mechanism to generate basic working transactions without having to write a single line of code.

I do not believe that neither the level of reusability which I have achieved, nor the speed at which I can generate user transactions, can be matched in any other framework, which leads me to believe that the refusal to see the benefits of Table Oriented Programming is based purely on narrow-minded, short-sighted dogmatism and has absolutely nothing to do with the results (superior results in my humble opinion) that can be achieved.

References

The power of table-oriented programming by Fredrik Bertilsson
Table Oriented Programming from the Programming Wisdom Center at geocities.com
Domain Logic and SQL by Martin Fowler
10 Common Mistakes Java Developers Make when Writing SQL by Lukas Eder at javacodegeeks.com
10 More Common Mistakes Java Developers Make when Writing SQL by Lukas Eder at javacodegeeks.com
Decoupling models from the database: Data Access Object pattern in PHP by Jani Hartikainen
Core J2EE Patterns - Data Access Object at oracle.com
Database-centric architecture at wikipedia.org

Object Relational Mappers are Evil at tonymarston.net
What is the 3-Tier Architecture? at tonymarston.net
The Model-View-Controller (MVC) Design Pattern for PHP at tonymarston.net
A Table Oriented Framework for PHP at tonymarston.net
A Data Dictionary for PHP Applications at tonymarston.net
Transaction Patterns for Web Applications at tonymarston.net
Why I don't do Domain Driven Design at tonymarston.net
The Template Method Pattern as a Framework at tonymarston.net
Are you achieving the aims of OOP?

Amendment History

05 Sep 2013	Updated Criticisms of my approach to include my answer to "This approach is only useful for simple CRUD applications".
23 Aug 2013	Added Criticisms of my approach.

counter

Tony Marston's Blog About software development, PHP and OOP

Table Oriented Programming (TOP)

This approach is only useful for simple CRUD applications