This article expands on what I wrote several years ago in Each database table requires its own class.
I have been told many times by numerous people that having a separate class for each database table is not proper OO, that real OO programmers don't do it that way, but none of these people have ever provided a satisfactory answer to these simple questions: "Who wrote this rule? When was it published? What is its justification? Why is it better than the alternatives? What are the alternatives?"
As far as I can tell there are only three possibilities when it comes to the relationship between tables and classes:
In my humble opinion option #1 violates the Single Responsibility Principle as a software object ends up by being responsible for more than one business object. See Arguments against having a single class for multiple tables for details.
Option #3 violates the principle of encapsulation which states that a class should contain ALL the data and ALL the operations concerning the entity which it represents. See Arguments against having a multiple classes for a single table for details.
Option #2, my chosen approach, not only does NOT violate any principles, it has actually enabled me to build software which provides a huge amount of reusability simply by utilising inheritance and polymorphism in a more practical way.
While there are many different definitions of OOP the only one that I use, as discussed in What is Object Oriented Programming (OOP)?, is as follows:
Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Inheritance and Polymorphism to increase code reuse and decrease code maintenance.
Note here that the objective is to create as much reusable code as possible, so your ability to implement the principles of OOP effectively should be judged by the amount of reusable code you manage to write and the amount of duplicate code which you manage to avoid. So how do you identify code that could be reused, and how do you write it so that it can be reused? As explained in Designing Reusable Classes you start by creating concrete classes for each of the entities that will appear in your business/domain layer, then you employ the process of abstraction to identify the similarities and differences between those classes. You can then move all the similar protocols (methods) to an abstract class while retaining the differences in each concrete class. The contents of the abstract class can then be shared by each concrete class using inheritance. Having the same set of methods in multiple classes then provides the capability of polymorphism which is a prerequisite of Dependency Injection which, when used properly, provides the mechanism of utilising objects which share the same methods.
So far the only reason provided by the critics of my approach goes along the lines of:
You are supposed to model the real world, and tables in a database are not real objects, they are just representations of real objects.
I find this statement to be seriously flawed. Just because you can model the real world does not mean that you should. You should only model those parts of the outside world with which your software will communicate directly. When you are developing software which communicates with "something" outside of itself then you have to write code which communicates directly with that "something". The fact that the "something" is a collection of database tables each of which represents "something else" does not mean that you have to create models for that "something else". There is no communication between objects in a database and the corresponding objects in the real world which they represent, so why waste time on modelling the objects with which there is no communication?
When you come to terms with the fact that if your application is dealing with database tables called Customer and Product then you are not dealing with real Customers and real Products in the real world but different tables in a database. The software communicates with entities in a database and not entities in the real world. There is no communication between the software and the real world, so why on earth would a sensible programmer create models for entities with which there is no communication either directly or indirectly? You should only use a model of an entity if you are communicating directly with that entity.
While the differences between a Customer and a Product in the real world may be great, you should be able to recognise that the differences between the two tables are far less. In fact there are more similarities than differences simply because every table in a database, regardless of its contents, is subject to the same rules, and a competent programmer should be able to apply the concepts of Encapsulation Inheritance and Polymorphism to abstract out the similarities and create code which has more reusability and therefore requires less maintenance. This is supposed to be the aim of OOP, so why are so many programmers following a path which does not achieve this aim?
I have written an ERP application which currently consists of 20 different domains using a total of over 400 database tables, and while each of those tables is totally different they also have a number of similarities:
According to the advice given in Designing Reusable Classes the best way to create reusable software is to first build software that works, then examine it looking for similarities, looking for patterns. You then separate everything which is similar from that which is different, isolate the differences in concrete classes and share what is similar from reusable sources which could either mean inheriting from an abstract class or calling a method in a service. The ultimate form of reuse would be to use a framework which was specifically written for that type of application. I write nothing but database applications for the enterprise, which is why I created the RADICORE framework to assist me in this endeavour.
If creating a separate class for each database table is so wrong, then can you explain why Martin Fowler, the author of Patterns of Enterprise Application Architecture defined patterns called Table Module, Class Table Inheritance and Concrete Table Inheritance which specify "one class per table"? Can you explain why Craig Larman, the author of Applying UML and Patterns, mentions the Representing Objects as Tables pattern? If these people say it's OK, then who are you to argue? If it's so wrong then why does the author of Decoupling models from the database: Data Access Object pattern in PHP state the following in his opening paragraph:
Nowadays it's a quite common approach to have models that essentially just represent database tables, and may support saving the model instance right to the database. While the ActiveRecord pattern and things like Doctrine are useful, you may sometimes want to decouple the actual storage mechanism from the rest of the code.
If it is a "common approach" then why is it wrong to follow it?
In this document I shall explain why that not only are there absolutely no pitfalls to my approach but in fact there are nothing but advantages.
I started experimenting with PHP version 4, my first language with OO capabilities, way back in 2002. This was after programming for several decades with other languages, primarily COBOL and then UNIFACE, so I was already used to a range of good practices such as KISS, DRY, structured programming, coupling and cohesion. My research told me that OOP was built on these same foundations, but was supposed to be better because it included new and revolutionary concepts called Encapsulation Inheritance and Polymorphism. I read the PHP manual which described the mechanics of these concepts and how to implement them in my code, then I began to build a new version of my existing development framework in this new language which took advantage of these new concepts. If OOP was supposed to be better, then my aim was to produce a framework which enabled me to build applications in a more cost-effective manner than what I had achieved with similar frameworks which I had written in my previous languages. As I write nothing but database applications, what are now known as enterprise applications, I am used to building a standard family of forms to maintain the contents of a database table, so I used as a benchmark the amount of time it would take to build a set of these forms for a new database table. These are the levels of productivity which I achieved:
Those levels of productivity are a combination of the capabilities of the language and the features provided by my development framework. Each language provided new features which made development easier and quicker, and my framework made use of these features to provide a common set of utilities for each of the applications that were developed. The fact that my PHP framework was far more productive than either of my previous versions led me to believe that the benefits of OO were "as advertised" and not just a bunch of hype and that my implementation of OO was on the right track. I began to publish articles on my personal website explaining my approach so that others could benefit from my experience, so imagine my surprise when people started telling me that I was doing it wrong and that
real OO programmers don't do it that way. Most of the arguments were along the lines of
Your way is different from mine, and as my way is the right way your way must be wrong. Instead of concentrating on writing cost-effective software these people appeared to be concentrating on following an arbitrary set of rules which I did not know existed. To them it appeared to be more important to follow a set of rules in a dogmatic fashion and produce software which is 100% "pure" (whatever that is) whereas my approach is totally pragmatic as I put results ahead of any arbitrary rules.
The PHP 4 manual simply described the mechanics of how to write code which implemented Encapsulation and Inheritance. It did not explain Polymorphism, so I had to work that out for myself. The manual explained what code to write, so I followed the explanation and wrote code which worked. Imagine my surprise when several years later I was told that writing code which worked was not good enough, that it needed to follow the rules described in such documents as Object Oriented Design (OOD), Domain Driven Design (DDD), the SOLID principles and Design Patterns. I read those documents and I laughed. It was clear to me that they were written by a bunch of comedians who did not have my experience in developing database applications, and it was clear to me that if I followed their "advice" I would be swapping my very productive development environment for one which had more complications, less reusable code, and which would be a maintenance nightmare. I regard the aforementioned documents as "advice" and not "rules" for the simple reason that if I broke a real rule then my code would not work. As my code obviously works, and has done for over a decade, and has been easily migrated through PHP versions 5, 7 and 8, there is absolutely no basis for calling it "wrong". If something works it cannot be "wrong" just as something which does not work cannot be "right". If I ignore a piece of advice and do not suffer any measurable consequences, then what value can be placed on that advice? If I disobey a rule and nothing breaks, then how can that rule be justified? If the only problem which arises when I ignore one of these artificial rules is that I upset the delicate sensibilities of its followers then all I can say is Aw Diddums!
When someone tells me "you should be doing it this way" my immediate response is to ask "Why?" They need to prove that their way is better than mine by identifying all those areas where my approach produces problems and theirs does not. No such proof has ever been provided. All they can do is claim that their method is better, but that is always subjective, it is nothing but an opinion. They cannot offer any objective proof, something which can be measured scientifically. When they say "your method is wrong" I counter with "how can it be wrong if it works?" When they say "your code is difficult to read and maintain" I counter with "how can it be if I have successfully been maintaining and enhancing it for over a decade?"
As I had spent the previous 20 years in writing database applications using several different languages, and having written libraries of reusable code, including development frameworks, in two of those languages, I had a lot of experience under my belt which I could use as the foundation to switching to a third language. As far as I was concerned the rules of a database application are the same regardless of the language in which that application is written. The way you design your database tables is the same, the way that you access those tables via SQL queries is the same, it is just how you generate those queries which is different. Each different language provides the developer with different syntax to achieve similar results, so it is up to the developer to use that syntax in the most efficient manner.
My previous experience taught me several principles which I regarded as sacrosanct:
Having built thousands of user transactions (use cases) dealing with hundreds of database tables I was also aware of the following areas of similarity which were prime candidates for being developed as reusable modules:
It is only by being able to identify areas of similarity that you will then be able to build reusable modules to deal with those similarities. The more reusable code you have at your disposal the less code you will have to write and maintain, and the more productive you will become.
A classic example of where my critics say that I am breaking a golden rule is in my approach of having a separate class for each database table. I have never seen a rule written anywhere which says I cannot do this, and I have certainly never seen a list of problems which arise from disobeying this so-called rule. Not only have I never encountered any problems with my approach, I have actually avoided the problems which are inherent in their approach and have even provided facilities and cost savings which their approach cannot.
As far as I am concerned my practice of having a separate class for each table is in line with the following definition:
A class is a blueprint, or prototype, that defines the variables and the methods common to all objects (entities) of a certain kind.
If you look at the CREATE TABLE script for a table is this not a blueprint? Is not each row within the table a working instance of that blueprint? Is it not unreasonable therefore to put the table's blueprint into a class so that you can create instances of that class to manipulate the instances (rows) within that table?
It is important to take into consideration one very important fact when you are developing a piece of software which communicates with objects outside of itself - you need to model those objects and not some mythical objects which exist in a different plane. For example, an enterprise application may deal with objects such as "customer" and "product" which exist in the real world, but the application will NEVER communicate with real customers or real products, instead it will only communicate with information about those objects which exists in a database. A customer, who is a person, may have operations such as "stand", "sit", "walk", "run", "eat", "sleep" and "defecate", but these are irrelevant in a database application. Different products may have different sets of operations, but again these are irrelevant in a database application. Every piece of information on any object you can think of is held in a database object known as a "table" with each piece of data held in its own "column", and regardless of what real-world object a table represents, and regardless of the amount and type of data which is held for that object, every table in a database is subject to the same set of operations, and these operations are Create, Read, Update and Delete (CRUD).
In OOD you are supposed to use the idea of is-a to identify groups of entities which share a common set of characteristics, which includes properties as well as operations. This group identifies a "type" and each member of that group is a different implementation of that "type". You should then be able to create an abstract class to cover what is sharable in the group and then a separate subclass for each entity in that group to specify what is unique and therefore not sharable about that specific entity. For example, if you have objects A, B, C, D and E which are all of type "foo" and which share the same set of protocols (methods) then you can create an abstract class called "foo" to contain those sharable methods. This abstract class can then be extended into separate concrete subclasses for each of those objects. The abstract class called "foo" contains what is sharable while each subclass, using the power of inheritance, combines what is sharable with what is unique within that subclass. If I write software which is communicating with objects in a database, and every one of those objects is a table which shares common characteristics with all other database tables, then surely it is obvious that I should create an abstract table class which contains the properties and methods which can be applied to any database table and then inherit this class into every concrete table class that may exist. In this way everything which can be applied to any unspecified database table can be defined once in the abstract class, and each concrete class need only define those things which are specific to the particular table which it represents. This use of an abstract class then opens the door to implementing the Template Method Pattern which is the best way of mixing inherited methods with customisable "hook" methods.
Too many programmers apply this idea of is-a at entirely the wrong level. They encounter a table called "customer" and think to themselves "each customer is a person, so I must create a Person class and then extend it to create a Customer subclass". I'm afraid that this viewpoint is far too narrow for the following reasons:
If I have hundreds of tables in my database then the idea of creating a separate abstract class for each table which can then be inherited by only a small number of subclasses produces a minimal amount of reusable code. Instead I have just one abstract class to cover the concept of a non-specific table and a separate subclass for each physical table then I have. Which means that if I have hundreds of database tables then I have hundreds of subclasses which means I can share the contents of that single abstract class hundreds of times. The result is therefore a significant amount of reusable code instead of a minimal amount.
Enterprise applications are comprised of a number of user transactions (also known as units of work, use cases or tasks) where each transaction has a User Interface (UI) at the front end, one or more tables in a relational database at the back end, and some software in the middle which transports data between the two ends while applying any business rules. In a large application there can be thousands of transactions and hundreds of tables. With my previous languages I gradually progressed from using the 1-Tier Architecture through 2-Tier and eventually 3-Tier with its separate Presentation, Business and Data Access layers. With the UNIFACE language it was standard practice to create a separate component in the business layer for each database table, so I saw no reason why I should not continue with this practice and create a separate class for each database table so that each could then become an object in the business layer.
I have been told on more than one occasion that the "correct" and "approved" way of applying the principles of OOP is to start with Object Oriented Design (OOD) with its is-a and has-a relationships, compositions, aggregations and associations, then design all objects using these principles. The database is left till last and accessed through an Object-Relational Mapper (ORM) which is required in order to deal with the differences between the object structure and the database structure. As I did not know that these rules existed, let alone were obligatory, I did what had proved successful in the past and started with my database which I designed using the rules of Data Normalisation, then structured by software around the database as was taught in Jackson Structured Programming. This approach produced very good results in my PHP code, so I could see nothing wrong with it.
Other programmers may attempt to justify their criticisms of my approach by saying that I was self-taught, and therefore taught badly, instead of being properly trained in the mystic arts of OOP by someone who knew the "proper" way, the "correct" way, the "pure" way. Instead, after having read up on the mechanics of encapsulation, inheritance and polymorphism in the PHP manual, I used nothing but a mixture of my experience and plain common sense to add those concepts into my software in order to produce as much reusable code as I could. It should be noted that my use of an abstract class and the Template Method Pattern are both recommended in the following books:
Anybody who thinks that my usage of these two ideas is wrong ought to take it up with those authors.
My logic for creating an abstract table class which could be inherited by every concrete table class went as follows:
An abstract class cannot be instantiated into an object as it is devoid of essential details - it knows what properties/characteristics a database table may have, but it has no values for these properties/characteristics. In order to instantiate a table's object it is first necessary to inherit the abstract superclass into a concrete subclass, and it is the constructor of this subclass which provides all the missing details when it loads in the contents of the <table>.dict.inc file. The abstract table class deals with an unknown database table while each subclass deals with a specific database table with a specific set of characteristics.
When I first developed my code I did not start with an abstract class, instead I created a complete concrete class, without any inheritance, for a single database table, along with the six Controllers shown in Figure 1. I then copied this code into a second class file for a second database table. As you can imagine this produced a lot of duplicated code, so when I identified a block of duplicated code I looked for a way to define this code in a single place so that I could call this code in that single place instead of having to duplicate it. This is precisely why inheritance was invented, so I created a superclass to contain this code and added an extends statement to each table class. After moving all the duplicated code out of each subclass and into the superclass I found that the subclass contained nothing but the constructor. When I found the need to augment this inherited code with some specific code for a particular class I added some hook methods to the abstract class which could then be overridden in each subclass.
The first set of table classes I created by hand, but as I found myself simply transposing information from the database schema to a PHP script in order to define the properties of each table I decided to automate this process. I designed and built a separate piece of software called a Data Dictionary which allowed me to import details from the database schema into an application database, then export that information to produce two PHP scripts - a table class file and a table structure file. The logic behind having two separate files is that the import/export process can be rerun whenever the structure of a table changes, but only the table structure file will be overwritten. The table class file is never overwritten as it may have been updated by a developer to include some of the variant/customisable methods which are part of the Template Method Pattern.
I performed a similar refactoring process on my Controller scripts. I noticed that the only difference between the script for table #1 was exactly the same as the script for table #2 except for the identity of the table, so I changed the script to remove the hard-coded table identity and to accept it from a variable which is defined in a separate component script.
OO programmers often say that OO Design is incompatible with Database Design, which is precisely why I don't waste my time with it. Once I have designed a properly normalised database I don't have to design any classes to communicate with that database as I can automatically generate a class file for each table. Some people are mystified when they see one of these newly generated class files as there is nothing there but the constructor. "Where is the code?" they ask in a bewildered voice. When they see the word extends in the class definition the truth begins to dawn on them. By default there are no methods other than the constructor in a class as all the standard boilerplate code is inherited from methods which are defined in the abstract class. Each method call made by a Controller on a Model refers to a method which has been pre-defined in the abstract table class, and each method is an instance of the Template Method Pattern.
Because I don't use OOD I never end up with that problem called Object-relational Impedance Mismatch which then requires that abomination of a solution called an Object-Relational Mapper (ORM).
Another point to notice is that I never knew of the rule which said that each column in a database table should have its own separate property in the class, along with a pair of getter/setter methods to read and write their values. My experience of writing database applications for the previous 20 years told me that databases deal in datasets where each set can contain any number of columns and any number of rows, so I did the obvious thing and built into the abstract table class a single property called $fieldarray which could hold whatever dataset the object needed to handle at that time. This then enabled me to insert the entire contents of the $_POST array into an object using a single argument on a method call instead of splitting that array into its component parts and injecting one named component at a time. It also allowed me to extract all the data from an object in a single array variable which then made it easy to write a single reusable View service to transform that data into an HTML document. The use of an array with variable contents also allows different subsets or supersets of data to be used on insert and update queries, and also allows SELECT queries to contain additional columns from other tables using JOINs.
Part of the data which is loaded in by the constructor is the $fieldspec array which contains the specifications for every column within that table. This then makes it possible to write a standard routine which uses those two arrays, one of field values and another of field specifications, to validate that the data which is supplied for a field actually matches its specification before it is allowed to pass through on its way to the database. Because this validation is performed automatically by the framework it is another section of boilerplate code which I do not have to add to each table class.
When I started programming we used the name transaction to identify a unit of work which performed a task which was useful to the user, something which allowed him/her to carry out their business duties. When databases later introduced the concept of a database transaction this was renamed to business transaction or sometimes user transaction. Some design methodologies use the name use case or event, but in the RADICORE framework I use the short name task. There is a table called TASK, with a primary key of task_id, which has a separate entry for each task which is available in the application. There is another table called MENU which is used to split this huge list of tasks into options on menu buttons and a similar table called NAV_BUTTON which shows which options are available on navigation buttons when that task is active.
After having created a database table it is usual to create one or more tasks to maintain the contents of that table. The most common series of tasks, what I refer to as a family of forms, is shown in figure 1:
Figure 1 - A typical Family of Forms
In my early COBOL days this family would be merged into a single large component (also known as a user transaction). There are still some programmers today who regard all six of the above transactions to be a single use case, so they build a single controller for that use case. The problem with this outdated approach is that each of those six parts has a different screen and different behaviour. Each part is called a "mode", so each time it is called you have to identify which mode - either LIST, SEARCH, INSERT, UPDATE, DELETE or ENQUIRE - is actually required so that the controller exhibits the correct behaviour for the current mode. This has several problems:
In order to avoid these problems I decided to break down the single large multi-mode component into a series of single-mode components, as discussed in Component Design - Large and Complex vs. Small and Simple. I therefore have a separate user transaction for each mode - LIST, SEARCH, INSERT, UPDATE, DELETE and ENQUIRE. Each of these reference the same table class in the business layer. This arrangement has the following advantages:
When I originally took the decision to rebuild my previous COBOL and UNIFACE frameworks using PHP I had already noticed a pattern of similar screen structures when performing similar operations on different tables, so I made the decision to build my web pages using a system of templates instead of building each one individually by hand. I had already been exposed to the use of XML and XSL stylesheets in the UNIFACE language, and after confirming that these technologies could be employed quite easily in PHP I decided to build all my web pages in this way. This involved building a standard routine which extracted all the application data out of the Model(s) and copied it into an XML file, loading in an XSL stylesheet which was defined separately, then performing an XSL Transformation to convert the XML document into an HTML document. This was a decision that I never regretted as, with lots of refactoring over several years, it has enabled me to build large numbers of different web pages using just a small number of reusable XSL stylesheets. For example, in my ERP application I have over 4,000 web pages which are produced from just 12 XSL stylesheets. This is in contrast with those people who manually create a separate XSL stylesheet for each web page in order to identify which piece of data goes where and with what HTML control.
I found this method of building web pages to be far, far better than that which I had seen in all those code samples I had found on the internet which involved outputting fragments of HTML code during the execution of the script. This meant that if you suddenly decided to jump to another page instead of completing the current one you would invariably encounter the "headers already sent" error. Using the XML/XSL method you can wait until all the processing in the Model(s) is complete, and then call the standard View object to produce the HTML output. Using this method it does not matter in which order the data is added to the XML document, just that it is there. It can be pulled out of the XML document in any sequence which is specified within the XSL stylesheet.
Some people may be wondering how I can possible create 4,000 different web pages from over 400 different database tables using just 12 XSL stylesheets. If you look at the screen layouts in my library of Transaction Patterns you should see that each page can be broken down into a series of areas or zones each of which is populated with data from different sources. The menu bar uses data from the MNU_MENU table, the title bar uses data from the MNU_TASK table and the MNU_TASK_QUICKSEARCH table, and the navigation bar uses data from the navigation_button table. While the pagination, scrolling and action bar areas are generated by the framework at runtime, the data area(s) is/are provided by whatever Model classes are used in that particular task. In order to tell the XSL stylesheet which element of application data goes where on the web page I use a small screen structure file whose information is copied into the XML document. The identity of that file is specified in the component script which is the starting point for each task in the application. This means that I can control what piece of application data goes where on the screen without having to modify any XSL stylesheets.
When you have written as many user transactions as I have where each transaction performs one or more operations on one or more database tables and displays the results in an electronic form on a monitor, either as a compiled form or an HTML page, you should be able to see some patterns emerge. For example, once you have a built a family of forms for one table, how much effort would be required to build the same family of forms for another table? After I had built a family of forms for Table#1 and another for Table#2 all I had to do was identify what was similar and what was different between the two families so that I could encapsulate the similarities in some sort of reusable module. As documented in What are Transaction Patterns I managed to split each form into the following categories:
The next step was to find a way to create reusable code for each of these categories. This I achieved as follows:
In the six forms shown in the forms family it should be noted that while each of them has a separate Controller the LIST1 form uses its own XSL stylesheet which displays application data using the List View while the other 5 share the same XSL stylesheet which displays data using the Detail View. All six of these tasks also use the same Model component.
I originally created the scripts for each application task by hand, but after a while I constructed a new piece of software which could do this automatically. I created a library of Transaction Patterns, which encapsulate a particular combination of structure and behaviour, then created a mechanism whereby I could link a Pattern with a database table to identify the missing element, the content, and at the press of a button this would generate the necessary scripts. I also made this software add the task to the MENU database so that it could be run immediately.
Every task in a database application, no matter how simple or how complex, has exactly the same characteristics:
An experienced database programmer should be able to see that the code for item #1 in the above list is virtually the same regardless of the contents of the database table. This is often referred to as boilerplate code. On the other hand the code for item #2 is unique for each table or each task. Instead of having to duplicate the boilerplate code in each class it would be very useful to find a mechanism which allowed you to define such code in a single reusable module, and then refer to that module as and when required. It would also be useful if such a mechanism allowed the insertion of any additional unique or custom code. This is where the Template Method Pattern comes into play. Each of the four methods in item #1 above has been implemented as a Template Method in the abstract table class which means that each will execute its own set of sub-methods, some of which will be invariant with fixed implementations while others will be variant/customisable methods which can be defined in each subclass as and when required. You can see a visual representations of this idea in my collection of UML diagrams.
Although I started with just the six Controllers shown in Figure 1, this has been expanded to 40 to cater for the more complex situations which I encountered while developing a large enterprise application. Each of the concrete table (Model) classes shares the same set of public methods which are inherited from the abstract table class, so this automatically provides for a huge amount of polymorphism. When you consider that polymorphism is required before you can make use of Dependency Injection, which is another way of increasing the amount of reusable code, this should be considered as being a good idea. Each of my 40 Controllers uses these methods to communicate with a Model which means that ANY Controller can be used with ANY Model. So if I have 40 Controllers (one for each Transaction Pattern) and 450 Model classes this produces a grand total of 40 x 450 = 18,000 (yes, EIGHTEEN THOUSAND) opportunities for polymorphism.
I do not use a Front Controller in my framework, so the URL for each task points directly to a separate component script in the file system. This is a very simple script which does nothing but identify the different components that are required to implement that task, such as in the following:
<?php $table_id = "person"; // identify the Model $screen = 'person.detail.screen.inc'; // identify the View require 'std.enquire1.inc'; // activate the Controller ?>
Each Controller script is then able to instantiate the relevant table class into an object so that it can make the method calls which are relevant to that Controller using code similar to the following:
require "classes/$table_id.class.inc"; $dbobject = new $table_id; $fieldarray = $dbobject->getData($where); $fieldarray = $dbobject->insertRecord($_POST); $fieldarray = $dbobject->updateRecord($_POST); $fieldarray = $dbobject->deleteRecord($_POST);
As you can see this uses my own version of Dependency Injection.
My methodology may not be the same as that used by most developers, but surely it is the results which are more important? There was a study in 1996 in which the productivity of two teams was compared to find out why one team was twice as productive as the other. The study broke down the code which was written into various categories - business logic, glue code, user interface code, database code, etc. If one considers all these categories, only the business logic code had any real value to the organisation. It turned out that Team A was spending more time writing the code that added value, while team B was spending more time gluing things together. With my approach I can create a table in the database, import it into my Data Dictionary, then generate both the class files and user transactions for the family of forms shown in figure 1 at the press of a button and be able to run those transactions within 5 minutes, all without having to write a single line of code - no PHP, no HTML, no SQL. This means that the developers have to spend far less time in writing boilerplate code, which leaves them with far more time to spend on the code which has actual value to the organisation - the business logic.
Virtually every article I read on the subject of OO includes the word "abstraction" without supplying a meaningful definition which identifies what this word actually means and how it is supposed to be applied when designing and creating classes. I have seen many attempts, usually feeble and sometimes conflicting, so I was rather pleased when I came across a paper called Designing Reusable Classes which was published in 1988 by Ralph Johnson and Brian Foote. This paper is discussed in depth in my own article The meaning of "abstraction" for which the summary is quite simple:
The idea is that you examine the collection of entities which will be of use in your application and separate the abstract from the concrete, the similar from the different. All the similar protocols (methods) can then be moved to an abstract class while the differences remain in each concrete class. You can then combine the similar with the different using class inheritancewhich is described in Designing Reusable Classes as follows:
Data abstraction encourages modular systems that are easy to understand. Inheritance allows subclasses to share methods defined in superclasses, and permits programming-by-difference.
Class inheritance has a number of advantages. One is that it promotes code reuse, since code shared by several classes can be placed in their common superclass, and new classes can start off having code available by being given a superclass with that code. Class inheritance supports a style of programming called programming-by-difference, where the programmer defines a new class by picking a closely related class as its superclass and describing the differences between the old and new classes. Class inheritance also provides a way to organize and classify classes, since classes with the same superclass are usually closely related.
If you are developing a database application which has hundreds of entities you should notice the following list of similarities:
As stated by Johnson and Foote
it is better to inherit from an abstract class than from a concrete class.
|$this->dbname||This value is defined in the class constructor. This allows the application to access tables in more than one database. It is standard practice in the RADICORE framework to have a separate database for each subsystem.|
|$this->tablename||This value is defined in the class constructor.|
|$this->fieldspec||The identifies the columns (fields) which exist in this table and their specifications (type, size, etc).|
|$this->primary_key||This identifies the column(s) which form the primary key. Note that this may be a compound key with more than one column. Although some modern databases allow it, it is standard practice within the RADICORE framework to disallow changes to the primary key. This is why surrogate or technical keys were invented.|
|$this->unique_keys||A table may have zero or more additional unique keys. These are also known as candidate keys as they could be considered as candidates for the role of primary key. Unlike the primary key these candidate keys may contain nullable columns and their values may be changed at runtime.|
|$this->parent_relations||This has a separate entry for each table which is the parent in a parent-child relationship with this table. This also maps foreign keys on this table to the primary key of the parent table. This array can have zero or more entries.|
|$this->child_relations||This has a separate entry for each table which is the child in a parent-child relationship with this table. This also maps the primary key on this table to the foreign key of the child table. This array can have zero or more entries.|
|$this->fieldarray||This holds all application data, usually the contents of the $_POST array. It can either be an associative array for a single row or an indexed array of associative arrays for multiple rows. This removes the restriction of only being able to deal with one row at a time, and only being able to deal with the columns for a single table. This also avoids the need to have separate getters and setters for each individual column as this would promote tight coupling which is supposed to be a Bad Thing ™.|
Each of these characteristics is defined as an empty property in the abstract class and filled with information when a concrete subclass is instantiated when it loads the contents its table structure file which is exported from the Data Dictionary.
Because these methods are shared by every concrete class they become "plug compatible" and can be swapped with other classes using the power of polymorphism. In this way when a Controller calls a method on a Model it does not need to know which Model it is addressing as they all share the same protocols. This also means that I can employ a design pattern known as the Template Method Pattern where you can define a series of invariant and variable methods in the abstract class which allows each subclass to define its unique business rules in its own variable "hook" methods.
While large numbers of different tasks may implement the same pattern a large and complex application may result in the creation of a large collection of different Transaction Patterns. RADICORE contains over 40 such patterns which have been used to create a large application which contains over 4,000 tasks.
Although the paper Designing Reusable Classes was published in 1988 I only discovered it quite recently as no other documents on OOP referred to it as a source of information. I find this very surprising as it provides a more substantial, meaningful and insightful description of the term "abstraction" than all the other airy-fairy, wishy-washy, limp-wristed attempts. I also find it surprising that despite not being trained to follow the ideas in this paper that using nothing more than my experience and intuition I managed to design my own system of abstract and concrete classes which demonstrate the preferred method of class inheritance which in turn provides vast amounts of polymorphism which all contribute to a collection of classes which provide more reusability than I have seen with any other framework. Perhaps I am not so dumb after all!
That is because they were deliberately taught NOT to do it that way by someone who did not understand how databases work and how to write code which interacts with a database. If these teachers consider that doing it "that way" causes problems, then where is this list of so-called problems published? If doing it "that way" does not cause any actual problems then how exactly can it be wrong?
If I have followed the advice given in Designing Reusable Classes and have managed to identify many portions of similar code and transferred that code into reusable modules which can be shared instead of duplicated then how can anyone possibly say that my methods are wrong? The efficacy of a methodology should be judged by the amount of reusable code which it produces, not how closely it follows a set of arbitrary and artificial rules. The proof is in the pudding - a meal should be judged by the way that it tastes, not by evaluating the recipe used to create it.
This is because they are not genuine rules, just personal preferences. If I break a genuine rule then my code will not work. If I break these artificial rules then not only does my code NOT break, it actually works better. OOD was not formulated with database applications in mind, therefore it does not provide an optimal solution. Using two different design methodologies - one for the software and another for the database - would be a recipe for disaster. I always design the database first using the rules of normalisation, then I skip OOD completely and force my software structure to follow my database structure. In that way I avoid the problem of Object-relational Impedance Mismatch which then requires that abomination of a solution called an Object-Relational Mapper (ORM).
Writing database applications requires a knowledge of how databases work, yet few OO developers understand this. They start with OO theory and then try to apply that theory in different scenarios, then complain when difficulties arise. I have seen such excuses as:
The advantage I have is that I worked with a variety of database systems - hierarchical, network and relational - for several decades using non-OO languages, so I knew how to design databases and write database applications. When I switched to an OO-capable language all I had to do was learn how to leverage the new concepts - encapsulation, inheritance and polymorphism - in order to write programs with more reusability.
Closely related to OOD is Domain Driven Design (DDD) with even more artificial rules which I choose to ignore. When writing a large enterprise application which consists of a number of distinct subsystems/domains I do not need to embark on a separate design process for each of those subsystems/domains for one simple reason - while each of those domains is different they all share one common attribute in that they are ALL database applications, and I have learned to build database applications from a common set of patterns. This is explained more in Why I don't do Domain Driven Design and The Template Method Pattern as a Framework.
Then you don't understand how to use inheritance and the Template Method Pattern. All code which is common to all database tables is defined once in the abstract table class and then inherited by every concrete table class. Every piece of code which can be shared is defined once within the abstract class. Every piece of code which is specific to a particular table is defined within that table's subclass. If code is defined once and shared multiple times then where exactly is the duplication?
If you don't know how to write reusable software I strongly suggest you read Designing Reusable Classes which was published in 1988 by Ralph E. Johnson & Brian Foote. I discuss this in more detail in The meaning of "abstraction".
This idea is propagated by those who don't understand how to use inheritance properly. Inheritance only causes problems when it is overused, such as to create deep inheritance hierarchies, or to inherit from one concrete class to create a different concrete class where some methods in the superclass are not appropriate in the subclass. This leads to such complaints as Inheritance breaks Encapsulation. Problems such as these can be avoided by only inheriting from an abstract class, which is precisely what I am doing.
This is discussed further in What is/is not considered to be good OO programming.
This complaint was explained in the following ways:
Abstract concepts are classes, their instances are objects. IMO The table 'cars' is not an abstract concept but an object in the world.
Classes are supposed to represent abstract concepts. The concept of a table is abstract. A given SQL table is not, it's an object in the world.
This complaint is completely bogus for the simple reason that what I have implemented actually follows what has been written:
While the structure of each database table is different the way that each table is handled is exactly the same. When a concrete table class is instantiated into an object there is code in the constructor which loads in metadata from the corresponding dictionary file into the instance. This metadata is used by the framework at runtime to guide the processing of that table. For example, the $fieldspec array is used by the standard validation class to check that each piece of data can be inserted into the database without causing an error. The parent relations array can be used when constructing SELECT queries while the child relations array can be used for referential integrity checks when deleting a record.
As far as I am concerned Object Oriented Programming (OOP) requires nothing more that writing programs around objects, thus taking advantage of encapsulation, inheritance and polymorphism to increase code reuse and decrease code maintenance. The efficacy of your OOP implementation can be measured in the levels of reusability that you achieve.
This is discussed further in What OOP is NOT and What OOP is.
Nobody in their right minds would create a separate subclass for each row in a table. Each class provides a blueprint for each instance of that class just as each table definition provides a blueprint for each row in that table. In a database it is only the concept of an unspecified table which is abstract while a particular table with a particular name and particular structure and particular business rules is a concrete instance of that abstraction. Just as a Person table can hold many rows, one for each person, my Person object is capable of having many instances, one for each person. I do the same for every table in my databases - one table, one class, where each instance of that class can hold as many rows as is necessary. This follows the Table module pattern which Martin Fowler wrote about in his book Patterns of Enterprise Application Architecture. He also has Class Table Inheritance and Concrete Table Inheritance, but as these talk about hierarchies of tables, which I do not have, I do not use them. I do not have table hierarchies in my software as there are no such table hierarchies in the database. The use of relationships may indicate that one table is related to another, but this in no way forces me to go through one table in a relationship in order to get to the other. Each table can be treated as an independent object - in fact it has to be for insert, update and delete operations - but for read operations it is possible to combine data from several tables by using JOIN clauses in the SELECT query.
How can I create an instance of a class without having a class to start with? I cannot create an instance of an abstract class and then supply it with the information it needs as the rules of OO explicitly prohibit instantiating an abstract class into an object. And you have the nerve to tell me that I don't understand OO!
As for losing the potential benefits of low maintenance, you are talking out of the wrong end of your alimentary canal. All the common code is contained within a single abstract table class, thus following the DRY principle, and each of my 400 concrete table classes contains only that code which is specific to that table while sharing all that code which it inherits from the abstract class.
How so? While it is true that there is a dependency between the Business layer and the Data Access layer what exactly is the problem? Every database application ever written has this same dependency, so what exactly is the problem?
Why not? Each table is a separate entity in the database, with its own structure and its own business rules, so why shouldn't I create a separate class to encapsulate this information? It has to go somewhere, so where would you put it? Besides, creating a new table class is very easy with my framework:
Once I have created the class for that table I can then use the generate PHP script facility in my Data Dictionary to create as many user transactions as is necessary to deal with that table. Each transaction can be run immediately from the framework. These will perform the basic operations after which the developer can add in as many customisations as is necessary by inserting code into the relevant "hook" methods.
No I don't. If I need to change a table's structure I deal with it in three simple steps:
I do NOT have to do any of the following:
You may have difficulties with your implementation, but remember that my implementation is totally different, which it had to be in order to eliminate those difficulties.
Every newbie programmer is taught that "design patterns are good", so they do the stupid thing and try to implement as many design patterns as possible. They have their favourite patterns and cannot understand why everyone else does not use the same ones. There are even arguments as to how each pattern should be used. Take a look at the criticisms against my implementation of the MVC pattern as an example.
There is no such thing as a set of "right" design patterns which everyone should use. The correct approach is not to pick a collection of patterns and then try to force them into your code, it is to write code that works, then refactor it into those patterns which are appropriate for your particular circumstances.
I do not pick patterns in advance and then attempt to write code which implements them. Instead I write code that works, and after I have got it working I refactor it as necessary to ensure that the structure and logic are as sound as possible. If the code then appears to match a particular pattern then that is pure coincidence - it is more by accident than design (pun intended!) This follows the advice of Erich Gamma, one of the authors of the Gang of Four book who, in this interview, said the following:
Do not start immediately throwing patterns into a design, but use them as you go and understand more of the problem. Because of this I really like to use patterns after the fact, refactoring to patterns.
In case you think that I don't use design patterns at all you would be mistaken. If you bothered to look closely at my framework you would see the following:
This topic is discussed further in You are using the wrong design patterns and You don't understand Design Patterns.
Too many people seem to think that code which does not follow their definition of OOP is not "proper" OOP at all, therefore it must be procedural. In this context the term "procedural" is used as an insult. Some of these criticism are explained in In the world of OOP am I Hero or Heretic?. A more detailed response is contained in What is the difference between Procedural and OO programming?
Some people seem to think that just because my abstract table class is bigger than what they are used to that it is automatically bad. They seem to think that there is a rule which says that a class cannot have more than N methods, and each method should not have more than N lines of code (where N is a different number depending on to whom you talk). As far as I am concerned this rule was only invented to cater for those people who are so intellectually challenged they cannot count to more than 10 without taking their shoes and socks off. I am obeying the rule of encapsulation which states quite clearly that when you have identified an entity you create a class for that entity which contains ALL the properties and ALL the methods that the entity requires. Note that as I use the 3-Tier Architecture each table class contains only business logic - all data access logic and presentation logic are in separate components. I am following the Single Responsibility Principle (SRP), so how can I be wrong?
Besides, this class cannot be used to create a God Object as it is an abstract class and cannot be instantiated into an object. This class is inherited by each concrete table class, of which I now have over 400, so each table class has its own object. There is no such thing as a single object which handles all database tables.
The definition of a God Object also states that it contains a majority of the program's overall functionality, and in my book "majority" means "greater than 50%", and after counting all the lines of code in my framework I can report that it only contains 17%. That is measurable proof, so your opinion is not worth the toilet paper on which it is written. This also invalidates the claim that I have multiple God classes.
This topic is discussed in more detail in You have created a monster "god" class and A class with multiple methods has multiple responsibilities.
People look at an example of one of my concrete tables classes which starts off by containing nothing more than a constructor and because it is so small they surmise that it must be anemic. Perhaps they don't notice the use of the word "extends" which allows it to incorporate code from a huge abstract class. An anemic domain model is supposed to be one which contains data but no methods to process any business rules, but if you opened your eyes and looked close enough you would see that all the necessary methods, which includes data validation, are defined in and inherited from the abstract table class.
This topic is discussed in more detail in You have created an anemic domain model.
Can anyone explain to me how some people looking at my table classes consider that my abstract table class is a God Object which does too much while my concrete table classes are all anemic as they do too little? Surely these two accusations are mutually exclusive?
The full complaint, as explained in OOP for Heretics, was as follows:
If you have one class per database table you are relegating each class to being no more than a simple transport mechanism for moving data between the database and the user interface. It is supposed to be more complicated than that.
You are missing an important point - every user transaction starts life as being simple, with complications only added in afterwards as and when necessary. This is the basic pattern for every user transaction in every database application that has ever been built. Data moves between the User Interface (UI) and the database by passing through the business/domain layer where the business rules are processed. This is achieved with a mixture of boilerplate code which provides the transport mechanism and custom code which provides the business rules. All I have done is build on that pattern by placing the sharable boilerplate code in an abstract table class which is then inherited by every concrete table class. This has then allowed me to employ the Template Method Pattern so that all the non-standard customisable code can be placed in the relevant "hook" methods in each table's subclass. After using the framework to build a basic user transaction it can be run immediately to access the database, after which the developer can add business rules by modifying the relevant subclass.
Some developers still employ a technique which involves starting with the business rules and then plugging in the boilerplate code. My technique is the reverse - the framework provides the boilerplate code in an abstract table class after which the developer plugs in the business rules in the relevant "hook" methods within each concrete table class. Additional boilerplate code for each task (user transaction, or use case) is provided by the framework in the form of reusable page controllers.
Surely if I make something more complicated than it need be I would be violating the KISS principle? You must be a member of the Lets's-make-it-more-complicated-than-it-really-is-just-to-prove-how-clever-we-are brigade. The alternative idea, that of having a class which is responsible for a group of tables, is something I could never dream up in a million years. I have worked with databases for several decades, and just as the DBMS itself treats each table as a separate entity with its own name, structure and set of business rules, then I follow the principles of OOP and encapsulate that structure and all those rules in a separate class. This simple approach has enabled me to identify and take advantage of a wide range of benefits which are explained below.
While it is true that each table class, when it it first created, can only handle the basic transportation of data between the UI and the database (which includes all primary validation) my implementation of the Template Method Pattern allows for any additional complexity to be covered by inserting code into the relevant hook methods within each subclass.
Incorrect. This approach is useful for any CRUD application whether simple or complex. Every program in a database application will perform one or more CRUD operations on one or more tables and execute any number of business rules, so the only difference between a "simple" and a "complex" program is the number of tables it accesses and the amount of extra code required to deal with each of those business rules.
If my framework could not offer anything other than the six patterns which are part of this family of forms then you might have a point. However, my long experience with developing enterprise applications which cover a variety of different domains and which contain user transactions of varying complexity has enabled me to create a library of 45 Transaction Patterns which contain steps comprised of a mixture of invariant and variant methods. It is very easy to create a starting transaction from any of these patterns simply by pressing buttons and without having to write a single line of HTML, SQL or PHP code. This will enable you to run the transaction with basic behaviour which you can then customise to your heart's content. Additional business rules can be processed simply by inserting the relevant code into any of the available "hook" methods.
It is a feature of my framework that every method called from a Controller on a Model is an instance of the Template Method Pattern. This is because every concrete table class inherits from the same abstract table class which consists of nothing but a huge collection of template methods. While the abstract class contains the implementation for each of the invariant methods, each variable/customisable method can have a totally separate implementation in each subclass.
Then your implementation was obviously faulty. Mine was not. Perhaps you need to rethink your ideas on how OOP should be implemented and concentrate, as I do, on producing results instead of following arbitrary rules.
Databases do not have associations they have relationships. I do not have any methods in any table class which deal with any associations that the table may have, and I do not have any classes which deal with several tables. All I have in each table class is a list of $parent_relations and $child_relations which identify if any relationships involving that table exist. The code to deal with each type of relationship has been standardised and is built into the framework, so needs no further action by the developer. More details can be found in the following:
Every table has its own class simply because every table is a distinct entity in the database which has to be accessed using exactly the same set of CRUD operations as every other table. Every table has its own structure and its own set of business rules, and by creating a class which would be responsible for more than one database table I would, in my humble opinion, be violating the Single Responsibility Principle (SRP).
While it is recognised that it is necessary to have an object in your software which represents an entity in the outside world with which the software will communicate, what exactly is meant by "entity"? What happens if an entity contains many different parts? Those people who start their programming career learning OO theory are taught about object associations and object aggregations, but struggle to deal with them when designing and writing the software to deal with them in a database. While OO theory says that you can deal with all associations, aggregations and compositions in the same way, I built database applications for 20 years before using an OO language, and during that time I learned how to deal with relationships in a standard way. This is documented in the following:
The idea that when a group of entities form what can be described as an aggregate then requires a single class to deal with that group does not follow the way in which database work. A relationship between two tables is signified by nothing more that one table having a foreign key which links to the primary key of another table. The operations that can be performed on a table are the same regardless of how many relationships may exist or what type they may be. Relationships are dealt with outside of the table, so in my software they are also dealt with outside of the table class, by specialist components in the Presentation layer. There are no special methods inside any table class to deal with any relationships, there are simply two arrays, called $parent_relations and $child_relations, which identify if a relationship exists and the mapping between the foreign key of the child and the primary key of the parent. The code to deal with each relationship in a manner which is suitable to the user is supplied by the framework in one of the numerous reusable Transaction Patterns. While a parent-child relationship can be dealt with using one of a choice of patterns, it depends on whether it produces a fixed hierarchy, such as for a SALES ORDER, or recursive hierarchy, such as for a BILL OF MATERIALS (BOM) which can be viewed in its entirety using a TREE pattern.
The idea that I should have a single Model class to handle all those tables in an association or aggregation is ridiculous enough, but when you couple it with the idea followed by some programmers that the Model should be handled by a single Controller this, in my humble opinion, transforms it from the ridiculous to the idiotic. I find the whole idea so amusing that sometimes I laugh so hard I can feel the tears running down my trouser leg. The purpose of OOP is supposed to be the creation of more reusable code as the more code you can reuse then the less code you have to write and the quicker you can finish, which means higher levels of productivity. My methodology provides me with a vast amount of reusable code, as explained in more detail in Why I don't do Domain Driven Design, which makes me far more productive than most other programmers. If my levels of productivity are better then surely the methodology which produces that level of productivity must surely be better?
If you think that my claims of greater productivity are at best exaggerated or at worst a bare-faced lie then you should take this challenge. If you cannot achieve within five minutes with YOUR methods what I can achieve within five minutes with MY methods, all without writing a single line of code, then I shall conclude that any criticisms which you keep throwing in my direction are not worth the toilet paper on which they are written and that you are talking out of the wrong end of your alimentary canal. Instead of simply claiming that your methods are superior to mine I challenge you to prove it. Either put up or shut up.
The Command Query Responsibility Segregation (CQRS) pattern deals with the notion that you can use a different model to update information than the model you use to read information. For some situations, this separation can be valuable, but beware that for most systems CQRS adds risky complexity. The above article contains the following statements:
The mainstream approach people use for interacting with an information system is to treat it as a CRUD data store. By this I mean that we have mental model of some record structure where we can Create new records, Read records, Update existing records, and Delete records when we're done with them. In the simplest case, our interactions are all about storing and retrieving these records.
As our needs become more sophisticated we steadily move away from that model. We may want to look at the information in a different way to the record store, perhaps collapsing multiple records into one, or forming virtual records by combining information for different places. On the update side we may find validation rules that only allow certain combinations of data to be stored, or may even infer data to be stored that's different from that we provide.
As this occurs we begin to see multiple representations of information. When users interact with the information they use various presentations of this information, each of which is a different representation.
Note that this definition of CQRS is an extension of the Command-Query Separation (CQS) pattern which was devised by Bertrand Meyer as part of his pioneering work on the Eiffel language which states that:
Every method should either be a command that performs an action, or a query that returns data to the caller, but not both. In other words, asking a question should not change the answer. More formally, methods should return a value only if they are referentially transparent and hence possess no side effects.
It should be obvious from my design that because I use methods which mirror the CRUD operations which can be performed on any database table on a one-for-one basis that a call to the getData() method does nothing but issue an sql SELECT query which does not alter the database. This therefore automatically satisfies the above principle.
There are some people, however, who seem to take great delight in expanding or stretching the words in a principle and end up changing its meaning entirely. The CQS principle talks about having different methods for commands and queries, which to me implies different methods in the same object, yet some people like to go one step further and put those methods into different objects. This to me is a step too far.
The idea that you need different Models to provide different representations of the data tells me that he doesn't understand how the Model-View-Controller design pattern is supposed to work. In this pattern the Model does nothing but output raw data while it is the View which formats that data into the representation required by the user. The same Model can be used with any number of different views to provide an equal number of different representations. There are many ways in which a Model's data can be incorporated into a web page, but how that data is presented is of no concern to anything but the View. If I wish to create a task which outputs data to a CSV file then I construct it from either the OUTPUT1 or OUTPUT4 patterns. If I wish to create a task which outputs data to a PDF document then I create it from either the OUTPUT2 or OUTPUT3 patterns. This is without the need to create a new Model. I even have specialist pattern to create address labels.
The idea that you would need separate classes to deal with separate operations is a non-starter for me as it violates the principle of encapsulation which states that a class should contain ALL the data for an entity and ALL the operations which can be performed on that data. The only time I create a subclass for one of my concrete table classes is when I want to have totally different code in one or more "hook" methods in order to provide different behaviour in a different task. For example in the DICT subsystem I have the following class files:
Many people seem to be annoyed at my heretical approach to OOP, but I don't care. It has enabled me to avoid a lot of the unnecessary coding that other approaches seem to require, and to provide a framework that takes care of a lot of the standard functionality that is needed in a database application. Thus I can achieve more with less, and isn't that supposed to be the benefit of using OOP in the first place? Below is a list of the areas in which I can save time:
Some people seem to think that the way that you design an application depends on the language you will use to implement it, that designing for an OO language is totally different from designing for a procedural language. I disagree. While working for various software houses in the past I would often visit a potential client to gather the requirements for a new system which they wanted, usually to replace an old system which was becoming more of a hindrance than a help. The requirements often started with "more management reports", so we would make a list of what reports they needed and what data needed to be included in each report. From this we would start designing the database which would provide the data for each report. In the 1980s a lot of these reports were printed on paper, but nowadays they are either provided as online screens, spreadsheets or PDF documents.
Having identified the data outputs and the data storage we then had to identify the data inputs. The end result was what is known as a logical design as it existed only on paper. This contained a preliminary database design plus a list of user transactions (use cases) which will would allow the users to insert, update and display that data. Each transaction was rated on its complexity which included the number of tables it needed to access, how they would be accessed, and what business rules needed to be implemented. Part of this process was to trace each piece of data from its input, its storage and it output to ensure that we knew were it came from and where it was going. The data structures were also put through a process known as Data Normalisation to ensure that they could be access as efficiently as possible.
This logical design, still in paper form, would then be discussed with the client to ensure that it met all of their requirements. The next stage would be to produce the physical design which would identify the hardware requirements, which DBMS would be used, and the choice of development language and possibly development tools such as frameworks. The volume of data which would be input each day would be used to judge the size of the database, and the number of users who would access the system at the same time would be used to judge the size of the CPU. Database backups and archiving strategies would also add to the hardware costs. The number of transactions and their complexity could be used as a guide to the development costs. Note that the cost of building a piece of software remains the same regardless of how many times it is run, whether it be a thousand times a day or just once a month.
This design process remained the same regardless of the development language for the simple fact in a database application the most important part is the database design closely followed by the requirements of all the user transactions that will be necessary to move the data into and out of the database. It is the software itself which is the implementation detail. This means that I do not need to design the software separately using either Object-Oriented Design (OOD) or Domain-Driven Design (DDD) as everything can be built using standard patterns. By not using two incompatible design methodologies my software structure is always in sync with my database structure, so I avoid the problem known as Object-relational Impedance Mismatch which then means that I do not have to work around that problem by using that abomination of a solution called an Object-Relational Mapper (ORM). Prevention is always better than cure.
Most programmers overuse inheritance by creating deep class hierarchies and inheriting from one concrete class to create another concrete class. The practice which I followed instinctively, which was later backed up by the experts, was to only inherit from an abstract class. I knew from my previous experience that every table in the database should be treated as a separate entity, and that because every table is subject to the same CRUD operations that the code for these operations could be placed in an abstract table class so that it could then be inherited and shared by every concrete table class. The use of an abstract class then enabled the use of the Template Method Pattern so that I could place custom code inside "hook" methods within each concrete table class to override the standard processing.
The abstract table class is supplied as part of the framework, and every concrete table class which is generated from the Data Dictionary will automatically inherit from this abstract class.
The only time I ever create a subclass of a concrete table class is when I need to provide a totally different implementation in any of the "hook" methods. For example, in the DICT subsystem I have the following class files:
As far as I am concerned all the necessary design patterns have been built into my framework. I started off by using the 3-Tier Architecture, but because I ended up by splitting the presentation layer into two separate components a colleague pointed out that this was also an implementation of the MVC design pattern. This resulted in a four-part structure which is shown in Aren't the MVC and 3-Tier architectures the same thing? The four components are as follows:
All the public methods in the abstract table class implement the Template Method Pattern which include "hook" methods so that custom logic can easily be added to each concrete table class.
Every concrete table class follows exactly the same pattern, so it can be constructed by the framework and not by the developer. As each of these classes represents a different database table it can use that table's details which already exist in the database schema. Each class file can be generated by the framework's Data Dictionary in two simples steps:
If a table's structure ever changes all that needs to be done is to repeat the import and export process which will cause the structure file to be recreated. The class file will not be overwritten as it may have been modified to include code in customisable "hook" methods. The customisable methods will need to be changed manually, but only if these mention any of the changed columns.
Each class represents a different database table, and as each table is subject to exactly the same operations as every other table all the common methods and properties have been predefined in the abstract table class. These are the methods used by each Controller to communicate with each Model.
Because all the data, both incoming and outgoing, is held in an array of variables called $fieldarray, which is defined in the abstract table class, I don't have to spend time in defining a separate variable for each column, nor do I have to build a separate getter and setter for each column.
I do not need to define a separate method for each user transaction (also known as "task", "use case" or "unit of work") as every transaction follows the same pattern in that it performs one or more CRUD operations on one or more tables, so it is the Controller's job to call the relevant method on the relevant Model. Each user transaction has its own component script in the file system, and it is this tiny script which identifies which Model(s) are to be used with which Controller for that transaction.
Each table class contains standard code which is inherited from the abstract class, and while this is sufficient to handle the transfer of data from the User Interface (UI) to the database and back again, and the primary validation to ensure that for inserts and updates each value is compatible the column definition in the database, it may be necessary to add custom code at different points in the processing cycle. This can be done by inserting the relevant code into the "hook" methods which have been built into the abstract class but which can be copied into each table class.
The primary validation requirements for each column in a table are defined in the $fieldspec array which is made available in the <table>.dict.inc file which is exported from the Data Dictionary. All user input comes in as an associative array, such as $_POST, where the column values are keyed by the column name. The abstract table class then uses a standard validation class to verify that each of the values in the data array matches that column's specifications in the specifications array.
Secondary validation can be carried out by adding custom code into the relevant "hook" methods.
This topic is discussed further in How NOT to validate data.
Object associations are nothing more than relationships where each relationship involves a foreign key on a child table which refers to the primary key on a parent table. Dealing with each relationship does not require extra code in any Model, it requires standard code in a Controller which deals with the two entities and handles the movement of the parent's primary key to the child's foreign key. This is why I created the LIST2 pattern.
Object aggregations are nothing more than a hierarchy of parent-child relationships, so it is easier to deal with each pair of tables in a separate user transaction instead of having custom code to deal with the entire collection of relationships.
A large number of programmers seem to think that each Model class needs its own Controller simply because each Model is given its own unique set of method names, which include the setters and getters for all the individual table columns. This means that the Model is tightly coupled to the Controller and the Controller is tightly coupled to the Model. This means that neither can be reused with other objects which indicates a deficiency in the design. I have cured this deficiency by making the communication between Controllers and Models to be as loosely coupled as is physically possible by having each Model use the same set of methods and by eliminating the use of getters and setters. This means that by using the power of polymorphism I can use any Controller with any Model.
Each Controller performs a fixed set of operations on a fixed number of Models and produces a different View, as described in Transaction Patterns, and by using the power of Dependency Injection the same Controller can perform the same set of operations on whatever Model it is told to use.
I decided from the outset that instead of building each HTML document from scratch for each user transaction that it would be better to use a template engine as I had already noticed a repeating pattern of structures with the only different being the content. I had already become familiar with the use of XML and XSL, and having proved to myself that both could be used easily with PHP I stuck with that as my templating engine. I started with a separate XSL stylesheet for each screen, but after several cycles of refactoring I managed to produce a small library of reusable XSL stylesheets which could be used for any screen in the application. While the same template can be used to display the data from different Models, the different data names are supplied at runtime using a separate screen structure script. The contents of this small script, which can be modified by the developer, are copied into the XML document so that they can be processed by the XSL stylesheet during the transformation process.
The construction of the XML document is common to all web pages so can be supplied in a single reusable object. The only variables required at runtime are supplied by the screen structure script. This is built by the framework when the user transaction is generated from the Data Dictionary, but it can be amended by the developer to customise the screen when required.
All the following areas in a web page are automatically supplied by and handled by the framework:
If you have to write such code yourself then you know what a burden it can be. Now imagine not having to write such code to achieve all this functionality.
Anyone who has written SQL queries for any length of time will tell you that they all follow a standard pattern with the only differences being the table and column names. While default SQL queries for INSERTs, UPDATEs and DELETEs are built automatically by the framework it is possible to customise the SELECT query by inserting code into the _cm_pre_getData() method which is one of the "hook" methods. The different parts of the query are then sent to the Data Access Object (DAO) where they will be assembled and sent to the selected DBMS using the relevant API.
Note also that there is a simple process to retrieve columns from a parent table by automatically adding JOINs to SELECT queries.
I have seen such a thing proposed more than once, such as in Decoupling models from the database: Data Access Object pattern in PHP, and I am always surprised, even shocked, that so-called "professional" programmers can come up with such convoluted and complicated solutions. In my mind that is the total opposite of what should actually happen. In my methodology I *DO NOT* have a separate DAO for each table, I only have a separate DAO for each DBMS (MySQL, Postgresql, Oracle and SQL Server) where each can handle any table that exists. If you understand SQL you should realise that there are only four operations that can be performed on a database table - create, read, update and delete - so why would I duplicate those operations for each table when I can have a single object to handle any table?
Some people question the necessity of having a swappable DAO as once chosen the application's DBMS is rarely changed. The words "once chosen" should provide a clue - the framework supports a number of DBMS engines, so its users are able to make their choice before they start development.
I have seen the instructions provided in other PHP frameworks for building new transactions, and I am amazed at how much effort is required. Too much manual effort, not enough automation.
In the RADICORE framework each user transaction requires the services of number of components - a Controller, one or more Models, and a View. Each Controller performs a particular set of operations on its Model(s) and is tied to a particular screen structure which is produced by a particular XSL stylesheet, with all the possible combinations described in my library of Transaction Patterns. Building a new transaction requires the following simple steps:
I started off by performing these tasks by hand, but this grew rather tedious over time so I decided to automate it by add some new functions to the Data Dictionary:
This function will then generate the relevant scripts and update the relevant tables in the MENU database. The new tasks are then available to be run. You can alter the screen layout by amending the screen structure file, and if necessary you can add "hook" methods to the table class file in order to apply additional business rules.
The only "difficulty" with this approach is deciding which Transaction Pattern to use in the first place, but as the framework download contains lots of samples this should become easier with experience.
In my early programming days there were no frameworks we could use, so everything had to be hard-coded and built from scratch. Once I had built my first framework with its own database this enabled these options to become more dynamic as they could be driven from the contents of various database tables. For example:
This is discussed further in A Role-Based Access Control (RBAC) system.
Other security features which are built into the framework are documented in The RADICORE Security Model.
It was common practice in my early programming days for all the menu screens to be hard-coded, which meant that they had to be designed and built up front, and any changes required that code to be amended. When I created my first framework in the 1980s I made the switch to a system of dynamic menus.
Each user transaction has its own record on the TASK table which then allows it to be added to either the MENU table or NAVIGATION-BUTTON table. The MENU table is used to create whatever menu structure is appropriate for your organisation.
When the contents of these two tables are displayed on the screen any tasks which are not accessible to the current user will be filtered out.
Using the RADICORE framework I am able to build new user transactions in minutes rather than hours because of my library of Transaction Patterns which provide all the boilerplate code which is necessary to put data into and get data out of the database. This leaves me with nothing to do but insert business logic into the pre-defined "hook" methods. It should therefore follow that when an analyst comes to write a detailed program specification for a programmer to follow that it should not be necessary to describe all that sharable boilerplate code as this never changes. The description of each Transaction Pattern covers such things as the look and feel of any screens or reports and how the program should behave. All that should be necessary should be as follows:
Years ago I read a complaint from some novice programmer who said that OOP is not suitable for database applications and that changing a table's structure was a complicated and long-winded process as it involved changing method signatures and as well as all the places which called those signatures. In the 20 years that I have been building database applications using the OO capabilities provided by PHP I have never had such a problem, so I can only conclude that the problem does not lie with PHP or the principles of OOP but instead lies with the complainant's inability to make effective use of those capabilities.
I have been told time and time again by my critics that my methods are rubbish because I am not following "best practices", but I contend that the truth is the complete opposite, that my methods are superior simply because I do NOT follow those practices because I have found practices which are demonstrably better. I develop database applications where the software structure is always synchronised with the database structure, so I don't need to waste time with any Object-Relational Mappers. Instead I use my Data Dictionary to construct both the table class file and the table structure file. If I ever change a table's structure all I need to do is to re-import that table's structure into my Data Dictionary and then re-export that structure to replace the table structure file. I only ever have to amend code within a table class if an affected column is mentioned in any "hook" method. If I need to amend an HTML screen all I do is amend a screen structure file.
Because I can save time by NOT doing a lot of useless things I can then spend that time in doing useful things which add value to the application.
My critics love to tell me that because my approach to building software is different from theirs, that it doesn't follow the same rules as they have been taught, then it is not "right" so must be "wrong". If it is wrong then it must exhibit the attributes of wrong software such as being difficult to maintain or extend. The truth is completely the opposite. The fact that I have a separate concrete class for each database table which inherits a huge amount of sharable code from an abstract class has opened up a whole world of opportunities which would not have been available otherwise. All database operations are routed through the same code in the abstract table class and thereby through the relevant Data Access Object (DAO) for the current DBMS. This has enabled me to extend my framework as follows:
There are rules, there are guidelines, and there are personal preferences. Some of these personal preferences are the result of experience while others are the result of ignorance - they simply don't know any better. The "rule" that it is wrong to create a separate class for each database table was obviously devised by a person who learned OO theory before he learned how databases work, and has yet to come to grips with the fact that you have to tailor your solution to fit the problem domain. This is what Domain Driven Design is supposed to be about. I worked with databases for over 20 years before I learned OO, and I used that experience to help me write software that takes advantage of what encapsulation, inheritance and polymorphism have to offer. I never saw any descriptions of encapsulation which said that I could not create a separate class for each database table, so that's precisely what I did. This has never caused me any problems EXCEPT to attract the ire of certain individuals who claim that I am wrong yet fail to offer any substantial proof. As far as I am concerned my method works, so it cannot be wrong.
If a piece of software is communicating with tables in a database it makes perfect sense to me to have a separate software model for each of those tables which will aid in the communication, both to and from, with those tables. Each table has its own name, its own structure, its own relationships and business rules, so a separate class for each table would be the perfect way to encapsulate all of that information in a single place.
Producing Model objects which are all of the same "type" (i.e. database tables) means that I can put all the code which is common to all database tables in a single abstract table class which can then be inherited by every concrete table class. This is a lot of code which is inherited by every one of the 400+ tables in my application, so that is a lot of reusability. The use of an abstract table class then enables me to implement the Template Method Pattern which has long been regarded as a fundamental technique for code reuse.
No matter how large or small an enterprise application may be, each user transaction within it ends up by performing one or more operations on one or more database tables, and those operations are limited to Create, Read, Update and Delete. These operations are common to every database table therefore can be defined within the abstract table class. Because these methods are called by Controllers, and because they exist within every concrete table class, I have create a library of Controllers (called Transaction Patterns) which can operate on any table class within the application through that OO mechanism called polymorphism. If I have 40 Controllers and 450 Models this means that I have 40 x 450 = 18,000 (yes, EIGHTEEN THOUSAND) opportunities for polymorphism.
When my critics tell me that I am breaking one of their precious rules they don't understand that breaking that rule has no adverse effect on my code. They cannot say "because you are doing that you cannot do this", so following that rule would not solve any problem for me. All it would do is force me to write additional code to achieve the same result, only differently, and if that effort is not rewarded with measurable benefits then in my universe that effort would be a total waste of time. I run a business where producing cost-effective software is the name of the game, so I prefer to spend my time in writing software that pleases my paying customers. I do not waste time in trying to write code that pleases the paradigm police with its purity as their definition of purity does not result in cost-effective software.
Not only does my approach NOT cause any problems, it actually opens up a lot of advantages which are completely closed to other methods.
My approach may ignore certain design patterns favoured by my critics, or their preferred implementations of those patterns, but there is no such thing as a definitive list of patterns, or fixed implementations of those patterns, which everyone must follow. Each programmer should be free to choose whatever patterns seem to be appropriate for their circumstances, and be free to implement them in whatever way they see fit. It is the end result that counts, not the effort that you expend to get those results. Being able to achieve better results than your rivals, and in shorter timescales, is what separates the competent from the cowboys.
Not only is it NOT wrong to have one class per database table, I firmly believe that it is the alternative approaches which are wrong as they create more problems than they solve.
Here endeth the lesson. Don't applaud, just throw money.
The following articles describe aspects of my framework:
The following articles express my heretical views on the topic of OOP:
Here are my views on changes to the PHP language and Backwards Compatibility:
The following are responses to criticisms of my methods:
Here are some miscellaneous articles:
|18 Oct 2023||Added I don't need to waste time writing detailed program specifications
Added I don't need to waste time changing method signatures after changing a table's structure
|21 Aug 2023||Added This approach cannot deal with complex associations|
|01 Nov 2022||Added Following the concept of "programming-by-difference"|
|01 May 2021||Modified Arguments against having a single class for multiple tables to refer to Why I don't do Domain Driven Design for additional details.|
|08 Feb 2021||Put Reusable XSL stylesheets into a section on its own.
Added Arguments against having a single class for multiple tables
Added Arguments against having a multiple classes for a single table
|02 Apr 2020||Modified several sections to include references to the Template Method Pattern.|
|01 Sep 2018||Added Additional Advantages of my approach.|