This article is about writing database applications using Object Oriented Programming (OOP). I have been designing and building database applications since the late 1970s/early 1980s. My first programming language was COBOL followed by UNIFACE, but in 2002 I decided to switch from desktop applications to web applications for which I chose PHP. This was also the first language I had used with Object Oriented (OO) capabilities. When I rebuilt by development framework in PHP using its OO features I found the transition very easy and the results were better than I expected.
I have noticed that there is a huge difference between my approach and the approach that is being taught in numerous online articles, tutorials and books. My many critics keep telling me that my approach is so wrong that it borders on the heretical. This criticism can be boiled down into the following statements:
Bad software is difficult to read, difficult to maintain, difficult to enhance, full of bugs, does not satisfy user requirements, is not cost-effective, et cetera, ad infinitum, ad nauseam. People who have actually used my RADICORE framework to build database applications will tell a different story. I myself have used it to build several applications, and I can tell you that it works, it works very well, and has done for over a decade
This leads me to believe that what is being taught today as OO theory is nothing more than a pile of pooh which is preventing whole generations of newbies from becoming competent programmers. If there are problems with putting a theory into practice then you should be prepared to examine both the theory itself and not just the way in which it is being implemented. I judge the efficacy of my work on its ability to produce results which satisfy my paying customers and not how well it follows a bunch of artificial rules and satisfies a bunch of dogmatic box tickers. If I can produce a better result by breaking a rule then I will do so. After all, if a rule is supposed to promote "good" software and I can still produce good software after ignoring that rule, then what value does that rule really have?
Too many of today's OO advocates are operating under the illusion that their precious OO theory is at the center of the universe and that everything else revolves around it, or is a mere "implementation detail". This is pure bunkum of the highest order. I designed and built many database applications in the 20+ years before I switched to an OO-capable language, and I can tell you quite categorically that in such applications it is actually the database design which is the center of the universe and that everything else revolves around it. In such applications the database is king and it is the software which is an implementation detail. Get the database design wrong and you have an uphill battle trying to produce the right results. Get the database design right, and structure your software around that design, and writing the software will be a walk in the park. Been there, done that, bought the t-shirt. This sentiment is echoed in the following quote:
Smart data structures and dumb code works a lot better than the other way around.Eric S. Raymond, "The Cathedral and the Bazaar"
When writing a software application, regardless of its type, the code will be interacting with something outside of itself. This "something" may be a physical device, either mechanical or electrical, it may be an image file or a sound file, or it may be a database. Whatever this "something" is, It is important to note that the software should be modelled on it so that it can interact with it as logically and simply as necessary. For example, you would not take the design for a building's elevator control system and use it as the design for an avionics system inside an aircraft. The design of the solution should be aligned around the problem, otherwise you will always have an uphill struggle.
The problem in the current world of OO is that everyone is taught that, regardless of the type of application being built, the software must ALWAYS be designed using the principles of Object-Oriented Design (OOD). This is supposed to be because it enables you to model the real world
. This is an interesting idea, but like most of the concepts in OOP it can be mis-interpreted and mis-applied in ridiculous ways. The article Don't try to model the real world, it doesn't exist contains the following observation:
It is not possible for anyone to model the actual real world, only their perception of the real world.
You should also remember that you never model the whole of the real world, only those bits which are necessary to complete the task in hand.
Anyone who has ever worked with databases will tell you that the database itself must be designed using the principles of database normalisation, which then forces the unwitting application designer to use two different design methodologies for the same application - one for the database, and another for the software which accesses that database. This always leads to a condition known as Object-Relational Impedance Mismatch for which the universal cure is an abomination called an Object-Relational Mapper (ORM). I do not use OOD, I do not have the mismatch problem, so I don't need an ORM.
I was not aware that OOD existed when I started to write programs using the principles of OOP, so I never wrote software which hid the fact that it was communicating with a relational database. I write enterprise applications which do nothing but communicate with databases, and these applications contain thousands of tasks (user transactions) each of which does something to one or more database tables in order to achieve its result. Each task is therefore designed and developed around the fact that it is required to perform actions on one or more database tables, and the best result is achieved by ignoring completely the concepts of OOD and such things as the SOLID principles and write code which knows that it is talking to database. I wrote my RADICORE framework specifically to help me build and run database applications, and it does not matter how many tables are in the database or what they represent - they are all database tables which can be accessed in exactly the same way. This is why each of my 450+ concrete table classes inherits so much reusable code from my abstract table class. This is why the Transaction Patterns I use on one table can be reused on other tables - the operations are identical, it is just the tables and their contents which are different.
In my early days as a programmer I was exposed to various sets of programming standards which varied in quality from "workable" to "sub-standard". Each organisation had its own standards, and sometimes different teams within the same organisation had their own standards. This prompted me to write Development Standards - Limitation or Inspiration?. I started to document my own personal standards which, in the light of actual experience, avoided the mistakes made by others. When I later became a team leader in another company my personal standards were adopted as the official company standards. The whole team was asked to vote on the standards, and mine were chosen by common consent.
When I became team leader I also had the opportunity to begin creating libraries of reusable code. Until that point my only option had been to copy and paste whole sections of code from one program to another as nobody in the team either recognised the need for such libraries or knew how to build and maintain them. These libraries were documented in Library of Standard Utilities and Library of Standard COBOL Macros.
This set of libraries was upgraded to a framework in 1985. This happened because a client asked for a system with dynamic menus instead of static ones, as well as a method of enforcing role based access control. In order to implement this I had to design a MENU database as well as a set of maintenance programs. I completed my design in a few hours one Sunday afternoon, and by Friday I had a working implementation. I later rewrote this framework in UNIFACE and again in PHP for which I also produced a structure document. If you don't know the different between a "library" and a "framework" then please read What is a Framework?
While working with UNIFACE I was introduced to XML documents and XSL Transformations, but although it was possible to transform XML documents into HTML pages the authors of UNIFACE chose to use their own proprietary mechanism which I felt was too clunky and less flexible.
With COBOL all development was done using the 1-Tier Architecture in which the user interface, business logic and data access logic were all contained within a single component. We did not have the entire application in a single program, we had a separate component for different areas of the application. When I switched to UNIFACE v5 I was introduced to the 2-Tier Architecture in which all data access logic was split off into a separate component which was part of the language. This enabled the DBMS to be easily switched from one to another without having to change any business logic. When UNIFACE version 7.2.06 was released this was upgraded to the 3-Tier Architecture as it supported separate components for the Presentation (UI) layer and Business layer. A single component in the Business layer could be shared by any number of components in the Presentation layer. The built-in Data Access component was a blessing because developers did not need to write any code to access the database, but it was also a curse as it could only generate simple queries without JOINs. If anything complicated was needed it meant either using a database view or a stored procedure.
By "database application" I mean an application whose sole purpose is to allow one or more users to interact with the contents of a database. Such applications may also be known as enterprise applications as they are business-facing web applications rather than public-facing web sites, and are usually concerned with the manipulation of business data for organisations, both large and small, for such topics as sales order processing, purchase order processing, accounting, invoicing, payroll, inventory, shipments and such. This excludes software such as that which is embedded in or interfaces with hardware objects, or which deals with games, video/audio manipulation, compilers, operating systems and device drivers.
Having written database applications for 20+ years before using an OO-capable language I was familiar with database design and data normalisation. I also learned about Structured Programming and, courtesy of Michael A Jackson, that the software structure should follow the database structure as closely as possible. Imagine my surprise when I was told that "proper" OO requires the use of OO design and that the database design can be dismissed as merely an "implementation detail".
In the early days the business of software was referred to as Data Processing (DP) instead of the more modern Information Technology (IT), and what we wrote were called Data Processing Systems. Bearing in mind that in this context a system is something which takes input and processes it in some way to produce output, a data processing system can be represented by the diagram in Figure 1:
Figure 1 - A Data Processing System
In this example a user (person) enters data into the system using electronic forms on an input device, and that data is processed in some way before it gets stored in a database. Data can also be extracted from the database, then processed in another way before it gets displayed on the user's device. In the early days the software applications were compiled and run on a server, and communicated with the user via green screen or dumb terminals which operated in either character mode or block mode. Modern systems use intelligent PCs as their monitors which allows for the following possibilities:
Figure 2 - A Web Application
It is also possible for the human being to be replaced by another computer system which can access the application via web services. This involves the transmission and receipt of documents in either XML or JSON format which have superseded the earlier EDI formats such as EDIFACT.
In my long career I have dealt with such storage systems as flat files, indexed files, Hierarchical databases (such as Data General's INFOS), Network databases (such as Hewlett Packard's IMAGE/TurboIMAGE), and Relational databases such as MySQL, PostgreSQL, Oracle and SQL Server. I am therefore no stranger to the concept of Data Normalisation and Entity-Relationship (ER) models.
If you look at Figure 1 you will see that at one end of the "system" is a user with a device which displays data using some sort of forms-based mechanism (such as compiled GUIs or HTML pages), while at the other end the data is stored (or persisted) in a mechanism known as a database. In between the two sits the software application which deals with the data as it flows between the two ends, and deals with any business or formatting rules on the way. The language and/or paradigm which is used to create this software should be completely irrelevant as the results - what is shown to the user and what appears in the database - should always be the same.
In my early days it was quite common to produce software as a series of components or modules which could be worked on separately, and which could be linked together to form a whole application. Each of these modules contained the code for dealing with the user interface, business rules, and database access logic in a single unit, but while working with UNIFACE I was exposed to the 2-Tier Architecture and then then 3-Tier Architecture which splits the code into separate parts, also known as layers or tiers, where each part dealt with a separate area of logic. This structure is shown in Figure 3:
Figure 3 - the 3 Tier Architecture
After using this this structure for a while its benefits became immediately obvious to me, which is why I chose to continue using it when I switched to a different language.
When I came to learn OOP in late 2001 and early 2002 the resources which were available on the internet were very small in number and far less complicated. All I had to go on was a description of what made a language object oriented in this Wikipedia article from October 2001 which stated the following:
Object Oriented Programming (OOP) is a software design methodology, in which related functions and data are lumped together into "objects", convenient metaphors, often mirroring real world things or concepts.
To do OO programming you need an OO-capable language, and a language can only be said to be object oriented if it supports encapsulation (classes and objects), inheritance and polymorphism. It may support other features, but encapsulation, inheritance and polymorphism are the bare minimum. That is not just my personal opinion, it is also the opinion of the man who invented the term. In addition, Bjarne Stroustrup (who designed and implemented the C++ programming language), provides this broad definition of the term "Object Oriented" in section 3 of his paper called Why C++ is not just an Object Oriented Programming Language:
A language or technique is object-oriented if and only if it directly supports:
- Abstraction - providing some form of classes and objects.
- Inheritance - providing the ability to build new abstractions (classes) out of existing ones.
- Runtime polymorphism - providing some form of runtime binding.
So, according to those experts, a computer language can only be said to be Object Oriented if it provides support for the following:
Class | A class is a blueprint, or prototype, that defines the variables (data) and the methods (operations) common to all objects of a certain kind. Can be instantiated into an object and extended to form a new class. |
Object | An instance of a class. A class must be instantiated into an object before it can be used in the software. More than one instance of the same class can be in existence at any one time. |
Encapsulation | The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations. This binds together the data and the functions that manipulate the data.
More details can be found in Object-Oriented Programming for Heretics |
Inheritance | The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass. A subclass may override any of the methods in the superclass, or may introduce new methods of its own.
More details can be found in Object-Oriented Programming for Heretics |
Polymorphism | Same interface, different implementation. The ability to substitute one class for another. By the word "interface" I do not mean object interface but method signature. This means that different classes may contain the same method signature, but the result which is returned by calling that method on a different object will be different as the code behind that method (the implementation) is different in each object.
More details can be found in Object-Oriented Programming for Heretics |
Abstraction | The process of separating the abstract from the concrete, the general from the specific, by examining a group of objects looking for both similarities and differences. The similarities can be shared by all members of that group while the differences are unique to individual members. The result of this process should then be an abstract superclass containing the shared characteristics and a separate concrete subclass to contain the differences for each unique instance.
More details can be found in Object-Oriented Programming for Heretics |
But why make the switch from procedural to OO programming? What are the benefits? After reading a few more articles on the internet I came across various descriptions which I summarised into the following:
Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Inheritance and Polymorphism to increase code reuse and decrease code maintenance.
Since those early days a lot of people have added to the definition of OOP, but as far as I am concerned all these additions are nothing but optional extras, and as I can find no practical use for any of them I choose to ignore them. This upsets a lot of purists who seem to believe that by not incorporating all these optional extras into my applications that I am not following "best practices", not obeying the "rules", that I do not understand OOP, that I am not a "proper" OO programmer, that my work must therefore automatically be inferior. What a load of rubbish! I write code which produces results, not follows an arbitrary set of artificial rules. I write code to please my paying customers, not to impress my fellow developers. I do not write code using any of these "optional extras" simply because I can write effective software without them. I have looked into using some of them, but as they would not add measurable value to my code I have to ask myself "why bother?"
The simple answer is that Too Many Cooks Spoil the Broth. Once you get an ever-expanding group of people deciding on how something should be done, such as deciding on a particular recipe, you will always get multiple and different opinions, with the person with the loudest voice/biggest mouth coming out on top. This is different from Many Hands Make Light Work which refers to implementing a design decision. Getting a hundred men to dig a trench is easy, but getting a hundred men to decide where to dig the trench, and how deep and how wide it should be, is a different matter altogether.
While the original concept of OO was perfectly sound, over the years many different "cooks" have added their personal opinions to the recipe and turned it into a humongous mess. It is interesting to note that the languages which were originally used to demonstrate OO concepts, such as SIMULA and SMALLTALK, never made it as mainstream languages. This would indicate to me that their implementations may have been theoretically pure, but were not accepted for practical reasons. The first object-oriented language to be widely used commercially was C++ which was created by starting with the already widely-used C language, which was procedural, and adding in OO capabilities. The original version was called "C with classes".
While this procedural language was enhanced with the addition of encapsulation, inheritance and polymorphism, other features that already existed in the language, or were added later, were taken by some programmers to be part of the OO paradigm and therefore considered to be essential to OO programming. These include the following:
Other OO languages, such as Java, were developed from scratch, but the ability to write procedural code was removed with the idea that everything is an object. Other features were added on top of encapsulation, inheritance and polymorphism, and it has been assumed by too many developers that these additional features are essential in order to do OOP properly. In my opinion they are not essential, they are nothing more than optional extras.
It is interesting to note that Alan Kay, who invented the term "Object Oriented" had this to say about these implementations of his idea:
Having spent several decades writing database applications for the enterprise with non-OO languages, when I wanted to switch to providing such applications for the web I looked for an easy-to-use yet popular web-capable language. I did not like the look of Java, so I chose PHP. I have never regretted this choice. I started with PHP 4 which was based on a successful procedural language, but which included full support for the essential characteristics of OO which were (and still are) nothing more than Encapsulation, Inheritance and Polymorphism.
There many different languages now available which provide support for OOP. They are different simply because they were designed by different people with different objectives in mind. There is no such thing as a "one size fits all" language, which also means that there is no such thing as a "one size fits all" implementation of OOP. Some languages are compiled while others are interpreted. Some languages are multi-purpose while others have only a single purpose. Some languages are statically typed while others are dynamically typed. Java, for example, is multi-purpose, compiled and statically typed while PHP is limited to web applications, interpreted and dynamically typed. Because these languages are different they tend to do similar things in different ways with different syntax. Some things which are necessary in some languages are either optional or non-existent in others.
One of the most disturbing things I have noticed with the PHP language from version 5 upwards is that new OO features have been added for no reason other than "other languages have X, so PHP should have X too". These people fail to realise that their previous language may have included feature X either because some bright spark thought that it would be clever, it would save a few keystrokes, or because it solved a problem that existed in that particular time with that particular language. A typical example of this is object variables which were devised to get around a problem with compiled programs and slow processors in the 1980s which no longer exists in the 21st century. PHP 4 did not support interfaces, so they were not important for OOP, yet they were added in PHP 5 to placate the OO purists even though they are totally unnecessary. This is what Rasmus Lerdorf, who invented PHP, had to say about this on the internals list:
Rather than piling on language features with the main justification being that other languages have them, I would love to see more focus on practical solutions to real problems.
Here is a quotation taken from the Revised Report on the Algorithmic Language Scheme.
Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.
I firmly believe that programming is an art, not a science, which means that a person must have the right artistic talent or aptitude to begin with otherwise they are wasting their time, they are flogging a dead horse, they are barking up the wrong tree. This means that it is simply not possible for an artist to document what he does in such a way that he can pass on that talent to an non-talented individual. Yet there are too many people out there who seem to think that they can do just that, which is why they try to break down the art of programming into a series of "rules", "principles" and "best practices". They seem to think that they can reduce artistic skill to a series of items on a sheet of paper which can be ticked off one by one, and if they come across something which does not tick all the right boxes they automatically assume that it is faulty or incorrect. The opposite is also true - if they come across something which does tick all the right boxes, which does follow their "rules" or "principles", then they automatically assume that it must be correct. This is pure fallacy. This sentiment was echoed in the blog post When are design patterns the problem instead of the solution? in which T. E. D. wrote:
My problem with patterns is that there seems to be a central lie at the core of the concept: The idea that if you can somehow categorize the code experts write, then anyone can write expert code by just recognizing and mechanically applying the categories. That sounds great to managers, as expert software designers are relatively rare.
The problem is that it isn't true. You can't write expert-quality code with only "design patterns" any more than you can design your own professional fashion designer-quality clothing using only sewing patterns.
As far as I can see the programming world is populated by two kinds of artist - true artists and con artists. A true artist is able to describe what he does in simple terms, whereas a con artist will try to dress it up in flowery language in order to disguise his lack of understanding and/or ability. These are members of what I call the Let's Make It More Complicated Than It Really Is Just To Prove How Clever We Are brigade. The trouble is that a statement or concept which starts off in a simple and unambiguous form can be reprocessed by a series of these con artists and end up looking nothing like the original. This is similar to the game of Chinese Whispers. If you look at Abstraction, Encapsulation, and Information Hiding by Edward V Berard of the Object Agency you will see where different authors have published different descriptions for certain basic OO concepts, but by using different words they have produced different interpretations which, after numerous iterations and re-interpretations, eventually end up as mis-interpretations and mis-representations. Instead of offering clarification they have muddied the waters. Instead of keeping things simple they have added unnecessary layers of complexity.
This leads me to believe that my implementation of OOP has been more successful simply because I started off with the original, basic and valid descriptions of OOP and was not distracted or led astray by the numerous mis-interpretations which followed, mainly because I did not know that so many mis-interpretations actually existed and had never read them. Now that I do know, I can dismiss them as the work of charlatans and con artists because my decades of experience have shown me that I have been able to use the bare-bones principles of OOP - that of encapsulation, inheritance and polymorphism - to meet the objectives of OOP - to increase code reuse and decrease code maintenance.
Now look at the following quotes from various wise men:
1. Some people know only what they have been taught, while others know what they have learned.
2. Experience is not what you've done, but what you've learned from what you've done.
3. To effectively apply practices, you need to understand the principle, but to understand the principles, you need to practice!
4. Experience helps to prevent you from making mistakes. You gain experience by making mistakes.
My path to OOP has been comprised of the following:
Now compare that with the path of today's novice programmer:
Jacob Gabrielson had this to say in his article How poopy is YOUR code?:
The problem is that programmers are taught all about how to write OO code, and how doing so will improve the maintainability of their code. And by "taught", I don't just mean "taken a class or two". I mean: have it pounded into their heads in school, spend years as a professional being mentored by senior OO "architects" and only then finally kind of understand how to use it properly, some of the time. Most engineers wouldn't consider using a non-OO language, even if it had amazing features ... the hype is that major.
If you think of the contents of a programming language - its syntax, expressions, operators, control structures and functions - as being similar to the contents of a children's construction set, you should see that they both consist of tiny building blocks or component parts which can be assembled in a multitude of different ways in order to produce a variety of different results. While each component has a particular function and may have certain rules when it comes to interacting with other components, when it comes to assembling them in order to produce something there are absolutely NO rules when it comes to how that "something" may be assembled. This is entirely up to the imagination of the budding constructor and is NOT limited by the imagination of the person who devised the construction set. By telling their students that there is only one way to assemble the language components into a finished piece of software the teachers are overstepping the mark and actually failing in their duty:
The result of this substandard teaching is a bunch of Monkey See, Monkey Do programmers who know nothing except Cargo Cult Programming. These people will become expert in writing applications that fail, that go over budget, that are monsters to develop and monsters to maintain.
When I write software I only allow myself to be constrained by three things:
I refuse to be constrained by the limitations of someone else's intellect as that would be like going back to a bygone age. Progress only comes from innovation, not imitation, which means that trying different methodologies or techniques should be encouraged and not discouraged. Those who criticise me for being "different" are therefore like Luddites who resist the march of progress.
If you have ever heard the expression learn to walk before you run you should realise that it would be best for students to learn how to write software using the bare essentials before trying and evaluating each of those optional extras. I emphasise "and evaluate" because each of those optional extras should be individually tested in order to see what effort is required to implement them and what benefits, if any, are actually obtained. It is not sufficient to use a feature "because it is there", it should only be used if it is actually proven to add value to the result.
In his article Talkers and Doers Chris Baus identifies two types of people in the world of software:
In this article he points out the following:
Software isn't about methodologies, languages, or even operating systems. It is about working applications.
...
As an application developer, when you evaluate a new tool or technique, you should always ask yourself, "How can this make my application better, or help me develop it more quickly?" If you can't quantify the advantages of a tool, your time is probably better spent actually building software.
I have followed that advice by creating working software using the simplest of techniques, yet I am constantly being berated for being "too simple" and for not following the advice given to me by all these architecture astronauts. They tell me that my software must be wrong because it's breaking all their rules, but they fail to realise that it works (and sells!), therefore it cannot be wrong.
An open-minded person would assess the validity of a different interpretation by comparing its results with what they have produced themselves. But my critics are not open minded. They are dogmatists who believe that what they have been taught is the "only way, the one true way" and that anybody who strays from the path of righteousness is a deviant and a heretic. Such people then struggle to combine the concepts of OOP and database theory because they do not realise, or do not accept, that database theory is superior to OO theory every day of the week. When they see what I have done they don't say "It's different, but it works, therefore it cannot be wrong". Instead they say "It's different, but as it is not allowed to be different it must be wrong".
When I came to build my new framework in PHP I had three particular objectives in mind:
After downloading and installing PHP onto my home computer I quickly ascertained that PHP was perfect for the job. It was purpose-built for web applications, was already widely used, was easy to learn due to the large number of online resources available, and it was open source and therefore free to use. I soon discovered how easy it was to write the code to deal with creating XML documents and performing XSL transformations, so I felt that I was heading in the right direction.
When it came to the 3-Tier Architecture I knew that the PHP language did not contain any functions which mentioned that architecture, so it was all down to how I split up my code.
After reading the PHP manual and how it supported OOP I could see how to deal with encapsulation, inheritance and polymorphism, so I started to write the code to deal with my first database table. The first thing that I noticed was that OOP automatically forces you to use at least two components - a class which contains methods and properties, and another component which instantiates that class into an object, then calls various methods on that object in order to obtain a result. I could immediately see that one of these two components would sit comfortably in the Presentation layer while the other would sit in the Business layer. I was not bothered about creating a separate Data Access Object at this point as I first wanted to create code that worked before I split it off into its own component.
Why did I start by creating a class to deal with a single database table instead of a group of tables? Why not? My decades of experience had taught me that an enterprise application consists of large numbers of database tables each with their own set of columns, then has large numbers of user transactions or tasks each of which does something with one or more of those tables. The idea of creating a class which was not directly associated with a single database table never entered my mind as I not could see any logical reason for it. In a database each table is a separate entity in its own right, and there is no such concept as a group of tables (except for grouping tables in different databases). There is no concept such as being forced to go through TableA to get at TableB. As far as I am concerned when I am writing a database application I am writing software which interacts with objects in a database, not physical objects in the real world, and database objects are called "tables". If a table holds data about a real-world object called "Person" I do not concern myself with the properties and methods of a real-world person as I am only interested with the properties and methods that apply to the "Person" table. Every object in the database is a table regardless of what real-world entity it represents. Every table has properties which are called "columns", and every table shares a standard set of operations named INSERT, SELECT, UPDATE and DELETE (often referred to as CRUD), so in my application I have a software object (class) for each table, and each table shares the same method names. What could be more logical than that?
You should also be aware that in the real world there are two types of object:
Regardless of whether a real-world object is active or inactive, alive or inert, in the database it is just a table, and can be operated on just like every other table. It cannot do anything by itself, it cannot generate requests, it can only respond to requests from external sources.
I also avoided the mistake of having the Presentation layer component having nothing but a call to a single $object->execute()
method and having the Business layer component do everything itself. Why did I do this? Because it violated the reasoning behind the 3-Tier Architecture.
When I created the methods for this table class it was also a no-brainer for me. Anyone who knows how databases work knows that there are only four operations that can be performed on a table - Create, Read (Select), Update and Delete - which are commonly referred to as CRUD. I did not make the mistake that I have seen so many others make by having separate methods to load, validate and store the data - I have a single insertRecord(), getData(), updateRecord() and deleteRecord() method which calls a series of sub-methods each of which performs a separate processing step. You might understand this better by looking at some UML diagrams.
When I created the properties for the table class I avoided the practice of having a separate property for each column. I noticed in my early testing that when data is sent in from the client device that it is presented in the form of the $_POST variable which is an associative array. I also noticed that when reading data from the database that each row is presented as an associative array. I liked the array processing in PHP, so as it was just as easy to access column data in an array I saw no reason to split either of those arrays into separate variables. That is why each table class has a single variable called $fieldarray instead of a separate variable for each column. This means that I don't have any setters (mutators) or getters (accessors) for individual columns, and this helps me achieve loose coupling which is supposed to be an important objective in OOP. Having a single array variable to hold row data also made it possible to extend it to an indexed array of associative arrays in order to handle data for more than one row at a time. Some people think that this is not allowed, but as that rule does not exist in my universe I ignore them.
It was also obvious to me that in order to implement the 3-Tier Architecture all code to format the data into HTML belonged in the Presentation layer, not the Business layer. This then meant that the table class did nothing but supply the calling class with raw data, and it was up to the calling class to transform that data into HTML. This I achieved by creating a single reusable component which turned the raw table data into XML, then transformed it into HTML using an XSL stylesheet. Because I then had three components, each of which performed a separate part of the processing, this prompted a colleague to remark that what I had done was provide an implementation of the Model-View-Controller design pattern. After reading a description of this pattern I could see that this was indeed true. This combination of the 3-Tier Architecture and the MVC design pattern resulted in the structure shown in Figure 4:
Figure 4 - The MVC and 3-Tier architectures combined
As soon as I discussed this implementation in a newsgroup I was subjected to the usual criticism and verbal abuse, which, as usual, failed to persuade me to change my heretical ways.
I also created a separate script for each user transaction (list, search, create, read, update and delete) rather than having a single script which could perform all of those transactions. This was because of previous experience which showed the benefits of single-purpose modules instead of multi-purpose modules.
I also avoided using a Front Controller as I had no idea that such a stupid notion existed outside of compiled languages.
After writing the code which dealt with the first database table I then set about creating the scripts which did exactly the same thing for the next database table. The phrase did exactly the same thing
should instantly signal to a wide-awake programmer that perhaps a reusable pattern could be emerging. To start with I took the first table class, copied it, then manually went through it and changed all the hard-coded references from table1
to table2
. Although the code worked I could see that there was a huge amount of duplication, so I set about creating code that could be reused instead of duplicated. The recognised mechanism for reusing code between different classes is inheritance, so I created an abstract table class which was then inherited by each of my concrete table classes. I then moved all duplicated code from the concrete class into the abstract class, then tested each script to make sure that it still performed its function. When I finished this exercise I discovered that each of my concrete table classes consisted of nothing but a class with a constructor which could be built from the following template:
Code sample #1 - Turning the abstract class into a concrete class
<?php require_once 'std.table.class.inc'; class #tablename# extends Default_Table { // **************************************************************************** // class constructor // **************************************************************************** function __construct () { // save directory name of current script $this->dirname = dirname(__file__); $this->dbname = '#dbname#'; $this->tablename = '#tablename#'; // call this method to get original field specifications // (note that they may be modified at runtime) $this->fieldspec = $this->loadFieldSpec(); } // __construct // **************************************************************************** } // end class // **************************************************************************** ?>
Some people look at this code and exclaim that because it is so small that I've done nothing but create an anemic data model. They fail to spot the amount of code which is inherited from the abstract table class. Those who look at this abstract table class then exclaim that because it is so large that it surely must be a God object. By reaching such conclusions simply by counting the number of methods or lines of code they are simply proving that they can count, but not that they can read or understand what they read. The abstract table class is large because it contains all the methods that could possibly be used in a table class. These methods are a mixture of invariant and variable methods as defined in the Template Method Pattern. The concrete class is small because all it need do is identify a particular database table with a particular structure. If you think about it carefully you should see that by combining something which is "too big" with something that is "too small" what I end up with is something which is "just right".
In my original implementation the constructor contained hard-coded definitions of all the fields/columns which existed in that table plus all the specifications (type, size, et cetera) for each of those fields. I later replaced this with a method call which loaded those definitions from an external file. This meant that I could regenerate the contents of that external file without having to modify the contents of any existing class file.
Every time I post a description of my approach to OOP on the web I am immediately subject to a barage of criticism and abuse. A selection of such criticisms, with my responses, can be found at:
Some of these criticisms are discussed below.
This statement, which is discussed (and dismissed) in greater detail in OO Design is incompatible with Database Design, shows a failure in the abilities of the author, not OOP. As far as I am concerned any competent programmer who knows how to design and build database applications should have absolutely no difficulty in switching to the OO paradigm, and any programmer who cannot is, in my book, simply not competent. I personally have written database applications using various different languages and paradigms - COBOL (procedural), UNIFACE (Model Driven and Event Driven) and PHP (OO) - and with each new language and paradigm I have been able to develop similar applications at a faster rate and therefore a lower cost. If I can do it then it cannot be that difficult. Those programmers who struggle are simply bad workmen who are blaming their tools.
As documented in What is/is not considered to be good OO programming, after publishing my views in an internet forum I received a torrent of damning criticism amongst which were these remarks:
1. Having a separate class for each database table is not good OO.
2. Abstract concepts are classes, their instances are objects. IMO The table 'cars' is not an abstract concept but an object in the world.
3. Classes are supposed to represent abstract concepts. The concept of a table is abstract. A given SQL table is not, it's an object in the world.
This person obviously did not understand what he wrote, otherwise he would have seen a very close correlation with my code.
The concept of a table is abstract | That is why I have an abstract table class which contains methods and properties which can be shared by any unspecified database table. |
A given SQL table is not, it's an object in the world. | That is why I create a concrete class for each physical table by combining the abstract table class with the specific details of each particular table. |
My use of an abstract table class which is inherited by every concrete table class has allowed me to implement the Template Method Pattern on a large scale. Every method called by a Controller on a Model is a template method where the superclass contains the unvarying parts of an algorithm, and each subclass can override the varying methods in order to add their own unique behavior at points of variability.
My abstract table class, which is so huge that it has been called a "God" class, is then inherited by every one of the 450+ table classes which exist in my main enterprise application. This means that each of my table classes is then quite small, which means that they have been called anemic classes. These people are applying an artificial rule which states that no class should have more than "N" methods, and no method should have more than "N" lines of code (where "N" is any number that you care to pull out of your hat on that particular day). Such a rule does not exist in my universe, so I ignore it. What matters to me is that I am sharing a huge amount of code through inheritance, and that, to me, is far more important than being able to count to 10. Unlike some of my critics I can actually count higher than 10 without having to take my shoes and socks off.
Jeff Atwood's article Why Objects Suck contains the following statement which indicates that providing polymorphism in a database application is very difficult:
A typical business problem is the converse of a typical object-oriented problem. Business problems are generally interested in a very limited set of operations (CRUD being the most popular). These operations are only as polymorphic as the data on which they operate. The Customer.Create() operation is really no different behaviorally than Product.Create() (if Product and Customer had the same name, you could reuse the same code modulo stored procedure or table name), however the respective data sets on which they both operate are likely to be vastly different. As collective industry experience has shown, handing polymorphic data with language techniques optimized for polymorphic behavior is tricky at best. Yes, it can be done, but it requires fits of extreme cleverness on the part of the developer. Often those fits of cleverness turn into fugues of frustration because the programming techniques designed to reduce complexity have actually compounded it.
If polymorphism relies on different objects sharing the same method signature, why on earth is he linking it to data instead of methods? What on earth is "polymorphic data"? If it requires "fits of extreme cleverness" to achieve polymorphism in a database application then does that make me extremely clever? I suggest that he follow my example and instead of using methods such as Customer.Create()
and Product.Create()
he use something like the following:
$object = new $table; $result = $object->insertRecord($_POST);
The value for $table
is supplied at runtime, and can be the name of any of the 450+ tables in my application. This works because every one of those classes inherits from my abstract table class which contains the generic ->insertRecord()
method. This method will take the data it is given (the entire contents of the $_POST array) and validate it using the column specifications for that table, and if that validation passes it will call the ->insertRecord()
method on the Data Access Object for that particular DBMS in order to generate an SQL "insert" query to add that data to the database. This simple technique did not require "fits of cleverness", nor did it turn into "fugues of frustration". I have been using this technique for over 10 years, and it has definitely helped, not hindered, my ability to maintain and extend my ever-growing enterprise application which I sell as a package to corporations all over the world.
The most common reason I hear as to why OOP and relational databases don't mix is because SQL is not object oriented, but this is a pure red herring. HTML is not object oriented either, but that does not stop legions of programmers from writing web applications using OOP. All the essential features of OOP do actually exist in relational databases, but they have different names and are implemented differently. Below is a summary of the comparisons which I documented in Object Relational Mappers are Evil:
Complaint | Response |
---|---|
Databases do not have classes. | A class is the blueprint that defines objects of a certain kind. In a database the schema for each table (the CREATE TABLE script) defines the blueprint for each record in that table. |
Databases do not have objects. | An object is an instance of a class. In a database each table row is an instance of that table's schema. |
Databases do not have encapsulation. | Encapsulation is the act of placing data and the operations that perform on that data in the same class. Data is defined using columns in a table's schema. Operations do not have to be defined for each table as every table can only be accessed using one of the four generic CRUD operations. |
Databases do not have inheritance. | Inheritance is the act of combining one class definition with another. In a database this is achieved with a foreign key which allows you to combine the data from one table with another by using a JOIN clause in the SQL query. |
Databases do not have polymorphism. | Polymorphism is enabled by having the same operations available in different objects. In a database the same CRUD operations are available for every table, so it is possible to construct an SQL query and do nothing but change the table name in order for it to work on a different table. |
Databases do not have <enter feature here> | Who cares? In OOP this is just an optional feature anyway, and if it is optional then there is no obligation to either include it in your code or emulate it in the database.
OO does not include support for such things as transactions, commit and rollback, but why should it? These are implemented within the database, and all the code has to do is execute the relevant command in the database. |
This topic is discussed further in Object-Relational Mappers are Evil.
As you should able to see, the names are different and the implementation is different, but the effects are more or less the same.
One of the problems that is often put forward by traditionalists is that changes to the database structure are always cumbersome to implement as they require so many changes to so many different components. This to me is an obvious sign that their levels of coupling and cohesion are completely wrong, which means that they are not applying the principles of OOP correctly. Either that or they are applying what are actually the wrong principles.
In my methodology I ignore object aggregation and object composition, so I never have a class which is responsible for more than one database table. I also never have a separate property for each table column, so I can pass all the data around in a single array instead of having code which deals with each column individually. Because of these heretical steps any changes to the database structure can be dealt with quite simply by importing the changed structure into my Data Dictionary, then exporting the details to the application. This regenerates the table structure file but does not regenerate the table class file as this may have been customised since it was first created. This simple process keeps the two structures, database and software, always in sync, which means that I don't have this problem.
I have clearly demonstrated that the result of my approach is the production of software which exhibits loose coupling, high cohesion and high levels of reusability, which is supposed to be a sign of a proper implementation of OOP, so why is my work described as "bad" while theirs is "good"? I am clearly achieving the objectives of OOP while they are not, so why is my work described as "bad" while theirs is "good"?
My approach to OOP received the following complaint in a newsgroup:
If you have one class per database table you are relegating each class to being no more than a simple transport mechanism for moving data between the database and the user interface. It is supposed to be more complicated than that.
You are missing an important point - every user transaction starts life as being simple, with complications only added in afterwards as and when necessary. This is the basic pattern for every user transaction in every database application that has ever been built. Data moves between the User Interface (UI) and the database by passing through the business/domain layer where the business rules are processed. This is achieved with a mixture of boilerplate code which provides the transport mechanism and custom code which provides the business rules. All I have done is build on that pattern by placing the sharable boilerplate code in an abstract table class which is then inherited by every concrete table class. This has then allowed me to employ the Template Method Pattern so that all the non-standard customisable code can be placed in the relevant "hook" methods in each table's subclass. After using the framework to build a basic user transaction it can be run immediately to access the database, after which the developer can add business rules by modifying the relevant subclass.
Some developers still employ a technique which involves starting with the business rules and then plugging in the boilerplate code. My technique is the reverse - the framework provides the boilerplate code in an abstract table class after which the developer plugs in the business rules in the relevant "hook" methods within each concrete table class. Additional boilerplate code for each task (user transaction, or use case) is provided by the framework in the form of reusable page controllers.
If you look at Figure 1 you should see that the purpose of the software application is to sit between the user and the database. The user can send data to be stored in the database, or retrieve data from the database and have it displayed to the user in one format or another. This data may be massaged or manipulated in some way in either of the inbound or outbound journeys. The data never remains in the software, it simply passes through.
I have been building database applications for several decades in several different languages, and in that time I have built thousands of programs. Every one of these, regardless of which business domain they are in, follows the same pattern in that they perform one or more CRUD operations on one or more database tables aided by a screen (which nowadays is HTML) on the client device. This part of the program's functionality, the moving of data between the client device and the database, is so similar that it can be provided using boilerplate code which can, in turn, be provided by the framework. Every complicated program starts off by being a simple program which can be expanded by adding business rules which cannot be covered by the framework. The standard code is provided by a series of Template Methods which are defined within an abstract table class. This then allows any business rules to be included in any table subclass simply by adding the necessary code into any of the predefined hook methods. The standard, basic functionality is provided by the framework while the complicated business rules are added by the programmer.
The idea that this approach is too simple immediately tells me that too many of today's programmers are deliberately trying to make OOP more complicated than it need be, thus violating the KISS principle and ignoring the advice of Albert Einstein who said:
Everything should be made as simple as possible, but not simpler.
I'm afraid that I'm with Einstein on this one. Anyone who disagrees must be a member of the let's-make-it-more-complicated-than-it-really-is-just-to-prove-how-clever-we-are brigade and can be dismissed as a know-nothing charlatan. I can write effective software using nothing more than the bare minimum of OO features in the language, so I wonder why legions of others find it so difficult. I don't use any of the optional extras simply because I cannot find a use for them. They would force me to use more code, not less, and they would not add anything of value.
Other quotations in support of simplicity can be found in A minimalist approach to Object Oriented Programming with PHP.
This topic is also discussed in In the world of OOP am I Hero or Heretic?.
What exactly is the difference between Object Oriented code and procedural code? In his article All evidence points to OOP being bullshit John Barker says the following:
Procedural programming languages are designed around the idea of enumerating the steps required to complete a task. OOP languages are the same in that they are imperative - they are still essentially about giving the computer a sequence of commands to execute. What OOP introduces are abstractions that attempt to improve code sharing and security. In many ways it is still essentially procedural code.
I read that as saying the following:
OO languages are identical to procedural languages, but with the addition of encapsulation, inheritance and polymorphism.
Human beings think in a linear fashion, performing steps in a logical sequence one after the other. Computers execute code in a linear fashion, executing one instruction before following it with the next. Procedural code is executed in a linear fashion. OO code is executed in a linear fashion. OO code uses classes and objects while procedural code does not. The same computer processor can execute both procedural code and OO code because it does not know there is a difference, therefore there is no practical difference.
Some people seem to think "proper" OO is not as simple as taking procedural code and putting it in classes so that you can take advantage of inheritance and polymorphism. I could not disagree more. I have taken my years of experience of writing database applications in procedural languages and have successfully applied the principles of OOP to write identical applications with more reusability, less code, and which are quicker to write and easier to maintain and extend. By doing so I have achieved the objectives of OO, therefore how can anyone say that my implementation of OO is wrong? Different, maybe. Wrong, no.
According to Yegor Bugayenko in his article Are You Still Debugging? the differences between procedural and OO programming can be expressed as follows:
Pardon my French, but that is a complete load of bollocks balderdash. It does not matter two hoots what name you give to a method/function provided that it is meaningful and not too long. The name should identify what the method does so that you have a very good idea of what is supposed to happen when you call it. Changing the name so that it is centered around a noun instead of a verb has no magical effect on how the underlying code is executed. The name could even be a string of random characters that are meaningless to a human being, but the computer would not care. It would still allow the method/function to be called, and it would still execute the code within that method/function in the same linear fashion.
Opinions such as those simply prove that that the author has lost the plot. OOP involves the creation of objects which have properties (data) and methods (operations). Objects are instantiated from classes which represent entities. Methods identify operations which can be performed on an object. Am I the only one who sees that entities are nouns while operations are verbs? This should be obvious if you look at the following two lines of PHP code:
$result = $object->method(); $result = $noun->verb();
Saying that OO code is different from procedural code is plain wrong. You can take code from a procedural function and wrap it in a class and it will still work in exactly the same way. Nothing magical happens to the way in which the code is executed just because it is in a class method. If I write code which is oriented around objects, then by definition it is Object Oriented. The only "trick" in using OO is to identify how many classes you need you complete a task, what each class does, and how you link those classes together. This topic is discussed in What type of objects should I create?
In my methodology I create a separate class for each database table. This class contains the business rules and the data structure of that table, with all its basic operations inherited from my abstract table class. Each resulting object then holds the data for a particular database table and provides all the operations that can be performed on that data. According to Yegor Bugayenko in his article Getters/Setters. Evil. Period. this idea is totally wrong:
Objects are not "simple data holders". Objects are not data structures with attached methods. This "data holder" concept came to object-oriented programming from procedural languages, especially C and COBOL. I'll say it again: an object is not a set of data elements and functions that manipulate them. An object is not a data entity.
Pardon my French, but that is a complete load of bollocks balderdash. Objects most definitely are data data structures with attached methods, as that matches the definition of encapsulation. It also matches the definition of the Table Module from Martin Fowler's Patterns of Enterprise Application Architecture (PoEAA) in which he says:
One of the key messages of object orientation is bundling the data with the behavior that uses it.
Later in the same article Yegor says:
In true object-oriented programming, objects are living creatures, like you and me. They are living organisms, with their own behavior, properties and a life cycle.
WTF!!! How can objects in the software be living organisms? Surely objects are just representations or models of those things with which the software must interact, and not the physical objects themselves. He offers the following explanation:
We are differentiating the procedural programming mindset from an object-oriented one. In procedural programming, we're working with data, manipulating them, getting, setting, and deleting when necessary. We're in charge, and the data is just a passive component. The dog is nothing to us - it's just a "data holder". It doesn't have its own life. We are free to get whatever is necessary from it and set any data into it. This is how C, COBOL, Pascal and many other procedural languages work(ed).
On the contrary, in a true object-oriented world, we treat objects like living organisms, with their own date of birth and a moment of death - with their own identity and habits, if you wish. We can ask a dog to give us some piece of data (for example, her weight), and she may return us that information. But we always remember that the dog is an active component. She decides what will happen after our request.
This is yet another load of balderdash. He tries to justify it with the following example:
Can a living organism have a setter? Can you "set" a ball to a dog? Not really. But that is exactly what the following piece of software is doing:Dog dog = new Dog(); dog.setBall(new Ball());How does that sound? Can you get a ball from a dog? Well, you probably can, if she ate it and you're doing surgery. In that case, yes, we can "get" a ball from a dog. This is what I'm talking about:Dog dog = new Dog(); Ball ball = dog.getBall();Or an even more ridiculous example:Dog dog = new Dog(); dog.setWeight("23kg");Can you imagine this transaction in the real world? :) Start thinking like an object and you will immediately rename those methods. This is what you will probably get:Dog dog = new Dog(); dog.take(new Ball()); Ball ball = dog.give();Now, we're treating the dog as a real animal, who can take a ball from us and can give it back, when we ask. Besides that, object thinking will lead to object immutability, like in the "weight of the dog" example. You would re-write that like this instead:Dog dog = new Dog("23kg"); int weight = dog.weight();The dog is an immutable living organism, which doesn't allow anyone from the outside to change her weight, or size, or name, etc. She can tell, on request, her weight or name. There is nothing wrong with public methods that demonstrate requests for certain "insides" of an object. But these methods are not "getters" and they should never have the "get" prefix. We're not "getting" anything from the dog. We're not getting her name. We're asking her to tell us her name. See the difference?
By arguing about what method names to use I consider that this person is wasting too much time on petty, nit-picking, inconsequential trivialities. It does not matter what the method name is provided that it describes what happens when that method is called as succinctly as possible. What it does must be obvious, How it does it does not matter.
This person is so far wide of the mark he is not even on the same planet. A database application is software which communicates with passive objects in a database, not physical objects in the real world. These database objects are not living organisms, they are simply collections of data which are organised into tables and columns. A table is a passive object in that it can only receive requests and return a response - it can never generate a request or do anything on its own.
If I own a business which deals with dogs, such as buying, selling or breeding dogs, then if I have a computer system at all it will do nothing but record information about the dogs in my possession in order to help me run my business. Such a system would not interact with any living dogs, it would merely sit in the middle between a human being on a monitor at one end and a database at the other. All the software would do is pass information between the user and the database. This also means that although the physical object may be able to do certain things, or have certain things done to it, those "things" will not be made available as operations or methods inside the application as they would be totally irrelevant. For example, a real dog may be able to walk, run, roll over, eat, sleep and defecate, but would a sensible programmer build these operations into a business application?
Actions like giving or taking a ball from a dog would not exist in the computer application as they would not be relevant to the business. There would be no user transaction called "Give ball to dog" or "Take ball from dog". As for asking a dog for its name it cannot tell you as it cannot speak. If you want to know the name of a dog then you would look at the dog's name tag, or use an RFID scanner to obtain its identity from a microchip which has been embedded under its skin. As for obtaining a dog's weight, you never ask the dog itself - you put on the dog on a set of scales, you read off the number, and you update the dog's record in the database. In this process the dog itself is not an active component - it does not tell you its weight, it allows itself to be weighed. A dog does not know that it has weight, it cannot measure its weight, and it cannot tell you its weight. In this context "weight" exists as nothing more than a value on a set of scales and a column in the DOG table. In order to set a dog's weight you perform an UPDATE operation on that dog's record in the database. In order to see a dog's recorded weight you perform a SELECT operation which retrieves that dog's data from the database. Remember that these operations are performed on the DOG table and not a living dog.
Each task in a database application does something with the data in a database table, and this "something" is limited to just four operations - Create, Read, Update and Delete. These operations do not act on one column at a time, they act on sets of columns in one or more records. A single READ operation is capable of retrieving any number of columns from any number of tables, an UPDATE operation can change the values in any number of columns in any number of rows in a single table, but the INSERT and DELETE operations affect whole rows of data, not just individual columns. When one of my Controllers receives a request it passes it on to a Model object using a corresponding method. The Model may perform some of its own processing before it passes that request onto the DAO which constructs and executes the relevant SQL query on the database. The query produces a result which is returned to the Model, which may perform some of its own processing before the result is returned to the user in the form of a View. Below is a brief overview of the method names and the corresponding database operation:
Controller action | Database action |
---|---|
$model->insertRecord($_POST) | INSERT INTO <tablename> ..... |
$model->getData($where) | SELECT ... FROM <tablename> ..... WHERE $where |
$model->updateRecord($_POST) | UPDATE <tablename> SET ..... WHERE $where |
$model->deleteRecord($_POST) | DELETE FROM <tablename> WHERE where |
You might understand this better by looking at some UML diagrams which go into greater detail.
There would be no need for the application to have separate user transactions to update or view individual columns from the DOG table as the user interface would show all the dog's data in a single screen. There would therefore be no need to create methods to work on individual columns as each user transaction would use SQL queries that operated on multiple columns at the same time. Each screen would show all the columns from a record in the DOG table, all the columns would be sent to the software in the POST variable, all the columns would be validated in a single operation, and all these columns would then be passed to the DAO so that it could construct and execute a single query to either INSERT or UPDATE a record in the DOG table. A wise person would therefore see that the database operates on sets of columns, not individual columns, so having methods in the software which operate on individual columns would immediately put the software at odds with the database, so would not be a good idea. I was used to the concept of working with datasets before I switched to an OO-capable language, and I have continued using that concept since switching with great success.
If you think that his idea of treating objects as living creatures is absurd, things get worse in his article when he repeats a quote from David West:
Step one in the transformation of a successful procedural developer into a successful object developer is a lobotomy.
I assume by this statement he means that an OO developer is just like a procedural developer, but without a working brain. Now that I can believe.
Although other developers are constantly telling me that I am writing what they call "legacy code", they are missing the point concerning enterprise applications. Large corporations tend to shy away from brand new leading edge or bleeding edge software, something which has been recently developed using the latest buzzword or shiny gizmo. They like their software to be mature, to be tried and tested, to be low risk, to be proven, to have a pedigree. In other words, large corporations prefer legacy software, and as I develop software specifically for use by large corporations my first priority is to please them and not a bunch of know-nothing developers.
Many developers seem to think that legacy code (i.e. code that has not been freshly written to today's fashionable standards) is automatically bad because it is old. I have got some ground-breaking news for you guys - good code does not deteriorate with age. It does not go rusty, it does not slow down, it does not fall apart. The only way for good code to become bad is when it is attacked by a bad programmer. The code I wrote over a decade ago in PHP4 still runs today in PHP7, apart from one minor change. The computer processor does not execute the code differently because of its age because it does not know its age. When it is given a piece of code to execute it does not know whether that code was written 10 years ago or 10 seconds ago.
I do not change my code to use every new gizmo, gadget or gimmick that appears in the language simply because it has become available. I see no reason to change code which works perfectly well just to achieve the same result in a different manner. The only thing that has changed over the years with my framework and my main enterprise application is that I have modified or added some areas of functionality, and being a halfway-decent programmer I have managed to do this without screwing up.
When I first starting using PHP in 2001 it was with version 4, and I quickly learned to write effective OO software using nothing more than encapsulation, inheritance and polymorphism which PHP 4 fully supported. I used this to create my development framework, which was based on similar frameworks that I had created in previous languages, and I developed several applications using that framework. When PHP5 was released in 2006 I still had to support my PHP4 customers, so I did the bare minimum to ensure that my codebase could run in both PHP4 and PHP5. This meant that the additional OO features that were added in PHP5 did not appear in my codebase as they could not be used by my PHP4 customers. I started developing my major enterprise application in January 2007, and this is still being used, maintained and actively supported to this day. Even though I no longer have any PHP4 customers I have never upgraded my codebase to include the additional features of PHP5 as I cannot find any practical use for them. Even though my codebase now runs in PHP7 it still looks like my original PHP4 code. It would take an enormous amount of effort to rewrite my code to incorporate all the new OO features, but as it would not make my software run any better I do not see the point. My attitude is that if it ain't broke, don't fix it.
Those programmers who complain that code which I wrote over ten years ago and which is still is use today must be bad because of its age are inadvertently admitting to the fact that they cannot write code which lasts that long. They write such crap that it has to be thrown away and rewritten after just a few short years. How many of them have ten year old applications which are still running?
An enterprise application is primarily concerned with putting data into and then getting data out of a database. Such an application will have hundreds, if not thousands, of tasks (user transactions) which perform a unit of work for the user, and each of these will (except in rare circumstances) involve one or more records in one or more database tables. As has been explained earlier, the only operations which can be performed on a database table are INSERT, SELECT, UPDATE and DELETE, commonly known as CRUD, but it would be extremely naive to say that every one of those tasks is simple. In my decades of experience with writing database applications the term "simple" can only be used when describing a common family of forms which can be seen, in one form or another, in every single database application that has ever been written, such as that shown in Figure 5:
Figure 5 - A typical Family of Forms
This family consists of 6 simple forms - List1, Add1, Enquire1, Update1, Delete1 and Search1 - which can be built to operate on any table in any database. Using my Data Dictionary it is possible to create and run this forms family for a database table in just 5 minutes, all without writing a single line of code - no PHP, no HTML and no SQL. Contrast this with the amount of code that has to be written using other "proper" frameworks or ORMs, as shown in A minimalist approach to Object Oriented Programming with PHP. While the result of this exercise would indeed result in 6 simple screens, these represent only 6 of the 40+ Transaction Patterns which are available in my catalog. These additional patterns offer different ways of accessing the database for different purposes, such as maintaining many-to-many relationships, which are far beyond anything I have seen in other CRUD frameworks.
Note that some people regard this family as a single use case and think that each use case should have a single controller. I disagree. Each member of the family is an individual user transaction which can be selected for activation from an application menu. This then makes it easier to introduce a Role Based Access Control system where access can be granted or disallowed to individual members of the family instead of the family as a whole.
Each one of my Transaction Patterns is a genuine pattern as it provides pre-written and reusable code which can be used to generate a working transaction. This is totally unlike Design Patterns which provide nothing but a description of the code which you then have to write manually for each implementation.
Although each generated transaction is fairly basic, it does work by providing default behaviour, but this default behaviour can easily be modified to provide whatever additional complexity is required. Adding code to a transaction to either override the default behaviour or to perform additional processing is incredibly simple (at least in my framework it is!). If you look at the documentation for each transaction pattern, such as Add1, you will see the methods which the Controller calls on the Model. Each of these methods is defined with the abstract table class which is inherited by every concrete table class. If you click on one of these method names, such as insertRecord(), you will see that it actually calls a series of sub-functions for the various steps that are required to complete that operation. Some of these sub-functions are prefixed with the characters "_cm_" to signify that they are customisable. This means that they have been defined in the abstract table class but without any code, so that when they are executed they do not actually do anything. When a developer needs to execute some additional code in one of these operations all he/she has to do is copy the empty method from the abstract class to the concrete class, then insert whatever code is relevant. This modified method will then override the abstract method the next time that it is called.
Using this method I have created an enterprise application which currently has 450+ tables, 1,200+ relationships and 4,000+ tasks. While some of these tasks can be regarded as "simple", the remainder have degrees of complexity which can vary between "little" and "lots". Any idea that my framework can only be used for simple applications is therefore so wide of the mark it is not even in the same timezone or even the same planet.
There are an enormous number of "rules", "principles" or "practices" which are touted as being essential for the "proper" implementation of various OO theories. I say "various theories" as there is no single definition of what OO is on which everybody agrees. Virtually every day I seem to come across a new article which completely contradicts what somebody else published earlier. Is it any wonder that there is so much confusion? As for the use of the term "proper" OOP, this is also misleading as the term "proper" is completely subjective. There is no universally accepted definition of what "proper" OOP is, so who is qualified to say what is or is not "proper"?
There is no such confusion in my mind as I have the experience to know what is a good idea or not, what is bullshit or not. In my humble opinion 97% of what is written about OOP today is complete and utter bullshit. If you take 100 statements about what makes a program OO or not then as far as I am concerned only three of them are valid - encapsulation, inheritance and polymorphism. Everything else is an optional extra, and as I have found no practical use for them in my code I choose not to use them, and I believe that my code is all the better for it. If, by using the bare minimum of OOP techniques, I can achieve the objective of OOP - which is to increase code reuse and decrease code maintenance - then surely I have used those techniques in a "proper" manner.
I do not consider those things which are being touted as the "rules" of OOP to be rules at all. I did not even know that they existed when I started programming with objects, so they could not be that important. I already knew the principles of good programming - such as modular programming, structured programming, high cohesion, low coupling, KISS, DRY and YAGNI - from my days with non-OO languages, so all I did was throw encapsulation, inheritance and polymorphism into the mix in order to achieve the maximum of benefits with the minimum of effort. In my humble opinion these additional rules are nothing more than personal preferences which were dreamt up after the fact, and the days have long gone when I am swayed by the personal preferences of any Tom, Dick or Harry who manages to post an article on the internet. Some of these rules were nothing more than solutions in particular languages for particular problems, yet even though the problems no longer exist, or can now be solved with simpler or more elegant and reliable methods, the original solutions still persist. None of the modern programmers know why the rule was created and therefore cannot see that it is no longer relevant, all they see is a rule that must be obeyed without question. They behave just like the apes in Company Policy.
I have decades of experience in the design and implementation of numerous database applications, and that experience has enabled me to build and refine my own set of personal preferences. I also have the experience, and the nerve, to question every rule, and if it does not stand up to close scrutiny then I will ignore it. I have used my experience to design and build a framework specifically for creating database applications, and I have used that framework to create a large enterprise application which is now being sold as a package to large corporations all over the world, so my implementation cannot be as bad as some people would like to think. The job of a software developer is to develop working software as efficiently as possible, not to follow some outlandish philosophy which is full of bizarre theories that produce bad results when put into practice. If nothing bad happens in my code when I ignore one of these artificial rules, then what is the point of that rule? If I can write good code by disobeying or ignoring one of these artificial rules, then as far as I am concerned that rule cannot be justified and has no right to exist, so in my universe it does not.
In his article Level 5 means never having to say you're sorry the author Jeff Atwood points out that rules are created by the talented for the untalented in the hope that more people will become talented. Unfortunately there are two weaknesses in this philosophy:
Unless a person has a glimmering of talent to begin with they will never become great. An untalented sculptor will be able to do nothing but turn a single piece of stone into smaller pieces of stone. An untalented artist will be able to daub coloured oil onto canvas and produce nothing but a canvas covered with daubs of coloured oil. An untalented pianist will be able to sit at a piano and make sounds by pressing the keys, but will never be able to produce what audiences would describe as "beautiful music".
Computer programming is an art, not a science, so a description of its techniques will not necessarily guarantee that the results will be duplicated with 100% accuracy. Programmers are artists, not mere typists sitting in front of a keyboard.
As far back as 2003 I started to publish articles on the internet showing how I used the principles of encapsulation, inheritance and polymorphism to add to the pool of knowledge that was gradually accumulating. I was immediately assaulted by a bunch of OO gurus who told me that "real OO programmers don't do it that way". You can read the full story in What is/is not considered to be good OO programming. I laughed at their arguments then, and I'm still laughing now. Every time I put forward an opinion which goes against the established view I am attacked with the same old arguments. When I refuse to kowtow to my so-called "superiors" I am subject to abuse and insults, as can be seen in the following:
My answer to all these criticisms is quite simple:
Progress comes from innovation, not imitation. Innovation is not possible unless you do things differently, unless you rewrite the rules.
If were to do everything the same as you I would be no better than you, and I'm afraid that your best is simply not good enough.
In order to be better I have to start by being different, but all you can do is attack me for being different without noticing that my results are superior to yours.
You argument is that because I am breaking your precious rules then my code must be crap. What you fail to understand is that if I can produce superior results by ignoring your precious rules then it is your rules which are crap.
If the only bad thing that happens if I ignore one of your precious rules is that I offend your delicate sensibilities, then all I can say is Aw Diddums!
There now follows a set of topics which you should read as being prefixed by the phrase "There is no rule which says that ....".
This idea could not possibly be more wrong. When building a database application it is the database design which is king, and the software which is the implementation detail. I have designed and built more than a few database applications from scratch, and that process always has the same steps:
Once this process has been completed the database design can be finalised. The number and complexity of the user transactions will provide an idea of how much development effort will be required. The program specification for each user transaction will identify what operations will need to be performed on what database tables in order to complete that user transaction.
The last part of this exercise is to write the code. It is not possible to write any code to implement a particular use case until AFTER the user interface and the database structure have been designed. The user interface exists at the front end, the database exists at the back end, and the software sits in the middle and passes data between the two ends. The programming language and methodology used is largely irrelevant. Different development teams will have different preferences, and these will lead to different timescales and costs. It is a function of management to choose the most cost-effective solution. It is therefore a function of the development team to come up with the best solution as far as the paying customer is concerned, not the most theoretically "pure" solution as far as the development team is concerned.
Further observations on this topic can be found at:
In the original definition of what makes a language Object Oriented or not there is no such claim. It states that an object can have properties, but not that those properties also have to be objects. An entity which has behaviour and state can be an object. A service which has behaviour but no state can be an object. There is no good reason why properties of an object should themselves be objects. This is why PHP treats values as simple variables and not value objects.
Every language has functions which operate on variables (such as $result = function($variable);
), but there is no rule that the function should be replaced by a method on that variable's object (such as $result = $variable->function()
). They both achieve the same result, but the "pure" OO version has the overhead of casting each primitive data type into an object.
Having worked with PHP code for over a decade now I can safely make the following counter-claim:
Everything in PHP is either a string, or an array of strings.
- When a web page (HTML document) is constructed it is nothing more than a large string of text which contains HTML tags.
- When an SQL query is constructed and executed, it is nothing more than a string containing a DML statement.
- When the GET/POST request is received by a PHP script it is presented as an array of strings. This is because in an HTML document there is no concept of data type, so it is not possible for an element to be presented as anything else but a string.
The fact that every variable starts off as a string is not a problem in PHP. When a function is called to operate on a variable, and that variable needs to be of a certain type (such as an integer or a decimal number) then the language uses Type Juggling to cast that variable into the expected type.
This observation is usually accompanied by PHP is not a pure OO language
simply because it does not implement the "everything is an object" concept. I dismiss this complaint as absolute bollocks as there is no such thing as "pure OO" or "impure OO". The original definition clearly identifies what a language has to support in order to be called Object Oriented. PHP supports the concepts of encapsulation, inheritance and polymorphism, and has done since version 4, so by satisfying the definition of OO it most certainly IS OO. The fact that it also supports procedural programming is an added bonus, not a disqualifying feature. Just like C++, PHP is a multi-paradigm language which is OO-capable. Anyone who uses the OO aspects of the language to write code which is oriented around objects is therefore an OO programmer.
The term "multi-paradigm" means that PHP supports both the procedural style and OO style of programming. All the core functions in PHP are purely procedural, but some of the extensions, such as MySQLi, provide both a procedural and Object Oriented interface.
Even though there are many languages which support OOP, because they implement the principles of OOP in different ways there are many arguments among their respective fans as to which is the "purest" implementation. This leads to articles such as Execution in the Kingdom of Nouns which comments on Java's preference of nouns over verbs.
While it is true that everything must be designed before it is built, the design process for a software application does not, or should not, concern itself with how the software should be built, it simply identifies what needs to be done. As shown in Figure 1 a database application has a user interface (UI) at one end, a database at the other, with the software in the middle. An application is made up of many user transactions, and each transaction either puts data into, or gets data out of, the database. While some transactions do not have a user interface, such as batch jobs which run as background tasks, all the others will have some form of screen or report layout. The design process then compares the layout of each UI with the structure of the database to ensure that all the data requirements are covered. It is usual to start off with a draft database design, but this is often reworked while discussing the needs of each individual user transaction.
The most important part of a database application is the database itself, and in order to provide optimum access to the data it must be properly designed, using the technique known as database normalisation, otherwise the application will run like a pig with a wooden leg and may cause it to be rejected by the client. The end result of this process is a list of tables, their columns, primary keys, optional candidate keys and indexes, and relationships with other tables, either parent-to-child or child-to-parent, through the use of foreign keys.
Each user transaction performs a unit of work for the user, and can be described using the following:
Note again that this design simply identifies what needs to be done and not how it should be done, and certainly does not dictate how the code should be written. In fact this design process need not even identify the programming language which is to be used as a proper design should be able to be implemented in any language using any paradigm.
That is where my design process stops as I have all I need to develop the software. I import all the database schemas into my Data Dictionary, then export each table's definition to produce a separate class file and a structure file. These are separate files as the developer may wish to update the class file with custom code, while the structure file is completely rewritten each time the table's structure is changed.
I can then use my data dictionary to generate user transactions by combining one or more classes with one of my Transaction Patterns. Each of these may start of in basic form, but they can be enhanced with the insertion of extra code into the customisable methods of the relevant class file.
I do not waste my time with an additional design process such as Object-Oriented Design for the software as it would be totally redundant. I have designed and built a framework which helps me both build and run each database application, and this works on the understanding that each table in the database has its own class file, and that every user transaction will perform one or more of the CRUD operations on one more more database tables.
Further observations on this topic can be found at:
Refer to you must recognise "HAS-A" relationships.
In OO theory class hierarchies are the result of identifying "IS-A" relationships between different objects, such as "a CAR is-a VEHICLE", "a BEAGLE is-a DOG" and "a CUSTOMER is-a PERSON". This causes some developers to create separate classes for each of those types where the type to the left of "is-a" inherits from the type on the right. This is not how such relationships are expressed in a database, so it is not how I deal with it in my software. Each of these relationships has to be analysed more closely to identity the exact details. Please refer to Using "IS-A" to identify class hierarchies for more details on this topic.
Objects in the real world, as well as in a database, may either be stand-alone, or they have associations with other objects which then form part of larger compound/composite objects. In OO theory this is known as a "HAS A" relationship where you identify that the compound object contains (or is comprised of) a number of associated objects. There are several flavours of association:
Please ref to Using "HAS-A" to identify composite objects for more details.
I have read about two different situations where mock objects are used:
The common method of building database applications by "proper" OO programmers is to start with the software design and leave the actual database design till last. After all, the physical database is just an implementation detail, isn't it? This requires the use of mock database objects which are easier to change than a physical database. OO programmers often complain that changing the database structure after the software has been built is a real pain because it requires lots of changes to lots of components. In my humble opinion this pain is a symptom of doing something which is fundamentally wrong.
This I call the Dyson approach because it sucks so much. My method is entirely the opposite:
Note that I don't have to write the class file by hand, it is generated for me by my Data Dictionary. My implementation also also makes it very easy to deal with database changes. Simply change the database structure, perform the import/export in my Data Dictionary and the job is done. I do not have to change any code in any Controllers, Views or the Data Access Object. I do not even have to change any code in the Model unless the affected column is subject to any business rules or secondary validation. I may also have to change a screen structure script if the column appears in or disappears from any HTML screens.
This seems stupid to me as all you are doing is testing the mock object instead of the real object, so what happens if your real object suddenly behaves differently and unexpectedly encounters a real error that you did not cater for in your mock object? One problem I have encountered on several projects for different clients is where some numpty changes the structure of a database table by adding in a new column with NOT NULL set but without a default value. This screws up any existing INSERT statements which have not been made aware of this database change as they will immediately fail because they do not provide a value for a NOT NULL column. No amount of testing with mock objects will deal with this, so if you want to test that your software can deal with real world situations you should test your real objects and not phony ones.
I don't waste any time with mock objects, which means that I have more time to spend on real objects.
I came across this idea when reading Objects Should Be Immutable by Yegor Bugayenko. The basic idea is that, once an object has been created it should not be possible to change any of its internal state.
What planet does this person live on?
In all my decades of programming with various different languages I have never come across such a ridiculous idea. When writing a database application it is quite common to have user transactions which obtain data from the database, display it to the user in a form or a screen, allow the user to make changes, then send those changes to the database. The idea of forcing each software component to be non-amendable never entered anybody's mind, and even if it had the effort of forcing the software to behave that way would have been too expensive to contemplate.
This concept has never been supported by any well-used programming language that I have ever heard of. It is certainly not supported in standard OOP which has allowed such things as setter methods since their inception. All the while the language cannot force all objects to be immutable then I regard this as a purely artificial rule which has absolutely no merit whatsoever. The effort involved in changing the software to abide by this rule would be very expensive and time-consuming, and as it would provide zero benefit for the end-user it cannot be justified.
The main reason given is that it forces the object to be thread safe. This is irrelevant in PHP as it is not possible to share objects across multiple threads as PHP operates in a shared-nothing architecture. Each script starts with zero memory, and when it terminates any memory used is unallocated.
In his article PHP and immutability the author Simon Holywell talks about how immutable objects could be created in PHP. This is not an easy process as the language was not designed to support them, so it requires a lot of effort. This leads me to one simple question: Why? Why bother bending the language to do what it was not designed to do? Why bother wasting your time doing something which is not necessary or even useful? Unless you can identify a problem which is caused by NOT having immutable objects then I can only conclude that there is no problem, and if there is no problem then why should I implement this solution?
"Proper" OO programmers are taught the favour composition over inheritance principle which says to never use inheritance and always use composition instead, but it never identifies why inheritance is supposed to be bad. I have never had a problem with inheritance, and I could never see any code samples which proved that composition was better, so I chose to ignore this principle. It turns out that this idea was put forward by programmers who did not know how to use inheritance properly, as discussed in Use inheritance instead of object composition.
A class diagram is a type of static structure diagram that describes the structure of a system by showing the system's classes, their attributes, operations (or methods), and the relationships among objects. I never produce one of these. When I produce a logical database design in the form of an Entity-Relationship Diagram (ERD) I don't bother to produce a second and incompatible design for my software components. If you think carefully you should realise that a database application does not talk to objects in the real world, it only talks to objects in a database. Every object in the database is called a "table", and all its properties are called "columns". My ERD can therefore be used as the class diagram as each table has its own class, and each class need only support four methods - Create, Read, Update and Delete.
This takes care of any IS-A relationships as every object is a database table. As there is a significant amount of processing which is common to every database table I have put all the common code in an abstract table class which is then inherited by every concrete table class.
This also takes care of any HAS-A relationships as a table's properties are limited to its columns. In a relational database a table can never be "contained" within another table, it is always a free-standing object.
By producing a separate class for each database table I end up with a one-to-one relationship between tables and classes. This means that the two structures - database and software - are always synchronised with each other. There is no mismatch, therefore no need for an ORM.
Every software class therefore follows the same pattern, and every competent OO programmer should know how to deal with repeatable patterns. I myself have created a Data Dictionary into which I can import all the table structures after which I can export each table's data to produce a table class file and a separate table structure file. While the abstract table class provides the default processing for any database table, it is the table structure file which provides all the necessary information for a particular physical table. I can therefore create fully-functioning class files without having to write a single line of code. All I need do is go to my Data Dictionary and press the "import" and "export" buttons.
I don't need to waste any time on identifying what methods each class should have simply because each class represents a database table, and the only operations that can be performed on a database table are Create, Read, Update and Delete. Due to my long experience with building database applications I have also been able to identify certain repeatable patterns with user transactions which has enabled me to define a library of Transaction Patterns. Unlike Design Patterns these are real patterns as they provide standard code which can be turned into a working user transactions. The capability to generate transaction scripts has been built into my Data Dictionary. All I need to is select a table, select a Transaction Pattern then press a button to generate the relevant script(s).
By adopting this non-standard (and therefore heretical) approach I can create all my class files without writing a single line of code, and I can create working user transactions also without writing a single line of code. I can augment the standard behaviour for any class by adding code to any of the customisable methods. Dealing with changes to the database structure is also a walk in the park - I simply re-import the changed structure into my Data Dictionary, then re-export it to regenerate the table structure file.
If my heretical approach avoids those problems encountered with the approved approach, and enables me to create class files and user transactions at a much faster rate, can you please explain to me how it can possibly be wrong?
I came across this idea when reading Constructors Must Be Code-Free by Yegor Bugayenko. Among the reasons for this incredible rule are:
This doesn't affect me as I never use object composition.
This doesn't affect me as I never inherit from one concrete class to create a different concrete class.
This is another example of a useless rule which has no benefits should I choose to follow it. More importantly, absolutely nothing bad happens if I choose to ignore it.
This definition of a class constructor states the following:
In class-based object-oriented programming, a constructor (abbreviation: ctor) in a class is a special type of subroutine called to create an object. It prepares the new object for use, often accepting arguments that the constructor uses to set required member variables.
.....
A properly written constructor leaves the resulting object in a valid state. Immutable objects must be initialized in a constructor.
Notice that it says "often" when describing that arguments may be used to set member variables (properties), which implies that this is entirely optional. It also does not specify how many member variables may be set this way. Note that an immutable object must have all its member variables set within the constructor as none of these variables can be changed during the lifetime of the object. I don't use such objects, so that restriction is irrelevant to me.
Unfortunately the phrase "leaves the resulting object in a valid state" can be mis-interpreted. It should have used the word condition instead of state as "state" can be taken to mean all the data that is held within the object. In the article How to Validate Data the author starts with the following comment:
The constructor's responsibility is to initialize an object into a sane state.
After a few code samples he states the following:
The main problem with this approach is that we allow an object to enter an invalid state in the first place. This is a deadly sin, because it forces us back into procedural programming as we cannot pass around object references safely.
The reference to procedural programming is wrong, as I have explained earlier. As for passing around object references, I don't, so this is not an issue for me.
The problem lies with the terms "valid state", "invalid state" and "sane state". By using the word "state" instead of "condition" the impression is given that "state" refers to the data within an object (i.e. its member variables) when in fact it does not. After an instance has been created it can be in one of two conditions - valid or invalid. The definition of a valid object is quite simple:
A valid object is one that can accept calls on any of its public methods.
Note that this does not make any statements regarding the condition of any of the object's member variables (or the state (condition) of its state (the data which it contains)).
Attempting to perform validation with a class constructor has a serious problem in that a constructor has no return type, and if you throw an exception then this will be treated as an invalid object without any state rather than a valid object with invalid state. If you do not catch the exception then the entire script will terminate.
This rule also implies that the data must be validated outside of the object before it can be inserted, but I am afraid that this would violate the principle of information hiding which encapsulation is supposed to enforce. All business rules concerning an object, and this includes data validation rules, are supposed to be buried within the object and hidden from the outside world.
If I don't use an object's constructor to put values into its member variables, then what exactly do I do? In the first place when I create an instance of a class the resulting object is valid but devoid of any data. The object can then be populated with data in one of two ways:
Every database application that I have ever worked on has had to deal with a number of use cases (user transactions). In my ERP application I have 4,000+ tasks (use cases) and 450+ model classes. If I had a separate method in a Model for each of these tasks it would mean the following:
As the primary objective of using OOP in the first place is supposed to be to increase the amount of reusable code, the lack of reusability with Models and Controllers that following this principle would produce is obviously, at least to me, a step in the wrong direction.
In my methodology I create an entry on the TASK table of my MENU database for each use case. This entry points to a component script on the file system which in turn points to a Controller and one or more Models where all communication between them is governed by the methods that were defined in the abstract table class. This means that the use case name is defined in the MENU database and not as a method name within a class. The user selects which task he wants to run by its name in the MENU database, and the Controller which is activated for that task uses generic methods to perform whatever action is required.
I do not have to create any special methods in a Model as all the public methods I need are inherited from a single abstract class. I do not have to put any special method calls into any Controller as they only use the same public methods which are defined in the abstract class. Each of my Controllers has been designed to be reusable with ANY Model, so is available as a pre-written component in my framework. So if I have 40 Controllers and 450 Models this equates to 40 x 450 = 18,000 (EIGHTEEN THOUSAND!) opportunities for polymorphism. If I followed your rule I would not have this level of reusability, so I don't follow your rule.
In the RADICORE framework each task is defined on the MENU database, and this data is used to compile a hierarchy of menu options, either as menu buttons or navigation buttons, so that the user can select which task he wants to run. When a task is selected this causes the associated component script to be activated. The contents of this script is similar to the following:
Code Sample #1 - Component Script<?php $table_id = "person"; // identify the Model $screen = 'person.detail.screen.inc'; // identify the View require 'std.enquire1.inc'; // activate the Controller ?>
For a full description please read Component Script.
Each of these scripts is very small as it does nothing but identify a combination of Model, View and Controller which, when combined, will perform the necessary actions on the database to complete the designated transaction. Each Controller uses the generic methods which were defined within the abstract table class, and as this is inherited by every Model class it is therefore possible for every Controller to be used with every Model, thus enabling large amounts of polymorphism.
When designing an application which has entities such as PRODUCT, CUSTOMER and INVOICE, the "approved" approach is to create methods which combine the object name with the operation which is to be performed, such as createProduct()
, createCustomer()
and createInvoice()
. This in fact is not good OO as it provides little opportunity for polymorphism and code reuse. Now think of the effect that those operations will have on the database, then change the operation names to reflect the equivalent SQL operation:
traditional | effect on the database | the Tony Marston way |
---|---|---|
createProduct() | insert a record into the PRODUCT table |
$table_id = 'product'; .... require "classes/$table_id.class.inc"; $dbobject = new $table_id; $result = $dbobject->insertRecord($_POST); if (!empty($dbobject->errors)) { ... handle error here ... } |
createCustomer() | insert a record into the CUSTOMER table |
$table_id = 'customer'; .... require "classes/$table_id.class.inc"; $dbobject = new $table_id; $result = $dbobject->insertRecord($_POST); if (!empty($dbobject->errors)) { ... handle error here ... } |
createInvoice() | insert a record into the INVOICE table |
$table_id = 'invoice'; .... require "classes/$table_id.class.inc"; $dbobject = new $table_id; $result = $dbobject->insertRecord($_POST); if (!empty($dbobject->errors)) { ... handle error here ... } |
payInvoice() | insert a record into the PAYMENT table |
$table_id = 'payment'; .... require "classes/$table_id.class.inc"; $dbobject = new $table_id; $result = $dbobject->insertRecord($_POST); if (!empty($dbobject->errors)) { ... handle error here ... } |
For example, the "traditional" approach to implementing the use case "Pay an Invoice" involves the following steps:
payInvoice()
which would have the effect of adding a record to the PAYMENT table and updating the balance on the INVOICE record.Each of these steps can only be implemented by writing code. The essence of this use case is contained in step #2, and I completely avoid the need to write code to implement the other steps by using reusable modules:
payInvoice()
method inside an INVOICE object as the insertRecord() method within the PAYMENT object is already there - it is inherited from the abstract table class. I modify the _cm_post_insertRecord() method inside the PAYMENT table to call the updateRecord() method on the INVOICE object, and I modify the _cm_pre_updateRecord() method within this object to recalculate the current balance from the contents of the PAYMENT table.Note that in the above table there is a single insertRecord()
method being called on a variety of different objects. That method call is built into a reusable Controller which can be used with any of the database objects in my application. I have encapsulated the details of each database table in its own class, I have inherited all the standard CRUD operations from an abstract class, thus providing me with more opportunities for polymorphism than I have ever seen in any other application, so why do my critics keep telling me that my implementation is wrong?
My critics keep telling me that my version of OOP is totally wrong, but is it? One of the first principles of OO theory is a process called abstraction which unfortunately has a different meaning depending on whose paper you read. The best description I found was published in 1988 by Ralph Johnson and Brian Foote in their paper called Designing Reusable Classes in which they describe it as follows:
The process of abstraction involves the separation of the abstract from the concrete, the similar from the different.
I discuss this paper in great detail in The meaning of "abstraction".
This is why I created an abstract table class to contain all the common properties and common methods which can be associated with a database table, and a concrete table class for each physical implementation of that abstraction.
If you look at a database application which contains thousands of use cases (tasks), what is the highest level of abstraction? What description can summarise the essential characteristics of all those use cases? To me it is that each use case consists of a user interface (UI) at the front end, a database at the back end, and some software in the middle which transports data between the two ends. That software may also process business rules which may mean reading from or writing to multiple tables in the database, or even activating other tasks. Each use case therefore performs one or more actions on one or more tables in the database, and anyone with more than two brain cells to rub together will be able to tell you that the only actions that can be performed on a database table are Create, Read, Update and Delete. By implementing the MVC design pattern I have create a Model class for each database table and a Controller for each Transaction Pattern which performs one or more of those operations on one or more tables. All each use case needs is a small component script which identifies which Controller to use with which Model. This means that I can reuse each of these Controllers with any Model, and reuse each Model with any Controller. If the aim of OOP is to maximise the amount of reusable code, then why do my critics keep telling me that my implementation of OO is wrong? I have achieved more than they have with their "proper" techniques, so why aren't they being criticised for not achieving enough?
My critics tell me that it is simply not "good OO" to create a separate class for each table as that must surely mean that I have a lot of duplicated code in each class. That just proves to me that they are failing at this process called abstraction yet again. If the concept of a database table is abstract while a particular database table is a concrete implementation of that abstraction, then why is it wrong to do what I have done, which is to create an abstract table class to contain all the properties and methods which characterise an unknown database table, and then inherit from this abstract class to create a concrete class for each actual database table?
I do not have a method such as getInvoiceBalance()
as a balance column is automatically included in the resultset when the getData()
method is called on the INVOICE object. I do not have a method such as calculateInvoicedBalance()
as the code to do this is within the _cm_pre_updateRecord() method which is called automatically when the updateRecord() method is called on the INVOICE object.
Instead of having a separate method for each use case I have a separate user transaction which appears on the application menu with the use case name. The user then selects the necessary use case from the menu, the URL points to the relevant component script in the file system, and this script performs all the necessary database operations to complete that use case. Although there can be any number of user transactions, and each one can access any number of tables, the only method names that are used in the Controllers are those which were defined in the abstract table class. This reuse of the same method name on different objects demonstrates polymorphism. Note that in my framework the actual code is split as follows:
$table_id
is provided in a separate component script which then passes control to one of my controller scripts using a form of dependency injection.Using this simple yet effective technique I can create as many tables in the database as I want, with a separate class for each table, yet I do not have to spend any time in defining or creating any methods to access any of those classes for any use case. This is because I have reduced each possible use case into its essential details, namely "it does something with one or more tables in the database". The "something" comes in two parts - structure and behaviour - for which I have created a reusable set of Transaction Patterns. All I have to do is combine a pattern with one or more database tables Currently my main enterprise application has over 450 database tables and over 2,800 user transactions being serviced by only 40 Transaction Patterns. That demonstrates a huge amount of code reuse, and 18,000 (450 x 40) opportunities for polymorphism.
In some circumstances it may be necessary to provide alternative implementations for certain generic methods in certain user transactions. I do this by creating a subclass of the concrete class but with a "_Snn" suffix. This inherits all the code from the superclass, so need only contain those methods which it needs to override. Note that this technique does not provide a new concrete class as the underlying table name does not change. Here is a sample subclass:
<?php require_once 'mnu_subsystem.class.inc'; class mnu_subsystem_s01 extends mnu_subsystem { // **************************************************************************** function _cm_pre_updateRecord ($rowdata) { .... alternative code goes here return $rowdata; } // _cm_pre_updateRecord } ?>
You can see examples of this in my framework with the following classes in the MENU subsystem:
mnu_subsystem.class.inc
- provides processing for the standard CRUD operations.mnu_subsystem_s01.class.inc
- will export all subsystem details to a text file.mnu_subsystem_s02.class.inc
- will create the directory structure for a new subsystem.I once read a post that talked about how to deal with invoice payments, and the author was tying himself in knots trying to decide whether to use invoice.pay(Amount)
or pay.invoice(Amount)
. If you look at What type of objects should I create? you will see that I recognise and work with only two types of object:
The use of invoice.pay(Amount)
implies that invoice
is an entity with an operation called pay()
. Everybody should know that in the real world an invoice is an inert/inactive object. In other words, it cannot do anything of its own accord, it can only have things done to it. An invoice can be created, read, updated and deleted. It can be printed, converted into a PDF or CSV file. Each of these operations is done to an invoice and not by it. An invoice object should therefore not have a pay()
method as this is not an operation that an invoice can perform. An invoice cannot pay itself, but it can be paid, or even unpaid, by an external object.
The use of pay.invoice(Amount)
implies that pay
is a service which can operate on an entity. This then opens up the possibility of using that service on other objects, such as pay.windowcleaner(Amount)
and pay.employee(Amount)
.
As far as I am concerned both of these approaches are totally wrong as they require the use of special method names which then requires special controllers to call those methods. Having special controllers instead of reusable generic controllers then increases the amount of code which has to be written, debugged and then maintained, so in my book this is a bad thing and should be avoided. Remember that this is a database application which accesses objects in the database called tables, and the only operations that can be performed on a table are Create, Read, Update and Delete. I have designed and built several applications which deal with invoice payments, and they all functioned in exactly the same way:
Using my framework I would therefore create a transaction based on the ADD2 pattern. The generic ADD2 controller would then call the generic insertRecord() method on an object created from the PAYMENT class. In the _cm_post_insertRecord() method of this class I would insert code to call the updateRecord() method on an INVOICE object which would cause that object to recalculate its balance from the current contents of the PAYMENT table. What could be simpler than that?
It is also possible to create new public methods in a class so that it can be called from another table class to perform some complex processing. Methods such as these cannot be called from any standard Controller. An example of this is where a PARTY can have one or more postal addresses, and where there may be different addresses for different purposes such as primary, billing and delivery. An invoice may need to show each of these three addresses. The business rules can be stated as follows:
The code required to lookup a Party's address for a particular purpose has to cater for all these possibilities, so this has been placed in special methods which can be called as follows:
$dbobject = RDCsingleton::getInstance('party'); $address1 = $dbobject->getPrimaryAddress($fieldarray['party_id'], $fieldarray['order_date']); $address2 = $dbobject->getBillingAddress($fieldarray['party_id'], $fieldarray['order_date']); $address3 = $dbobject->getDeliveyAddress($fieldarray['party_id'], $fieldarray['order_date']);
Notice here that as the dependent object can only come from a single class there is no advantage in using dependency injection, so I instantiate it just before I consume the service that it provides. By using a singleton I also avoid the overhead of creating a new instance should I need it more than once.
When I was researching how to implement the principles of OOP in 2002 I noticed in all the tutorials and code snippets that every attribute of an object was defined as a separate member variable (aka "property") which was then accessed by its own getter and setter methods. Although this was regarded as common practice I did not treat it as a hard and fast rule. The first thing that I noticed with PHP is that when a script receives a GET or POST request from the client it is presented as an associative array of name=value
pairs. If multiple rows are involved then this array is multi-level; the first level is indexed by row number, and each row has its own associative array with a separate entry for each column. I also noticed that when reading data from the database that the result set can easily be passed back in the same format - a multi-level array, first indexed by row, then associative or every column in each row. I had already learned that arrays in PHP were very powerful and easy to use, so I asked myself a simple question:
If I am writing a database application, and a database deals with datasets containing any number of columns from any number of rows, and the PHP language can deal with any size of dataset in a single array, then why do I need to unpack an array into its constituent parts?
This means that when my Controller receives the $_POST array and wants to pass that data to the Model I have two choices:
<?php $dbobject = new Person(); $dbobject->setUserID ( $_POST['userID' ); $dbobject->setEmail ( $_POST['email' ); $dbobject->setFirstname ( $_POST['firstname'); $dbobject->setLastname ( $_POST['lastname' ); $dbobject->setAddress1 ( $_POST['address1' ); $dbobject->setAddress2 ( $_POST['address2' ); $dbobject->setCity ( $_POST['city' ); $dbobject->setProvince ( $_POST['province' ); $dbobject->setCountry ( $_POST['country' ); if ($dbobject->updatePerson($db) !== true) { // do error handling } ?>
<?php require_once 'classes/$table_id.class.inc'; // $table_id is provided by the previous script $dbobject = new $table_id; $result = $dbobject->updateRecord($_POST); if ($dbobject->errors) { // do error handling } ?>
Those of you who know about coupling should see that the first of these two code snippets demonstrates tight coupling while the second demonstrates loose coupling. Tight coupling is considered to be bad while loose coupling is good.
Of the two choices I went for the second for the simple reason that it got the job done with the least amount of code. I then noticed that as well as not having to customise any Controller to hard-code the names of any table columns, I did not even had to hard-code the table/model name as it could be passed down from the calling script. Because every Controller communicates with the Model using methods which are defined within the abstract table class, and this abstract class is inherited by every concrete table class, this meant that any Controller could be used with any Model via the principle of polymorphism. The net result is that not only am I using less code, I actually created more reusable code. That must be good, right?
If you have a separate class property for each column in the table then it follows that the only way to put values into or get values out of the object is to use separate getters and setters for each column. I do not like this idea as it uses too much code and automatically creates tight coupling, which is supposed to be a bad thing. As explained previously an SQL database deals with data in sets which can contain any number of rows and any number of columns, which is why I have a single $fieldarray
member variable to hold the current dataset as an array. The entire array can be injected into the object as a single argument on an object call, as in $result = $object->insertRecord($_POST)
. Accessing a column value when it is contained within an array is not a problem, so instead of using $this->fieldname
you would use $this->fieldarray['fieldname']
instead. When an object's data is passed to the View object for processing it can easily be unpacked internally using code similar to the following:
foreach ($this->fieldarray as $fieldname => $fieldvalue) { .... } // foreach
This just shows that there are several different methods by which data can be injected into or extracted from an object, each with its own set of pros and cons, so creating an artificial rule which says that only one of these methods is acceptable is, in my humble opinion, totally unacceptable. If there are several options available then I must be allowed to choose which one to use in my code.
This topic is also discussed in Don't use getters and setters for user data and Getters and Setters are EVIL.
The is another variation on the rule which states that the Model should never contain invalid data. The only golden rule in database applications is that the data must be validated BEFORE it is written to the database. This means that unvalidated data may be injected into the Model, but that the data MUST be validated before the INSERT or UPDATE query is generated.
All OO programmers should be aware that all business rules for an entity should be encapsulated in the class which models that entity, and data validation is part of those rules. Putting these rules somewhere else would therefor violate encapsulation.
Note that using constraints or triggers within the database to validate the data is a possibility, but one that I would never use because I have always operated on the premise that if an SQL query fails then it is a catastrophic failure which causes the program to terminate immediately. It has always been standard practice to perform data validation within the program code, and I am not going to change that practice for anyone.
This rule is a continuation of you must have a separate class property for each table column and you must validate all data before it is put into the Model
Validating each value within its setter makes it difficult to perform cross-validation with other values. Some people prefer to use a separate validate() method, or even have the validation performed within the Controller instead of the Model, but since day one I have performed all validation using a method which does not have to be called from the Controller.
Other programmers choose to have separate public methods called load(), validate() and store(). This is not a good idea as it allows for more data to be inserted after the validate() has been performed, which could lead to errors during the store(). In my framework I do not treat these as separate operations as they must always be executed together and in a particular sequence. In other words they form a group operation in which they are separate steps within that operation. If you look at either insertRecord() or updateRecord() the load() is performed by passing all the data in as an input argument while the validate() and store() are performed internally. Note that the store() method is only called if the validate() method does not detect any errors. For fans of design patterns this is an example of the Template Method Pattern where the abstract class contains all the invariant methods and allows variable/customisable methods to be defined within individual subclasses.
My controllers which handle INSERT operations use the following code:
require_once 'classes/$table_id.class.inc'; // $table_id is provided by the previous script $dbobject = new $table_id; $result = $dbobject->insertRecord($_POST); if ($dbobject->errors) { // do error handling } // if
My controllers which handle UPDATE operations use the following code:
require_once 'classes/$table_id.class.inc'; // $table_id is provided by the previous script $dbobject = new $table_id; $result = $dbobject->updateRecord($_POST); if ($dbobject->errors) { // do error handling } // if
If you examine what happens with the insertRecord() and updateRecord() methods you should see that these methods have their processing broken down into several steps, one of which is either validateInsertPrimary() or validateUpdatePrimary(). This will use the standard validation object to verify that the field values in $this->fieldarray
conform to the specifications found in $this->fieldspec. Additional validation can be perform in one of the _cm_commonValidation(), _cm_validateInsert() and _cm_validateUpdate() methods. Note that if any of these methods puts anything into the $this->errors array then the database update will not be performed. Notice also that I do not use exceptions, which then allows more than one error message to be returned.
This is a restriction that results from having a separate class variable for each column in the database. Each variable can hold only a single value, therefore the object can only deal with a single database row.
Programmers who are familiar with SQL know that queries deal with data sets and not individual columns. Thus a SELECT query can return any number of columns from any number of rows from any number of tables. By not taking advantage of this feature the programmer will not only be making extra work for himself, he will actually be making a rod for his own back.
That is why I use a single array variable to hold an object's data. The array can start by being indexed by row number, then associative to identify a value for each column name. By having all the data presented in a single array variable it therefore means that whenever I need to pass data from one object to another I can do it with a single array variable instead of one column at a time. This then means that I have achieved loose coupling instead of tight coupling, which is supposed to be a Good Thing (™).
For those of you who think that this idea is absolute heresy I suggest you take a look at the Table Module which is part of Martin Fowler's Patterns of Enterprise Application Architecture (PoEAA).
The Controller and Model are components in the Model-View-Controller design pattern.
I don't know who came up with this idea, but he was clearly a few sandwiches short of a picnic. Perhaps in all the descriptions or samples he read he could only see a reference to a single model and automatically assumed that it meant that there could only ever be a single model. How naive. How restrictive. This is another stupid rule that I have been ignoring for over a decade simply because I did not know it existed. Even if I had known I would have ignored it simply because there is no logical reason why such a restriction should exist. When I was developing my enterprise application I wanted a controller which accessed more than one model, so I wrote one. It worked without any issues, so what could possibly be the problem? It also has an advantage in that it allows each Model to have its own set of scrolling or pagination controls.
If you look at OOP for Heretics - Figure 9 you will see that a Sales Order has data which is spread across several different tables which are linked in a number of one-to-many or parent-to-child relationships. Some OO programmers will attempt to create a single Sales Order class which contains references to all these tables as properties within the object, but this approach is on my "to be avoided at all costs" list. Instead I have a separate class for each table so that I can easily build a family of forms to perform basic maintenance on that table from my collection of standard Transaction Patterns. When I want to create a user transaction which needs to show data from more than one table I am not constrained by the artificial rule that a Controller can only speak to one Model, so I create a user transaction from my LIST2 pattern which can deal with any pair of tables which are linked in a parent-child relationship. Note that this controller uses the standard generic methods which are inherited from the abstract table class to access each table, so there is no need to create custom methods for each relationship.
Those of you who have been taught the, ahem, proper approach will not choose that simple path. Oh no. You will prefer something much more complex just to prove how clever you are. You will be forced to use a controller which can only access a single model, so you will be forced to use composite objects where it will not be possible to get at the child table without going through the parent. This then means that the model will have to contain special methods in order to identify which table needs to be accessed, and with what arguments. This in turn means that the controller will have to be modified to call those methods, which then ties that controller to that model. Why is this a bad idea? Because it makes that modified controller unsharable. It introduces tight coupling between that Controller and that Model. This adds to the amount of code which has to be debugged and maintained, and reduces the amount of code which can be shared. This completely wipes out the benefits of using OOP, so why do you morons keep doing it? Answers on a postcard please to .....
In my framework there is loose coupling as any of my standard Controllers can be used with any of my Model objects. This is because each of my Controllers, which belongs to one of my Transaction Patterns, communicates with the Model through methods which are inherited from my abstract table class.
This topic is also discussed in Criticsm #4 of my implementation of MVC.
This continues on from the previous rule which I ignore. It means that when you have separate user transactions to browse, create, enquire, update and delete rows in a table all these requests are routed through the single controller that handles that table. An example can be found in GRASP - The Controller which states:
A use case controller should be used to deal with all system events of a use case, and may be used for more than one use case (for instance, for use cases Create User and Delete User, one can have a single UserController, instead of two separate use case controllers).
That controller must then contain code which forces it to behave differently depending of which of those actions was requested. I use the term "modes" as the program would be running in either browse mode, enquire mode, create mode, or whatever mode. It would then need to remember which mode it was currently running in so that it only executed the code which was relevant to that mode. Executing the wrong code or not executing the right code has been known to produce unexpected results.
Just because someone created an example which performed all those actions/modes in a single program then caused all those who saw it to erroneously assume that it showed the only way to do it, that it was a rule that it should be done that way. How naive. How restrictive. Being an experienced developer I have learned otherwise. In my early COBOL days it was standard practice to have one program which could do everything to a database table, which required a mode
switch in order to identify which of the browse, create, enquire, update or delete modes was currently being processed. This complicated the source code and caused problems, so when I had the opportunity to implement a different approach I decided to break this large component into smaller units so that each one handled only one of those modes. This approach is documented in Component Design - Large and Complex vs. Small and Simple. This approach provides the following characteristics:
Another advantage to this approach was that it made it far easier to implement Role Based Access Control as it was possible to grant access to a transaction which performed a single mode rather than grant access to one of several modes within a transaction. The access checking could then be performed by the framework before the program was called instead of within the program after it was called.
Having a separate transaction for each mode made it easier for me to utilise the feature known as component templates in UNIFACE. I improved on this idea when I switched to PHP by creating a catalog of Transaction Patterns. I created a controller script for each pattern, and due to my heretical implementation of Dependency Injection and my capacity for polymorphism it turned out that each of my controllers could be reused with any of my 450 models. To put this into perspective, my main enterprise application currently has over 4,000 user transactions, and each one of those shares one of my 40+ controller scripts. Do you have that level of reusable code in your application?
If you are one of those blind-as-a-bat and dumb-as-an-ox rule-followers who knows nothing other that whatever drivel you have been taught then you will have to create a hand-written controller for each user transaction. This adds to the amount of code which has to be written, debugged and maintained, and reduces the amount of code which can be shared. This completely negates the benefits of using OOP, so why do you morons keep doing it? Answers on a postcard please to .....
I have seen this idea documented in several well-known frameworks where it describes how to create simple CRUD transactions, and I am amazed to see how much code that has to be written in order to achieve the desired result. Take a look at the following examples:
The idea of creating each Model class by hand, with its unique set of properties and methods, is something that I abandoned when I built a Data Dictionary into my framework. If I need to create a Model class for a new database table I use an automatic process, not a manual one:
Note that the generated class file initially contains nothing but a constructor as all default methods and properties are inherited from the abstract table class.
If each Model has customised instead of generic methods, it will then need a customised Controller in order to call those methods on that Model. A customised controller is not reusable, so this goes against one of the aims of OOP which is to increase reusability.
If each Model uses the same set of generic methods, such as those which are inherited from my abstract table class, then it will be possible to use generic controllers which can call those generic methods on any Model using the OO feature known as polymorphism. If I have 40 Controllers which can be used with any of my 450 Model classes then that gives 18,000 (40 x 450) opportunities for polymorphism. Can your framework match that?
In order for a generic Controller to do its stuff with an unknown Model there needs to be some mechanism which allows the identity of the Model class to be passed to the Controller. In my framework this is done using a form of dependency injection where the identify of the Model is loaded into a global variable in a component script such as the following:
<?php $table_id = "person"; // identify the Model $screen = 'person.detail.screen.inc'; // identify the View require 'std.enquire1.inc'; // activate the Controller ?>
The last line in this script activates the controller which contains code such as the following to load in the relevant class file and instantiate an object from that class:
require "classes/$table_id.class.inc"; $dbobject = new $table_id;
The generic Controller is then able to call any of the generic methods on that unknown Model by virtue of the fact that all those generic methods are inherited by each Model class from the abstract table class, as shown in the following:
$fieldarray = $dbobject->getData($where); $fieldarray = $dbobject->insertRecord($_POST); $fieldarray = $dbobject->updateRecord($_POST); $fieldarray = $dbobject->deleteRecord($_POST);
Each output format - HTML, CSV and PDF - has its own pre-written class which is built into the framework. Each individual task, which is built on a Transaction Pattern, will reference only one of these View objects.
The HTML View object will require a small screen structure script which is initially generated by the framework, but which can be modified at will. Each of these files identifies which XSL stylesheet is to be used, and which columns from that Model need to be displayed on the screen, and where. The object will then extract all the data from the single data array in the Model and write it out to an XML document. It will also copy the column data from the screen structure file into the XML document so that a small number of reusable stylesheets can be used instead of having to create a customised stylesheet for each screen. The View object finally performs an XSL transformation to generate the final HTML document which is returned to the client's browser.
The PDF View object will require a small report structure script which is initially generated by the framework, but which can be modified at will.
I do not have a separate DAO class for each table, only for each supported DBMS - currently MySQL, PostgreSQL, Oracle and SQL Server. This handles all communication with the physical database. The abstract table class, which is inherited by every Model class, contains all the relevant methods to instantiate and pass control to the relevant DAO as and when necessary.
The need for an ORM only occurs when you encounter the problem known as Object-Relational Impedance Mismatch. This problem only arises when you use two different methodologies to design different parts of the same software application, and these produce incompatible results. An intelligent person should recognise that in this situation it would be better to drop the least reliable of the two methodologies and therefore produce a single design which does not have any incompatibilities. Instead of creating a problem which requires a "solution" in the form of additional complex code, wouldn't it be easier to eliminate the problem in the first place and then NOT have to implement that abomination called an ORM? Every experienced database programmer already knows that the software should follow the database design, so I'm afraid that it's OOD that should get the elbow. Good riddance, I say! An intelligent programmer should see that being able to achieve a result with less code should be regarded as a Good Thing (™) and therefore should be encouraged. Can you explain to me why it isn't?
Some people use an ORM simply because it offers an OO method of creating SQL queries, such as those described in php.activerecord - finders. I do not waste my time writing objects to manipulate strings when the PHP language has built-in functions to do just that. After using SQL for five minutes it became immediately obvious to me that although an SQL query is a single string, it is actually made up of a series of distinct substrings. In each of my table classes I therefore hold each of these substrings as a plain string, as shown below:
$this->sql_select = ('...'); $this->sql_from = ('...'); $this->sql_groupby = ('...'); $this->sql_having = ('...'); $this->sql_orderby = ('...'); $this->setRowsPerPage(10); // used to calculate LIMIT $this->setPageNo(1); // used to calculate OFFSET
This then enables me to examine and modify any of these substrings at will, and it is only after they have been passed to my Data Access Object (DAO) that they are assembled into a single query which can be executed. Once I know what query I wish to execute, after testing it in my database client program, it is therefore very easy to get the software to generate the same query. I do not have to learn the ORM's query language in order to generate genuine SQL.
Note also that I do not waste my time with collections of finder methods. These do not exist in SQL as it only ever needs a general-purpose WHERE clause in a SELECT statement, so why should they exist in the software? I have a single generic getData($where) method where the single argument is a single string which can contain anything which is valid in an SQL query. Because it is generic I can use it on any table object to retrieve whatever data I like using whatever filters (selection criteria) I like.
There is a big difference between "using" design patterns and over-using them to such an extent that they become an obsession and get in the way of writing effective code. Too many programmers are taught to study a particular catalog of design patterns (yes, there is more than one, which adds to the problem) choose as many patterns as they think will be necessary, then to design their code around those patterns. This is a BIG mistake. That's not just MY opinion, it is also the opinion of Erich Gamma, one of the authors of the GOF book, who in the article How to use Design Patterns said the following:
Do not start immediately throwing patterns into a design, but use them as you go and understand more of the problem. Because of this I really like to use patterns after the fact, refactoring to patterns.
One comment I saw in a news group just after patterns started to become more popular was someone claiming that in a particular program they tried to use all 23 GoF patterns. They said they had failed, because they were only able to use 20. They hoped the client would call them again to come back again so maybe they could squeeze in the other 3.
Trying to use all the patterns is a bad thing, because you will end up with synthetic designs - speculative designs that have flexibility that no one needs. These days software is too complex. We can't afford to speculate what else it should do. We need to really focus on what it needs. That's why I like refactoring to patterns. People should learn that when they have a particular kind of problem or code smell, as people call it these days, they can go to their patterns toolbox to find a solution.
The GOF book actually contains the following caveat:
Design patterns should not be applied indiscriminately. Often they achieve flexibility and variability by introducing additional levels of indirection, and that can complicate a design and/or cost you some performance. A design pattern should only be applied when the flexibility it affords is actually needed.
This sentiment is echoed in the article Design Patterns: Mogwai or Gremlins? by Dustin Marx:
The best use of design patterns occurs when a developer applies them naturally based on experience when need is observed rather than forcing their use.
This habit of (theoretically) solving a problem by adding on another layer of indirection or abstraction is often over-used. A quote usually attributed either to David Wheeler or Butler Lampson reads as follows:
There is no problem in computer science that cannot be solved by adding another layer of indirection, except having too many layers of indirection.
In his article Protected Variation: The Importance of Being Closed (PDF) the author Craig Larman makes the following observation regarding over-engineering:
We can prioritize our goals and strategies as follows:Low coupling and PV are just one set of mechanisms to achieve the goals of saving time, money, and so forth. Sometimes, the cost of speculative future proofing to achieve these goals outweighs the cost incurred by a simple, highly coupled "brittle" design that is reworked as necessary in response to true change pressures. That is, the cost of engineering protection at evolution points can be higher than reworking a simple design.
- We wish to save time and money, reduce the introduction of new defects, and reduce the pain and suffering inflicted on overworked developers.
- To achieve this, we design to minimize the impact of change.
- To minimize change impact, we design with the goal of low coupling.
- To design for low coupling, we design for PVs.
If the need for flexibility and PV is immediately applicable, then applying PV is justified. However, if you're using PV for speculative future proofing or reuse, then deciding which strategy to use is not as clear-cut. Novice developers tend toward brittle designs, and intermediates tend toward overly fancy and flexible generalized ones (in ways that never get used). Experts choose with insight - perhaps choosing a simple and brittle design whose cost of change is balanced against its likelihood. The journey is analogous to the well-known stanza from the Diamond Sutra:
Before practicing Zen, mountains were mountains and rivers were rivers.
While practicing Zen, mountains are no longer mountains and rivers are no longer rivers.
After realization, mountains are mountains and rivers are rivers again.
The following comment from "quotemstr" in Why bad scientific code beats code following "best practices" puts it much more succinctly:
It's much more expensive to deal with unnecessary abstractions than to add abstractions as necessary.
By following the KISS principle and YAGNI I only write code that is necessary and in as few lines as possible. Code that belongs together I keep together, thus producing high cohesion. I keep my APIs as simple as possible to avoid the ripple effect when changes become necessary, thus producing loose coupling. I don't use unnecessary abstractions, and my entire PHP framework can be described in a simple one-page diagram. Can yours? I designed this framework around the 3-Tier Architecture, not because I had read about it in a pattern book, but because I had encountered it in a previous language and I instantly saw its benefits. When I showed my code to a colleague he remarked that it also contained an implementation of the Model-View-Controller design pattern, but that was purely by accident and not by design (not by design. Geddit? It's a play on words! Oh, never mind.) The only pattern I have ever read about and then implemented is the Singleton.
Having built my framework around two simple yet effective architectural patterns I set about building an enterprise application. In such an application the database design is absolutely critical, so rather than reinventing the wheel I used some of the designs in Len Silverston's Data Model Resource Book. As soon as I saw these designs I recognised immediately the power and flexibility that they provided, so I built the databases, imported the schemas into my Data Dictionary, exported each table to produce a separate class file, then generated basic user transactions from my catalog of Transaction Patterns. All I had to do in order to convert a basic transaction into something more complex because of business rules was to add the relevant code into the relevant customisable method of the relevant class file. This application currently has 450 tables, 1,300 relationships and 4,400 user transactions, and is being sold to large multi-national corporations all over the world. This was achieved without the use of OOD and without the use of design patterns. I do not bother using design patterns to build individual application components as I consider them to be useless on two counts:
With design patterns it is not possible to say "combine this pattern with that class and create a working transaction". It IS possible with Transaction Patterns, which is why I think that they are superior. Each of my 40+ patterns uses a pre-written and therefore reusable Page Controller. The use of pre-written and reusable code means that I have less code to write, debug and maintain. My main enterprise application has over 2,800 user transactions, and every one of those was built from one of those patterns, so that is a huge amount of reusability. Can you imagine how much extra effort it would have taken to manually construct a separate Controller for each of those 2,800 transactions? By not having to expend that effort I saved time and money, which benefits my customers as I can provide them with working applications much sooner and much cheaper than my competitors. This may be news to some people, but the objective of a software developer should be to develop cost-effective solutions for their paying customers, and not to follow crazy philosophies dreamt up by other developers.
These principles were badly written as they do not provide clear, concise, accurate and unambiguous definitions. This has led to them being redefined and reinterpreted so many times that the original idea has been totally lost. It is therefore impossible to follow one of these interpretations without upsetting the supporters of the others. This is a lose-lose situation as whatever you do will be perceived as wrong by someone somewhere.
I do actually follow those which I deem relevant, but only up to a point.
This is a totally stupid idea. If I am writing a database application then why should I hide from the application the fact that it is communicating with a database? If I am writing a missile guidance system then why should I hide from the application the fact that it is guiding missiles? If I am writing an elevator control system then why should I hide from the application the fact that it is controlling the movement of elevators? The software should be designed to fit the business problem it is supposed to solve, and there is no single design philosophy which can be used for all types of problem.
Some naive people argue that the implementation of the persistence layer may change, but is this true? A database application will ALWAYS communicate with a database, and the only possible change will be the DBMS. I currently have a DAO class for MySQL, PostgreSQL, Oracle and SQL Server, and the relevant class is instantiated into an object at runtime. If communication with another DBMS is required then I will write a new class for that DBMS. Note that this would NOT require any changes to any existing code.
I think this idea comes from those who are fanboys of OOD but who are totally ignorant about relational databases and how they work. It may also be an unnecessary continuation of the situation which arose when relational databases first appeared on the scene and needed to be accessed by the existing teams of programmers. Relational databases and the SQL language were brand new and understood by relatively few, so until the masses could be trained in the "black art" of SQL it was common practice that the untrained programmer would write SQL-agnostic code and call a separate module, written by an SQL guru, which did the business. That situation continued until every programmer had been trained in SQL, but some people don't seem to have got the memo that the situation has changed and still think that it exists in the 21st century. Get real, guys. Any programmer writing a database application without knowing SQL is as useless as a web programmer who doesn't know HTML. I build database applications, and the code that I write is ALWAYS aware that it is talking to a database. It has been that way for 30+ years, and I'm not changing that on the say-so of a few jumped-up, wet-behind-the-ears, tenderfoot, greenhorn, know-nothing newbies.
Here are some links to some of the criticisms that I have received, along with my responses:
I continue to ignore these people and their stupid ideas, and I believe that my software is all the better because of it.
For a list of what I regard as "optional extras" please take a look at A minimalist approach to Object Oriented Programming with PHP.
I am constantly being told by my critics that I am lagging behind all the other developers by not refactoring my code to incorporate each new feature, function or capability as it becomes available in the language. I am not being "with it", I am not being "cool", I am not writing code which is "awesome".
I am a pragmatic programmer. I see my job as developing effective software to impress my paying customers by its ability to satisfy their requirements in a cost-effective manner, and not to impress my fellow developers by its use of complex code which follows the latest fashion. I have used quite a few different languages in my long programming career, and with every language it was never an objective to find a way to utilise every feature of the language, only those features which were necessary to complete the task at hand. Whenever a new version of the language was released I would read the documentation to see what new features it contained and to ascertain whether or not it was worthwhile to add that any of those features to my code. If I considered that a new feature would not add value, or its only purpose was to solve a problem which I did not have, then I would ignore it.
I developed my framework using PHP 4, and it worked in a very satisfactory fashion. When PHP 5 was released, with its additional OO features, I could not use them as I still had PHP 4 customers to support. I managed to change my code so that it ran in both PHP 4 and PHP 5, which meant that my customers could upgrade at will without having to change a thing. Due to the slow adoption rate of PHP 5 I never bothered to upgrade my code to use any of the new features as I did not want to lock out any of my customers who were still using PHP 4. I used my framework to start building an enterprise application, first as a bespoke application, then later as a package which is still going strong and is being sold to large multi-national corporations all over the world via my partnership with Geoprise Technologies.
I did actually read about these new additions to the language just to see what the fuss was about, but rather than jumping in and changing my code I performed what is known as a cost-benefit analysis. I looked at each feature to determine the following:
Without exception the costs always outweighed the benefits, so I decided to take the pragmatic approach and completely ignore the new features. My application still did what it was supposed to do and was easy to upgrade and maintain, so I did not see the point of spending my valuable time in an exercise which had zero benefit to my paying customers. I chose instead to spend that time in doing things that my customers were actually willing to pay for.
If you look carefully at some of those new features you will see that they were added as a solution to a particular problem, but if my codebase does not have that particular problem then my view is that I do not need to implement that solution. My original codebase now runs under PHP 7, so if it still works then why change it? If it ain't broke then why fix it?
When developing a database application using OOP it is worth bearing in mind the following:
I developed that approach to developing software using the OO capabilities of PHP in 2002/2003, and I still use it today. Why? Because it works, it is simple, it is quick to build, it is easy to maintain, and it is easy to extend. I did not incorporate all those other "rules" or "principles" as they did not exist way back then. Now that I do know that they exist I refuse to use them as they would not add value to my software, only complexity.
My critics keep telling me that all these optional extras are fundamental to OOP, but I disagree. It is they who have lost the plot, not me. The following statement shows both the what and the why of OOP - what it is and why it should be used in preference to previous paradigms:
Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Inheritance and Polymorphism to increase code reuse and decrease code maintenance.
As far as I am concerned this means that Object Oriented Programming is no different from procedural programming, except for the addition of Encapsulation, Inheritance and Polymorphism.
As a developer who had used non-OO paradigms for the previous two decades I expected to be able to develop similar applications with less effort because of the promise of more ways to access reusable code. I build nothing but database applications, and I was able to implement a typical family of forms in COBOL within one week, which with UNIFACE became one day, and with PHP was reduced to five minutes. Note that I also developed my own frameworks in each of these languages. I was able to achieve this leap in productivity simply by using each of the primary features of OOP in a pragmatic fashion. This level of reusability has also contributed to a high level of maintainability. For over a decade I have been enhancing and adding to the functionality within my abstract table class, which has enabled me to enhance the processing within every user transaction. I have also been enhancing the functionality within my Page Controllers, which then affects every user transaction which uses that controller. So far from devolving into a Big Ball of Mud my code is still lean, mean and as fresh as a daisy.
I have clearly demonstrated that my software has loose coupling and high cohesion, which is supposed to be a sign of a proper implementation of OOP, so why is my work described as "bad" while theirs is "good"? I am clearly achieving the objectives of OOP while they are not, so why is my work described as "bad" while theirs is "good"? If the answer is "Because you are not following the rules!" then I can only respond with the following question:
If I can write better software by not following your rules, then what does it say about the efficacy of those rules?
There appears to be too many programmers out there who are incapable of achieving the objectives of OOP using their own intellect, so they rely on guidance from the more experienced of their programming brethren. Unfortunately this "guidance" has emerged as a series of rules or principles which have been so badly written that they have been open to huge amounts of over-interpretation and mis-interpretation. The novice programmer then follows all these rules with the expectation that the resulting software will be acceptable. However, they do not yet understand the distinction between "following the rules" and "achieving the objectives". As a pragmatic programmer I am results-driven, and I will ignore any artificial rule which gets in the way. The dogmatic programmer, on the other hand, is rules-driven and is totally incapable of recognising "achieving the objectives" even if it crawled up his leg and bit him in the a*se. All he can see is that I have broken his precious rules, and because of that he automatically assumes that my results are wrong. Until he is prepared to put his rules aside and compare his results with mine he will never be able to achieve results as good as mine.
As far as I am concerned any OO programmer who is incapable of writing effective software using nothing more than encapsulation, inheritance and polymorphism, without the use of any of those frivolous and optional extras, is just that - incapable, bordering on incompetent, and - judging by the amount of crap they can't help producing - helplessly incontinent.
Here endeth the lesson. Don't applaud, just throw money.
The following articles describe aspects of my framework:
The following articles express my heretical views on the topic of OOP:
These are reasons why I consider some ideas on how to do OOP "properly" to be complete rubbish:
Here are my views on changes to the PHP language and Backwards Compatibility:
The following are responses to criticisms of my methods:
Here are some miscellaneous articles:
05 Mar 2017 | Added you must use mock objects.
Added you must create a class diagram. Added you must use immutable objects. Added your class constructors must be empty. |
01 Feb 2017 | Added more entries to There is no rule which says that .... |