I recently came across a blog post by Kevin Smith entitled What's so great about OOP? in which he, as a senior software engineer, attempted to answer the following questions posed by some new interns:
He wanted to find a fundamental explanation, but he said that nearly everything he came across immediately jumped into SOLID or design patterns or advanced concepts of some kind. Rather than starting with such advanced topics he wanted to go back to basics, and based his answers on the conversion of some code from its original procedural style into the so-called "better" OO style. In my humble opinion his answers fell short of providing simple answers to those simple questions, so were a failure. By simply taking a small program and converting it from procedural to object oriented, while removing the inefficient code along the way, he left his students with the task of deducing the answers by examining what he had done. In my book this is not good enough. He should have provided the answers so that the students could have rewritten the code themselves instead of him rewriting the code and thus forcing his students to deduce the answers. Seeing a particular solution from which you have to guess the principles on which it was built is not the same as being given the principles on which you can build your own solutions.
I have been reading about OO theory and creating implementations of that theory since 2002, and while browsing the internet I have come across a huge number of articles in which different people have given different interpretations of what OO means and how it can be implemented. These articles usually express themselves in the format "This is a rule, and this is how it should be implemented" or "if you don't do it this way then you are not a proper OO programmer". I have seen too many misinterpretations of perfectly valid rules, and I have also seen a proliferation of artificial and imagined rules. Fortunately I have enough experience so that I am able to separate the wheat from the chaff, and I have summed up my opinions of all this chaff in Look at how many rules I break, Your code is crap! and Your rules are RUBBISH!
While the author did attempt to answer the three questions, it is my opinion that the answers glossed over the basics with airy-fairy language which had the effect of hiding the fundamental principles under a layer of indirection. Below I have extracted various statements from his article and then commented on those statements.
He attempts to define OOP in the following manner:
Object-oriented programming is a way of building software that encapsulates the responsibilities of the system in a community of collaborating objects. An object consists of information (properties of the object) and behavior (methods of the object that operate on the object's information) that work together to fulfill the object's role in the system.
In my view this answer is incomplete. It uses the word "encapsulation" in a totally misleading manner. It is not just about "providing a community of collaborating objects" as any modular system written in a non-OO language can be about "providing a community of collaborating modules". What is the difference between a newfangled "object" and an old fashioned "module"? What can you do with one that you cannot do with the other? The most basic definition of OOP that I prefer to use is as follows:
Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Inheritance and Polymorphism to increase code reuse and decrease code maintenance.
Note that, as explained in What OOP is, it is not possible for a language to call itself "object oriented" unless it directly supports encapsulation, inheritance and polymorphism.
Encapsulation | The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations. This binds together the data and the functions that manipulate the data.
More details can be found in Object-Oriented Programming for Heretics |
Inheritance | The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass. A subclass may override any of the methods in the superclass, or may introduce new methods of its own.
More details can be found in Object-Oriented Programming for Heretics |
Polymorphism | Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method signature (which is not the same as an object interface), but the result which is returned by calling that method on a different object will be different as the code behind that method (the implementation) is different in each object.
More details can be found in Object-Oriented Programming for Heretics |
The article also contains the following statement:
OOP is more than just different syntax. It's a paradigm shift away from thinking of programming as the manipulation of data. This is an entirely different way of modeling solutions to real problems.
I'm afraid I have to disagree. While an OO language does indeed have different syntax - if only to support the additional features of encapsulation, inheritance and polymorphism - it also has a large number of similarities. I have spent over 30 years building database applications in three different languages (two of which were non-OO), and they have all been about the manipulation of data. They all have a user interface (UI), business logic and data access logic. Data always flows from the UI through the business layer to the data access layer and back again. The underlying data model (the database schema) remains the same, the user transactions (use cases) remains the same, and even the UI looks the same. The biggest difference is how the components in the business layer are constructed. They still perform the same functions, but they are (or should be) constructed to take advantage of what OO has to offer.
The next statement is one with which I actually agree:
You may have heard that OOP is all about modeling the real world. This is true in a general sense, though this axiom is often misunderstood to mean that objects themselves should literally model real world objects. That goes too far with the analogy.
You should never model the whole of the real world, only those parts with which your application is supposed to interact. Once you have identified an entity which your application is supposed to reference don't model all its properties and methods, only those which are actually needed. For example, in a typical enterprise application (which is nothing more than a glorified term for a database application which is used by businesses) the software never interacts with physical objects in the real world, it only ever interacts with their representations in a relational database. This means that all objects in your software should be designed to interact with objects in your database, and all objects in a database are known as "tables". This also means that you shouldn't waste time trying to model the operations that can be performed on a real world object as anyone who has ever written a database application will tell you that there are only four operations which can be performed on a database table - Create, Read, Update and Delete (CRUD).
For example, in a sales order processing system you will have an object called CUSTOMER which is represented by a table called PERSON with a role of "customer". This PERSON table does not contain all the properties that may exist for a person - such as height, weight, hair colour, eye colour, favourite food, favourite drink, et cetera, et cetera - it only contains those properties required by the application such as name, postal address and email address. It does not matter what operations a real person may perform - such as stand, sit, walk, run, eat, sleep and defecate - as the application never interacts directly with a physical person, it only interacts with that person's data within the database.
His next statement then goes off track again:
Instead, think of it from a higher-level view. A community of objects should model the interactions and responsibilities we see in agents of purpose in the real world. Put another way, objects should be designed to solve problems like we solve them in every day life.
This entire statement is far too vague for my taste, and therefore liable to mislead an inexperienced reader.
How is the community of objects in an OO system different from a community of modules in a non-OO modular system?
Each object in the business layer should represent an object with which the application is supposed to interact, which for a database application will always be a database table. As for interactions, in a database application there are only four operations which can be performed on any database table, so designing for more than these would be wasted effort.
This does not appear in any OO tutorials that I have read, so is a meaningless statement.
This is too vague to be of any use. In reality every solution has a start point and an end point with a number of steps in between. A complicated solution may require a large number of simple steps. These steps may need to be performed in a serial manner, but is is possible that some could be performed in parallel. Regardless of whether you are using OO or not each of these steps will use identical code with the only difference being how the code in each step is actually packaged - in one it is a collection of procedural functions/modules while in the other the functions can be grouped together in classes from which objects can be instantiated.
There is another statement with which I strongly disagree:
Note that the constructor requires everything that constitutes a valid schedule according to the business.
The purpose of a constructor is *NOT* to construct an object with valid data, it is to construct an object which can respond to subsequent calls on any of its public methods. In other words its purpose is to leave an object in a valid state not with valid state. It is perfectly valid to create an object which is initially empty, then use a method to either insert data from the UI which can then be added to the database, or to request data from the database which can then be displayed in the UI. This particular topic is discussed further in Re: Objects should be constructed in one go.
Here is another:
That objects are responsible for the important work of the system is not only the distinguishing characteristic of OOP, it's fundamental to the very practice of writing object-oriented software
I'm sorry, but this is *NOT* a distinguishing characteristic of OOP. By "important work of the system" I assume you mean data validation and the execution of business rules. This also exists in non-OO systems, but whether the code exists in a non-OO module or an OO object it is still the same code.
Next is a statement which contains an element of truth:
This is a point worth repeating. As programmers learn OOP, it's easy to get lost in theory, principles, and design patterns and lose sight of the objects themselves. If you're new to OOP, first commit yourself to thinking in terms of objects, each with its own role. It's crucial. Everything else in OOP is built on it.
But what exactly do you put in each object? Some people think that the concept of a sales order represents a single object in the real world, so should have a single object in the application, but when it comes to building the database they discover that, by following the rules of Data Normalisation, they actually require a group of separate but related tables such as those shown in Object Oriented Database Programming. Their solution? Stick with a single SALES_ORDER object, but make it responsible for interacting with every database table in that group. This to me is a fundamental mistake. Each table in a database is a separate object, so it deserves to have its own object in the software so that it can be accessed independently from other objects. An object which deals with multiple tables has multiple responsibilities, but in a well-designed OO system each object should have no more than one responsibility.
He starts this section with a statement which is accurate up to a point:
I've alluded to it already, but let's be clear about something. There's a common misconception that using classes means the code is object-oriented. That's not even a little bit true.
If your application is oriented around objects then you *ARE* doing object oriented programming. However, if your classes do not make use of inheritance and polymorphism then they would be no better than procedural functions as they would not be taking advantage of what OOP has to offer. Classes which have no more than one method would fall into this category.
Using classes with nothing but static methods is *NOT* object oriented programming for the simple reason that no class is ever instantiated into an object before one of its methods is called. If you are not using objects then your code cannot be object oriented. Where is the inheritance with static methods? Where is the polymorphism with static methods? If you have little or no inheritance and polymorphism then you have wasted the opportunity to provide more reusable code, and everyone knows that the more reusable code that you have then the less code you have to write and maintain. If you are not getting any benefits out of OOP then you are wasting your time in using it.
His next statement is so wide of the mark it is not even in the same timezone:
Think of a class as a blueprint; a code organization tool, if you will. That blueprint can be for a set of related functions and scoped variables. That's still procedural code. Or it can be the blueprint for objects that are created when that class is instantiated, each of which with knowledge and behavior that helps it accomplish its role. That's object-oriented.
While it is true to say that class should contain related functions and scoped variables (which is known as cohesion by the way) it is totally wrong to say that this is still procedural. If the code is in a class which is instantiated into an object, then by accessing code which is organised into objects that code is most definitely oriented around objects and therefore fits the description of OOP. The contents (or lack thereof) of a single method does not suddenly turn OO code into non-OO code. The only definition of a well-designed class is that it contains ALL the operations and properties of the entity which it represents. It is not good OO to make a class responsible for more than one entity, nor is it good OO to split an entity's operations and properties across multiple classes.
His statement that a class should be a blueprint for set of related functions and scoped variables just reinforces my view that each table in the database should have its own class. If you look at the DDL statement for a table you will see that it defines all the variables which are required for each record in that table. As for the functions, any SQL developer will tell you that you NEVER have to define the operations which are required for each table as every table inherits the same set of standard operations - Create, Read, Update and Delete. If the DDL script is the blueprint for each record in a particular database table it should follow that the software should have a separate class which acts as the blueprint for every instance of that class. This simple logic is why I always use a separate class for each database table.
He starts with a simple question, then provides the answer:
What's the real benefit? In a word: sanity.
That answer is useless as it does not identify anything which can actually be measured. The true purpose of OOP is to provide more reusable code by taking advantage of encapsulation, inheritance and polymorphism. The more reusable code you provide then the less code you have to write and maintain. If the components that you create in your OO implementation require more code than the equivalent components in a non-OO implementation then as far as I am concerned you are doing something wrong. There are certain things that you should be able to measure:
Just as a point of comparison here are some figures from my main enterprise application:
The end result of my conversion to OOP is that I can create components in a database application by writing less code than I had to in either of my previous non-OO languages. Not only do I write less code, each component has more functionality. As far as I am concerned if you are having to write more code in your OO implementation than you did in your previous procedural language then you are doing something wrong.
Here is another of his statements with which I disagree:
Object-oriented programming is orders of magnitude better at handling complexity than procedural programming.
This is a wild statement that has absolutely no proof at all. In order to deal with complexity, such as business rules, you have to write code, and the code that you write in an OO program is almost exactly the same as the code you write in a procedural program. The only difference is that in one system the code is grouped into procedural functions while in the other it is grouped into class methods. Simply having access to the features in an OO language will not guarantee that the code that you write will be orders of magnitude better, it is how you make use of those features that matters. For example, a common feature of all database applications is that before you construct an INSERT or UPDATE query you must check that the data for each column is valid for that column's data type otherwise the query will fail. It is therefore necessary to perform this validation within the program code before the SQL query is executed so that you can report the error to the user and give him/her the chance to correct it. While other programmers are still doing this validation manually I can do it automatically.
Here is another dubious statement:
With procedural code, business rules are scattered all throughout the system and locked up in the head of the most experienced programmer on the team.
This is not necessarily true. It is possible to write a well-structured modular application in a procedural language where the business rules for an entity are contained within the component that deals with that entity. It is also possible to write a badly structured OO application in which the business rules are scattered all throughout the system. This is typically the case when programmers follow the totally artificial rule that a method in a class should not have more than X lines of code, and a class should never have more than Y methods. Why? In order to follow their perverse interpretation of the Single Responsibility Principle, that's why. These morons are forgetting the fundamental rule regarding encapsulation which states that once you have identified an entity which is of interest to your application you construct a class which encapsulates ALL the properties and ALL the methods which relate to that entity. Splitting the methods for an entity across more than one class therefore violates encapsulation. Splitting the properties for an entity across more than one class therefore violates encapsulation.
His next statement has an element of truth:
With OOP, those business rules are written into the responsible objects, hiding the complexity of their implementation.
It is also true that in a well-written procedural application those business rules can exist within responsible components. The big trick is to identify how do you identify a "responsibility" so you can put it into its own component or object? This is where my technique of having a separate class for each and every database table pays dividends. Each table has its own structure, and each structure has its own set of business rules, and because there is a finite set of data types which can be used in a table's structure it is possible to build a single object which can take the user's input and the structure of that table in order to verify that the data for each column is valid for that column's specifications.
Note here that when he mentions hiding the complexity of their implementation he is talking about implementation hiding which is *NOT* the same as information hiding.
Here are my answers to those questions:.
In a language such as PHP which supports both the procedural and object oriented paradigms the code that you write to "do stuff" is basically the same, with the only difference being how that code is packaged. One has stateless functions while the other has stateful objects. Functions do not have to be instantiated while objects are instantiated from classes. Having a collection of classes on its own will not provide any substantial benefits, it is how you construct these classes that really matters. Good classes give you access to code sharing through inheritance and polymorphism, so the more code you are able to share the better is your OO implementation.
The following articles describe aspects of my framework:
The following articles express my heretical views on the topic of OOP:
These are reasons why I consider some ideas on how to do OOP "properly" to be complete rubbish:
Here are my views on changes to the PHP language and Backwards Compatibility:
The following are responses to criticisms of my methods:
Here are some miscellaneous articles: