Tony Marston's Blog About software development, PHP and OOP

PHP's Type System

Posted on 29th March 2025 by Tony Marston
Introduction
What is Type Safety?
The origin of static typing
Data Types
User Defined Types
Data Structures
PHP's type system
Strict typing is supposed to be optional
References
Comments

Introduction

In the 40+ years that I have been programming I have used various languages, which include COBOL, Assembler, UNIFACE and PHP. The first three were compiled and statically typed while the last is interpreted and dynamically typed. It should be obvious that each of these languages is different, which means that moving from one to another involves dealing with the differences in order to become proficient in that new language. No competent programmer should ever complain about the differences, nor should they attempt to continue using their old programming style when it conflicts with style dictated by the new language. Anybody who does deserves a smack on the back of the head and then sent to bed without any supper.

First, a brief description of the two terms:

PHP is not a language which is compiled, it is interpreted. This means that it was designed from the ground up to be dynamically typed, so any type checking is performed at run time. Functions are provided to both test and change a variable's type. See PHP's Type System for details.

In spite of this I am continually amazed at the number of programmers who, after switching to PHP from a strictly typed language, constantly complain that this causes no end of type errors as all dynamically typed languages are inherently bad. The fault does not lie with these languages, it lies with the inability of these junior-grade, second-rate, badly educated programmers who fail to understand how the different type systems emerged and under what circumstances does one system have advantages over the other.

I regard strict typing as a crutch for the mentally handicapped or like the training wheels on a child's bicycle - they are OK when you are a child, but when you are an adult they simply get in the way.

I regard strict typing as a crutch for the mentally crippled. It is like the training wheels on a bicycle - they are OK when you are a child as they stop you from falling over and hurting yourself, but when you are an adult they simply get in the way. To say that statically-typed languages are better and more popular than dynamically-typed languages is not supported by the fact that 60% of the world's programming languages support dynamic typing and only 40% support static typing. This would indicate to me that those who support static typing are in the minority, and that the newer languages are more likely to support dynamic typing.


What is Type Safety?

According to all he "experts" no dynamically typed language can ever be type safe, but what does "type safe" actually mean? I found the following definition in this wikipedia article:

In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that is, some facilities are type-safe and their usage will not result in type errors, while other facilities in the same language may be type-unsafe and a program using them may encounter type errors. The behaviors classified as type errors by a given programming language are usually those that result from attempts to perform operations on values that are not of the appropriate data type, e.g., adding a string to an integer when there's no definition on how to handle this case. This classification is partly based on opinion.

Type enforcement can be static, catching potential errors at compile time, or dynamic, associating type information with values at run-time and consulting them as needed to detect imminent errors, or a combination of both. Dynamic type enforcement often allows programs to run that would be invalid under static enforcement.

It goes on to say the following:

If a type system is sound, then expressions accepted by that type system must evaluate to a value of the appropriate type (rather than produce a value of some other, unrelated type or crash with a type error).
...
The semantics of a language must have the following two properties to be considered type-sound:

Progress

A well-typed program never gets "stuck": every expression is either already a value or can be reduced towards a value in some well-defined way. In other words, the program never gets into an undefined state where no further transitions are possible.

Preservation (or subject reduction)

After each evaluation step, the type of each expression remains the same (that is, its type is preserved).

Note where under the heading Progress it says every expression is either already a value or can be reduced towards a value in some well-defined way. Under Preservation it also says After each evaluation step, the type of each expression remains the same. By reading PHP's type system you will see that both of these statements are a match for PHP's behaviour, which means that, according to this wikipedia definition, PHP can be classed as Type Sound.


The origin of static typing

Some of the following statements are taken from Type Wars which was published by Robert C. Martin (Uncle Bob) in May 2016.

I first ran into the concept of types in 1966 while learning Fortran as a teenager. In Fortran there were essentially two types. Fixed Point (integer), and Floating point. Expressions in Fortran could not "mix modes". You could not have integers alongside floating point numbers. Only certain library functions could translate between the modes.
Early languages were not designed from the ground up to be type safe. This was added on as an afterthought to crash the program rather than produce unpredictable results. Type checking was added in at compile time to prevent crashes at run time.

This was because the binary representation of integers and floating point numbers was completely different, so if you tried to use a series of bits as one type of number when in fact it was something else then the results were unpredictable and sometimes catastrophic. The only way to avoid such errors was to give each variable a type and not allow a value of the wrong type to be loaded into it. This also meant that if a function expected an argument of a particular type it would perform the relevant type checking at compile time and generate an error if there was a mismatch.

C, of course, had types; but they were not enforced in any way. You could declare a function to take an int, but then pass it a float or a char or a double. The language didn't care. It would happily push the passed argument on the stack, and then call the function. The function would happily pop its arguments off the stack, under the assumption that they were the declared type. If you got the types wrong, you got a crash. Simple. Any assembly language programmer would instantly understand and avoid this.

At least it had the sense to crash instead of returning an unreliable result. But if it had the ability to detect that a variable was of the wrong type, how much effort would it take to automatically convert it to the correct type? Surely that would be better than crashing?

The next problem came when reading and writing data from files. I personally have worked with flat files, indexed files, hierarchical database and network databases, and they all had the same characteristics:

Data Types

Here are the most common data types which are supported by programming languages and database systems:

byte Stores whole numbers from -128 to 127
short Stores whole numbers from -32,768 to 32,767
int Stores whole numbers from -2,147,483,648 to 2,147,483,647
long Stores whole numbers from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
float Stores fractional numbers. Sufficient for storing 6 to 7 decimal digits
double Stores fractional numbers. Sufficient for storing 15 to 16 decimal digits
boolean Stores true or false values
char Stores a single character/letter or ASCII values

User Defined Types

As far as I am concerned the term "type" includes only those data types which are directly supported by the database and/or the programming language. While some languages support the concept of classes as user defined types this is not the case with PHP. It is not possible to assign a type to a class, and when a class is instantiated into an object the only result from using the gettype() function is "object".

In PHP classes cannot be categorised by type, so the concept of supertype and subtype have no meaning. This means that polymorphism is not restricted to objects with the same supertype/subtype.

Everybody knows that you can inherit from one class, which then becomes a superclass, to create a new class which is known as a subclass. In those strictly typed languages where a class can be given a type the superclass becomes a supertype and a subclass becomes its subtype. Those languages then require that you cannot reference an instance without specifying its type otherwise polymorphism will not work. This is explained in Polymorphism and Inheritance are Independent of Each Other. This completely artificial restriction does not exist in PHP as classes do not have types, so all that is required for polymorphism to work is that the same method signature exists in multiple objects. This is mostly achieved in the RADICORE framework by having every one of my concrete table class inherit from the same abstract class, but also in the separate Data Access Objects (DAO) where the various method names are manually duplicated instead of being inherited.

Arrays and classes are known as compound or composite data types as they contain groups of scalar values which are related in some way.

Data Structures

When accessing data from external sources, such as from a compiled form or a disk/database file, this data was presented as a fixed length record (also known as a struct or composite data type) which contained a number of fields each with its own data type and length.

Before PHP came along earlier languages, such as COBOL, were strictly typed for the simple reason that they used predefined records into which or from which all I/O operations were performed. All input/output operations required a pre-defined and pre-compiled record structure which identified precisely the type and size of every piece of data that was passed in that operation. The entire structure is passed as a single argument. It is imperative that the receiving structure matches the sending structure both in size and composition otherwise the receiving subprogram or subroutine will not be able to see the correct values. Here is an example:

01  customer-record.
    05  cust-key            PIC X(10).
    05  cust-name.
        10  cust-first-name PIC X(30).
        10  cust-last-name  PIC X(30).
    05  cust-dob            PIC 9(8).
    05  cust-balance        PIC 9(7)V99.

The PIC(TURE) clause identifies the data type:

Numeric values were uncompressed by default, meaning that PIC (9) (the same as PIC (9) USAGE DISPLAY) would require 9 bytes. Several compressed formats were allowed, such as:

In the early days of computing when hardware was expensive and programmers were cheap it was necessary to store numbers in as small a space as possible.


PHP's type system

PHP is a language which is not compiled, and it does not use pre-defined and fixed data structures. It therefore does not meet the requirements of all other statically typed languages as explained in the introduction.

PHP is interpreted. It cannot perform any type checking when the source code is parsed into bytecode as a variable's type can be changed at any time. For this reason all type checking on a variable can only be performed at run time when that variable is used in an expression. It is not necessary to declare a variable and its type before you assign a value to that variable. Once a value has been assigned it is also possible to overwrite that value with a different value of a different type. This means that the following statements will not cause an error:

<?php
$foo = 27;              // value is an integer
$foo = 'twenty seven';  // value is a string
?>

While the latest version of PHP supports type declarations for function arguments, return values and class properties, all this does is prevent a variable from being assigned a value of the wrong type, as shown in the following code snippet:.

<?php
class Foobar
{
    var int $foo;
		
    function __construct ()
    {
        $this->foo = 27;              // value is an integer
        $this->foo = 'twenty seven';  // value is NOT an integer, will cause a TypeError
    }
?>

It is important to note that in PHP type declarations cannot be used for values which are supplied from external sources, such as the following:

Instead of pre-defined and fixed data structures PHP uses dynamic arrays for all HTML and SQL values. They are dynamic for the simple reason that you cannot declare what elements an array should contain before you receive it from an external source. You can also add or remove elements from an array at any time as there is no way you can prevent these operations. This also means that when reading an array you may need to verify its contents before you start processing what may not be there.

You might also think that this would force the programmer to take an array containing string values and insert lots of code to manually cast each string value into the correct type, but this is not the case. If you read the section on Type Juggling in the PHP manual you will see where it says the following:

PHP does not require explicit type definition in variable declaration. In this case, the type of a variable is determined by the value it stores.
...
PHP may attempt to convert the type of a value to another automatically in certain contexts. The different contexts which exist are:
Note: When a value needs to be interpreted as a different type, the value itself does not change types.

This was reinforced in RFC: Strict and weak parameter type checking where it said the following:

PHP's type system was designed from the ground up so that scalars auto-convert depending on the context. That feature became an inherent property of the language.

It goes on to say:

Strict type checking is an alien concept to PHP. It goes against PHP's type system by making the implementation detail become much more of a front-stage actor.

In addition, strict type checking puts the burden of validating input on the callers of an API, instead of the API itself. Since typically functions are designed so that they're called numerous times - requiring the user to do necessary conversions on the input before calling the function is counterintuitive and inefficient. It makes much more sense, and it's also much more efficient - to move the conversions to be the responsibility of the called function instead. It's also more likely that the author of the function, the one choosing to use scalar type hints in the first place - would be more knowledgeable about PHP's types than those using his API.

Finally, strict type checking is inconsistent with the way internal (C-based) functions typically behave. For example, strlen(123) returns 3, exactly like strlen('123'). sqrt('9') also return 3, exactly like sqrt(9). Why would userland functions (PHP-based) behave any different?

Proponents of strict type hinting often argue that input coming from end users (forms) should be filtered and sanitized anyway, and that this makes for a great opportunity to do necessary type conversions.

While both HTML input values and SQL query results are presented as strings there are the following differences:

It is not necessary to manually cast each string value to the correct type before it is used as an argument in a function. All that is necessary is that you check that the value can be automatically coerced into the correct type without producing a TypeError.

While I agree with the statement input coming from end users (forms) should be filtered I disagree with the statement this makes for a great opportunity to do necessary type conversions for the simple reason that it is not "necessary" to manually convert each input string into the correct type. All that is "necessary", as explained above, is that each input string be filtered so that it can be automatically cast into the correct type without causing a TypeError. Adding code to do manually what the language was designed to do automatically does not strike me as the act of a competent programmer, someone who understands what the word "efficient" actually means. In the above-mentioned RFC it says the following:

For example, consider a function getUserById() that expects an integer value. With strict type hinting, if you feed it with $id, which happens to hold a piece of data from the database with the string value "42", it will be rejected. With auto-converting type hinting, PHP will determine that $id is a string that has an integer format - and it is therefore suitable to be fed into getUserById(). It will then convert the value it to an integer, and pass it on to getUserById(). That means that getUserById() can rely that it will always get its input as an integer - but the caller will still have the luxury of sending non-integer but integer-formatted input to it.

The key advantages of the proposed solutions are that there's less burden on those calling APIs (fail only when really necessary). It should be noted that most of the time coding is spend consuming existing API's and not creating new ones. Furthermore it's consistent with the rest of PHP in the sense that most of PHP does not care about exact matching zval types, and perhaps most importantly - it does not require everyone to become intimately familiar with PHP's type system.

This makes sense to me - if a function is called in a thousand places then it is more efficient, as well as less error prone, to perform any type checking just once within the function itself instead of duplicating it in those thousand places.


Strict typing is supposed to be optional

Having said that PHP was designed from the outset to be dynamically typed there are a large number programmers of limited-ability out there who do not appear to have the mental capacity to deal with dynamic typing. They switch to a language which is different, then have the audacity to complain that it is different! They learned to code using a statically type language, have always been taught that static typing is "best", and, like a pack of trained monkeys, they refuse to consider any alternatives. They cannot deviate from what they have been taught. I spent the first 20 years of my programming career using strictly typed languages, but as soon as I started playing with PHP I took to dynamic typing as a duck takes to water. This is unlike my intellectually inhibited colleagues who turn up their noses as if it were a turd in a toilet.

To these intellectual lightweights I can only say one thing: If you don't like dynamic typing then for f***s sake stop whining and stop using a dynamically typed language. To do otherwise would be as stupid as a non-swimmer jumping into the deep end of a swimming pool while wearing concrete boots and then complaining that he's drowning. Get real, you numpties. It's time to grow up, put on your big-boy pants and learn to act as adults.

What is even worse is the fact that some of these people have infiltrated the team of core developers and are gradually converting the language (I call this sabotaging) to be strictly typed. First there was the introduction of Type Hinting in PHP 5, as a result of PHP RFC: Scalar Type Declarations, which later was promoted to Type Enforcement with the declare(strict_types=1); directive. The use of strict type checking was supposed to be entirely optional, but the implementation of PHP RFC: Deprecate passing null to non-nullable arguments of internal functions in version 8.1 introduced a serious backwards compatibility (BC) break and a stupid inconsistency. This RFC started with the following statement:

Internal functions (defined by PHP or PHP extensions) currently silently accept null values for non-nullable arguments in coercive typing mode. This is contrary to the behavior of user-defined functions, which only accept null for nullable arguments. This RFC aims to resolve this inconsistency.

The statement Internal functions currently silently accept null values for non-nullable arguments in coercive typing mode is incorrect as up to that point all arguments were nullable even though they were not individually marked as such. This was covered in the manual where it said:

Converting to string
String conversion is automatically done in the scope of an expression where a string is needed.
null is always converted to an empty string.

Converting to integer
To explicitly convert a value to int, use either the (int) or (integer) casts. However, in most cases the cast is not needed, since a value will be automatically converted if an operator, function or control structure requires an int argument.
null is always converted to zero (0).

The statement This is contrary to the behavior of user-defined functions, which only accept null for nullable arguments assumes that no user-defined function can accept null for any argument, but this was changed several years before in version 7.1 with the introduction of nullable types. Nullable type syntactic sugar was added with the following description:

A single base type declaration can be marked nullable by prefixing the type with a question mark (?). Thus ?T and T|null are identical.

This meant that when you looked at the function signatures in the manual and did not see that an argument was explicitly marked as nullable with this Nullable type syntactic sugar then a novice programmer could assume that none of these arguments would accept null. This assumption would be incorrect for the reasons stated above - (a) accepting nulls was standard behaviour for the language since its inception, and (b) accepting nulls was documented in the manual. This glaring mistake told me straight away that the numpties who both proposed and voted for this RFC had failed to Read The Freakin' Manual (RTFM). Had they done so they would have realised that they had two choices:

  1. Change the documentation to mirror the behaviour.
  2. Change the behaviour to mirror the documentation.

One of these would produce a massive BC break while the other would not.

They made the WRONG choice, which is why I wrote The PHP core developers are lazy, incompetent idiots. Not only was it the wrong choice, it contradicted the following statements made in PHP RFC: Scalar Type Declarations where it said:

Behaviour of weak type checks
A weakly type-checked call to an extension or built-in PHP function has exactly the same behaviour as it did in previous PHP versions.

Backward Incompatible Changes
Since the strict type-checking mode is off by default and must be explicitly used, it does not break backwards-compatibility.

Unaffected PHP Functionality
When the strict type-checking mode isn't in use (which is the default), function calls to built-in and extension PHP functions behave identically to previous PHP versions.

Strict Typing
By default, PHP will coerce values of the wrong type into the expected scalar type declaration if possible.
Warning: Function calls from within internal functions will not be affected by the strict_types declaration.

All of the above statements are wrong for the simple reason that all internal functions will throw a TypeError if you try to pass null for any argument. This means that you now have to insert code to do manually what the language previously did automatically. This requires using the relevant type casting on an argument, such as:

$result = function((string)$string);
$result = function((int)$integer);
$result = function((float)$float);

Without this manual intervention you could see the following notices in your log file after upgrading to PHP 8.1:

Passing null to parameter #1 ($string) of type string is deprecated
Passing null to parameter #1 ($num) of type int|float is deprecated

As well as the implementation of this 2nd RFC being inconsistent with what was promised in the 1st RFC, it also introduced a new inconsistency. Their pathetic excuse for doing it this way was given as:

For the new scalar type declarations introduced in PHP 7.0 an explicit choice was made to not accept null values to non-nullable arguments, but changing the existing behavior of internal functions would have been too disruptive at the time.

The words changing the existing behavior of internal functions would have been too disruptive at the time shows that they clearly did not understand that the existing behaviour WAS to accept null values for all arguments (even the manual said so) and all that was necessary was to update the documentation for all function signatures. The comment about this being "too disruptive" shows that they were just too lazy to make the right decision and too stupid to realise that it was the wrong decision. Instead of taking on the work themselves and maintaining backwards compatibility they chose to force every userland developer to change their code to solve a problem which did not previously exist. The fact that they also missed what RFC: Strict and weak parameter type checking said about efficient programming shows that they are also incompetent.


References

Here are some other articles which I have written on the subject of strict typing:

Here are some articles on strong and weak typing:

Here are some articles by people who actually like dynamic typing:


counter