Procedural VS Object Oriented

What is the correct way to write code?

If you have worked with PHP long enough you may well have come across the website PHP The Right Way. This website is dedicated to helping developers learn the industry best practices. Interestingly another site appeared afterwards called PHP The Wrong Way. In essence this website is a rebuttal to the many of the recommendations made in PHP The Right Way.

With any sphere of learning it is important to keep an open mind, to listen and critically evaluate the received wisdom and determine if it is indeed wisdom or opinion built on shaky ground. There is one part of the argument I would like to focus on and this is the argument for and against "Always use object-oriented Programming".

While PHP The Right Way does not appear to explicitly state that you should always use an object oriented approach it is implicit from the examples and the recommendations to use design patterns. While there are programming approaches available other than procedural and object-oriented these approaches are the most common in any PHP development.

While I agree in principle that you should not religiously follow a single approach and that there are instances where procedural programming will be the most cost effective and simplest approach these instances are very rare, and your default rule should be to adopt an object-oriented approach.

The instances where procedural programming is appropriate is only in throw away code; code that will run once for a single purpose, does not need to be tested to production level and is generally for you to get a quick answer to a simple question. A possible example is parsing a massive log file and counting the number of instances of a text string. This can probably be written in 3-6 lines of code and does not need the separation of concerns normally associated with object-oriented programming. The code is being run once and thrown away, it is too simplistic to really be worth while retaining.

Scope Creep

Anything else even something relatively simplistic that is run multiple times in a production environment should be written in an object-oriented fashion. The main reason for this is that you cannot possibly predict how the requirements of this piece of code will change.

Early in my career I was asked to write a single page report on some server status parameters, this was so the IT department could get a quick break down of server load and various hardware indicators. It was just your basic CPU utilisation, disk usage etc. The CEO just wanted something done quickly. As all of the data required could be gather from one place, there was very little data transformation and this was an internal only page I wrote a simple procedural block of code, all of the data gathering, display logic, everything was in a single block of procedural code.

All of the code required to drive this page was visible in the IDE without scrolling, it really was not many lines and was ready for use the same day.

A few weeks later I was asked for a small tweak, to add some extra columns that were to be taken from another data source, again there was urgency placed on getting the code live and it was not much more code. I just added the code into the single script, the code had probably doubled in size by that point.

Every couple of weeks some new data source was being added, the page was becoming more an more complex, it had gone from a simple reporting page into an interactive filtering page monitoring everything from the servers throughput to the temperature at various points in the office. This script grew to thousands of lines and each change was taking longer and longer and it was becoming very easy to break parts of the functionality with any changes.

Over the course of my career I have seen countless quick and dirty procedural code based projects which were meant to just be small blocks of code that have grown into unwieldy leviathans ready to crush you at every turn.

Does Object-Oriented Code solve this problem?

There is a great quote on PHP The Wrong Way
The object-oriented model makes it easy to build up programs by accretion. What this often means, in practice, is that it provides a structured way to write spaghetti code.
– Paul Graham in Ansi Common Lisp
There is good object-oriented code and there is bad object-oriented code. The main problem with object-oriented code is that if you do not stick to the important principles of object oriented code then you will write nonsensical spaghetti code that is even harder to read and potentially harder to maintain than procedural code.

However, object-oriented code does help solve some of the major problems of procedural code which will affect any procedural code base which has grown beyond a few scripts.

Procedural Code Problems

There are many issues with procedural code, but the three easiest to encounter and hardest to defend against are global variables, uncontrolled error states and loop variable scope. You might consider other factors such as test-ability to be even more important than these, but it is easier to show someone the benefits of the above three I have mentioned to someone who has not experienced Test Driven Development (TDD).

Global Variables

Any procedural code base that has grown is likely to start "requiring" lots of other global scripts which contain global variables, perhaps the database settings or email server settings.

These variables are often declared in one script and then used in another. IDEs are unlikely to be able to know that the variable you are using is global and may flag it as undeclared. You are likely to start ignoring undeclared variables and then in one block of code incorrectly spell a global variable, or perhaps overwrite one in a middle script causing another script later in the execution to fail because that variable is not in its expected state.

Uncontrolled Error States

When writing object-oriented code you can adopt the early exit approach. Doing so in a procedural environment is frequently harder to achieve and more likely to lead to unexpected issues with "required" code expecting to run but not being called. Procedural code will tend towards lots of IF blocks which grow in complexity and make it harder to maintain logical error states and understand the current state of the system.

Loop Variable Scope

This is a similar issue to global variables, but as it tends to manifest in different ways and is easily capable of unexpected data states I felt it was important enough to separate from global variables.

Procedural code frequently will have loops, and it is very common for PHP developers to forget to clear out variables from the next iteration of the loop, this can lead to simple problems such as empty fields retaining the last value in the loop and populating data incorrectly, or building exponentially larger arrays or database updates that destroy your server.

It is true that this problem is not exclusive to procedural style coding and can be achieved in an object-oriented environment it is far less likely due to the way that variables inside different methods are out of scope.

Object Oriented Code Problems

As I mentioned earlier bad object-oriented code can become a nightmare of spaghetti redirection until your head is spinning and you have no idea where you came from. However, this is relatively easy to avoid by following 2 basic principles.

Single Responsibility Principle

All objects should have a single responsibility and all methods should also be responsible for a single action. By attempting to achieve this you should start to see all of your methods are very small, they do a single simple action and your build up your code as a series of simple actions. As the methods are all small and have a single responsibility they become easy to read, easier to spot bugs and easier to name. 

For example you could have a piece of code call a method stringStartsWithString($string, $startsWith). Now in PHP 7 you can provide type hints and return hints to make things super clear straight away in the IDE, but even with earlier version you should would have reasonable confidence that you will get a boolean response as to whether the string being passed in starts with your second string.

The single responsibility principle helps lead to better code reuse, if a method is called parseIncomingEDIFACTD96AAndSendAperakResponse then while this is descriptive and you have a good idea of the intention when you come to parsing different incoming data and sending an Aperak response there is a chance you might end up duplicating code. If the Aperak response is separated you just need to call it after you have written your new parser.

Code becomes more robust. Take our example method stringStartsWithString. You could want to test array keys to see if they start with a string. You could then have a method anyKeyFromArrayStartsWithString. This method would call stringStartsWithString as it is testing each key. If there was a bug in stringStartsWithString or perhaps you found that you could make efficiency savings with some code optimisation, because this method has a clear defined responsibility you can have a good level of confidence that fixing the bug or providing an optimisation will not adversely effect the method anyKeyFromArrayStartsWithString.

More over by writing these small simple methods you can write simple tests to make sure they work rather than requiring complex end to end tests for a procedural code base. Your tests will be easier to maintain, understand and will make the code more robust, rather than being an end to end test that needs to change regularly.

Early Exit

There are some books on the SOLID principles of software development that go against this principle, suggesting that methods should have a single point of exit to make the code easier to understand. However, I feel that the early exit approach always leads to clearer and simpler code. This style of code is much easier to debug, especially if you are investigating an error which you do not have a way of easily reproducing and need to work out how it occurred by reading the code.

When you write methods you should return values or throw exceptions at the earliest possible point. This makes reading the code and understanding what has happened much easier.

It is common to see blocks of code like the following:

if ($success === true)
{
//large block of code
}
else
{
 throw new Exception ("error occurred...")
}

Often I will be looking at the if else and need to collapse the code just to check where the else state is likely to take me. If the else state is an exception it is much easier to read through

if ($success !== true) throw New Exception("error occurred...")

//large block of code

It also helps prevent high levels of IF ELSE indentation which is normally difficult to read and maintain. You may well have seen something like the following:

if ($state=='good')
{
  if ($value >= 0)
 {
    if ($quantity >=0)
   {
      //do something
   }
    else
   { 
      throw new Exception('3');
    }
  }
  else
 {
    throw new Exception('2')
  }
}
else
{
  throw new Exception('1');

This becomes much clear if you write:

if ($state!='good') throw new Exception('1');
if ($value < 0 ) throw new Exception('2')
if ($quantity < 0 ) throw new Exception('3');
//do something

While my examples are throwing exceptions the same simplicity advantages can be achieved with early return values.

function stringStartsWithString($string, $startsWith)
{
  if ($startsWith == '') return false;
  if (strlen($startsWith) > strlen($string)) return false;
  if (substr($string), 0, strlen($startsWith) != $startsWith) return false;
  return true;
}

As you can see the above code has 4 return points, but I feel it is easier to understand than the single exit point equivalent, and this clarity is only more apparent in more complex code.

function stringStartsWithString($string, $startsWith)
{
  $return = true;
  if ($startsWith == '') 
 {
     $return = false;
 } 
 else  if (strlen($startsWith) > strlen($string)) 
 {
    $return = false;
 }
  else  if(substr($string), 0, strlen($startsWith) != $startsWith)
 {
   $return = false;
  }
  return $return;
}

Now there is a lot more to good code than these two principle, but if you try to follow them other good practices like well named methods, testable methods are likely to follow. If a method has a single responsibility it is likely to be easy to name and clear what the code does, you are unlikely to get generic sounding methods like generate or process which force you to jump into and read rather than quickly scan to get a good appreciation of the intention of the code.

Comments

Popular posts from this blog

IE9 Intranet compatibility mode in Intranet websites

Multi-select with shift on HTML table

Intersystems Caché performance, woe is me...