Does It Mean You Are Doing Object-Oriented Programming If You Write Classes?
I remember a job interview that I had in my early days as a software engineer. I was asked what is the Object-Oriented Programming (OOP). I tried to remember things I learned at a University and murmured something about encapsulation, inheritance, and polymorpshism.
Is it still the best answer? Let’s take a step back and look at fundamental ideas behind OO programming and try to find out.
What We Are Used To Think
An ‘object-oriented’ language is a language where you need to split the code into classes that are instantiated into objects at a runtime. Also, this language must support encapsulation, polymorphism, and inheritance. That’s what we are used to, right?
But you might be surprised to discover that none of those words were mentioned when the term “OOP” was first pronounced.
The term “Object-Oriented Programming” was originally coined by Dr. Alan Key in 1967. He was working at the University of Utah on researching programming architectures. His idea was to build software that consists of a set of small mini-computers. Those mini-computers are independent, isolated from one another and communicate via passing messages rather than direct data sharing.
You might notice that it is similar to what computers on the Internet do. At about the same time, a US government agency was developing ARPANET, a military network that soon become the Internet. OOP and the Internet are both products of same era and ideas that were discussed in academic circles at that time. (Also, this idea can be seen from a different angle. Dr. Key had background in math and biology. Such structure resembles how complex biological structures are built out of simple independent cells).
But how does the idea of independent mini-computers translate into practice? In Dr. Key’s exact words,
“OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.”
Differently put, there are 3 components to OOP in the initial idea:
- Message passing
- Extreme late binding
There is nothing about classes or objects. And very little about inheritance or polymorphism or encapsulation. When talking about OOP, we tend to focus on “objects”, the main word in the term. But looks like it is misleading. Maybe it should have been called something else? Some people go as far as suggesting to call it “MOP” instead - Message-Oriented Programming.
“I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea. The big idea is messaging.” - Alan Kay
The Essense Of OOP
Let’s look closer at those 3 founding ideas.
The first programming language to implement Object-Oriented model was Smalltalk. But message passing in Smalltalk was implemented as synchronous function calls. This became de-facto standard implementation of the idea of message-passing. It was repeated in C++, Java and most of other object-oriented languages.
OO programming is all about objects. Objects are things that respond to messages (or should be) - to get an object to do something you send it a message - how it does it is totally irrelevant - think of objects as black boxes, to get them to do something you send them a message, they reply by sending a message back. How they work is irrelevant - whether the code in the black box is functional or imperative is irrelevant - all that is important is that they do what they are supposed to do. - Joe Armstrong
I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages - Alan Key
There is one subtle detail. Message passing differs from conventional programming technique where a function is directly invoked by name. It relies on the object to select and execute the appropriate code itself.
Each object can receive messages and figure out if it knows how to deal with them. In other words, you don’t execute code by calling it by name: you send some data (a message) to an object and it figures out which code, if any, to execute in response.
How is that better? The benefit is in stronger decoupling objects from each other — the message sender is only loosely coupled to the message receiver, through the messaging API. Objects can abstract away and hide data structure implementations. The internal implementation of an object could change without breaking other parts of the software system.
Isolation is easy to confuse with encapsulation. We tend to think about different level of “privacy” of class internals that programming languages offer us (public/private/protected).
But isolation is a stronger idea. Isolation means that software components know nothing about each others internals. No internal properties or no mutable state is common between multiple components. Objects treat each other as black boxes.
[…] the whole point of OOP is not to have to worry about what is inside an object. Objects made on different machines and with different languages should be able to talk to each other […] - Alan Kay
Web servers and web browsers are great examples of isolated components that communicate via message passing. Web browser knows nothing about internals of a web server. Everything that is behind a URL is a black box to browser. It knows nothing about internal state, exceptions or behaviour. Same goes the opposite way, server does not know anything about browser internals nor cares about it. The result of this isolation is the Internet that we know today. Imagine what happened if each time browser had a problem, server crashed too? Or vice versa? Isolation in terms of OO programming is to bring similar level of isolation to the implementation of an application.
Avoiding shared mutable state is the reason behind isolation. The only way to affect another object’s state is to ask (not command) that object to change it by sending a message. State changes are controlled at a local, cellular level rather than exposed to shared access.
Extreme Late Binding
Is (exteme) late binding same as polymorphism?
If you have multiple classes that have same method (and organized in some sort of hierarchy in many languages), you don’t know which exact method of which object will be called at compile time. At compile time you only know that a method exists, and which one to run will be determined later, at runtime. This is what we usually call polymorphism.
Extreme late binding takes this idea one step further. No checks are performed until runtime. You don’t even need to know that the method exists. Similar to what we have in dynamically-typed languages.
What it gives us is ability to defer decisoins to later time. You don’t need to commit to one particular way of doing things, you can decide and change it later. It also allows you to build systems that you can change at runtime (for example, in Erlang you can swap parts of the system while it is running, no need for downtime)
What We Learned
Somehow over the years understanding of OO programming morphed into some weird idea that OO programming is about organization of code into classes and modelling real world via “is-a” and “has-a” relations (‘dog is-a animal’ or ‘car has-a engine’)
“I made up the term ‘object-oriented’, and I can tell you I didn’t have C++ in mind.” - Alan Kay, OOPSLA ‘97
OO Programming is not about that. Fundamentally idea of OOP is not polymorphism or encapsulation. It is not even about inheritance. All those things appeared later. The fundamental idea behind OOP is message passing, not organization of code into objects. It does not matter what is inside a class - imperative code or functional. It does not matter if it inherits from something or not. OOP is a design idea, not code organization.
So what do we do now? Should we start replacing OOP with ‘MOP’? I think it could create even more confustion. But it is nice to know that there is more to it than what we used to think.
As a takeaway, I focus on following ideas:
- Message passing is the key idea behind OO Programming
- Inheritance does not matter: it is code organization, not architecture
- Encapsulation is weak, isolation is stronger
- Polymorphism is a weaker idea than late binding
- OOP can be both imperative or functional