Building abstraction like the OSI model
Layered abstraction models are an incredibly useful tool to reason about the myriad functions that even simple engineering systems must employ, especially when computers are involved. They take highly unique engineering components, identify commanality among them, and what interfaces they provide, and then in chaining several of these abstractions together, an engineering practitioner is provided with a relatively simple model to describe complex behaviour. Given how useful they are, how would one make a layered abstraction model? The Open Systems Interface, or OSI, model for network communication has an answer in ISO/IEC 7498-1.
Before diving into that, the OSI model is also useful for illustrating how layered abstraction works. The OSI model understands network communication, like is done for the internet or industrial fieldbusses, to be a layer cake of abstractions over the physical communication media as shown in the following figure. An individual tool may implement the behaviors and interfaces of of one or multiple layers. The layering enables design freedom in how a network is implemented. Communication over HTTP usually goes over a TCP transport through an IP network and to other destinations over WiFi or Ethernet, but Lindgren et al. show that it’s possible to send IP signals over CAN, and Ethernet over EtherCAT is possible as well. High-level protocols can also be replaced to obtain a lower level of abstraction for performance or design flexibility, as was done for the video game DOOM in 1993, as well as for contemporary game engines like Godot or Unity.
Figure 1: The low layers like Physical or Data Link are concerned with minute details of communication, like voltage signaling or byte endianess. The high layers like Application provide network facilities without need to be understand the low-level phenomena.
The OSI standard also motivates how to make a model like this (in §6.2.1). In summary, each layer should be a discrete, separable item with a coherent function. Their interfaces should be succinct and only connect with adjacent layers. Fewer layers eases explanation and implementation and the choice of layers should be motivated by experience. This being a novel theory, experience will be the principal gain of future work, although case studies discussed later will also help. Helpfully, it offers a test to aid in determining layer boundaries. Layers should have a different level of abstraction in terms of morphology, syntax, and semantics. This latter part is from Bloomfieldian linguistics, described in Leonard Bloomfield’s Language.
But what does that actually mean? Consider an example.
printf("I have %u cows\n", num_cows);
Morphology is concerned with the atomic elements of meaning. In C code, this means short tokens of text, here printf, (, "I have %u cows\n", ,, num_cows, ), and ;. None of these can be reduced and still carry meaning—pri and ntf seperately don’t combine to make printf. But these atoms of meaning needn’t be text or language. As will be discussed, they can be cables, connectors, computer hardware, or anything that carries semantic meaning.
When these tokens are combined into phrases, that’s syntax. Syntax is the structure into which meaning is applied. The printf example is syntactically parsed by C compilers as in Figure 2. The syntax tells that the postfix expression represented by the identifier printf has the argument list of "I have %u cows\n" (a string literal) and num_cows (an identifier). What any of this means is the domain of semantics.
Figure 2: The above printf example with syntax labeled according to Kernighan & Ritchie C. The whole line is a postfix expression and the printf identifier is also a postfix expression and a primary expression on its own. The tokens between parentheses are an argument expression list, and within that, the tokens between commas are at once many types of expressions that enable writing non-trivial code, but terminating as a primary expression, and then here as a string literal and an identifier.
The semantics of printf are the expected behavior of the function, which can be gleaned from the documentation. It will say that printf means to convert and write output to the console under the control of a format string, according to Kernighan and Ritchie. It will also say that the format string is the first expression in the argument expression list and that it is of type char*. So "I have %u cows\n" is that format string, and the documentation will state that the second expression in the argument expression list, here num_cows will be formatted as a string and written to the output where conversion character, here %u is. The logic of how the conversion characters and string formatting work are based on the format string’s own morphology, syntax, and semantics, the study of which is left as an exercise to the reader.
So to make a layered abstraction model like the OSI, first look for contrasts in morphology; are there fundamentally different atomic elements to components? A mechanical system’s atomic elements will be geometric bodies made from materials, whereas a power electronic system will be made from components that have a certain response to a voltage or current. Second, how do the components syntactically relate to each other? Mechanical components have reaction forces when in contact with other components, but electrical components make a network of components. Lastly, what is the semantic meaning of their structure? Mechanical components produce flexural rigidity or strength or convert one type of energy to another. Electrical systems process, filter, or amplify electrical signals. If functionality or tools vary greatly in these three aspects, it’s quite likely that they could be seperated into abstraction layers, at least according to the OSI standard.