Introduction

The Unified Modeling Language (UML) is a graphically based specification language for modeling software-intensive systems that has, since the late 1990s, become a de facto standard in the object-oriented software engineering industry. The UML employs a variety of diagrams to model different aspects of a system during the course of its development. The UML began life at the Rational Software Corporation, which itself started out as Rational Machines in Massachusetts in 1981 and was eventually acquired by IBM in 2003. In 1996 Rational tasked three of its software engineers with developing a non-proprietary unified modeling language.

Grady Booch, who had been with the company from the outset, had already developed a widely-used object-oriented modeling method (known as the Booch method) that was ideally suited for object oriented design (OOD). James Rumbaugh, who joined the company in 1994, developed his own very popular Object Modeling Technique (OMT) which was more suited to object-oriented analysis (OOA). In 1995 the third member of the team, Ivar Jacobson, was recruited. Rational had acquired the Swedish software company Objectory AB for whom Jacobson worked, and Jacobson had invented the Object-Oriented Software Engineering (OOSE) method. The three software engineers, nicknamed the Three Amigos, worked in consultation with a number of major players in the software industry. Their efforts culminated in the formation, in 1996, of an international consortium called the UML Partners.

In addition to Rational Software, consortium members included a number of other high-profile companies, including DEC, Hewlett-Packard, Intellicorp, Microsoft, Oracle, and Texas Instruments. Its aim was to produce a complete specification for the UML and submit it in response to a Request for Proposals (RFP) issued by the Object Management Group (OMG) for a standard modeling language. OMG was itself a computer industry consortium, formed in 1989 and dedicated to the creation and maintenance of computer industry standards. A draft specification (UML 1.0) was submitted to OMG in January 1997, but it was a revised specification (UML 1.1) that was adopted by OMG in November of the same year. The standard has since that time undergone a number of amendments and revisions, but remains under the auspices of the Object Management Group. The latest major revision was UML 2.0 which was adopted by the OMG in 2005, and the most recent specification at the time of writing (released in December 2017) is UML 2.5.1.

The UML 2.x specification is broken down into four parts:

The UML is not a development methodology in its own right, although it has given rise to several development methodologies based on it. The best known of these is perhaps IBM's Rational Unified Process (RUP). The UML was designed to be compatible with a significant number of contemporary object-oriented development methodologies. Concepts from many of these methodologies were considered when the UML was being formulated, with the result that it can be applied to a wide range of object-oriented software development projects from small, single-user applications up to enterprise-wide, distributed information systems.

The UML specification is also extensible, and provides two mechanisms (stereotypes and profiles) to enable users to create new diagram types and model templates to suit a specific application area. One possibly negative aspect of such a broad-spectrum and flexible approach is that the UML specification is large and complex. Critics have claimed that, as a consequence, UML is difficult to learn, includes many diagrams that are rarely if ever used, and ignores some widely used diagrammatic modeling techniques such as data flow diagrams and structure diagrams.

More detailed information on the UML standard and related standards can be found at the Object Management Group's website:

http://www.omg.org

The UML is also covered by an international standard, although the current standard relates to version 1.4.2 (the current status of the standard is described as "under review"):

ISO/IEC 19501:2005

The model

An information system consists of some combination of hardware and software. Its purpose is to provide a solution for some kind of business problem. For most information systems, by far the greatest investment in terms of both time and money goes into the creation of the software element. Due to the highly specialised nature of software engineering, the analysis and design functions necessary for the development of a new information system are carried out by people who are specialists in these areas. These specialists do not usually have prior knowledge of the business area (i.e. the real-world domain) in which the system will be deployed. The client (the person or organisation commissioning the new system), on the other hand, knows their business inside out but does not usually have more than a superficial knowledge of software engineering principles, or the capabilities and limitations of hardware. There is a further dichotomy between the systems analysts, who investigate the problem domain and formulate a solution, and the software developers, who must design and implement that solution.

Given the possibility of misunderstandings occurring, there is an obvious danger that software developers will ultimately produce something that does not live up to user expectations. As software projects become larger and more complex, the probability increases of misunderstandings arising, because more people will be involved in the project, more information will need to be collected and processed, and many more channels of communication will be required. Each additional line of communication increases the potential for error. Anything that can improve communication, and thus reduce the possibility of misunderstandings occurring, must therefore increase the chances of a successful outcome.

Several decades of experience has shown that simply producing greater volumes of paperwork, or documenting everything in excruciating detail, does not usually solve the problem of poor communication. If anything it can compound the problem because the sheer volume and complexity of the documentation often means that some problems go undetected. The documents used by analysts, whilst meaningful to the analysts themselves, are often unintelligible to the client. In fact, they are not always easy to work with even for software designers and programmers. More importantly, if the client does not understand the proposed software solution, they have no way of knowing whether it is going to do what they actually need it to do.

The UML can be used to produce a model of the proposed system in a graphical form that is easily understood by all parties concerned (the project stakeholders). Stakeholders include analysts, software designers, programmers, and the client (this last group includes the individuals responsible for making high-level decisions, the end users that will be operating the system, and the technical staff that will be maintaining and supporting it). The UML will also produce documentation that is compatible with both object-oriented analysis and design methods and the object-oriented programming languages that may be used to implement the software.

The model itself consists largely of a collection of diagrams that represent various aspects of the system. There are a number of different types of diagram, each of which has its own role to play in the development process. The model is a blueprint for a software system in the same way that a set of architectural plans constitutes the blueprint for a building, or a set of engineering drawings becomes the blueprint for a piece of machinery. There are many analogies to be found in traditional branches of engineering. The main difference is that, whereas other engineering disciplines have traditions that go back hundreds or even thousands of years, software engineering is still in its infancy. Ironically, the sheer speed of developments in computer technology has meant that our ability to develop highly advanced information systems has not always been matched by our ability to monitor, document and control the development process. The UML is an attempt to rectify that situation by creating a set of standard diagrams that can be used to describe every aspect of a proposed system.

One of the criticisms of the UML is that there are too many different types of diagram (thirteen in the current version). It is true that some diagram types are rarely if ever used. On the other hand, the number of different diagram types is intended to ensure that all stakeholders can view those aspects of the system in which they have a particular interest, expressed in terms that they can understand. Each type of diagram presents a different view of the system, or part thereof. Different types of diagram may represent different stages of development, and each type of diagram will be of particular interest to a different group of stakeholders. Another way of describing this is to say that different types of diagram represent different levels of abstraction (they show those system features that are of interest to the stakeholder, and hide everything else). One type of diagram might provide a high-level view of the entire system, for example, while another type of diagram might provide a detailed view of a specific system component.

Broadly speaking, the UML model consists of two main categories of diagram. One category includes those diagrams that present a structural view of the system. When we talk about structure (or architecture), we mean things like data objects, the operations that can be performed on them, and the relationships between them. The other category includes diagrams that represent the behaviour of the system. This type of diagram can represent the way in which different parts of the system interact, how the state of various system parameters changes over time, or the sequence in which various types of event may occur.

A model, then, is a collection of interrelated diagrams (possibly accompanied by some written documentation) that describes every aspect of the system, at different levels of abstraction. The model can be modified as development progresses in order to reflect changes required due to the discovery of some previously unforeseen problem, or because the client has amended or added to their list of requirements. It models the structure of the application and the dynamic processes that will occur within it. It models the roles of the various users of the system (the actors), the business processes that the system is designed to support, the activities that must be undertaken in support of those business processes, the data structure, and the system logic. It does all this in a relatively easily-understood format that is highly compatible with an object-oriented programming language.

UML Diagrams

At the time of writing, the current version of the UML specification is version 2.5.1, which has thirteen different types of diagram. The different types of diagram can be split into two main categories. Six of them represent structural information about the system. Put another way, they can be used to represent different aspects of the system's architecture and deal with things that must be included in the system. The remaining diagrams each describe some aspect of the system's behaviour. Of these, there are four types of diagram that relate specifically to the interactions that occur between the various system components. Because all of these diagrams essentially show different views of the same system, it is not uncommon to find the same elements appearing on different types of UML diagram, but in a different context. Bear in mind also that because the UML is designed to be extensible, you may also come across user-defined elements and diagram types.

The following list shows the standard diagram types and how they are categorised: