Skip to main content

Bad Integration by Design or How to Make a Horrible Web Service

To understand what makes easy integration or a “good web service”, it’s worth taking a glance at the historical methods of I.T. systems integration.  After all, business systems have been passing data around and/or activating each other, aka integrating, for almost as long as there has been commercial I.T. business systems (approximately since 1960). 

The first major “interface” method between systems was throwing sequential fixed-length record files at each other.  This was pretty much the only method for 20 years and still remains in widespread use, though mostly around mainframe and legacy systems.  The system providing the interface, either outputting the data or providing a format for which to send it data, defines a field by field interface record, along with header and footer records.  Because these are fixed length records, the descriptive definition (the human readable documentation) must include the format and length of each field, along with any specialized logic interpretation or encoding.  For example, if a record represents a person, which includes their gender, it might specify a 1 byte single digit field, with a 0 representing male and a 1 representing female.  (Given that this appropriate started in the early days of computing, there is also a strong tendency to minimize data size – save the bytes! – leading to additional encoding logic within the definition.)  Because the definition is fixed length records, no data typing can be enforced within the data format, only at time of programmatic interpretation.

So how did this approach work?  It worked great.  This is the base approach of generations of systems, especially financial and business systems. 

If it worked great, why don’t we do this anymore?

Answer: Because of the data typing (no enforcement in the format), the encoding (no enforcement in the format and not understandable without documentation), and other dependent logic (such as cross field validation instructions, example “if field 2 is female, then you may fill out the field 9 for number of pregnancies”), getting an interface build and correct would take 2-6 weeks per connection.  So while this method worked, it was time consuming to successfully implement.

API’s came along to allow direct activation, and defined a fixed set of data types required to activate.  This solved the problem of the first model’s data typing without enforcement, and part of the documentation problem (the data types became self explanatory).  Further, the API’s could define descriptive names for the data fields, thereby providing some self-documenting ability within the API.  A major improvement.

API’s, however, added a new problem: they were technology, and often version, dependent.  Meaning an API exposed on one system in one language in one release was compatible only with another system in a matching system / language / and version. 

Regardless, integration via APIs was easier and faster.  And it became the base technology that allowed Windows, Unix and other modern operating systems to move from being simply an execution starter and hardware interface to being a facilitator of interaction between applications.  It further allowed a real-time interaction that was not possible previously.  That said, figuring out and correctly using an API could still take days to weeks.  Embedded cross field validation and logic would often slow down the process.

API’s evolved in the next generation with REMOTE APIs.  Remote APIs moved the cross-application interaction to cross-system cross-environment interaction.  The original remote API technology with commercial success included DCOM, CORBA, and RMI.  All of these commercial implementations worked, but were very complicated and highly sensitive to perfect conditions.  And, for the most part, they were TECHNOLOGY specific (as well as being version specific).  So while they began to offer the new ability of remote invocation and/or coordinated system interaction, the environment had to be perfectly configured and matching technology and version.

Each one of these generations of integration technology worked within it’s context and solved problems not previously solvable – offering new abilities and new opportunities.  Yet their limitations meant they remained niche solutions for specific narrow problems. 

With the arrival of web services, a new integration level was reached.  Web Services offered all the previous abilities while adding key points:

- The data format is XML, and therefore self descriptive.

- The service and data format is defined with an XSD, and therefore is self validating.

- The communication protocol is firewall and technology neutral and friendly.

- The data format is technology neutral and supported by all development tools.

With these abilities added to the historical ones, integration moved from a major project effort to…simple, trivial, fast.  And with that change web services and integration became more than just commonplace, it became the way to do things.  (This brings some new problems, such as integration spaghetti and interconnection dependencies, but that’s a different discussion.)

So how do you make a horrible web service?  Simply strip away one more more of the primary advantages it offers.  Examples:

-- Serialize language specific objects into your web service as one or more data items.  For example, serialize a .NET object into your web service.  The result, a web service that can only work with .NET (and of the appropriate version).  Yes, I’ve seen this done.

-- Place “codes” in data fields in the web service.  For example, make a field <Gender> where “3” = Male and “1” = Female.  Then explain to the user of the web service that they must download your table of codes / values to insert the correct values or interpret the values.  This, sadly, is a not-uncommon error.

-- Structure the XML as just a flat list of fields even though it could be placed in a hierarchy, or is in a hierarchy in the objects or database tables.  The corollary of this error is to expose multiple services for each level of a hierarchy rather than one service with a hierarchy.  This is the error of sharing data and not the business function / transaction.  All too common.

In general, by stripping a web service approach to an earlier generation by stripping an ability, the result is a service of limited use, difficult re-use, and challenging to understand.  Each of these problems turns into extra time and complexity, the exact opposite of what services come to solve.

I recommend avoiding these errors.

Popular posts from this blog

Integration Spaghetti™

  I’ve been using the term Integration Spaghetti™ for the past 9 years or so to describe what happens as systems connectivity increases and increases to the point of … unmanageability, indeterminate impact, or just generally a big mess.  A standard line of mine is “moving from spaghetti code to spaghetti connections is not an improvement”. (A standard “point to point connection mess” slide, by enterprise architect Jerry Foster from 2001.) In the past few days I’ve been meeting with a series of IT managers at a large customer and have come up with a revised definition for Integration Spaghetti™ : Integration Spaghetti™ is when the connectivity to/from an application is so complex that everyone is afraid of touching it.  An application with such spaghetti becomes nearly impossible to replace.  Estimates of change impact to the application are frequently wrong by orders of magnitude.  Interruption in the integration functioning are always a major disaster – both in terms of th

Solving Integration Chaos - Past Approaches

A U.S. Fortune 50's systems interconnect map for 1 division, "core systems only". Integration patterns began changing 15 years ago. Several early attempts were made to solve the increasing problem of the widening need for integration… Enterprise Java Beans (J2EE / EJB's) attempted to make independent callable codelets. Coupling was too tight, the technology too platform specific. Remote Method Invocation (Java / RMI) attempted to make anything independently callable, but again was too platform specific and a very tightly coupled protocol. Similarly on the Microsoft side, DCOM & COM+ attempted to make anything independently and remotely callable. However, as with RMI the approach was extremely platform and vendor specific, and very tightly coupled. MQ created a reliable independent messaging paradigm, but the cost and complexity of operation made it prohibitive for most projects and all but the largest of Enterprise IT shops which could devote a focused technology

From Spaghetti Code to Spaghetti Connections

Twenty five years ago my boss handed me the primary billing program and described a series of new features needed. The program was about 4 years old and had been worked on by 5 different programmers. It had an original design model, but between all the modifications, bug fixes, patches and quick new features thrown in, the original design pattern was impossible to discern. Any pattern was impossible to discern. It had become, to quote what’s titled the most common architecture pattern of today, ‘a big ball of mud’. After studying the program for several days, I informed my boss the program was untouchable. The effort to make anything more than a minor adjustment carried such a risk, as the impact could only be guessed at, that it was easier and less risky to rewrite it from scratch. If they had considered the future impact, they never would have let a key program degenerate that way. They would have invested the extra effort to maintain it’s design, document it property, and consider