A Tale of Two Systems
In order to keep the column to a managable length, the examples will be kept simple and the discussion brief.Â Those with experience in doing this kind of systems integration will appreciate that things are usually much more complex than the discussion would suggest.
Emergent Risk (Re)definedWe will begin our discussion with a reminder of a couple of the properties of Web services systems that challenge security architects:
- Web services provide for a loose coupling of systems in which the endpoint systems have no direct knowledge of each other nor any relationship with each other.
- The endpoint systems were not originally designed as pieces of an integrated system nor were they originally designed to participate in the provision of a service to "strangers."
When a system is designed and implemented, it is done implicitly or explicitly against a set of requirements, and, explicitly or implicitly, some of the requirements include controls that protect the system against risks that were known or anticipated at the time.Â In order to determine the optimum set of controls for the system, it is necessary to understand the environment in which the system will be operating, and model it.Â This exercise may be/have been undertaken formally or may have "fallen out of" the usual requirements gathering process.Â Whatever the case, the exercise results in some understanding of the threats the system is expected to face, the risks associated with that exposure and some notion of what controls to put into place that will allow the organization to address those risks.Â Depending upon the severity of the exposure and the cost of managing it, controls can be selected to:
- Prevent a negative event from occurring,
- React to a negative event, or
- Detect and report on the occurrence of a negative event.
- The value of the asset being protected.
- The nature of the threat.
- The nature of the vulnerability to the threat.
- The risk to the business if an attack is successful and is publicly known.
- Whether to prevent, react to or simply monitor the attack.
- The cost of the control.
- Design threat/risk model - The characterization of the threats/risks that exist or which are anticipated at the time the system was designed and implemented.
- Design control assumptions - The assumptions made about the efficacy of the selection of controls to be implemented by the system.Â The assumptions are based on the design threat/risk model and the selection of threats to be prevented, managed and monitored.Â (Controls are not limited to physical and technical access control mechanisms.Â They also include business and technical processes as well as policy).
- Design control set - The set of (risk management) controls selected for the system.Â This selection is based on the design threat/risk model and the design control assumptions.
- Design trust model - The characterization of the entities that are expected to be a part of The System; what their (trust) responsibilities are, and the degree to which they can be expected to fulfill them.Â In this case, The System includes hardware, software, business and technical processes and human beings.
- Design trust boundaries - The "line" that separates aspects of the system that are subject to the system controls from those that are not.
- Design trust domain - The elements of the system that are subject to the design system controls.
Before we go any farther, I need to talk about how I am going to use a couple of words.Â The word "trust" will appear frequently, and it will be overloaded.Â It will mean different things depending upon the context.Â A search of the InfoSec literature will show the same thing.Â Unless there could be some confusion about the meaning in a particular situation, I am going to let the context provide the meaning.
The other word I will use loosely will be "endpoint."Â An endpoint could be a legacy system or an individual user.Â In most cases, it will not matter or it will be obvious from the context which it is.Â If the context is ambiguous and it is important that the distinction be made, I will.Â It is possible that endpoints can be built from scratch specifically to participate in a Web services-based system.Â One would expect that those endpoints would not be subject to the problem being addressed in this article.
The trouble for an endpoint begins when the legacy system is recruited into participating in a system in which there are elements that exist outside the legacy system's trust domain.Â This changes the operating environment of the existing system.Â With very few exceptions, this has the effect of introducing risks that are not present in the system's design risk model and for which there are no controls in place that would allow the risks to be managed.Â The greater the difference between the "old" environment and the "new" environment, the greater the probability that there will be some risks in the "new" environment that are not included in the legacy system's design risk and threat models.Â With respect to Web services and legacy systems, the emergent risks are those risks that arise for the legacy system when it is confronted with becoming part of a larger system.
In the next few paragraphs we will look at two systems whose environments were changed enough that the change induced emergent risks.Â These systems were chosen because they are of varying vintage and architecture.Â The idea is to show that the problem is not specific to a particular class of application.Â It really does not matter exactly which system it is.Â In fact, the message I would like to convey is that these are patterns in the Gang of Four sense of the term.Â The examples are taken from real applications.Â However, I have tried to omit enough detail so that the identity of the specific system is not apparent while preserving enough to make the examples meaningful.Â
A Mainframe Batch ApplicationThis first example will illustrate how "putting a Web front end" onto a legacy banking system can create emergent risks for the legacy system.Â First we will introduce the salient aspects of the mechanics of a typical checking account system and a typical savings account system.Â Then we will plug a naive Web application into the system and examine what can happen.
Setting the StageConsider a classical Demand Deposit Account (DDA) banking application.Â One that was designed to support the typical checking account and the typical banking day.Â The salient aspects of an archetypal DDA system are these:
- When a customer opens a DDA account, among other things they do is sign a signature card.Â This card is kept on file and used (when necessary) to:
- Verify the identity of the account-holder and
- Verify that the person who is requesting activity on the account (making a deposit or cashing a check) really is who they represent themselves to be.
A typical bank has some means of defining "relationships" in which a bank customer identifier is associated with all of the accounts that that customer has with the bank.Â This makes life much easier for Customer Service Representatives and ATMs.Â If an ATM is going to allow a person to transfer money between accounts, there must be a mechanism available for it to determine the accounts to which the user has legitimate access.Â With the relationship information, the ATM system can construct the inquiry and transaction messages with the appropriate account numbers to the legacy system.
Some Design AssumptionsFrom the last section, we can infer some of the assumptions that designers of the system could make about risks and controls.
- The batch program trusts that the sources of the transactions (tellers or ATMs) have:
- identified and authenticated the requester of the transaction,
- verified the account number(s) in the transaction and
- authorized the transaction.
The Paradigm ShiftNow the bank decides that it wants to allow on-line access to its services.Â It wants to let its retail customers inquire on account balances, transfer funds between accounts, pay bills, etc.Â The bank decides to use Web services to implement the system.Â It will give its retail customers access to bank transactions via the Web and will provide a B2B "back-end" with payees in order to facilitate the bill-pay process.Â In this scenario, we will only focus on the customer-facing part of the system.
The Web services development team talks with the mainframe support group and learns of the message set that the branch and ATM systems use and adopt them.Â They talk with the ATM support team and learn how that system uses the relationship mechanism to select the correct accounts to present to the customer.Â The then build an online banking application that, among other things allows a customer to transfer money between accounts.Â The (abbreviated) scenario looks something like this:
- The customer logs into the online application using a combination of ID and password.
- The customer selects the "Transfer Funds" option.
- The Web application sends the account numbers in the relationship out to the browser which are displayed in drop-down lists that allow the customer to select the "from" account, the "to" account and the amount.
- The selection is then sent back to the Web services application which then builds the messages for the mainframe application and sends them out.
The Risks EmergeOne of the vulnerabilities/risks falls out of the identification and authentication process.Â Tellers and ATMs use multi-factor authentication.Â The application uses single-factor authentication; a user ID and a passcode.Â This mechanism is subject to shoulder surfing, but that is reasonably easily managed, although harder than at an ATM machine.Â Worse by far is that it is also subject to keyboard monitoring attacks.Â See the articles on The Register and the Washington Post about a recent exploit in which the attacker used a spyware program to glean login ids and passwords to his victim's brokerage account.Â This was an attack of limited scope only because the purpose of the attack was limited in scope.Â It is only a matter of time before someone tweaks one of the mass-mailing worms to deliver a keystroke logger.Â Then it is only a matter of harvesting the logs and grepping through them for activity on online banking/trading/commercial sites . . .Â See these articles at News.Com and The New York Times (requires free registration) for glimpses of the tip of the iceberg.
Another risk that arises for the first time is that the customer has control over (part of) the data stream.Â In this naive application, the Web services layer sent the customer's account numbers and balances out to the browser.Â The customer then POSTed transactions back to the Web services layer.Â The Web services layer then constructed a message using the account and amount information from the POST and sent it to the mainframe . . . just like in the ATM world.Â There was never any need for the system to question anything it got from the ATM.Â Theoretically, the ATM was a secure machine, it implemented dual-factor authentication, and communications were encrypted.Â No one could get access to the data stream.Â This is not the case in the wonderful world of the World Wide Web.Â There are client-side proxies everywhere.Â In the new universe, it is not possible to trust the data stream.Â End-users can change the data stream at will, even that "protected" by SSL . . . see for instance, @stake's WebProxy or Packet Storm's Achilles.Â Given the way this application was written, it was possible to transfer funds between any two accounts in the system . . . because the services layer assumed that it was not going to get anything back that it didn't send out.Â It did not know that it could not trust the data coming back to it.
A third risk is the absence of the paper trail that is available from branch- and ATM-based applications.Â If the Web service does not log activity, there is no trail.Â Non-repudiation is much harder to establish in this environment also.Â Banks and ATMs have surveillence cameras.Â The Web services environment does not have that.Â Even if the Web application does log activity, the single factor authentication is so weak that the audit trail is worth little.
The only thing that saves this from being really bad is that, these days, most Web banking applications only allow funds transfer among accounts in the customer's relationship.Â So the monetary risk to the bank is minimal.Â The two major risks are:
- Reputational risk for the bank.
- Loss of privacy and identity theft for the customer.
Implications for the Mainframe and the Web Services LayerThese are serious threats to the legacy system.Â The Web services application is far outside of its trust domain.Â As described above, the mainframe cannot trust anything it gets from the Web services layer.Â The designers of the Web services layer did not understand what the mainframe's risk and trust models were.Â They didn't know that it was not OK to just "write to the interface."Â They did not understand that there were no controls in the mainframe system's design control set that could address the risks introduced by this new layer.
This was a deliberately simple example of a naive implementation.Â In this example, the Web services layer violated the mainframe system's design control assumptions because it did not verify that the account numbers that were being tendered by the endpoint were valid for that customer.Â It was easy to miss this because it was part of the business process in the branch and, in the case of the ATMs, was done implicitly.Â It only shows up in the Web services environment because of specific properties of that environment.Â The risks that the system creates by using single-factor authentication don't have a parallel in the branch- and ATM-based world.Â This is what makes identifying emergent risks so challenging.
A Two-tier Client-server ApplicationThis tale is about what happens when a manufacturer decides to share the information that is kept in a typical early-'90s client-server line-of-business application with its customers.
Setting the StageThis was an Oracle-based system that was a combination of off-the-shelf and internally developed code.Â The user interface was built using SQL*Forms and application logic was implemented in Forms triggers and database triggers and stored procedures.Â (I am only naming these products because they are a relevant part of the design of the system).Â The system provided of the basic functions the manufacturer needed; basic accounting, inventory management, sales order processing, standard routing, QA, shipping, etc.Â This was a specialist operation that provided custom processing to customers' goods, so the inventory management, shop floor control, QA and shipping subsystems tracked customers' goods from the time they were received, through the manufacturing process all the way through shipping, in more or less real time.Â Order status was updated as goods flowed through the process and Accounts Receivable was updated whenever an order was shipped and billed.Â Following are some of the aspects of the system that are pertinent to this discussion.
This was an integrated system which at any time could provide current:
- Customer order status
- Customer inventory by location
- Customer inventory status by state (raw, in-process, QA, finished)
- Customer inventory by location
- Work in process by manufacturing order
- Work in process by machine and by process
- Work in process by manufacturing department
- Work in process by customer
- Aged Accounts Receivable by customer
- Ad-hoc inquiries on various combinations of the above information
Some Design AssumptionsThe application design was more or less standard for the time.Â It was modular; each module could stand alone but the modules all worked against a system-level data model and could work with one another if configured to do so.Â Also typical of systems of this genre were the assumptions about how and by whom the modules would be used.Â These assumptions are the ones that will be violated by the paradigm shift.Â Lets take a look at some of them.Â
There were several modules that were affected by this decision:
- Accounts Receivable
- Order Entry
- Inventory Management
- Shop Floor Control
- Quality Assurance
- Shipping and Receiving
Since there was no formal relationship between modules, any individual could be authorized to use more than one.Â Access to system function was managed in a table that associated system user IDs with menu items.Â Through this mechanism it was possible to construct "personalized" menus for each user.Â When users initiated sessions, they were presented with a system menu that gave them access to whatever functions were assigned to them, across modules.Â "Normal" users saw only one module, and sometimes, only a subset of the functionality available in that module.Â The important thing to keep in mind, though, is that no matter how few functions and individual might be able to perform, the user could perform that function on any customer/account/inventory item or whatever in the system.Â Herein lies the problem.
The Paradigm ShiftOut of the box, a user who could run an aged receivables report could run that report on any, all, or a subset of all customers.Â A user who could query the status of work in process on the manufacturing floor could do so for any, all or a subset of machines, customers, etc.Â A user who could schedule shipments could do so for any or all customers, etc., etc., etc.Â Now we want to make some of the system functionality and data available to customers.Â The paradigm shift is that now the customer will be allowed to:
- Get a quotation online
- Enter a new order
- Check on the status of existing orders
- Locate inventory
- Issue shipping instructions
- Check pricing
- See current aged receivables
The Risks EmergeAll of that functionlity is already available.Â The problem is that the system design assumes that the only entities that are going to be using the system are employees who need to be able to work with all customers' accounts.Â The system design also assumes that the company employees will not discuss one customer's information with another customer.Â As was mentioned previously, the industry is intensely competitive and processing secrets and cost and pricing schedules are jealously guarded.Â All of this information is readily avaliable on the system, and there are controls in place to protect it, given the way the system was originally designed.Â However, giving customers direct access to the system invalidates the design threat, risk and trust models.Â Customers are outside the design trust domain.Â There are no controls in place to manage the risks that are incurred by giving customers access to the system.Â Some of the risks are that customers could:
- See each other's manufacturing secrets.
- See each other's cost structures.
- See the company's cost and pricing structures.
- See the company's manufacturing secrets.
- See each other's work in process.
- Reprioritize each other's orders.
- Issue bogus shipping instructions for other customers' goods.
Implications for the Legacy System and the Design of the Customer-facing SubsystemGiving customers access to the system had the effect of defining a class of user and type of system access that was orthogonal to that for which the system was originally designed.Â It meant adding a whole new dimension to the control space.Â It created the necessity for introducing role-based access control into the system.Â There is not the space to go into the details of the surgery that had to be done to implement the new system.Â The move to allow customers access to the system violated one of the cornerstone assumptions of the original system design.Â Suffice it to say that the job was made much easier because of Oracle's understanding of roles.
Lessons LearnedIn this article, I have tried to give a couple of real-world examples of emergent risks.Â If I succeeded in presenting them in the right way, the response of many readers will be something like:Â "Well, duh!Â Of course!"Â I expect that by the time readers finished the sections on design assumptions, they would have an idea of what was coming next and the glimmerings of a solution.Â The tale that wasn't told is the one about the "journey to enlightenment . . ."
As noted early in the article, the probability of the existence of a design document that details threat and risk models, the control set, and the trust model against which the system was designed is slim to negligible.Â The older the legacy system, the less likely it is. Â (At least in the commercial universe.Â There is more of that sort of information available in some military and intelligence systems.Â But for Joe Sixpack Webspert who is trying to build an HR portal, that information is not there).Â Every organization is different, but if one reads the mailing lists and newgroups, one comes away with the overwhelming impression that there is no one who, at the beginning of a Web services project, says:Â "Guys, before we start designing this thing, we need to understand the security assumptions that are being made by the systems to which we are going to connect so that we be sure that we don't violate them with our system."Â Not once have I experienced that.Â The thing is, if it is not done, the Web services application is very likely to become the legacy system's greates threat.Â I do not believe that there is an intentional decision not go through the process.Â In almost all situations, it has just been a case of no one thinking about it.Â Simply a lack of awareness.
Compounding the lack of awareness on the part of the Web services development team management is the chasm between them and the team which supports the legacy systems.Â Though there might be no written documentation on the assumptions made by the design of the legacy system, that information usually is available, but it's in the heads of the people who are supporting the legacy system.Â In large organizations, there is usually considerable distance, both organizational and geographic, between the two teams.Â Sometimes the Web services development team is part of a business unit in one city and the mainframe support team is in the Corporate IT group at corporate headquarters.Â Even worse, the mainframe application support team may be outsourced and the people who really know the system are half a world and twelve time zones away.Â In situations like this, communication between the teams is likely to be cumbersome and painful, making it very hard to establish the kind of interaction that facilitates the discovery process.Â In the end, this situation becomes yet another of the emergent risks that has to be managed.
Getting Through the First Pass of the Discovery ProcessIt would be nice to be able to produce a list of emergent risks that could be handed to the Web services development team.Â Given that every situation is going to be different, that is impossible.Â There are some general classes of controls that can form the basis for research.Â If the team is lucky enough to establish a good dialogue with the legacy system support team, detailed information will fall out of the disucssions.Â A way to get started would be to ask the following questions about the controls listed below:Â Are there any?Â What are they?Â What was the threat and risk profile against which they were chosen?Â What risks are they intended to manage?Â What are the design trust assumptions?Â What are the design trust boundaries?
- Identification controls
- Authentication controls
- Authorization/access controls
- Non-repudiation controls
- Confidentiality controls
- Integrity controls
- Availability controls
Organizations that have mature (IT) risk management programs already do this or something like it.Â The ones that have a robust Information Assurance program probably have much of the necessary information in formal documentation and have representatives from IA/InfoSec on all development and support teams.Â For these organizations, this is probably not news.Â However, there are many more organizations that do not have mature risk management programs and robust Information Assurance progams than there are those that do.Â In the past, it has just not been necessary for these organizations to develop them.Â Introducing Web services into a system can introduce significant new risks to legacy systems.Â It is for concerned individuals in those organizations that this article was written.