A Conversation with Steve Ross-Talbot

Interviews

March 29, 2006
Volume 4, issue 2

Download PDF version of this article PDF

A Conversation with Steve Ross-Talbot

The IT world has long been plagued by a disconnect between theory and practice—academics theorizing in their ivory towers; programmers at “Initech” toiling away in their corporate cubicles. While this might be a somewhat naïve characterization, the fact remains that both academics and practitioners could do a better job of sharing their ideas and innovations with each other. As a result, cutting-edge research often fails to find practical application in the marketplace.

This is why the world needs more people like Steve Ross-Talbot. He has more than 20 years of experience leveraging cutting-edge research and applying it to real business problems. Recently he founded Pi4 Technologies where he and his team draw on the field of the pi-calculus to improve the ability to design, automate, and analyze business processes.

In addition to his entrepreneurial experience, Ross-Talbot holds positions on several standards bodies, including the Worldwide Web Consortium, where he is chair of the Web Services Coordination Group and co-chair of the Web Services Choreography Working Group.

Interviewing Ross-Talbot is Stephen Sparkes, CIO of Morgan Stanley’s investment banking division. Sparkes is no stranger to the field of business process management, having spent more than 20 years working in technology for leading financial institutions in a range of development and infrastructure roles.

STEPHEN SPARKES In addition to your work with the W3C, you’ve been through several start-ups. Can you tell us a bit about the role of each company and the evolution of your research?

STEVE ROSS-TALBOT In 1997, I started SpiritSoft. It was then called Push Technologies, but we rebadged it because the term push technology was getting a lot of bad press, as people were publishing content over the Internet and consuming vast amounts of bandwidth. SpiritSoft’s mission was to build a generic event-condition-action, or CEP (complex event processing) facility.

I had worked on a very large project called Hoodini (for highly object-oriented development) at Nomura International, where I was asked to deliver an active query facility. It turned out to be a special case of event-condition-action where the event is the change in the database, the condition is the predicate or query that you’re wanting to subscribe to, and the action is to refresh your query results set, to remove things from it or add things to it, and then inform an application.

The only reason for doing that is to reduce bandwidth on the server so that you can start distributing the processing, and therefore you don’t have to go back to the server to read your queries. You obviously change the programming model on most of the applications because they have to be event-driven. But if you’re doing GUI interfaces, generally you have an event loop, so being event-driven is not so strange.

The key was to build something that was flexible around this notion of active queries. I left Nomura in ‘97 to start SpiritSoft in order to deliver a generic capability for event-condition-action computing because, at that time, I certainly felt that it was a very interesting and perhaps a fundamental way of building systems. These days we talk about event-driven architectures and service-oriented architectures as if they have been around forever, but there was a lot of stuff that went on before we got to this point, and event-condition-action and active systems were part of that.

I believed for a long time that event-condition-action was fundamental to the notion of autonomic computing and went on record as using event-condition-action to build workflows.

I left SpiritSoft in about 2001, because, like all start-ups, you have to focus on the things that you can deliver immediately, and SpiritSoft needed to focus on messaging, not on the high-level stuff that I very much wanted to do.

I got a dispensation to leave and do more work on event-condition-action, whereupon I met Duncan Johnston-Watt, who persuaded me to join a new company called Enigmatec.

At Enigmatec we went back to fundamentals. This really is the thread upon which the pi-calculus rests for me. When you do lots of event-condition-actions, if the action itself is to publish, you get a causal chain. So one event-condition-action rule ends up firing another, but you do not know that you have a causal chain—at least the system does not tell you.

It troubled me, for a considerable time, that this was somewhat uncontrollable, and certainly if I were a CIO and somebody said they were doing stuff and it’s terribly flexible, I’d be seriously worried about the fragility of my infrastructure with people subscribing to events and then onward publishing on the fly.

So causality started to trouble me, and I was looking for ways of understanding the fundamentals of interaction, because these subscriptions to events and the onward publishing of an event really have to do with an interaction between different services or different components in a distributed framework.

Many years before I did any of this, I studied under Robin Milner, the inventor of the pi-calculus, at Edinburgh University. I came back to the pi-calculus at Enigmatec and started to reread all of my original lecture notes, and then the books, and finally started to communicate with Robin himself. It then became quite obvious that there was a way of understanding causality in a more fundamental way.

One of the interesting things in the pi-calculus is that if you have the notion of identity so that you can point to a specific interaction between any two participants, and then point to the identity of an onward interaction that may follow, you’ve now got a causal chain with the identity token that is needed to establish linkage. This answered the problem that I was wrestling with, which was all about causality and how to manage it.

At Enigmatec, we told the venture capitalists we were doing one thing, but what we actually were doing was building a distributed virtual pi-calculus fabric in which you create highly distributed systems and run them in the fabric. The long-term aim was to be able to ask questions about systems, and the sorts of questions that we wanted to know were derived from causality. For example: Is our system free from livelocks? Is our system free from deadlocks? Does it have any race conditions?

These are the sorts of things that consume about half of your development and test time. Certainly in my experience the worst debugging efforts that I’ve ever had to undergo had to do with timing and resource sharing, which showed up as livelocks, deadlocks, and race conditions. Generally, what Java programmers were doing at the time to get rid of them, when they were under pressure, was to change the synchronization block and make it wider, which reduced the opportunity for livelocks and deadlocks. It didn’t fix the problem, really; what it did was alleviate the symptom.

SS Just made it slightly less likely to occur?

SR-T That’s it. And I became obsessed with perfection, because I really felt you must be able to do this.

SS It should be an absolute proof.

SR-T Completely. We can prove something is an integer. Surely, we should be able to prove something about interaction. I started looking at other papers that leveraged the work of Robin Milner and the pi-calculus and found some fundamental work by Vasco Vasconcelos and Kohei Honda. Their work looked at something called session type, which, from a layman’s perspective, is this notion of identity tokens pre-pended to the interactions.

If you have that, then you can establish a graph—a wait-for graph or a causality graph—and statically analyze the program to determine whether it will exhibit livelocks, deadlocks, or race conditions.

I started to work with Kohei Honda on putting this into systems. That’s when the Choreography Working Group was established, and that’s when I became involved in the W3C.

It was quite natural to get the academics involved, which is why Robin Milner became an invited expert on that working group, along with Kohei Honda and Nobuko Yoshida. The three of them really formed the fulcrum for delivering these thoughts of very advanced behavioral typing systems, which the layman—the programmer—would never see. You would never need to see the algorithm. You would never need to read the literature—unless you were having trouble sleeping at night.

That’s the only reason I can think of for people reading it, unless you want to implement it. It should be something that’s not there, in the same way that we don’t have to know about Turing completeness and Turing machines and lambda-calculus to write software. That’s all dealt with by the type system in general.

I managed to establish early on a bake-off between Petri net theory and the pi-calculus within the Choreography Working Group, and clearly the pi-calculus won the day. It won because when you deal with large, complex, distributed systems, one of the most common patterns that you come across is what we call a callback, where I might pass my details to you, and you might pass it to somebody else, in order for them to talk back to me. In Petri net theory, you can’t change the graph: it’s static. In the pi-calculus, you can. That notion of composability is essential to the success of any distributed formalism in this Internet age, where we commonly pass what are called channels in the pi-calculus, but we might call them URLs, between people. We do it in e-mails all the time, and we’re not conscious of what that may mean if there is automated computing at the end of the e-mail doing something with our URL that was just passed. There’s a real need to understand the formal side of things in order to have composability across the Internet, let alone within the domains of a particular organization.

I carried on doing this stuff at Enigmatec, but again, as with all start-ups, it turned out that Enigmatec really needed to focus on a point-solution to a problem and to use the distributed computing framework that we had put together. And quite rightly so, it focused on disaster recovery orchestration, which has to be highly distributed in and of itself. That left me with a little bit of a problem, funding-wise, because then how could I really take advantage of choreography? At that stage I had been working on this stuff for two years.

It made sense again to move on, and Enigmatec kindly donated all of the source code that we had established around choreography to an open source project, allowing me and my business partner, Gary Brown, to do what we now do in our new company called Pi4 Technologies. As the name suggests, it’s based around the notion of using the pi-calculus.

SS How are we going to take advantage of the phenomenal power of the pi-calculus? What shape will that manifestation take? Will it be so deeply embedded we just need to be aware of it, or is it something we’re actually going to experience firsthand?

SR-T I certainly don’t think you will experience it firsthand. I think it will exhibit itself as just a typing mechanism. If, for example, you’re an architect in a large enterprise, then commonly you have three roles: the business architect, who’s responsible for gathering requirements—functional and nonfunctional—as well as business processes; the system architect, who’s responsible for trying to fit those requirements and business processes into a system; and a technical architect, who’s responsible for realizing that. Usually the systems architect and technical architect are the same person. When we build these systems as architects, what we often use are things like UML (Unified Modeling Language) for describing our overall systems, and we do that primarily because it gives us a common language. But as we all know, most projects fail because of poor communications and perhaps poor requirements as well.

I think the first manifestation will be in tools for the systems architects to describe unambiguously the interactions that occur between the fundamental roles or components or services in a service-oriented architecture.

For example, in a back-office processing environment, you might have validation and enrichments; you might have cash flow and settlements; you might have confirmations; you might have reporting and risk. All of these things have to happen, and they don’t happen serially; they happen in parallel.

What will happen is we’ll deliver solutions based on the CDL (choreography description language), which allows systems architects to describe their systems fully, from an observable perspective, with guarantees about the introduction or nonintroduction of livelocks, deadlocks, and race conditions. You can think of it as a way of adding rigor to the description, much in the same way that systems designers in avionics and electronics do. We’re trying to move up the stack, using technology, and to get the same benefits, generally, that people understand in normal engineering contexts outside of software.

If you have a livelock, a race condition, or a deadlock, it will show up as either a warning or an error in your system description, and it will point you to the specific lines in your choreography where that’s occurring. You won’t have to be aware how it does that. What you’ll get is, “Hey, you might have a problem here.”

SS So in much the same way that the compiler would have flagged a low-level error in days gone by, this will flag an error in the interactions modeling.

SR-T Correct.

SS In the past large enterprises spent gazillions of dollars on business process engineering consulting initiatives, but then ended up with a telephone directory of a modeled process or a consultant’s view of the modeled process. That was a dead end because there was no way to transition into an implementation of a system. What you have described seems at least to build half of the bridge: to be able to give you confidence in the final solution.

SR-T You raise a very interesting point. I was recently at an architects’ summit with some of the great and the good from the industry getting together to talk about the fundamental architectural principles and what architects want from vendors. Somebody actually said what they would like to do in the future—setting a timeline of 2008—is to Google for the services, both in terms of the attributes, as well as the behavior. With this idea that you mentioned about having these big books of business processes, maybe, just maybe, it would be possible to do that. I certainly know that there is some work with other pi-calculus aficionados where they are creating a large XML database that encodes the behavior of components that would deliver exactly this sort of capability.

What this would do is set up competition in a more fundamental way among component providers. You get the guarantee of interoperability because you’ve got the right functions and the right behavioral footprint. Then you could actually decide which components can play in your choreography—internally or externally—and you could make those decisions based purely on the nonfunctional requirements: “How much does it cost?” “What’s its reliability?” That would be a brave new world.

SS It would be a brave new world indeed. To have that degree of abstraction applied to system selection or component selection, with the confidence that you would get interoperability, is a huge step toward improving the efficiency and effectiveness of technology investments.

SR-T To be clear, we’re not there yet, but that horizon of 2008 that the architects’ summit set is not an aggressive timeline. My guess is that if we delivered this next year—and there is a possibility we could deliver it next year at Pi4 Tech—it won’t be ready for real commercial use probably for a couple of years after that because it would have to go through some tremendous trials.

SS It would be interesting to hear from you about the Pi4 Tech business model and, given that you’re an early-stage company in a leading-edge field, how you’re ensuring that you share the advances that you’re making and that you’re harnessing the broadest number of minds to attack the problem, while at the same time maintaining some commercial prospects for your company.

SR-T It’s quite important that I keep my daughter at dancing school, so the money is an issue. There are a couple of things that I’m doing in this respect. Rather than start up one company, I started up two, which may sound a bit bizarre, but it’s a hedging mechanism in a sense.

Pi4 Technologies exists primarily, and forevermore really, as the open source custodian of the pi-calculus-based tools, the first manifestation of which is the CDL tool suite from Pi4 Tech, and that is open sourced under Apache licenses—Apache 2.0 currently—forevermore.

The monetization of that comes primarily from the second company, Hattrick Software, which is co-founded by Michael Paull (CEO), Dr. Gary Brown, and me. Hattrick is really a verticalization of business needs that can be provided using choreography. It provides the professional open source supports in consulting for the Pi4 SOA (service-oriented architecture), the Pi4 Tech CDL tool suite, and it builds on top of that to deliver a technology called the behavioral backbone.

To understand what the behavioral backbone is, imagine you’ve got a distributed infrastructure where you could describe what you have currently in CDL and then generate monitors for each of the roles, which act as sniffers on the wire—just a little bump in the wire that’s sniffing on the surface of a component for what goes in and what goes out, and the order. The behavioral backbone provides you with the monitoring agents and the consolidation of that monitoring information against a choreography. So what you get to look at is variance from your plan, and it does that without doing anything to your services. It’s a neat way of figuring out whether your understanding of your distributed system is accurate, because you could do this and find that actually it’s completely different, and you may iterate several times until you get some kind of stable state that really does reflect what you have.

The advantage is that once you’ve got the advanced type system, first you can be sure it’s a reflection of what you have, and second you can run the advanced type system against it and find all the problems that you didn’t realize you had.

SS In many cases, that may reveal a problem in the actual business process that has been modeled, so there’s some interesting potential here, as well, as a way of exposing flaws in business processes.

SR-T The driving force behind this is doing composite or structured products in an investment bank. This is a drive toward derivatives trading, and that’s simply because today in most institutions it appears that there are huge backlogs in things like confirmations and probably even in cash flow and settlement, though I suspect fewer backlogs there—banks are very good at getting their money.

So, the behavioral backbone really was developed to enable people to manage their settlement processes better and to manage the life-cycle description of complex trades by describing the behavior of the trade across its lifetime.

You can imagine a trade documented in FpML (Financial Products Markup Language), which is a static description of the trade, but with an extension that has a CDL description that describes the whole life cycle of that trade and where all the touch points might be. So you’ve got one orthogonal description of the life cycle of the trade and another orthogonal description that is really the settlement process. That’s really what Hattrick is doing, using the Pi4 Tech tools suite.

That’s how we’re monetizing our work. The money from Hattrick gets fed back to Pi4 Tech as a percentage that Pi4 Tech takes for all the open source support and consulting. That money sits in a big pot and is used to fund research projects.

That’s how we can manage to engage with the widest community in terms of furthering the groundbreaking work. Just to give you an example, we’ve got Kohei Honda, Nobuko Yoshida, and Robin Milner all working on the open source stuff for the advanced type system. We’ve also had two other research groups approach Pi4 Tech to contribute. One is the University of Bologna, which is well known in the pi-calculus community, and the other is Imperial College, which is Howard Foster’s group that has been using something called the labeled transition system analyzer.

By open sourcing in this way, we’ve managed to remove all of the IP (intellectual property) issues that worry academics, and as a result, we have academia working on business-focused problems, which is very rare. I know that Professor Robin Milner is very excited by the way we’ve managed to pull people together and to give relevance to all the body of work that he has founded, and it’s that relevance that’s key.

I’ve always been pretty good at taking very advanced research and figuring out how to give it business relevance. I’m not a good researcher. I can just join the dots quite well.

SS With Hattrick Software leveraging the platform that Pi-4 Tech has created, you obviously have aspirations for others to leverage that platform either in other verticals or perhaps in competition with Hattrick. The big guys will all present their existing BPM (business process management) solutions, but without this capability, they are lacking a massive element that would make them successful. Are you seeing interest from the larger players, or is it still not on their radar?

SR-T It’s really not on their radar yet. Some players have expressed an interest. There’s one smaller data-integration specialist company that is starting to use our CDL tools, and of course, oddly enough, Hattrick and anyone else has the ability to write their own. What Pi4 Tech has done is to open source this technology, so that doesn’t have to be how you get into the business.

I would fully expect many companies to leverage what Pi4 Tech has and then write their own over time because they will want to own the means of production. Any VC-funded or publicly listed company would want that, and that’s good. That’s to be encouraged.

There’s also one large, quite well-known software and hardware vendor that is interested, and we’re having quite a lot of dialogue, but nothing solid yet. I think the way to fuel that is through success stories. The first success story was the adoption of CDL by FpML.org.

The second success story that I’m hoping for is that SWIFT (Society for Worldwide Interbank Financial Telecommunication) will adopt this. If so—and I’m tempting fate somewhat—then I think the large vendors—the Oracles, IBMs, Microsofts—cannot possibly ignore it. It becomes too important, and they’re absolutely welcome to use the Pi4 Tech open source stuff.

They are completely at liberty to rewrite everything they want, as long as they don’t break the open source license agreement. They can rewrite it from scratch, but what we want to do is give them a kick start. We want this stuff to be used and made as relevant as possible.

Making it open source from day one was an interesting way to go. I’ve never done that with a company before, but I think it’s absolutely necessary in order to get the adoption that’s required to make it useful—and to make it be seen to be useful.

Without large vendors all over this, it would have been a much harder road if we had not made it open source. Open sourcing does a couple of things. On the business side, it takes away the risks, and on the personal side, it means that both Dr. Gary Brown and I will always have the ability to work on this, which is something not to be underestimated, because there is a need for continuity.

SS You correctly pointed out that eventually any R&D stage in a corporation becomes targeted on an implementation and a solution, and that it’s in an organization’s interest to narrow its focus and actually survive as an independent entity. I think that creating a vehicle for you to continue to develop that is a very smart move.

SR-T The IP in all this should not be in the production of choreography tools or the use of the pi-calculus. The real IP, the real value for a business, is the choreography. So what I would hope is that over time Hattrick and others will enter the market and create choreographies that will form templates for businesses. Then businesses can take those and adjust them. That’s the real IP.

Above all, in trading, people use spreadsheets all over the place, and you kind of wish they didn’t. The real IP that the bank has in the front office and in risk is often embedded in a spreadsheet. It’s not in the spreadsheet program.

SS I would say that you’re selling yourself short in that prior to VisiCalc or subsequent clones, the entity that possessed VisiCalc had a significant commercial advantage over those that did not. I think that you’ve been working in this field for a long time, and have absorbed and internalized the benefits of the platform, whereas to somebody outside of the field, the platform itself has value. Agreed, the ultimate value is going to be in the vertical implementation and the choreography that fits a specific business process.

SR-T You are probably correct. Value extraction is really the Hattrick model of things. So what we’re really doing is mitigating license costs, which tend to be dominated by the introduction of a sales force, a marketing team, and everything else that companies end up paying for. We’re sort of making it more transparent and saying, “Look, use the software and buy insurance.”

So we all extract value. To me, the value isn’t to get rich. It would be nice, but that isn’t the focus. I fell into that trap in the dot-com boom, and never will I fall into it again. To me, the value is in just doing what I do, which is try to deliver good technology to solve interesting problems and to get fair returns for it over time. I’m just skewing the model, really, so that we get wider adoption by having an open source model.

Eventually, we’ll get fair value and everything in the garden will be roses, and I can buy my daughter a car when she’s 17.

SS Based on my appreciation of what you have achieved so far, I’m very confident that you’ll be able to get her a very nice car when she’s 17, if there’s any justice in the world.

Originally published in Queue vol. 4, no. 2—
Comment on this article in the ACM Digital Library