BM-Hitachi 747 page 1

ASSIGNMENT

Hitachi Instrument Division and Boehringer Mannheim had a long-standing relationship building and marketing clinical chemistry analyzers. BM defined market needs and requirements while Hitachi designed and produced the instruments. Hitachi's instruments were very highly regarded for their mechanical perfection but their software was lagging, not so much in reliability but in ease of use. BM defined legitimate requirements based on customer feedback and general industry criticism but Hitachi negotiated away many of the requests, claiming that they were simply impossible to meet. They also suggested that many of the problems were caused by users outside of Japan, who refused to follow rules. For the new 747 instrument, BM asked that Hitachi hire someone else to develop software for the rest of the world outside of Japan. Hitachi suggested that software developed for American users would be accepted around the world because Americans follow rules less than anyone else. As the Instrument Division had no design capability in the US, they contracted with Hitachi America's subsidiary, Hitachi Instruments, to develop the software. Hitachi Instruments had capable programmers but they were doing other things and this was going to be a big and complicated job. A new team was to be created, led by American managers and software lead. These were to be free to use employees of both Hitachi Instruments (all based in California) and the Instruments Division (who would come from Japan) as well as to recruit new employees and contractors. I was hired as the software lead.

REQUIREMENTS

For requirements, I was given an extremely detailed existential description of the software for Hitachi's 911 instrument, which exhibited all of the problems that BM wanted to avoid. Hitachi did not give me BM's requests, as these had already been negotiated out of the requirements. The documentation clearly described a system with the problems that I had been told needed to be corrected so I developed my own requirements based on the overall directions that I had been given, customer complaints, and journal articles discussing the difficulties of using Hitachi instruments. These initial requirements were general in nature, for example that that instrument would multitask. As the project progressed and I began to communicate directly with BM marketing, engineering, and field applications personnel, I added requirements more specifically targeted to the 747, such as reducing the coupling between sub-systems. As I developed and presented solutions to each problem to Hitachi, it became apparent that they too recognized many of the problems but found correcting them difficult due to the compartmentalization and rigidity of the Hitachi organization. In the more fluid development environment that we established for the 747, the Hitachi engineers' excellent engineering skills were put to much better use than they had been in previous instruments. They were willing and able to accept even my most radical ideas because they understood the technical justification.

MULTITASKING

From the beginning and throughout the life of this project, this one requirement was held above all others. With instrument control, result analysis, printing, network communication, QC, user interface, and other tasks all continuously running, it is clear that clinical instruments need some form of multitasking. What was not clear at the time was how to achieve this. The two dominant PC operating systems were MSDOS, which had no ability to multitask, and Unix, which afforded only a very expensive (in terms of CPU bandwidth) full task switch capability. Many developers, including Hitachi (and Sequoia-Turner, the originators of Abbott's Cell Dyn series) opted to use MSDOS with an explicit state machine to govern each task. To reduce the high overhead cost, Hitachi allowed each task to consume several or more seconds before relinquishing its time slot, causing an infamous unresponsiveness in all domains.

I also chose MSDOS but I developed a lightweight threading means that was extremely efficient and allowed multitasking of normal code, as opposed to the highly contorted explicit state machines required by other approaches. I described my invention in Software Partitioning for Multitasking Communication published in “Dr. Dobb's Journal”. Combined with sub-system coupling improvements that I made, the resulting instrument was very responsive in all domains. BM saw this as such a significant improvement that they made multitasking the theme of their 747 introductory advertising campaign.

COMMUNICATION AND COUPLING

Hitachi's communication design was exacerbating the software multitasking problem. In many transaction types the analyzer expected a nearly instantaneous informative response from the controller. In most cases the analyzer had the information needed to submit a query but would do nothing with this until just before it needed the response. Working with the analyzer code developers I divided transactions into separate query and response phases, affording the controller much more time to develop informative responses. To handle the cases where such separation was not feasible I developed a new type of communication design pattern, where low-level fast responders (i.e. ISRs) can serve information known only to slower high-level tasks in a manner similar to scatter-gather DMA. The Dr. Dobb's Journal article describes this. This approach affords a thousand-fold improvement in application response time at no cost in complexity. In fact, complexity is reduced in that much of the communication is governed by tables that convert a portion of the code from procedural to declarative. The benefit of this was made very clear when a programmer investigating a communication problem noticed a discrepancy in one of the table entries that did turn out to be an error. What is notable is that this programmer, who did not know any assembly language, was able to diagnose and correct an error in the assembly language ISR. She never could have done this had the ISR been entirely procedural.

LAB COMMUNICATION

Hitachi had expected me to continue using their old lab communication protocol for the new 747 but I found too many problems in it. Its error detection and correction means were weak and inadequately defined and it communicated poorly with the lab management computer.

Hitachi had committed the common blunder of using framing characters that could occur in the payload, rendering them useless. I corrected the transport layer by packetizing all tokens and defining a rigorous error detection and recovery protocol. Contrary to some Hitachi concerns that my exacting definition and rigorous documentation would confuse LIS (Lab Information System) programmers, we received only compliments for producing such prescriptive documentation.

A significant problem with the Hitachi system design was that the analyzer would read the bar code on a sample tube and submit a test query to the controller only 15 seconds before beginning to process the sample. This was certainly sufficient time for the controller alone to respond, but the controller was expected to relay the query to a lab management computer, which often could not respond in such a short time. Consequently, many samples received default testing even when a different set of tests had been requested. This was such a significant problem that BM's application engineers had developed a stand-alone computer interposed between each instrument and the lab management computer just to store and forward test requests. This problem was partially corrected by the two-phase transaction decoupling but certain situations remained problematic. To remove all problems, I added a new kind of lab communication transaction that I called the “ANY”, in which the controller asks the lab management computer to send any test information that it has. The controller stores this, essentially caching the response to the analyzer's anticipated query. As my entire communication protocol became an industry standard, I saw competitors' advertisements touting support for the ANY transaction.

I included many unusual but potentially valuable features in the protocol, including:

A control packet to indicate acknowledgement but to ask that transmission be paused to avoid a rejection due to inadequate buffer space.
A multi-sample query, improving communication bandwidth utilization and affording a DBMS the option of performing a parallel query (I was a little ahead of my time on this one, as this capability has only recently become available).
Session-based performance control (what would now be called Quality Of Service control).
Private control extensions within the basic transport framework.

Recognizing that such advanced features could overwhelm LIS programmers, the communication always started in the simplest configuration and negotiated advanced features, adding only those supported by the lab management computer.

USER INTERFACE AND FOREIGN LANGUAGES

At the time that the 747 software was developed, most programmers were producing screen output using procedural code. I developed a screen rendering engine based on a low-level open source windowing (but still character-based) program. Each screen was fully declaratively defined by a deep structure hierarchy. Each artifact's declaration included one or more pointers to functions for processing input, responding to events, etc. We now know this as functional polymorphism.

I had not been told at the start of the 747 project that the program would be expected to support multiple languages. However, the table-driven user interface design made this easy. One of the elements of every on-screen artifact was a display string (null for some things). Without changing any procedural code, the display language could be changed simply by changing the strings in the table. When I presented this idea to BM, they said that even if the programming task were trivial, having to rebuild and validate the program would cost too much. I changed the static strings to dynamically loaded pointers into language-specific libraries. The one program could thus be fully tested and validated independently of its languages. BM representatives later told me that this one feature was saving millions of dollars in program maintenance costs.

ERROR REPORTING: BUZZ OFF

Early in the project, I was told that the customers wanted a “buzz off” capability. This was a reference to a simple means of turning the error buzzer off. In previous Hitachi instruments, the buzzer was annoying and non-specific, often screeching incessantly about unimportant problems. The only way to turn it off was through a complex menu path, which would have to be again traversed to reenable the buzzer after correcting the minor problems. BM requested that one key be dedicated to “buzz off/on” for quick access. This apparently simple request would be difficult to meet because all of the standard keyboard keys were used in shared ways that precluded dedicating any one of them. Consequently, Hitachi had for years resisted BM's entreaties to fix the problem.

The “buzz off” problem was symptomatic of a larger problem in Hitachi's error reporting scheme. It was rigid and did not adapt to either changing circumstances or user action. Each domain, for example printer or LIS communication, was allotted a dedicated block of space in a rigid error screen even if it had no errors to report. Meanwhile, a domain with a lot of errors to report could be starved for screen space. The problem with the buzzer was that it had only one obnoxious level and was associated with either all types or no types of errors.

I addressed the entire problem of error reporting with a scheme similar to the UI. I defined an error object, whose characteristics included error reporting text, error correction text, severity level, timeliness (i.e. whether the user has already seen this particular error instance), function pointers for unique processing, etc. Unlike Hitachi's old approach, I didn't allow any domain to directly write to the display. Instead, to report an error, a domain would create an error object and pass it to a shared error reporting engine. The domain would have no more involvement with that error other than an indicator telling it not to report such an error again until the existing report has been processed. The reporting engine was free to apportion the error screen in whatever way showed the user the most valuable information. Artificial intelligence rules defined value primarily based on severity and timeliness. Older and less severe errors showed in the main error window only as space allowed but could always be seen by drilling down into the screen hierarchy. These same characteristics determined what the buzzer would do. I developed a buzzer control engine to play various “tunes” and a developer's aid for programming new tunes. These ranged from the old obnoxious sound to a soft occasional beep. I also defined several on-screen warning indicators that similarly ranged in intrusiveness. Thus, a wide range of attention getters could be associated with each error type independently of all others. The reporting engine reduced the level as an error aged, for example by being seen by the user.

BM-Hitachi 747

updated:2016.07.13

This Page