. | . | . | . | David McCracken |
Abbott Instrument Development Systemupdated:2016.07.13 |
The general manager of the hematology business unit succinctly stated his goal as “Figure out how to rationally design and build clinical instruments so that minor changes don't always have unintended side effects.” Marketing had a different goal, the next specific instrument. The project director and I concluded that the two goals were compatible as long as we insisted that all features of the one instrument would be realized as instances of an abstract instrument. Each element would be first designed for the abstraction and then utilized for the specific instrument. This approach would not only satisfy the marketing department but would provide valuable feedback regarding the validity of the abstraction. To reinforce this feedback we decided to create several different instruments, each with unique requirements that would place additional demands on the abstraction.
I proposed the abstract instrument approach for two reasons. One is that a major cause of the problems was that every aspect of each instrument was being derived directly and only from a specific marketing requirement with no attempt made to fit into a consistent framework. Consequently, each new requirement could be realized only by an entirely new design effort, which not only had to produce a means of realization but also to discover, usually by trial-and-error, the side effects. If nothing else, the abstraction would at least define a framework before the design could be overrun by ad hoc single-point solutions. The second reason for the abstraction is that it enables the rational design of reusable sub-system components (hardware and software). Without the abstraction, components are designed for a specific instrument. Inevitably these will not be sufficiently flexible to be used in other instruments and will be inseparable from the original instrument due to inadequate encapsulation. This is why “be all and end all” instruments never succeed as the basis for a product line. Especially with software, but also with hardware, effectively reusable components are designed for a larger range than any one system requires and are fully characterized and tested to boundary conditions before they are designed into any specific system. Auditors routinely criticized Abbott for doing practically no unit testing, but this was the natural outgrowth of not designing independent units in the first place.
While the main goal was to create a system to rationalize development of clinical instruments, it was evident that additional benefits could be achieved at very little cost by taking them into account during planning and by simply advertising their availability. One of these is that a flexible instrument development system could be used for more than building clinical hematology instruments. It could be used to build similar instruments, lab automation facilities, unit test systems, etc. To support these alternate uses, where feasible, general capabilities of the system would not be infused with hematology-specific features.
Although the target instruments are clinical and, therefore, fixed function, the production instruments themselves are used in-house to develop the test methods that will eventually be deployed. To facilitate this essentially analytical use, all of the Abbott instruments use some form of scripting system, enabling the scientists to design and test their own experiments. The scripting systems had bifurcated into a flexible but promiscuous form and a rigidly controlled form. Both forms embedded in the scripting language and in the interpreter instrument-specific definitions so that even minor hardware changes required the language itself, the language compiler, and the instrument code to change, defeating the purpose of scripting.
I designed a new object oriented scripting system in which every aspect of the target instrument (or other system) in any way controllable from a script is an instance of a generalized class. This is applied without exception not only to physical devices but also to abstractions like instrument states and modes of operation and to behaviors, such as data collection types. As in my BM-Hitachi 747 UI design, this design regularization alone, even if only realized statically, affords significant benefit by keeping a consistent framework. However, just as this approach enables late-binding of the language into the 747, it provides a means to prevent the sort of domain coupling that cripples Abbott’s older scripting systems. Taken together, the definitions of all of these classes comprise a language that describes the physical and behavioral architecture of any target system. This language contains simple declarations, no procedures, and no side effects and yet can completely describe the system.
Native to the script language and its compiler is an understanding of the classes. They contain no target-specific information. However, scripts contain many target-specific references. To compile a script, the compiler first reads an instrument definition file, which defines the entire instrument using the declarative class language. The compiler uses this information to build a database, which it then uses to resolve target-specific references in the script. The compiler generates optimized pseudo-code for the interpreter, which, like the compiler itself, embeds only an understanding of the classes. Consequently, the interpreter program also is not instrument-specific and may be reused in any system.
The target system may contain multiple CPUs capable of interpreting the script language. These are all described in the instrument definition file. Scripts are targeted to specific CPUs but may access facilities under another’s control. From the instrument definition, the compiler recognizes such remote access and generates the proper pseudo-code to coordinate the two CPUs. Since both speak the same language, the requesting CPU only needs to send the command message to the other. The core interpreter program is small enough to fit into a single-chip microcontroller’s memory. Modules for specialized facilities, such as data collection and motor control, considerably increase the program size but are specifically designed to be included or not by a link option. The script language includes commands to access all potential facilities but this doesn’t increase the program size of interpreters that don’t have some of these. They simply won’t execute the related instructions, but will instead issue a run-time warning. Such a problem is rare in any case because the compiler won’t generate offending code unless the instrument definition file is incorrect.
The script language as seen by its programmers comprises both the fixed language and the names of facilities in the instrument definition. This composite language is not context-free. For example, one interrupter flag may indicate its open state by a logic 1 while for another a logic 0 indicates open. Also, to avoid complicating the language for its users, a relatively few source commands generate a wide variety of optimized command messages. Consequently, the composite language cannot be recognized by an LALR(1) parser even with the disambiguating assistance of precedence and associativity. It would be possible to avoid shift-reduce and reduce-reduce conflicts by forgoing syntax directed translation and using the parser just to reduce sentential forms to an ordered collection for subsequent translation. However, this approach would essentially parse each statement twice, losing much of the benefit of BNF (Backus-Naur Form) modeling of the grammar. To take advantage of modeling and the resultant availability of automatic parser generators like Bison, I avoided ambiguous non-terminals (all resulting from one or more ambiguous terminals) through a partially context-sensitive scanner. The fixed portion of the language contains mostly context-free words, for which I used a scanner generated by Flex from my RE (Regular Expression) model. For a few overloaded semi-fixed words, such as the command and device-specific state “open”, and all of the instrument facility classes I punted scanning to a function that determined the class of a word by matching up its context to instances of the word in the instrument database. Determining the class at the scanner level avoided jamming the parser PDA (pushdown automaton) into an unrecoverable state on the lookahead.
As an unusual and valuable consequence of its context-sensitivity, the compiler can report errors in application-specific detail. Instead of reporting a generic syntax error, for example it may report that a particular device doesn’t have a state called “left” but it does have “up” and “down”.
The scripting system supports multiple simultaneous threads. Any script can spawn a new thread simply by invoking another script. The compiler maintains a script database that tells the execution unit of every script. If a script spawns a script on another execution unit, the compiler transparently generates the proper code to coordinate this. The parent script does not have to know which unit is the child but it can explicitly state the child for the rare cases of overloaded script names (where multiple units use the same script name, but not necessarily the same script). The parent script can request a private communication channel with the child, allowing them to coordinate with each other. Again, whether the child is local or remote is transparent to both the parent and the child.
Although it would be possible to use the scripting system without a debugger, many of the intended users have little programming experience and need not only assistance running their scripts but creating and managing them as well. I wrote a Windows program to provide this assistance. It provides a complete IDE (Integrated Development Environment). The source level debugger provides the expected run, stop, breakpoint, and stepping facilities, using information generated by the script compiler. Breakpoints can be set to stop just the script in which they appear, all scripts running on the same unit or all script activity in the entire system. The debugger also provides unique capabilities such as a facility to measure the elapsed time between two or more events.
I needed to display the script source code in the debugger and, rather than using a single-purpose display means, I included a general-purpose editor. In debug mode, the editor marks timepoints and breakpoints and shows where scripts are paused at breakpoints. In edit mode, the editor allows multiple script files to be simultaneously loaded and edited. In a third mode, the editor displays on-line information about the target system. In all three modes, the user can request help on a selected word. For target-specific words, the debugger displays information that the compiler gleans from the instrument definition file. For standard script language words, the debugger spawns MSWord, passing to it via DDE a Word Basic script that causes it to open a document file, such as the script language reference, and go to a point appropriate to the query.
Domain separation and its corollary, late binding, are critical goals well served by my scripting system as regards the language, compiler, and interpreters. However, the instrument also comprises a controller, a separate program that executes on a PC and is written for a standard operating system. Traditionally, this program contains a great deal of embedded instrument-specific knowledge, preventing instrument development independently of system level programmers. I addressed this problem by defining and implementing a generic analyzer control capability and by coordinating the controller with the script compiler so that the controller would also see instrument-specific features as instance of generic classes.
My instrument controller coordinates with the analyzer at run-time to avoid embedding application-specific information in the program. For example, it does not know how many execution units are in the analyzer, what application programs they want downloaded, or what scripts they want. The units themselves log on to the controller and explain what they want, a simple solution that significantly decreases coordination costs. The controller provides generalized data acquisition, supporting simultaneous processes limited only by available memory. Data is stored in FCS standard files with defined extensions for non-list data types. Raw data is shown in scatter plots and histograms defined by the user. Plot parameters, absolute ranges, display sub-ranges, size and arrangement of plot windows, whether to display in log or linear, and many other characteristics are dynamically configurable. Multiple configurations can be saved and recalled via an automatically generated menu.
Every instrument has many application-specific control parameters that the user adjusts as part of routine maintenance. The parameters are presented to the user in a dialog that usually resembles a spreadsheet. Behind each parameter adjustment is a complex sequence of events to effect the immediate change as well as to record the new value. To avoid embedding application-specific information in the controller, the script compiler creates from the instrument definition a database containing all data acquisition and parameter definitions in a format suitable to the controller. This information not only includes basic presentation information, such as parameter names and allowed ranges, but also the actions to take in response to changes. My controller automatically generates dynamic dialogs based on this information. It allows multiple users to save and restore their own unique settings by name. The controller automatically updates the database of user settings as the instrument definition changes, pruning parameters that are removed and assigning default values (included in their definition) to new ones.
All of Abbott’s instruments have separate controller and analyzer units. My analyzer design contains multiple CPUs but only one directly connects to the controller. In most systems the controller and analyzer are located close to each other because the user has work to do at both. However, a number of specific physical configurations are feasible, with the controller ranging from a condensed PC daughter card on the main analyzer board to a stand-alone PC located several feet from the analyzer. In non-instrument applications the controller and whatever it controls might be far apart and communicate through a LAN or even WAN connection. It would be a mistake to coerce all possible configurations to use the same link type, yet some uniformity is required.
As with all other aspects of the development system, I defined a generic communication facility and implemented specific instances of it for the instruments under development. An important aspect of this facility is that the controller application is largely unaware of the physical link. When a controller application program starts up it logs onto a communication server that resides in a DLL. The application tells the server which of the supported links it wants to use and passes configuration parameters, but these are all abstract data types applicable to any link. If the server is already providing analyzer communication to another application using a different link, the existing link will be used instead of the requested one. The application can send messages to the analyzer at any time simply by calling a transmit function in the DLL. It can register threads to receive all analyzer messages or just selected types. The communication server routes analyzer messages, based on their type, to registered threads. A registered receiver gets each message by calling a function in the DLL, which optionally puts the thread to sleep if there isn’t one and wakes it up when one comes in.
Typically, most other applications choose a link type and build the error control and coordination required of any robust system into the application itself. Different applications must duplicate all of this. For example, in Abbott’s older analyzers, the rudimentary script debugger and the instrument controller, which are separate programs, must each contain the entire communication control system. With all of this moved into the DLL, all applications have access to the best communication system that can be devised at very little cost. An added benefit is that the communication server supports multiple simultaneous applications, for example the script debugger and a standard instrument controller, reducing the impetus to put all capabilities into one program.
Although the communication server is an integral part of the driver, it executes at the application level and can interface to application-level links, such as sockets; to hybrid-level links, such as WDM drivers; and to pure system-level device drivers. It operates with both Win9x VxDs and NT (2K/XP) SYS drivers, automatically detecting the OS and configuring itself accordingly. The application-level server establishes a tight bond with the system-level driver, coordinating messages through shared memory. Both input and output messages are passed through circular buffers directly accessible to both the link driver and the application. Input messages are presented to the application with no copying while transmit messages are copied once. To further improve efficiency, applications retrieve input messages without the performance-robbing ring transition of typical device drivers because the server function that gets a message for an application runs in the application data space. This approach has the flexibility of a message passing system but, unlike most such systems, affords efficiency approaching that of actual DMA. Consequently, it can be used in very high data rate systems, for example where a PC daughter card and the main analyzer board literally share memory. No standard Windows device driver could take advantage of this configuration because of the high message passing overhead. At the other extreme, the communication server can just as easily coordinate with a WAN link. Applications don’t change in any way due to the link variations.
The communication server affords another benefit in that it allows any
application to inject messages into the input buffer, nearly duplicating the
effect of actual input from the analyzer. This simplifies application program
development and enables time-coordinated attacks for boundary condition testing
of the communication system and application data processors.
See Windows Device Driver
I defined a uniform extensible I/O control system based on “scan chains“ of simple shift registers. I was inspired to do this by my experience on the TVRO consumer product but these days even high-performance computing is going serial. The SPI-like clocked serial protocol and use of inexpensive RS422 components limits the data rate to less than 10MBps but also makes a fairly forgiving environment for connectors and cabling. CPUs simply read or write FPGA registers whose contents are continuously rotated through the scan chains, which can be up to 20 feet long and contain as many as 15 separate I/O boards.
The scan chain is specifically designed to allow I/O units to be inserted and removed anywhere throughout its length without violating electrical requirements. This capability is very handy during development but makes device addresses unstable. The configurability of the script compiler takes care of this. The scripts themselves, accessing the devices only by name, would not be affected. The configuration language supports relative address definition so that adding the definitions of a newly inserted unit automatically corrects the addresses of units already defined.
All of Abbott’s instruments use many stepper motors. The older instruments use a stepper control unit in which one CPU controls as many as 12 motors. The CPU calculates and stores step patterns for all motors in dual-port memory. Random logic reads these patterns at a fixed rate to drive the motors. This approach is not only very inexpensive but also affords an opportunity to precisely coordinate simultaneous motor movements. However, the implementation has many problems, among them that only one motor at a time is correctly controlled, that the motor controller unit can do nothing else so that all script motor commands have to be interpreted by a different unit and explicitly passed to the controller, that ramps have to be generated by hand, that the motor controller’s position reference frame forces the script execution unit to embed knowledge of each motor’s range and direction, and that every motor is connected individually to the controller through a wide and electrically noisy cable. Despite these problems, I liked the concept and decided to design an improved version.
I defined my motor controller not as a unique unit but as a native capability of script execution units. The related program code comprises two parts, a quite small command portion that all units get and a large control portion (including a large ISR) that is linked only for units that also have motor control hardware. Any unit that has the control software and hardware can directly control motors. However, they only manipulate motors through the script interpreter, exactly the same as units that don’t directly control motors. The script compiler knows which unit owns a motor and automatically generates the proper code for the messages to be sent by any unit to the motor’s owner.
The script compiler generates a variety of ramps, including linear and piece-wise logarithmic, from parametric statements. When such a statement executes, the ramp, already defined by the compiler, is loaded into the motor controller’s memory. Thus, the run-time cost of ramp execution is nil while the cost of changing a motor’s ramp is negligible. To facilitate ramp development by experiment, ramps are stored in a dedicated heap with an LRU (Least Recently Used) removal scheme providing virtually infinite memory.
Motor control hardware mainly comprises an FPGA, which reads the step patterns (written by the CPU’s motor ISR) from shared memory and transmits them on a modified version of the I/O scan chain. As many as 20 motors can be located anywhere on the one low-noise scan chain. The motor drivers, located in the scan chain units, are controlled by shift registers. I had optimized my design for Abbott’s existing machines, which used many steppers but all low power, adequately served by full step control, and with loose timing requirements. Late in development, marketing decided they wanted a random sample handler, which required very precise coordination of multiple steppers, higher power drivers, and micro-stepping to avoid resonance. Without changing the overall architecture, I added micro-stepping by designing a state machine that responds to a step plus direction pattern as opposed to the two-phase gray code of the full-stepping interface. I implemented this initially in a GAL with PALASM code and later in a Xilinx PLD with Verilog. For the higher power motors, I designed a new driver circuit. Although not originally intended for precise coordination, the scan chain’s broadside loading automatically synchronized all motors.
The Load-Unload Cycle Operation and X-Theta Coordinated Ramp Lecture videos demonstrate and explain how X and theta motors are synchronized to simulate Y-axis movement. This is needed because the machine has only X translation and theta rotation but racks can be put onto and taken off the belt only by strict Y-axis movement. Any X movement or rotation at that time causes mechanical collisions. Synchronization is especially tricky in the acceleration phase. Complex mechanical systems like this have many resonant points. If a motor’s acceleration ramp is too slow, resonance can cause step loss. But step loss will also occur if the ramp is too fast. Even in normal circumstances, the acceptable ramp range may be limited. In this situation though, the ramps of both theta and X have the additional requirement that they must combine to produce only Y-axis movement. Further complicating ramp design, because theta is located on the body moved by X, theta movement applies a force against X and X does the same against theta. Consequently, the loads the two are trying to accelerate are not constant but vary in accordance with each other’s acceleration. The ideal ramps cannot be calculated and finding them by experiment could take weeks in other systems. I was able to find the ideal ramps in about two hours by taking advantage of my system’s ability to produce coordinated motor ramps and to vary them even as the machine operates.
Abbott’s motor drivers (3717/3718) had a fairly poor reliability record. I used these same components and found that they were being destroyed by certain dead short arrangements caused by operator error. I corrected this vulnerability by designing an inexpensive high-speed, self-resetting circuit breaker. I describe a particularly unusual aspect of my design in Inverted Regulator Increases Choice and Reduces Complexity published in “EDN” Dec 15, 2009.
Complex multi-domain instruments always present a formidable version control situation. Turning a screw a half turn can transform a great performer into a piece of junk. Adding the ability to redefine instrument hardware even while using the instrument exacerbates the version control problem. Several of the lab technicians, familiar with good procedures, started a notebook for each experimental instrument and dutifully recorded their work. The chaos of the development lab (and resistance from some developers) allowed many changes to go unrecorded. In any case, the paper-based notebook was difficult to search and required tedious repetition of details like date. I replaced the paper-based notebook with a semi-automated version recording mechanism in the instrument control program. The user had to at least take responsibility for invoking this (from a menu) but even if they did nothing else, it would at least record the date, time, and user, as well as any changes that it found in scripts (only those known to be in use). Its automatic script scan was intelligent, reporting whether script changes were functional or only cosmetic. The program encouraged the user to include additional categorical information. The information was written to a file using XML syntax with my own schema.
We started the design of the development system with a long series of discussions. I recorded and reported minutes for review and design history. When we began implementation, I simply reported a list of what I did and asked my colleagues to review it. One of them complained that she couldn’t review something from such skimpy documentation. No matter how well my code might be commented, if she doesn’t know where to find it, she can’t review it. This is so obvious, I can’t understand why I needed to have it pointed out to me. I responded by issuing the day before each weekly review meeting a report of everything that I had done in the week, with each item broken down into requirement, design, implementation, and testing (from my BM-Hitachi 747 experience I had learned to do unit testing up front). I fully described each category, including rationales and alternatives, so that any phase of every item could easily be reviewed. Some of my reports were quite long, yet my colleagues would come to the review with comments about something they had read deep in the report. I also found that writing the report forced me to review my own work much more thoroughly.
I began concatenating the reports into one very large design history file (example) and soon realized that text search was not an efficient means of accessing so much information. I began using hyperlinking within the document to create topic chains free from the document’s time-based order, creating, in effect, a design history relational database. See Development Models.
It is easy to measure the success of a single product; you either finish and sell it or you fail. The true success of a development system can only be known with time. However, we can compare individual tasks done with the system vs. without it. The first experimental hematology instrument made using the new system collected its first sample in the week that it was put together. This compares to the several months that this task had required in the past. More telling is that in the first three weeks, this instrument collected thousands of samples. Previously, this level of functionality would have taken a year to achieve. The effect of the new system varies widely by task. An extreme example was when the flow engineer had designed a new method and reported that he was ready for software to control it. He was expecting the usual six-month wait for this support. Instead, it was ready in 10 minutes. Further, he could have done this himself although I did it for him.
One of my goals for the development system was that it could serve beyond providing a platform for instruments. This was not an Abbott goal but, interestingly, we had an opportunity to test whether I had achieved this. A manufacturing engineer was developing a subsystem life-cycle test system for an older instrument. The instrument could not possibly support this itself so he was trying to use Lab View. This is the ideal application for Lab View, yet after three months of struggle he found (as many people have) that Lab View affords encouraging immediate but incomplete success followed by a never-ending attempt to get everything right. I gave him some of my obsolete motor controllers, which are full-fledged script execution units with some I/O, and showed him my documentation. With these he built and programmed his test system in one day.