Scientist/programmer dichotomy

Bioinformatics and data analysis are mission critical activities at both large pharmaceutical and small biotech companies. Access to bioinformatic resources can vary, depending on project priority and financial resources. Even when forthcoming, bioinformatic resources can be difficult to take advantage of for many reasons. The impedance mismatch between programmer and scientist vocabulary can result in costly misunderstandings during software design. The lengthy requirements gathering, document writing, prototype roll out, evaluation, rewrite cycle adds cost and extends project time lines. The perceived need by programmers to work with the latest and greatest language and programming technology can add unnecessary complexity, expense, and time to a project.

Way back in the 1980s there was a competitive market for desktop database systems such as dBASE, FoxPro, Quattro Pro, FileMaker, and my favorite - Superbase. Self taught programmers using such packages could make short work of many data storage needs in a laboratory environment. Though much maligned by the professional programmers for their sometimes sloppy, undocumented spaghetti code, scientists were focused on getting the job done in the fastest most efficient manner possible. Desktop systems were efficient, incorporating table and form design as well as scripting in a single package. A scientist could focus on mastering the details of his discipline rather than memorizing commands and protocols for different languages and systems.

Microsoft effectively ended innovation in the desktop database market with its Access database monopoly. Though capable, Access is expensive and forces a commitment to a closed Windows based system. Although there are numerous open source desktop databases available, few integrate a form designer, report generator, and scripting language. These elements do exist separately, and a close approximation of a semi-integrated system would be the EMACS/ELISP/SQLITE system I review in this tutorial. Familiarity with such a system would allow the user to perform rapid application development. Efficiency and rapidity are important, as many programs will have a short half life in a rapidly changing research and development environment.

Donald Eastlake describes my software development attitude succinctly as he discusses a piece of software called the ITS system:

The ITS system is not the result of a human wave or crash effort. The system has been incrementally developed almost continuously since its inception. It is indeed true that large systems are never “finished”….

In general, the ITS system can be said to have been designer implemented and user designed. The problem of unrealistic software design is greatly diminished when the designer is the implementor. The implementors’ ease in programming and pride in the result is increased when he, in an essential sense, is the designer. Features are less likely to turn out to be of low utility if users are their designers and they are less likely to be difficult to use if their designers are their users.

Donald EastlakeHackers 1972 page 127

User designed software eliminates the communication barrier between scientist and programmer, and puts control of time lines in the hands of the scientist.

Share