Sgml in Healthcare Information Systems. Liora Alschuler, Robert Dolin, md, John Spinosa md phd in: Proceedings gca sgml '97

Download 80.16 Kb.
Size80.16 Kb.
SGML in Healthcare Information Systems.
Liora Alschuler, Robert Dolin, MD, John Spinosa MD PhD
In: Proceedings GCA SGML '97.

Introduction 2

SGML in Healthcare 2


Design Principles 3

Wop Bop aloo Bop, A Wop Bam Boop: Betty Boop Rocks and Rolls with Itty-Bitty DTDs 4

Mapping HL7 into SGML 4

Oswestry Pilot 5

Other Formats for Medical Records 6

Extending the CEN Report on Data Formats 6


SGML is Good Medicine 8

Toward an SGML Architecture for Healthcare 8

The Hand-off 9

Modeling 10

Areas of Application 11

The Playing Field 11

Specific Areas 11


There is a growing consensus that healthcare records, including the individual patient record, will be gathered, managed, and distributed electronically, but there is little consensus on how this will be done. One study estimates that fewer than five percent of providers have determined how they will computerize patient records. [1] In this climate, where our second largest industry has yet to establish an informational infrastructure, what does SGML have to offer? What are the prospects of large-scale use of SGML-based technology? How does use of SGML relate to other standards efforts?
This paper examines the place of SGML within healthcare informatics, reports on some recent work demonstrating the application of SGML to healthcare records, and discusses the relationship between SGML-based standards for healthcare and other standards initiatives. It concludes with a brief discussion of one type of SGML architecture and applications envisioned for healthcare.

SGML in Healthcare

SGML reduces a document to a regular expression in a known grammar that can be parsed. As such, it makes possible computer processing of information recorded in all the various forms that a narrative can take. In addition, of course, SGML is an international, ISO standard that is platform, application, and vendor independent.
Initially, our goal was to provide a standard approach to use of SGML for the electronic health record where the irreducible need for narrative had prevented an easy implementation of traditional information management techniques. As a minimum, we thought that SGML would be a good data format to use within the unstructured text segments of a record. We still think this is a good idea, but we are beginning to see the possibility that SGML can be applied across an even wider area of medical informatics and contribute more than the encoding of isolated blocks of text.
SGML and applications such as HyTime, XML, and DSSSL can address the medical record as a linked, distributed hypertext used and reused in multiple contexts. Furthermore, the question has been raised whether or not SGML is an appropriate metalanguage for use across the spectrum of healthcare messaging. Dr. Robert Dolin of Kaiser Permanente has propelled this issue to the forefront of debate and discussion by creating a transformation methodology between arbitrary HL7 messages, the predominant interchange syntax in healthcare, and SGML documents. We include a preliminary report on his work below.
The prospect of widespread use of SGML within healthcare is often referred to as “the document paradigm.” This refers to the modeling of information within documents and the use of documents as the basic units of information processing -- hardly a new thought for traditional publishing, but an approach that takes on new meaning within healthcare information systems.
Using SGML for an electronic health record (EHR) is similar to publishing, but there is no end to the process as long as the patient is alive. A patient’s EHR is a constantly growing collection of small documents. Although an EHR is a dynamic entity, there are times when a “snapshot” of a particular portion is required. Attestation and versioning must relate to this snapshot.
Like a traditional literary form, it is not possible to know today what the reader of tomorrow will find significant within a medical record or a set of medical records. Observations are more constant over time than are conclusions. For better understanding of disease processes, we must be able to leverage these distinctions.
The basic dilemma of healthcare information modeling is that one master model must serve widely disparate functions, software applications, and stakeholders: authoring, storage, management, dissemination, and research. Some of the tools that will be used to analyze and manipulate these electronic medical records do not even exist today.
We believe that SGML is the most practical modeling syntax for medical narrative -- and much that is not narrative -- because it has the flexibility to conform to the language of the individual instance. Modeling to the information, if carried out in the proper manner, is neutral with respect to application, so that SGML-encoded information is easily handed off to other modeling schemes and other technology. Information models created to work within specific application boundaries will, we believe, have a more difficult job when called on in the future to support processing that lies outside the original functional specification.


The HL7 SGML Special Interest Group [2] is a year old, having come together informally in April of 1996 and joining HL7 formally at the end of the year. HL7 is an ANSI standards organization with over 1700 individual members, 450 organizational members including 185 vendors and 171 healthcare providers, and chapters in half a dozen countries outside the US. It is responsible for the HL7 abstract messaging syntax which is in widespread use between hospital systems across the country. HL7 is one of the organizations working with the federal government on implementation of the electronic record provisions of the Kennedy-Kassebaum Act.
Since formal affiliation, the HL7 SGML SIG has held two meetings in conjunction with HL7. The Tampa meeting included over six hours of open sessions on SGML, in addition to the working sessions of the SIG. The Chicago meeting, held after these Proceedings go to press, will include a two-hour tutorial, as well as the working sessions, and will feature the first presentation of an HL7/SGML mapping.
The SIG has almost reached closure on a set of Design Principles that will guide the creation of the SGML architecture and document types for healthcare.

Design Principles

The Design Principles currently under construction are divided into three sections covering general principles, relationship with other standards, and the role of SGML-encoded information within a healthcare implementation. The last section is meant to emphasize that SGML-encoded data can interoperate with existing technology while broadening the options for reuse and information management. The final point in the working draft of this section states:
The SGML document types and architecture created according to these Design Principles will facilitate (or at least not inhibit) these types of implementation processing: [JOHN: semicolons at end of each!!! . after last one]

  • direct collection of information;

  • efficient and precise searching;

  • efficient and flexible retrieval;

  • querying of the data;

  • linkage with and interchange of data among all kinds of healthcare information databases (e.g., patient medical record, lab reports, outcomes databases, drug information, practice guidelines, decision support modules) whether or not these databases use SGML;

  • multiple storage models (with heterogeneous data types);

  • transformation of data structure (e.g., into relational database, object-oriented database, object-relational database structure);

  • portability of data (i.e., the display and distribution of information across platforms and applications and in multiple media);

  • assemblage of data;

  • persistency of data and concepts (i.e., tracking document changes and the reason for the change, a.k.a. version control);

  • collection and addition of metadata expressing context, meaning, and intent;

  • documentation of the reasoning in the medical/diagnostic process.

The Design Principles are not a design strategy. These issues, which are the current business of the SIG, are discussed in the Architectural Issues section of this paper.

Wop Bop aloo Bop, A Wop Bam Boop: Betty Boop Rocks and Rolls with Itty-Bitty DTDs

The Tampa meeting of HL7 included a demonstration of how SGML might be applied to patient records. The demo was created with the generous and extensive help of Eliot Kimber, Coeditor of the HyTime standard. It has been inserted into an HTML tutorial and will soon be on the HL7/SGML Web site. The demo showed how indirect, HyTime linking can be used in off-the-shelf software to create and manage a distributed, write-once-read-many , patient record that accumulates over time.

The basic scenario of the demo is that patient Betty Boop [sic] comes in for an examination. The first document created is a physician’s progress note called boopsoap.sgm. (SOAP is the acronym for Subjective, Objective, Assessment, and Plan, the sections that form the basic structure of the progress note.) The Plan calls for a biopsy to be done. The second document is the surgical pathology report of that biopsy. Each document is created independently and stored as documents in a write-once repository. The scenario posits that when the pathology report is completed and filed, the file name of the document is added to a lookup table which correlates notes and reports. In this case, Kimber used location files (boop.loc for Panorama and booploc.sgm for HyBrowse) and a set of entities (boop.ext) for the lookup, but any indirect set of pointers could be used. Use of standard HyTime pointing mechanisms, however, allowed us to show the link created in the lookup table using off-the-shelf software that recognizes HyTime syntax.
The files are not linked until the location file is edited. Once the external lookup table has been created, the link between the SOAP note and the pathology report becomes “live” hypertext. This was demonstrated using both Panorama from SoftQuad and HyBrowse from TechnoTeacher. [www.softquad.comxxx; ??? check both and insert!!!!!] As Kimber states in his comments, “This intermediate layer of indirection makes it possible to manage the addresses without modifying the documents that make the initial references.”
To demonstrate the affinity of SGML-encoded data for reuse in new applications, we sent the SOAP document and the pathology report to Tim Bray of Textuality who scripted some simple queries based on terms found in context in the documents, displaying the results in an HTML Web browser.
At the same tutorial, Tim Bray of Textuality introduced the concept of Rock and Roll SGML, or how to do alot with a little and Eric Skinner, Product Manager of Omnimark, introduced the concept of microdocuments and how they might apply to a medical context.

Mapping HL7 into SGML

The current draft of HL7 is version 2.3 which is a messaging syntax. A typical scenario for use of HL7 might be as follows: A patient registers for admittance to a hospital. An admitting system (called ADT for admission, discharge, transfer) looks up the patient’s demographics in a central hospital information system, “admits” the patient, and generates a message that the patient was admitted. This message is translated into HL7 by an interface engine. The message is sent to several other systems within the facility such as the pharmacy (drugs can now be requested for this patient), dietary (a menu must be built), lab (so that lab tests can be done), and so on. Typically, each system within the hospital uses its own information system with its own data type and conventions. At the receiving end, the HL7 message is decoded into the internal format of the legacy system.
Dolin has created a semi-automated process for mapping HL7 version 2.3 message specifications into SGML DTDs. This lays the basis for the automated mapping of arbitrary HL7 messages into SGML document instances (DIs) in real time. He will present his preliminary findings to the HL7 meeting in Chicago in April.
HL7 version 2.3 messages are composed of segments which contain fields, attributes, and data types. Data types are defined in text and lookup tables specify field value restrictions. Dolin’s method defines HL7 tables as parameter entities, then builds messages as document types containing SGML elements equivalent to HL7 segments and fields. HL7 fields can be SGML elements or attributes based on a simple set of criteria.
Following is an HL7 message segment and its SGML equivalent:

This message header segment (MSH) is an acknowledgment (ACK) with a unique control ID. It is a production level message (not a test) in HL7 version 2.1. The Lab identified as 767543 acknowledges receipt of a message sent by the ADT system. The message is time and data stamped.

Dolin has found that the mapping preserves the HL7 specifications within a modular DTD. The DTD can be extended through automatic lookup of data types in the HL7 tables. His preliminary conclusions indicate that use of SGML may offer the following advantages over HL7 syntax:

  • validation of the message with available off-the-shelf software (HL7 does not specify validation)

  • creation and processing of messages based on current document technology

  • removal of ambiguity found in nested HL7 content models (required fields may have optional and required segments, but HL7 syntax has no method of identifying and verifying this)

  • provision of end-of-message delineator (HL7 has none)

SGML can specify maximum field length although validation must be done by an external application.

Clearly, other HL7/SGML mappings are possible and we are just beginning to explore the implications of this equivalence within the SIG and within HL7.

Oswestry Pilot

The Oswestry Pilot, while described to us as “a small pilot study in a single specialty Orthopedic Hospital” in Oswestry, England, probably represents the largest SGML database of patient records to date. The first official results of this work will be presented in England the week after the Barcelona conference, but we will be joined in Barcelona by a representative of the project who will present preliminary findings and describe the methods and technology applied.
The five-month pilot has compiled thousands of records covering 300 patients using DTDs for physician’s progress notes, pre-operation and post-operation surgical notes, and many other forms of report. The records will be managed and presented in a SoftQuad Explorer database. Dr. Sean Brennan, who oversaw the project at the National Health Service, describes the basic workflow:
“The doctors were required to dictate their note into a dictaphone in a structured manner -- presenting symptom, diagnosis, treatment etc. In order to ensure that this structure was compliant with other work in this area we adopted the clinical headings which are currently being developed by a team in England as a way of providing a common core record which will be easily communicable to other health professionals and health sectors. The dictated notes were then input by their secretaries through the SGML input screens and then tagged and stored as SGML text.” [Brennan]
The study will link these records to existing relational databases used in radiology and pathology and will add multimedia components such as digital X rays, Gait assessment video clips, and histopathology photos to a subset of records. While Brennan adds that this does not represent a policy shift by the National Health Service, NHS sees it as an important assessment of how the document paradigm can be integrated with other technology such as EDI and relational databases.

Other Formats for Medical Records

There are several initiatives within medical informatics that relate to the use of SGML within the electronic health record.

Extending the CEN Report on Data Formats

In 1993, the European Committee for Standardization (CEN) looked at 23 data formats to determine their suitability for exchange of healthcare-related messages. They chose five formats for detailed evaluation but SGML was not among them. The five that were studied and reported on were: ASN.1, ASTM E1238, EDIFACT, EUCLIDES, and ODA.

To evaluate these five, CEN described a scenario where data is exchanged between three parties -- a physician requesting a report, a laboratory, and a report recipient. They created a Domain Information Model and a set of General Message Descriptions (GMDs) based on this scenario. The GMDs are described in a formal object model that reflects the exchange requirements but is independent of the data formats evaluated.
Dolin and collegues extended the CEN evaluation to SGML in a paper submitted to the American Medical Informatics Association. [Dolin, 1997-1] He created an SGML DTD that reflects the CEN model. This DTD is posted on the HL7 SGML Web site and is excerpted here:

Interpretation (YES | NO) #REQUIRED>


Nationality CDATA #IMPLIED>

CEN specified six axes of evaluation broken down further into 37 criteria against which the encoding formats would be tested. The Dolin paper adds SGML to the table of comparison which includes criteria such as Supported Information Structures (e.g., Optional, Choice, Repetition); Supported Data Types (e.g., ASCII text (ISO 646), Layout Support, Sound, Coded data type); Encoding (e.g., Bit-oriented transfer syntax); Evolution and Backward Compatibility (e.g., Message Version Handling, Adding New Attributes); Conformance (e.g., General syntax conformance); and Support and Availability (e.g., International standard, Used in healthcare, Used in other sectors).
The paper concludes that “SGML compares favorably with the other syntaxes investigated by CEN in their 1993 report.” The criteria not supported by SGML were the bit-oriented transfer syntax, registration mechanism for message types, and non-character data types, although it was noted that use of the NOTATION declaration allows an external application to perform this validation.
In the conclusion of the paper, Dolin also noted that SGML is not object oriented so that is was not possible to fully automate the translation between an object oriented model and SGML. This might introduce constraints in the mapping of HL7 version 3 and SGML. This issue deserves further study.


The Good European Health Record (GEHR) is the name of a wide-ranging specification created within the European Health Telematics research program (Advanced Informatics in Medicine.)

“The GEHR project consortium involved 21 participating organizations in seven European countries, and included clinicians from different professions and disciplines, computer scientists in commercial and academic institutions, and major multi-national companies.” [GEHR, p.2]

The GEHR project is the result of extensive requirements gathering on EHR recording, portability, communication, exchange, “ethical, medico-legal and security issues,” and educational requirements. The GEHR architecture is expressed formally as an object model and a complementary exchange format. Components of the architecture have been incorporated into the CEN electronic healthcare record architecture.
The initiative is instructive in several ways. The authors place an explicit priority on service to the patient and support for the clinician. GEHR has created and placed in the public domain a set of anatomical drawings to used as common reference within diverse healthcare systems, thus extending the domain of the information system to actual content. The comprehensives and detail of the GEHR model clearly deserves study by any group modeling information in healthcare. A full study of the suitability of the GEHR architecture for use with SGML would be extremely useful.
Where the GEHR architects accept limitations on their model is in the precise realm where SGML is strong -- the imposition of explicit yet flexible structure between gross generality and granular data elements. The architecture summary states that in areas of practice like psychiatry where expressive narrative is the norm,

“...for the foreseeable future, meaning will probably be restricted to reading of a whole text entry by another clinician, and computability will be very limited. In other specialties a highly structured record is of greater benefit. The context may then be expressed explicitly, allowing grouping and relations to be recorded.” [GEHR, p.3]

The question that must be asked by those interested in the application of SGML and in the modeling process itself is whether the line drawn by GEHR designating where expressiveness is an imperative and where it is dispensable is acceptable for clinicians, patients, and researchers.

SGML is Good Medicine

Toward an SGML Architecture for Healthcare

The figure below shows the basic components of a medical records system. It indicates the area currently addressed by HL7 (exchange between legacy systems) and the area in which SGML-based technology could have application. As we look forward to creation of an SGML architecture for medicine, the major issues can be divided into two areas: the hand-off between SGML and other technology and the information modeling effort itself.

One of the most important considerations, and one that is highly political and is being answered differently in Europe and in the US, is the question of who owns and controls the medical record? When it was filed in a doctor’s cabinet, this issue had no practical consequence. Today, we are determining whether the record will be flung across widely distributed computer systems, or, alternately, concentrated within a massive database, and we are deciding the terms and conditions under which it can be assembled, transferred, updated, and analyzed.
Today, your medical history is a record of a set of transactions within a system that is struggling mightily to find a way to balance costs and benefits. As such, it is grist for the statistical mill that is managing healthcare and it is part of an aggregate of like records. This aggregation is the key to the information infrastructure required to run the healthcare industry.
Your medical record is the richest source for our epidemiological knowledgebase, which can help us perceive the course of disease within our society and the impact of pollutants in our changing environment. Today, it cannot be used. Current epidemiological databases are derived from codes abstracted from the medical record and “reportable conditions” that require separate documentation. Often the codes are ones developed for billing and charge capture and do a poor job of representing the record itself.
Building better databases from computerized patient records can help us determine treatment guidelines, liabilities, and safe exposure levels. If these records are based on narrative, they will be a richer source of information than if they originate as database schema. Regardless of how well conceived a relational or object-oriented schema, no one can foresee the questions and relationships that will take on significance over time. Narrative, which contains the full set of observations, encoded in SGML with a minimal applied structure and context, is the richest source of knowledge over time about the relationship between exposure levels, occupational conditions, and other factors that we cannot predict. It is difficult, if not impossible, to build data templates to capture the same breadth of information.
Our philosophy of medical record ownership should inform these basic questions, which bind social policy to technical solutions:

  • What is the medical record and where does it reside?

  • Is it a series of documents? objects? schemas? links?

  • Is it collected in one place or fetched from distributed repositories?

  • If so, how are the pieces identified, tracked, and assembled?

Finally, the design of a medical SGML architecture must consider that the electronic health record is, itself, a therapeutic modality not separate from the treatment it records and monitors. The course of treatment is determined in part by the information that is present and absent, found and not found, within the patient history. This dynamic interaction between the action taken and the recording of that action will increase as the electronic health record becomes as much a part of healthcare as the physical exam and diagnostic imaging.

This interaction between information and the course of treatment is illustrated by a study of the cause of under immunization among children. The study found that the most significant reason that insured children are not vaccinated was a failure of providers to follow guidelines on simultaneous administration and a failure to give immunization during unscheduled visits. Addressing this second point, the study said that using every clinical encounter as an opportunity to administer immunization is “logistically challenging.”
“Currently, physicians in most settings are not actively made aware of each patient’s immunization status at urgent visits ... Computerized tracking systems are now available at many large HMOs, but we are unaware of any that have active feedback loops to routinely print out immunization status for patients making urgent visits. Although establishing such loops requires integrating large computer systems, our findings suggest that such an approach could be worth the effort.” [ Lieu]
It is not enough that the electronic medical record works. It must support the medical, social, and political solution that is best overall for each society.
With these rather weighty thoughts in mind, we examine the issues that confront the architects of an electronic health record from two perspectives: the hand-off between SGML and other technology and the basic model of the record itself.

The Hand-off

SGML and SGML-based tools need not replace existing technology and need not be seen in opposition to other standards initiatives within medical informatics. The mapping of HL7 version 2.3 to an SGML document type is an extension of the modeling work done by the HL7 community into a new syntactic representation. Building equivalencies extends the usefulness of information encoded in HL7 beyond today’s HIS legacy systems.
But HL7 itself, and much of medical informatics outside of HL7, is not resting on version 2.3. Today, three of the primary efforts to build interchangeable models of information within healthcare are using object oriented technology for object management, exchange, and data modeling. The GEHR initiative, introduced above, uses object modeling for the content and management of the EHR; HL7 version 3.0 uses object modeling as an exchange format replacing the syntax of version 2.3; and CORBAmed, a domain technical committee of the Object Management Group (OMG) [Sokolowski, 1996], uses object technology to distribute and insert objects that are themselves of various formats. Sokolowski explains,
“CORBAmed actually provides the interfaces or IDLs (interface definition language) that allows two applications, systems, clients servers, whatever, to exchange information. A snippet of clinical data could be exchanged between the systems using an IDL that specified what type of data (an observation for instance) and in what format (ASCII stream, SGML, or HL7 as examples). It is entirely up to each application to process and handle the data correctly. So the interface engines are part of the application, and CORBA provides the interface.” [Sokolowski, 1997]
Where object technology is used for exchange and management, SGML can be one of many data formats that model the objects defined by HL7 version 3, GEHR, and CORBAmed so that loss-less translation occurs between disparate applications. CORBAmed not only manages different data types, but allows the flexibility of local storage and management in diverse technology from proprietary legacy systems to object/relational databases. A standard SGML data model for medicine will complement these object oriented schemas. Use of SGML opens the document paradigm to database management and extends the flexibility, longevity, and applicability of the object technologies.
At the same time, we would like to extend use of SGML as a distributed hypertext, to explore document and language-based methods of information management. In some areas, this may set up an equivalency between SGML link and storage management and object oriented methods . This will expand the use of both methodologies in much the same way as the equivalency between HL7 2.3 and an SGML DTD allows use of open document technology alongside legacy systems.. An equivalency between SGML and object models would allow translation between systems and permit the choice of local management based on local social, political, and technical preferences. A greater use of the document paradigm in the management of medical information may fit the decentralized US system more closely than the purely object model developed in Europe.
Regardless of the management strategy, we believe that the ability to model narrative with structure but without distortion should ensure that SGML comes into play at the level of the individual report or note. In addition to the increased flexibility of the model itself, SGML offers these advantages over standards that define the data to fit a particular type of data management application:

  • Transparency of data for search and query: While there is no inherent search and query language for SGML encoded data, there is no build-in prejudice that rules out any type of search or query.

  • Longevity: SGML encoded data does not yellow with age or successive generations of applications.

  • Granularity: The ability to scale the structure to fit an application or context without compromise of exchange

  • Extensibility: The ability to modify that granularity at a later time without loss of compatibility

  • Translatability: SGML is a good conversion hub, “you can always get there from here”

  • Accessibility: The Document Paradigm and document technology provides a degree of accessibility to all levels of users, including the patient, that may be an advantage is some systems where the need for more sophisticated and specialized applications would be a barrier to entry and participation

The section below begins to explore how we can realize some of these ambitious goals for SGML in medicine.


The goals of SGML in medicine are ambitious, and the potential stakeholders diverse in both needs and outlook. Within the HL7/SGML SIG, much of the initial effort has gone into education of the participants about SGML and discussion of other industry-wide SGML initiatives. The themes that we feel will be central in the creation of the standard are:

  • Preference for a library of small DTDs, interrelated through a formal SGML architecture, over a large monolithic DTD.

  • Overlapping, scaleable domains of information, some of which are specified by the standard and some of which are the province of local or enterprise-wide implementations. The most generic model should apply to the widest domain and as the domain becomes more specific, the model becomes overlaid with more granular information.

  • Small interrelated DTDs and scaleable domains strongly suggest use of architectural forms.

  • Supply context and granularity: DTDs should give context to progress notes, laboratory reports, history and physicals, etc., but have sufficient granularity to allow representation of all coding schemes including, but not limited to, ICD, SNOMED, DRG, CPT.

  • Map HL7 v 2.3 segments into the architecture (and therefore into the DTDs) so that HL7 messages can be generated from an SGML document instance.

  • Map other models of medical information such as the GEHR object model and the GEHR Health Record Item into the HL7/SGML architectural form.

  • Extend the document paradigm through a common architecture: An individual’s healthcare record can be seen as the links that bind together and aggregate the separate document instances. A healthcare or hospital system could manage the patient record by managing the links, while responsibility for a particular document instance remains the concern of the individual’s care providers.

Areas of Application

This is an excellent time to enter the market for the electronic health record and there are several areas where SGML-based technology can gain a significant market within healthcare. We offer some thoughts on the general climate, then describe some specific opportunities.

The Playing Field

One view of healthcare informatics today is that of increasing consolidation at the hands of a small number of extremely large players. Fortune magazine speculates that only four or five players will remain at the end of this consolidation. Those who currently have a large stake in the business have a hold on customers that is familiar to us in SGML: They own the legacy data. It is common in hospital legacy systems to offer no access to records other than individually printed reports.
SGML-based systems not only promise to end this kind of asset blackmail, they can lower the barriers to entry by allowing small players to participate in the market. Currently, integration of the electronic health record is being attempted by companies that are taking control of the whole record (e.g., billing, pharmacy, and everything else) because that is the only way they can provide a seamless integration. The result is a lock-out of competing technology and a reduction in the number of players able to participate.
SGML-based systems and SGML-encoded information could lower the barrier to participation and bring back the best-of-breed option to healthcare information purchases.

Specific Areas

We see the following areas of application for SGML-based technology:
1. Flexible methods of gathering clinical data -- It is relatively easy to demonstrate the utility of the EHR within a clinical care setting -- there are many beneficiaries. Very quickly, however, one is presented with the quandary of how to get the information into the system within a culture that is time sensitive to the extreme. Our sense is that successful implementations will show a wide variety and flexibility of input mechanisms from voice recognition to gesture-based data entry to specialized editing tools.
2. The application of document management technology to the clinical setting -- Much of what transpires in a clinical care setting can be viewed as classical workflow and document management, yet the existing hospital information systems have not taken advantage of this technology . We believe that the introduction of the document paradigm as a standard way of viewing patient records and clinical reports can present significant new opportunities in this market for document management and workflow technology.
3. Translation and conversion -- The demonstration of equivalency between the current messaging standard (HL7 v2.3) and SGML invites competition in this arena from those with SGML conversion technology and expertise. Specifically, we believe that it will prove worthwhile to look at the use of SGML conversion technology in place of the current HL7 interface engines. The pull toward SGML as the conversion hub, rather than HL7, will become stronger as document technologies come into use for collection and dissemination of information within medicine.
4. Composition and publication -- One area where current HIS technology has much to gain from integration of SGML-based document technology is flexibility in presentation of information. A typical reporting system today gives few options in application of styles and output media. The independent application of styles and the reuse of documents in different media will be a strong pull toward document technology.
There is, of course, much cross over between these areas of application. For example, SGML makes possible a channel between proprietary charting and records management software and an enterprise-wide data warehouse. Kaiser Permanente Southern California is planning to use an SGML output from their patient record system to populate their relational data warehouse. The parsed and validated SGML files will be accessible within the warehouse for queries, outcomes studies, and other research.
Other SGML-based implementations are being discussed across healthcare. We will update and extend this report at the Barcelona conference.



While the responsibility of the authors alone, we have developed the ideas expressed in this paper through vigorous and stimulating exchanges held within the SGML HL7 SIG and outside of it with Steve Newcomb, Eliot Kimber, Tim Bray, and with SIG members Tom Lincoln, Dan Essin, John Mattison, and Rachael Sokolowski. Many thanks to Bethany Schroeder and Randy Marbach who read a draft.

[1] A report by Anderson and Bunschoten states that “the three most common reasons given for not implementing electronic records are cost, named by 61%; no credible systems currently available, 34%; and security and confidentiality concerns, 27%.” The report estimates that “...fewer than 5% of these groups have made meaningful progress toward implementing computer-based records.”
[2] HL7 stands for Health Level 7 which is so termed after the OSI seventh level of interoperability. See the HL7 Web site at
Alschuler L, Lincoln TL, Spinosa J. Medicine for SGML. In: Proceedings GCA SGML'96, 1996:181-9.
Anderson, Howard J. and Bruce Bunschoten, “A Progress Report,” Health Data Management, Faulkner & Gray, 1996.
Brennan, Dr. Sean Brennan, NHS, private correspondence.
Dolin 1997-1: Robert H. Dolin, MD; Liora Alschuler; Tim Bray; John E. Mattison, MD, “SGML as a Message Interchange Format in Healthcare”, AMIA submission.
European Committee for Standardization / Technical Committee 251 - Medical Informatics. Investigation of Syntaxes for Existing Interchange Formats to be used in Healthcare. CEN/TC251. January 1993. (
GEHR Dipak Kalra, David Lloyd, “The Good European Health Record--Architecture Summary” in press. See
HL7 SGML Special Interest Group Web site:
Lieu 1996, Tracy A. Lieu, Steven B. Black, Michael E. Sorel, Paula Ray, Henry R. Shinefield, “Would better adherence to guidelines improve childhood immunization?”, Pediatrics, 12/1/96, p. 1062.

Lincoln and Essin 1994

Lincoln, T.L. , Essin, D.J.: The introduction of a New Document Processing Paradigm Into Health Care Computing‐A CAIT White Paper, distributed over the net in 1994. Available at

Sokolowski, 1996, Rachael Sokolowski, “Open Voice-Enabled, Structured Medical Information,” Towards an Electronic Patient Record, pre-publication copy. See also
Sokolowski, 1997, private correspondence

Download 80.16 Kb.

Share with your friends:

The database is protected by copyright © 2023
send message

    Main page