Health Level 7 (HL7) with Perl
Written by Nikos Vaggalis   
Monday, 25 July 2016
Article Index
Health Level 7 (HL7) with Perl
HL7 Grammar Notation
Implementation in Perl
Internal representation

In this article we take a deep look into HL7, the defacto standard in the health sector for exchanging clinical and patient information over heterogeneous systems, with the aid of Perl and the Net::HL7 CPAN module.

HealthIT is a rapidly evolving sector that everyday sees a massive amount of data accumulating. The pressing need for storing, retrieving and manipulating that data led to a computerization race which itself gave rise to a highly disparate and non standardized HealthIT landscape with no common grounds of communication.

But as times changed and requirements became more complex, it soon became evident that a common language for the exchange of information was necessary; a language that would enable this exchange not only inside a healthcare institution's internal IT systems, for example between labs and administration, but also between distinct Health institutions even across cultural and national barriers. (Consider epSOS as an example of a common infrastructure that facilitates the services of e-dispensation and patient summary on a pan-European level.)

Therefore, healthcare data standards emerged that, in relation to the Electronic Health Records, bundled the best industry practices whereby clinical and patient information could be shared and exchanged over heterogeneous systems in a standardized way. Specifically, they addressed the fact that for the massive amount of data accumulating in electronic format there was no satisfactory or standardized way of organizing, representing, and encoding so that it could be be handled and understood by the recipient systems. This had stark repercussions on resource spending, decision making, common reporting and analytics, but most of all it was affecting the patient's safety and the quality of the service offered.

Once available these standards would encode clinical data using a common terminology and ease exchange between systems. Some of them, taken from those listed on What Are the Different Standards in Healthcare?, are:

  • DICOM - Digital Imaging and Communications in Medicine
    Provides for handling, storing, printing, and transmitting information in medical imaging.

  • SNOMED CT - Systematized Nomenclature of Medicine Clinical Terms
    Developed mainly to encode the clinical data in a patient record. It is available for use in the U.S. via licensing by the National Library of Medicine (NLM) for country-wide use. It is also used by many other countries and is managed by the International Health Terminology Standards Development Organisation (IHTSDO). 

  • LOINC - Logical Observation Identifiers Names and Codes A standard developed and maintained by the Regenstrief Institute. It was originally developed to encode lab observations but has since expanded to also represent clinical observations.  It is an open source standard that is widely considered to be the definitive lab standard.

  • CCR - Continuity of Care Record
    Responds to the need to organize and make transportable a set of basic information about a patient’s health care that is accessible to clinicians and patients.

  • HL7 CDA (part of HL7 version 3.0) - Clinical Document Architecture
    Provides an exchange model (XML-based) for clinical documents (such as discharge summaries and progress notes); recently known as the Patient Record Architecture (PRA).

  • HL7 v2.x
    The primary and most widespread data interchange standard for clinical messaging, adopted by 95% of all US Health Institutions, in effect since 2003.



The Middleware

Usually software utilization of these standards is through middleware that take care of the transformation (mapping data from incoming messages to variables, executing custom script on message receiving, constructing HL7 messages from data source, running XSL transformations on incoming HL7 v3 or XML encoded messages); manipulation, and exchange (TCP/MLLP, Database, File, FTP/SFTP, HTTP, SMTP, SOAP over HTTP) of messages.

The two most popular and widely used are:  

  • Mirth Connect an open source package with commercial support that acts as an HL7 interface gateway. With a rich Java client interface and channel creation wizard which associates applications with the Mirth engine components, it "makes it easy to transform non-standard data into standard formats, and build and monitor multiple interfaces while you efficiently integrate and move data across your systems, locations, and community".  
  • Chameleon/Iguana,  is "an easy-to-use, proven and scalable messaging toolkit that enables vendors and developers to add HL7 messaging capabilities to their healthcare applications. The HL7 Messaging Toolkit is easily customized to support any type of HL7 message, regardless of version or format.". It enables applications written in Java to tell Chameleon to generate HL7 or X12 based on a set of Java objects. Conversely, it instructs Chameleon to take an incoming X12 or HL7 message and convert it into a set of Java data objects. This approach abstracts away the details of HL7 and X12 so that applications can be focused solely on business logic.


There's cost involved in them, plus they're mostly applicable and useful to big and integrated HealthcareIT solutions; what we are interested in, is to approach the matter programmatically and experience the construction and delivery of HL7 v2.6 messages through code.

A popular library with which you can implement those messages is HAPI, an open-source fully featured Java application programming interface (API) with an object-oriented HL7 2.x parser, that developers can use to add HL7 capabilities to their applications.

There's also a Python port, the HL7apy, a .NET port, the Nhapi,  while Perl's counterpart comes in the Net::HL7 CPAN module, part of the Perl HL7 toolkit.

The Message

First of all, let's take a look at the HL7 v2.6 message we want to create, together with its Encoding Rules:

MSH|^~\&|||||20160526110214||ADT^A01^ADT_A01|          id201|P|2.6||||||||||DD015|
PID|||100660325^^^NationalPN&2.16.840.1.113883.19.3  &ISO^0~80253^^^^1||GREENING^WAYNE^^^^^L||
 19610130|M||||||||||| 303603715||||LONDON|

According to its specs:

Message formats prescribed in the HL7 Version 2.6 encoding rules consist of data fields that are of variable length and are separated by a field separator character. Rules describe how the various data types are encoded within a field and when an individual field may be repeated. The data fields are combined into logical groupings called segments.

Segments are separated by segment separator characters. Each segment begins with a three-character literal value that identifies it within a message. Segments may be defined as required or optional and may be permitted to repeat. Individual data fields are found in the message by their position within their associated segments.

As such, an HL7 message's basic building blocks are the 'segments', grouped data by semantics. Our message consists of the following segments :

MSH or Message Header
Defines the intent, source, destination, and some specifics of the syntax of a message.
PID or Patient Identification
The PID segment is used by all applications as the primary means of communicating patient identification information. This segment contains permanent patient identifying and demographic information that, for the most part, is not likely to change frequently.

PV1 or Patient Visit
The PV1 segment is used by Registration/Patient Administration applications to communicate information about the patient visit.

DG1 or Diagnosis
The DG1 segment contains patient diagnosis information of various types, for example, admitting, primary, etc. The DG1 segment is used to send multiple diagnoses.

What determines the inclusion or exclusion of those segments in the unit of a  message, is the message type, or the so-called trigger event. In this case, ADT^A01^ADT_A01, the real-world event that necessitated the creation of a record and exchange of information relating to it is the admission of  a patient. Its message is made up of the following segments:

adt a01 - admit visit notification hl7 v2.6 - 2016-07-10 14.09.09Table 1: ADT^A01^ADT_A01 (Patient Admission)
Caristix HL7 Definition   

A patient transfer, ADT^A02^ADT_A02 has the following segments:

adt a02 - transfer a patient hl7 v2.6 - 2016-07-10 14.12.04

Table 2: ADT^A02^ADT_A02 (Patient Transfer)
Caristix HL7 Definition  

Last Updated ( Wednesday, 27 July 2016 )