Monday, August 20, 2018

HAPI FHIR Example Server Using Docker on AWS

1 comments

Confession: I am not an abstract thinker. In order to learn something new, I have to have it in my hands. So when I was asked to evaluate and understand the FHIR standard, my first course of action was to lift the hood on one of the more mature reference FHIR servers available, the HAPI FHIR JPA Server.
You must realize that following this tutorial will not get you any closer to understanding the FHIR standard … but it WILL give you your own environment where you can inspect every aspect of FHIR, from issuing REST API calls to examining the data model for storage that HAPI chose to use. This setup provides me total transparency from input to process to output, all with technology I am comfortable with. Your goals may be different, I just hope the guide provides value whatever they may be.
By default, the HAPI JPA example server starts up seamlessly using an embedded Apache Derby database and Jetty server. I’m more comfortable with Tomcat and PostgreSQL, and from the documentation, HAPI seemed … well … happy, to let me switch. 

While it was a pleasantly straight forward exercise to set this server up the way I wanted it, there were a few gotchas. This is the guide I wish I had found when I googled “HAPI FHIR JPA server Tomcat PostgreSQL”. 🙂 

I am big Docker and Docker-Compose fan (and with good reason). So I chose to create a docker-compose file to start a Tomcat server and a PostgreSQL database and deploy the HAPI FHIR JPA Server example application to the application server. 

The last bit (that became oh-so-simple thanks to my upfront efforts with Docker) is launching my server on an AWS EC2 instance. This is a normally just a nice environment to work in, but for the HAPI FHIR JPA Server to be of any use to me, it was also a necessity.  You see the sample data that I would have liked to have loaded from the HL7 website does not load properly to the HAPI server for dSTU3. Fortunately the folks at Crucible have some synthetic patient records they will load for you, all you need is a public server URL (hence the AWS move). 

Read on below for the step-by-step details. You can download the Docker artifacts from this article. And all you FHIR gurus out there,  please feel free to let me know where I could have made my life easier!

Create Dockerfiles for Tomcat and PostgrSQL. 

Lukas Pradel has already done a pretty fantastic job of writing a nice tutorial for dockerizing a Tomcat & PostgreSQL setup

There is very little we need to change in the Dockerfiles from that posting.  One important change will be to reference the HAPI FHIR JPA Server example .war. We haven't built this yet, but we will soon. I also changed the username, password and database name to match what I wanted for my application. You will see in the next steps where you will tell the HAPI application what these values are, so keep them handy. 

The Application Server (Tomcat) Dockerfile

FROM tomcat:8-jre8 MAINTAINER gmoran
RUN echo "export JAVA_OPTS=\"-Dapp.env=staging\"" > /usr/local/tomcat/bin/setenv.sh

COPY ./hapi-fhir-jpaserver-example.war /usr/local/tomcat/webapps/fhir.war
CMD ["catalina.sh", "run"]

The Database (PostgreSQL) Dockerfile

FROM postgres:9.4  
MAINTAINER  gmoran

ENV POSTGRES_USER gmoran  
ENV POSTGRES_PASSWORD XXXXXXXXX  
ENV POSTGRES_DB fhirdata  

Create a Docker-Compose file to orchestrate and launch the system.

The docker-compose.yml file provided in the post above also works quite well for this application. I chose to stick with port 8080 for simplicity. Also I map port 5432:5432 so that I can use the psql PostgreSQL utility from any machine to interrogate the database tables.

The docker-compose.yml File

app-web: build: ./web ports: - "8080:8080" links: - app-db app-db: build: ./db expose: - "5432" ports: - "5432:5432" volumes_from: - app-db-data app-db-data: image: cogniteev/echo command: echo 'Data Container for PostgreSQL' volumes: - /var/lib/postgresql/data

Our next chore is to build the HAPI FHIR JPA example server. If you are familiar with Git and Maven, it is easy.

Download HAPI source from GIT. 

You can download the HAPI source following these instructions, or use the git clone command-line as follows:

$ git clone https://github.com/jamesagnew/hapi-fhir.git

Modify the FhirServerConfig.java to wire the database configuration. 

Since we want to use PostgreSQL, we will need to add the PostgreSQL JDBC driver jar to our Maven pom.xml file. The attribute values for the 9.4 version of the jar are as follows: 
                 <dependency>
   <groupId>org.postgresql</groupId>
   <artifactId>postgresql</artifactId>
   <version>9.4-1201-SNAPSHOT</version>
  </dependency>

In the source code navigate to the FhirServerConfig.java class. Navigate first to the hapi-fhir-jpaserver-example folder, then the class is nested down here: 

src/main/java/ca/uhn/fhir/jpa/demo/FhirServerConfig.java

Using your favorite code editor, make the changes shown in red as follows:

public DataSource dataSource() { BasicDataSource retVal = new BasicDataSource(); retVal.setDriver(new org.postgresql.Driver()); retVal.setUrl("jdbc:postgresql://app-db:5432/fhirdata"); retVal.setUsername("gmoran"); retVal.setPassword("XXXXXXXX"); return retVal; } @Override @Bean() public LocalContainerEntityManagerFactoryBean entityManagerFactory() { LocalContainerEntityManagerFactoryBean retVal = super.entityManagerFactory(); retVal.setPersistenceUnitName("HAPI_PU"); retVal.setDataSource(dataSource()); retVal.setJpaProperties(jpaProperties()); return retVal; } private Properties jpaProperties() { Properties extraProperties = new Properties(); extraProperties.put("hibernate.dialect", org.hibernate.dialect.PostgreSQL94Dialect.class.getName());

Note what we changed.

The driver class must be the PostgreSQL driver class (org.postgresql.Driver()).

The URL is the standard JDBC URL for connecting to a database, comprising the protocol, the host, the port number and the name of the database. The host is an interesting value: app-db. If you look back at our docker-compose.yml file, you will see that we named our container app-db, and therefore, we can reference that name as the hostname for the database. The rest of the values in the URL MUST match the values we set in the docker-compose and Dockerfile configurations.

host: app-db
port: 5432
database: fhirdata
username: gmoran
password: XXXXXXXX

Also note that we changed the hibernate dialect (org.hibernate.dialect.PostgreSQL94Dialect.class.getName()). If for some reason you change the version of PostgreSQL used in the Dockerfile, you will want to make sure this dialect class matches the version you chose.

Once you are satisfied your changes match  the configuration, save this file. We will now return to the command line to build the server using Maven.

Build the HAPI FHIR JPA Server example .war file.

One item to note: With the changes that we made to use PostgreSQL instead of Apache Derby, you will see errors in at the end of the build. These are test failures and did not affect the stability of the server, so I ignored them (don't judge me).

Return to a command line, and navigate in the HAPI source code to the hapi-fhir-jpaserver-example folder. Run the following command from that location:

$ mvn install

Locate the target folder in the hapi-fhir-jpaserver-example folder. There should be a hapi-fhir-jpaserver-example.war file created.

Copy the hapi-fhir-jpaserver-example.war file into the /web subfolder you created or downloaded with the Dockerfiles. 

Spin up an AWS EC2 instance (I used Ubuntu free tier). 

I'm not going to go into great detail on HOW to work with EC2. Amazon does a pretty good job with documentation, and you should be able to find what you need to start a free AWS account and launch an EC2 instance.

Be certain to allow access to SSH and port 8080 (or whatever port you may have used for the web application server) in your AWS Security Group for your server. Allow access to port 5432 as well, if you want to use psql or another database management utility with your PostgrSQL server. You will be given the opportunity to set this access in the EC2 Launch Wizard. 

Install Docker & Docker-Compose on your EC2 instance. 

You'll need to SSH into your EC2 instance to do these next installs.

The Docker guides have great instructions on how to install Docker and Docker-Compose on Ubuntu.

Note that you will want to add the Ubuntu user to the Docker group on your EC2 server so you aren't having to sudo all the time.

Move your Dockerfiles, Docker-Compose script and .war file to your EC2 instance.  

We've come a long way, and we are almost done. The last steps are to move your files to the EC2 instance, and run docker-compose to spin up the Docker containers on the EC2 server.

Once you have the tools installed, exit out of your EC2 instance terminal.

Use the tar utility to compress your files. From the root folder (the folder that holds your docker-compose.yml file), run the following command: 

$ tar cvzf web.tar.gz

You should now have a compressed file in the root folder named web.tar.gz that contains all the necessary files for the server. 

Next, use the SCP utility to upload your files to the EC2 instance. 

scp -i xxx.pem web.tar.gz ubuntu@my_aws_ec2.compute-1.amazonaws.com:/home/ubuntu

The xxx.pem file in the command above should be the .pem file that you saved from AWS when you created your EC2 instance. The hostname should match the hostname of your EC2 server instance.

Once the files are uploaded to your EC2 instance, SSH into the EC2 server once again. Find the folder that holds your web.tar.gz file.  If you followed along exactly as I did it, the file should be in the /home/ubuntu folder.

Use the tar utility once again to extract your files:

$ tar xvf web.tar.gz

Run Docker-Compose to launch your containers.

Navigate to the folder that contains your docker-compose.yml file. Run the following command:

$ docker-compose up

Now, with luck, you should have the HAPI FHIR example server up and running. You can test talking to your server by navigating to it in your browser:

http://my_aws_ec2.compute-1.amazonaws.com:8080/fhir/

Add Some Example Patient Resources to HAPI.

If you are looking to learn a bit about FHIR, it would be useful to have some FHIR resources to play around with. I found that HAPI also comes with a nifty CLI that will allow you to load data from the HL7 site just for this purpose. Sadly, that example data doesn't work and according to the GitHub issue, no one intends to fix it. 

All is not lost. Crucible is a clever project that has a number of useful tools for testing your FHIR implementations, and they include a site for generating "realistic but not real" patient resource data. This is what I used to load some data into my new server. 

In a browser, navigate to the Load Test Data tool. Enter your server's base URL, select your format and the number of patients you would like generated, and let tool work its magic. 


I loaded 100 patients, and now have a large enough stash to start really learning something about FHIR. But I think that will have to wait until tomorrow, this was enough for one day!

Cheers, 
G

Tuesday, February 10, 2015

Today's the day:) Hello Hitachi Data Systems

3 comments
Just over 10 years, and  it has finally happened.

Hitachi Data Systems intends to acquire our baby, Pentaho.

I couldn't be more excited.  Pentaho is the fast, maneuverable Destroyer coming alongside the Hitachi Battleship, eagerly pursuing dominance of the IoT and Big Data space.  I can't think of a better fit for us as a company ready to do big things in a big market, or as a culture of innovators, entrepreneurs and talented, hard-working engineers. So many people have committed themselves to the Pentaho vision for the past decade, people I know like family. Congratulations to my Pentaho family, and Hello Hitachi.

Looking forward to all we will achieve together :)

From Pentaho: http://www.pentaho.com/hitachi-data-systems-announces-intent-to-acquire-pentaho

From the CEO: http://blog.pentaho.com/2015/02/10/a-bolder-brighter-future-for-big-data-analytics-and-the-internet-of-things-that-matter-pentaho-hds/

Pedro Alves: http://pedroalves-bi.blogspot.pt/2015/02/big-news-today-hitachi-data-systems-hds.html

From Hitachi: https://community.hds.com/community/innovation-center

Bloomberg: http://www.bloomberg.com/news/articles/2015-02-10/hitachi-to-buy-pentaho-to-bolster-data-analysis-software-tools

Wednesday, October 15, 2014

MDX: Converting Second of Day to Standard Time Notation

0 comments
Had a bit of fun with Pentaho Analyzer recently. In the release of Pentaho 5.2, we have introduced the ability to define filters across a range of time, which is really handy when your dataset is millions of records per second.

My use case included keying our time dimension on seconds per day, which results in 86,400 (60 seconds * 60 minutes * 24 hours) unique records; one to represent each unique second in a day. While this is great for simplifying query predicates, it does not help the usability or intuitiveness of the analysis report you are presenting to the user. For instance, who would intuitively understand that 56725 represents 15:45:25 in time?

So I came up with this user-defined calculation that will convert seconds in a day to standard time notation. Would love to hear from anyone who can optimize this:)  This is a valid MDX calculation that Mondrian will process. Since I needed to know the minimum and maximum second per hour in the display, I used the second of day number as a measure.

Format(Int([event scnd of day min]/3600), "00:") || 
  
  Format(Int(([event scnd of day min] - 
        (Int([event scnd of day min]/3600))*3600)/60), "00:") ||
    
    Format([event scnd of day min] - 
          ((Int([event scnd of day min]/3600)*3600) + 
          (Int(([event scnd of day min] - 
          (Int([event scnd of day min]/3600)*3600))/60)*60)), "00")


Here's what it looks like in Analyzer.  The columns Minimum & Maximum Second of Hour have the calculation applied to them. Note the time filter range in the filter panel. Super sweet.



Sunday, September 28, 2014

Hello Docker.

0 comments
Docker. Hmmmm. I really want to love it. Everybody else loves it, so I should right? I think maybe some of the "shiny" isn't so bright using DOCKER ON MY MAC.  Although, Chris J. over at Viget wrote this blogpost that singularly walked me through each Mac-Docker gotcha with zero pain.   Total stand-up guy, IMHO.

If you are not familiar, Docker plays favorites to Linux based operating systems, and requires a virtual machine wrapper called boot2docker in order to run on a Mac or Windows OS. Not a huge hurdle, but definitely feels heavier and a bit more maintenance intensive ... two of the core pain points in traditional virtual environment deployments that Docker proposes to alleviate.

Beyond that silliness, there is a whole lot more *nix based scripting than I expected. Somehow I thought the Dockerfile language would be richer, accommodating more decision-based caching. You know, something like cache this command but not this one.  As I looked around and read a few comments from the Docker enthusiasts and Docker folks-proper, it seems there is a great desire to keep the Dockerfile and it's DSL ... well ... simple. Limited? Is that a matter of perspective? I can appreciate simple I guess, but I still want to do hard stuff ... and thus I am pushed to the *nix script environment. This may just be a matter of stuffing myself into these new Docker jeans and waiting for them to stretch for comfort:)

One blessed moment of triumph I would like to share: I was able to write a Dockerfile that would accommodate pulling source from a private Github repository using SSH. This is NOT a difficult Docker exercise. This is a persnickety SSH exercise:) The Docker container needs to register the private SSH key that will pair with the public key that you have registered at Github. At least that is the approach I took. Please do let me know if there are easier / better / more secure alternatives.

So, the solution. The first few steps, I'm going to assume you know how to do, or can find guidance. They are not related to the container setup.

I'm going to tell you right up front that my solution does have a weakness (requirement?) that may not be altogether comfortable, and Github downright poo-poos it. In order to get the container to load without human intervention, you need to leave off the passphrase when you generate your SSH keys (Gretchen ducks.).  I planned to revisit this thorn, but just simply ran out of time. Would love to hear alternatives to this small snafu. Anyway, if you're still in the game,  then read on...

Here are the steps you should follow to get this container up and running.

  1. Generate a pair of SSH keys for Github, and register your public key at github.com.
  2. Create a folder for your Docker project.
  3. Place your private SSH key file (id_rsa) in your Docker project folder.
  4. Create your Dockerfile, following the example below.
  5. Build your image, and run your container.
  6. Profit:)

The Dockerfile

FROM gmoran/my-env
MAINTAINER Gretchen Moran gmoran@pentaho.com

RUN mkdir -p /root/.ssh

# Add this file ... this should be your private GitHub key ...
ADD id_rsa /root/.ssh/id_rsa

RUN touch /root/.ssh/known_hosts
RUN sudo ssh-keyscan -t rsa -p 22  github.com >> /root/.ssh/known_hosts

Running as root User

I am referencing the root user for this example, since that is the default user that Docker will use when you run the container. If you would like a bit more protection, you can create a user, and run the container with that user with the following command ...

USER pentaho

I created the 'pentaho' user as part of a Dockerfile used in the base image gmoran/my-env. IMPORTANT: Note that gmoran/my-env also downloads the OpenSSH daemon and starts is as part of the CMD Dockerfile command.

Adding the id_rsa File

The id_rsa file is the private SSH key generated as part of the first step in this process. You can find it in the directory you specified on creation, or in your ~/.ssh directory.

There are a number of ways to add this key to the container. I chose the simplest ... copy it to the container user's ~/.ssh directory. OpenSSH will look for this key first when attempting to authenticate our Github request.

Adding github.com to the known_hosts File

We add the github.com SSH key to the known_hosts file to avoid the nasty warning and prompt for this addition at runtime.

In my thrashing on this, I did find several posts in the ether that recommended  disabling StrictHostChecking, which hypothetically produces the same end result as manufacturing/mod'ing the known_hosts file. This could however leave this poor container vulnerable, so I chose the known_hosts route.

At the End of the Day ...

So at the end of the day, when I thought I would be honing my Docker skills, I actually came away a with a stronger set of Unix scripting skills. Good for me all in all.  I am excited about what Docker will become, and I do find the cache to be enough sugar to keep me drinking the Docker kool-aid.

I should say I appreciate not actually having to struggle with Docker. It is a nice, easy, straight-forward tool with very few surprises (we won't talk about CMD versus ENTRYPOINT). Any time-consuming tasks in this adventure were directly related to my very intentional avoidance of shell scripting, which I now probably have a tiny bit more appreciation for as well.

In the words of the guy I like the most today, Chris Jones ... Good Guy Docker :) 





Tuesday, April 15, 2014

Pentaho Analytics with MongoDB

1 comments
I love technology partnerships. They make our lives as technologists easier by introducing the cross sections of functionality that lie just under the surface of the products, easily missed by the casual observer. When companies partner to bring whole solutions to the market, ideally consumers get more power, less maintenance, better support and lower TCO.

Pentaho recognizes these benefits, and works hard to partner with technology companies that understand the value proposition of business analytics and big data. The folks over at MongoDB are rock stars with great vision in these spaces, so it was natural for Pentaho and MongoDB to partner up.

My colleague Bo Borland has written Pentaho Analytics with MongoDB,  a book that fast tracks the reader to all the goodness at your fingertips when partnering Pentaho Analytics and MongoDB for your analytics solutions.  He gets right to the point,  so be ready to roll up your sleeves and dig into the products right from page 1 (or nearly so).  This book is designed for technology ninjas that may have a bit of MongoDB and/or Pentaho background. In a nutshell, reading the book is a straight shot to trying out all of the integration points between the MongoDB database and the Pentaho suite of products.

You can get a copy of Pentaho Analytics with MongoDB here.  Also continue to visit the Pentaho wiki, as these products move fast.

Friday, March 07, 2014

Pentaho's Women in Tech: In Good Company

3 comments
I was honored this week to be included in a a blog series that showcases just a few of the great women I work with, in celebration of International Women's Day on March 8.

Check out the series, I think you'll find the common theme in the interviews interesting and inspiring. Pass on the links if you have girls in your life that could be interested in pursuing technology as a career. 

Friday, December 14, 2012

Pentaho's 12 Days of Visualizations

0 comments
If you are interested in the ultimate extendability of Pentaho's visualization layer, you'll love this fun holiday gift from Pentaho: 12 Days of Visualizations.  Check back each date marked for a new plugin that demonstrates Pentaho leveraging cool viz packages like Protovis, D3 and more.

http://wiki.pentaho.com/display/COM/Visualization+Plugins

Today's visualization: the Sunburst!




Merry Christmas and Happy New Year!