|Integrative Biology VRE Project|
Bespoke Development: In silico Experiment Repository
The aim of the in silico experiment repository was to provide a graphical, online environment where biological simulation experiments could be constructed and managed without any knowledge of unix, cluster computing, or advanced shell scripting. The key benefits were:
The initial generic IBVRE portal did also attempt to provide an interface where simulation experiments could be executed without using the command line. However, the in silico experiment repository aimed to be much more tailored to the in silico experimental process, and less generic. This, it was hoped, would help tackle some of the uptake problems experienced with the initial IBVRE portal.
To keep the scope of the work manageable the IBVRE project focusing on meeting the needs of those using Memfem, a nonlinear finite element simulation tool originally written by Dr James C Eason. The pilot user community selected was made up of three of the largest heart modelling labs around the world that use this software on a day to day basis to conduct heart simulation experiments:
Natalia Trayanova's computational cardiac electrophysiology lab, the largest of the three labs, was originally based at Tulane University, New Orleans. Events in August 2005 led to them to relocate temporarily to the University of Washington at St Louis before returning to New Orleans in January 2006 and finally relocating permanently to Johns Hopkins University in August 2006. The Virtual Heart Lab, which was originally based at Washington and Lee University, relocated to Tulane University - with some work continuing at WLU - in July 2006 where Dr James Eason took up a visiting teaching position.
Two separate design elicitation workshops were arranged, the first at the Virtual Heart Lab in January 2006, the second at Natalia Trayanova's lab in May 2006. The strategy for both workshops was to follow techniques from Cooperative Design, a design methodology characterised by a high level of end-user participation. This strategy was followed in an attempt to secure a high level of user buy-in to the project - it was felt that researchers would be more likely to use the VRE if they were heavily involved in designing it.
Dr Eason divided up the experimental process into a series of components, from defining the particular heart geometry and stimulation protocol, to determining where output data resulting from the simulation should be stored. Viewing the process in this way helped the team to see the things that would change between simulations.
The main components identified were as follows:
An experiment will normally start the virtual heart from a known state and will apply a stimulous and examine its affect. This initial state is created by running a simulation where the virtual heart is paced a number of times through the application of a series of low amplitude shocks. The state of this pacing simulation is saved at an appropriate point as a restart file, and this is used as the basis of subsequent simulations, that involve the application of higher amplitude shocks.
Vulnerability Grid experiments test the affect of the timing and strength of a shock when applied to a simulated heart. In these experiments, large numbers of simulations are run in parallel, each simulation varying only in the time and strength of the applied shock. One aim of these experiments is to help design better defibrillators by identifying the lowest dosage shock that can terminate ventricular fibrillation (VF). Vulnerability Grids were by far the most common type of experiment performed at Dr Eason's Virtual Heart Lab, making up around 80% of experiments performed.
For this observation session, an experienced in silico experimentalist, Ashley Brown, was videoed while she carried out a simulation experiment. Through the use of screen capture software VNCRec and a standard camcorder, it was possible to capture the user interactions both within the physical workspace, and within the Linux desktop environment.
The session was useful in that it revealed the inherently intricate nature of the process, and the functions that would need to be automated in any VRE. Examples of tasks that were performed included:
The user interface storyboarding was carried out collaboratively in front of a whiteboard with the lab's director Dr Eason. It was decided to initially focus only on Vulnerability Grid studies as these represented the majority of simulation studies carried out. Basic screens were laid out, giving the minimal functionality necessary for the user to
In addition to the above, there was also the idea that experiments would be organised under over-arching studies. The properties identified for experiments, and batches were as follows:
Two iterations of the prototype were developed. The first, a simple HTML mock-up, was evaluated with Ashley Brown and Dr Eason. Following their comments a more sophisticated prototype utilising Java Server Pages (JSP), the JSP Standard Tag Library (JSTL), and a PostgreSQL database was developed. This version actually allowed experiments to be constructed and executed on the local computational cluster (The Inferno), and was evaluated with a new student who had no previous experience of conducting simulation experiments at the lab.
At the Trayanova lab, a very similar workshop was held, starting with a whiteboarding session to establish the overall research lifecycle. This very quickly brought home the much more diverse nature of experiments carried out at the Trayanova lab in contrast to the Virtual Heart Lab where experiments tended to be much more targeted. Another important difference was that each individual experimentalist will design and run their own experiments with guidance and direction from the lab's director; at the Virtual Heart lab, the students who perform the simulations were normally working with an experiment that has been pre-designed by Dr Eason.
The experimental protocol followed was reasonably consistent with the Virtual Heart Lab, starting with the choice of model (canine, human, rabbit etc), and then through to the definition of the stimulation protocol. Experiments were classified according to the number of dimensions involved from simple 0 dimensional experiments i.e. single cell, through to 3 dimensional slab and whole heart models. 2 and 3-dimensional experiments were then further subdivided according to the type of phenomenon under study.
Also discussed was how the file system was used to organise experiments and their corresponding output data. Each experimentalist had different ways of managing this; however, there would normally be a directory for each project/study and under this, a model and parameter sets directory. The parameter sets directory contains a sub-directory for each individual simulation with a specific set of parameters; the model directory contains the geometry model in use for all simulations within the study - as this can be quite large, especially for whole heart models, it is normally symlinked into each simulation sub-directory, to save space.
A walkthrough was carried out with Hermenegild Arevalo, a graduate student working at the lab. The experiment shown involved pacing a simulated rabbit heart from different positions - the output data resulting from this was to be given to another group for analysis. After showing the script that executed the experiment, Hermenegild went on to demonstrate the visualisation side of the experiment, forming the greater part of the experimental work.
In contrast to the Virtual Heart lab, where post-processing centred around the use of statistical analysis packages, here all data analysis was carried out through the use of the Meshalyzer visualisation package. There was a discussion about how this visualisation package compared with the other main package used, CoolGraphics (CG). A powerful feature that was available in Meshalyzer but not to the same extent in CG was clipping planes - this gives the experimentalist the opportunity to cut away sections of the virtual heart in order to see the electrical activity inside.
As at the Virtual Heart lab, lab members were brought together in front of a whiteboard, to sketch out a user interface. The solution to emerge was much more generic than the VRE developed for the Virtual Heart Lab, and not tied to any particular experiment type. Users were able to construct experiments, and monitor their status as they were running. In this case, status did not refer to whether the job had failed or completed on the cluster, but a visual representation of the electrical activity on the surface of the simulated heart at the current timestep. It was felt that this ability to view a snapshot showing the heart from 6 perspectives would be the killer feature, and could convince lab members to use the VRE exclusively, in preference to the command line.
As at the Virtual Heart Lab, a prototype utilising JSTL and PostgreSQL was developed and evaluated with lab members. This did not allow experiments to be constructed from the VRE, but allowed already running experiments to be visualised from all 6 directions, at the current timestep.
Both the WLU and the Tulane prototypes were put together very rapidly and never designed to be robust - their main purpose was as a design elicitation tool. Following the two workshops at the Virtual Heart Lab and Tulane, work focussed on developing a more generic user interface design and data model that would bridge the requirements of the two groups. This design work was carried out in conjunction with Rob Blake, the research analyst at the Trayanova lab. The final user interface design was then validated and refined with a number of Trayanova lab members at a third design workshop held on 19 July 2006 as part of the IB project world tour.
The data model developed was based on the CCLRC Scientific Metadata Model. It had a hierarchical structure where information is passed down the structure. The highest level is the Study. Under Study is Experiment and below that is Sub Experiment. Jobs (or simulations) can only be run from the lowest level. Variables defined at the Experiment level will be inherited by the Sub Experiment and cannot be changed.
Following the workshop on 19 July 2006, the project team were confident that they had a good understanding of the user interface requirements of the two groups and justified in committing to a longer period of development to allow time for a more robust infrastructure to be constructed. At this point Rob Blake left Tulane to take up a PhD course at the University of Illionis. From August 2006 the project team worked with Umar Farooq who took over from Rob at Tulane.
Development work from August 2006 consisted of building a fully-functional web application based on Apache Struts, and a cut-down dashboard style portal version. The Technical Consultant (based in London) and the User Interface developer worked on the Struts version, while the Systems Developer worked on the portal version in parallel. The technical design for the Struts application is based on the Model-View-Controller (MVC) design pattern, with the data access layer following the Core J2EE Data Access Object Pattern and implemented using straight JDBC. The portal version, by contrast, uses the Spring Framework to implement MVC and Hibernate to provide the object relational mapping.
In December 2006, a first version of the VRE, which allowed submission of Memfem simulations to a single-node test cluster, was installed on a server at the Johns Hopkins lab. At this point a few members of the lab started to try using the IBVRE to submit simulation experiments on a trial basis. The final formal evaluation was arranged as a video-taped shared desktop session utilising VNC (a desktop sharing tool) and Skype on 16 February 2007. The session was organised around a guided walkthrough of the VRE and generated a coherent set of requirements for further development and was very reassuring in the sense that it validated the original VRE concept in its potential to transform the way the two labs work. As of March 2007, several lab members at Tulane and Johns Hopkins have now started to use the VRE to create and manage their simulation studies in preference to the command line.
IBVRE Final Evaluation, February 2007 Full Text (PDF)
In order to address the immediate requirements of the end user community, the first version of the VRE was developed to work only with local computational clusters supporting the PBS queuing system. However, it was always planned that this prototype would be integrated with the middleware services developed by the IB Technology Group bringing the capability to submit simulation jobs to national Grid resources such as HPCx and the UK National Grid Service (NGS).
Integration of with the IB middleware services, to provide this Grid connectivity, was carried out by Lakshmi Sastry's group at CCLRC between December 2006 and March 2007. A version providing job submission capability to NGS, HPCx, and data hosting within the Storage Resource Broker (SRB) vault at CCLRC was installed on a server at Tulane University in March 2007, and was in alpha-testing at project end.
The home page of the system, as shown below, presents the user with a list of studies, each corresponding with a particular set of users. Each study has a view link, that takes the user to the Study Details page, and a Jobs link that displays all jobs associated with the study. The All Jobs link at the top of the page displays all jobs associated with all the studies known to the system. The New Study link takes the user to a page allowing them to create a new study.
The Study Details page, shown in the next screenshot, displays descriptive information relating to the study and an edit link takes the user to a corresponding page allowing the user to edit this descriptive information. At the bottom of the page is a list of experiments associated with the study, each with a corresponding view link.
The Experiment Details page, shown below, lists the simulation parameters specific to the experiment, as well as a list of any immediate sub-experiments. A link at the bottom of this page, takes the user to another page listing all jobs associated with this experiment. An edit link takes the user to a corresponding page (not shown) enabling the user to edit the parameters associated with the experiment.
The following screenshot shows the image returned when the user clicks on the snapshot link, one of the more innovative features of this VRE. This coloured image indicates the surface electro-potential at the current time-point of the running simulation, and helps the user to decide whether to continue with the current simulation or to abandon it.
Source code for the system was released to the SourceForge open source software hosting web site in March 2007, under the unix name ibvre. The software is being made available to the community under a Modified BSD licence. Work will continue after project end to migrate much of the user documentation, currently hosted on the JHU wiki, to the SourceForge website.