Transcription

EHR Usability Test Report of ClinicalDocument Exchange version 1.5Report based on ISO/IEC 25062:2006 Common Industry Format for Usability Test ReportsProduct Under Test:Clinical Document Exchange version 1.5Date of Usability Test:August 28, 2018Date of Report:September 7, 2018Report Prepared By:Iatric Systems, Inc.Jeremy Blanchard Director of Analytics at Iatric Systems, Inc.(978)-805-3157Jeremy.Blanc[email protected] Quannapowitt Pkwy Unit 405Wakefield, MA 01880

Contents1.Executive Summary. 32.Introduction . 43.Method . 54.3.1Participants. 53.2Study Design . 63.3Tasks . 73.4Procedures . 73.5Test Location . 93.6TEST ENVIRONMENT . 93.7TEST FORMS AND TOOLS . 93.8Participant Instructions . 103.9Usability Metrics . 11Results . 134.1Data Analysis and Reporting. 134.2Discussion of the findings . 13Effectiveness . 13Efficiency . 14Satisfaction . 15Major Findings . 15Areas for Improvement . 165.Appendices . 16Appendix 1: Recruiting Screener . 16Participant selection . 16Appendix 2: Participant Demographics . 17Appendix 3: Non-Disclosure (NDA) and Informed Consent form . 19Appendix 4: Example Moderators Guide . 20Appendix 5: Final Survey Questions. 31Appendix 6: System Usability Scale Questionnaire . 32

1. Executive SummaryA usability test of Clinical Document Exchange version 1.5 (CDE), an electronic application forCDA exchange and reconciliation was conducted on July 24, 2018 at Melrose WakefieldHealthcare in Stoneham, Massachusetts by Iatric Systems, Inc. The purpose of this test was totest and validate the usability of the current user interface and provide evidence of usability in theEHR Under Test (EHRUT). During the usability test, 10 healthcare providers matching the targetdemographic criteria served as participants and used the EHRUT in simulated, but representativetasks.This study collected performance data on 4 tasks typically conducted on CDE, such as: Patient Selection Reconciling Problems Reconciling Allergies Reconciling MedicationsDuring the 60-minute one-on-one usability test, each participant was greeted by the administratorand asked to review and sign an informed consent/release form (included in Appendix 3); theywere instructed that they could withdraw at any time. Participants were trained on newfunctionalities of the EHR at the start of the session. The majority of the participants currently donot use the products. Training Agenda/Outline and user logins and training patients wereprovided, but no training materials were given to the participants. A general overview of thetesting process and expectations was presented. The application functionality was reviewed withthe participants. Time was allotted for questions and additional practice if needed.The administrator introduced the test and instructed participants to complete a series of tasks(given one at a time) using CDE. During the testing, the administrator timed the test and, alongwith the data loggers, recorded user performance data on paper and electronically. Theadministrator did not give the participant assistance in how to complete the task. Participantscreens, head shots and audio were recorded for subsequent analysis.The following types of data were collected for each participant: Number of tasks successfully completed within the allotted time without assistance Time to complete the tasks Number and types of errors Path deviations

Participant’s verbalizations (comments) Participant’s satisfaction ratings of the systemAll participant data was de-identified – no correspondence could be made from the identity of theparticipant to the data collected. Following the conclusion of the testing, participants were askedto complete a post-test questionnaire and fill out a System Usability Scale (SUS). Variousrecommended metrics, in accordance with the examples set forth in the NIST Guide to theProcesses Approach for Improving the Usability of Electronic Health Records, were used toevaluate the usability of CDE. Following is a summary of the performance and rating datacollected on CDE.Figure 1 – Data SummaryMeasureDeviations(Observed/Optimal)10/0Mean (SD)1TaskSuccess(%)Mean(SD)55 (30)247 (66)34NTask#PatientIdentificationand nMedicationReconciliationPathDeviationTask Time(in seconds)ErrorsTaskRatings1 difficult,5 easy31.8 (18.1)Mean(SD)0 (0)Mean(SD)4.5(0.67)17/080.2 (22.1)1 (0.6)3.9 (0.7)100 (0)7/059.6 (14.4)60 (40)6/049.0 (23.7)0.1(0.3)0.1(0.3)4.3(0.46)4.3(0.64)The results from the System Usability Scale (SUS) scored the subjective satisfaction with thesystem based on performance with these tasks to be: 68.75.In addition to the performance data, the following qualitative observations were made:- Major findings (found in Discussion of the Findings, section 4.2)- Areas for improvement (found in Discussion of the Findings, section 4.2)2. IntroductionThe EHRUT tested for this study was Clinical Document Exchange version 1.5 (CDE).Clinical Document Exchange is a CDA exchange and reconciliation solution. The system canelectronically transmit and receive a patients EHR records via HL7 3.0 standards. This exchangeof information provides up-to-date patient data to the physician in the care setting allowing for

better patient care decisions and reducing the number of typographical errors that occur fromtranscribed or verbal exchange. The reconciliation component provides healthcare providers theability to manually process the CDA that was received and review patient problems, allergies, andmedications in a user-friendly interface prior to incorporating them into the patient’s medicalrecord. The usability testing attempted to represent realistic exercises and conditions.The purpose of this study was to test and validate the usability of the current user interface andprovide evidence of usability in the EHR Under Test (EHRUT). To this end, measures ofeffectiveness, efficiency, and user satisfaction, such as time on task and task ratings, werecaptured during the usability testing.3. Method3.1 ParticipantsA total of 10 participants were recruited and selected to be tested on CDE. Participants in the testwere hospital IT staff, physicians, and nurses. Participants were recruited by Mark-HarrisonNelson, who has been a Health IT professional for over 20 years. Mark-Harrison hassupported/implemented several health IT initiatives during his time with Melrose-Wakefield(formerly Hallmark Health) and has prior experience as a Chem-Lab manager.Participants had no direct connection to the development of or organization producing theEHRUT, CDE. Participants were not from the testing or supplier organization. And, participantswere not current active users of the software, but were designated as intended end users of CDE.All participants completed a 30-minute training session prior to the usability test.For the test purposes, end-user characteristics were identified and translated into a recruitmentscreener used to solicit potential participants; an example screener is provided in Appendix 1.Recruited participants had a mix of professional experience and EHR experience that we felt wasrepresentative of our target users. Most hospital employees that have historically used CDE areIT staff members, but could also include physicians and nurses, so we recruited a representativegroup. The following is a table of participants by characteristics, including demographics,professional experience, computing experience and user needs for assistive technology.Participant names were replaced with Participant IDs so that an individual’s data cannot be tiedback to individual identities.

Part 8Tester8Male9Tester9Male10Tester10FemaleFigure 2 – Participant Demographics TableAgeEducation Occupation/ ProfessionalRangeRoleExperience(at currentposition)50-59Bachelor's Sr. Systems Clinical ystemdegreeSystemAnalystAnalyst30-39Bachelor's ClinicalClinicalDegreeAnalyst IIAnalyst30-39Bachelor's SeniorInformationDegreeClinicalSystems AnalystHealthcare40-49Bachelor's tor60-69Bachelor's er'sPhysician/InternalDegreeend userMedicine30-39Bachelor's ProjectMeditech forDegreeSpecialist2 years,MeditechConsultingfor 4 years,2 years fulltime withHospital30-39Bachelor's SystemsHealthcareDegreeAnalyst IIIT40-49Bachelor's ExecutiveNADegreeDirector, 0No720No48036No10 participants (matching the demographics in the section on Participants) were recruited and all10 participated in the usability test. Participants were scheduled for 60-minute sessions at theuser’s convenience. A spreadsheet was used to keep track of the participant schedule andincluded each participant’s demographic characteristics, survey results, and user comments.3.2 Study DesignOverall, the objective of this test was to uncover areas where the application performed well –that is, effectively, efficiently, and with satisfaction – and areas where the application failed tomeet the needs of the participants. The data from this test may serve as a baseline for future

tests with an updated version of the same EHR and/or comparison with other EHRs provided thesame tasks are used. In short, this testing serves as both a means to record or benchmarkcurrent usability, but also to identify areas where improvements must be made.During the usability test, participants interacted with 1 EHRs. Each participant used the system ina familiar and comfortable location and was provided with the same instructions. The system wasevaluated for effectiveness, efficiency and satisfaction as defined by measures collected andanalyzed for each participant: Number of tasks successfully completed within the allotted time without assistance Time to complete the tasks Number and types of errors Path deviations Participant’s verbalizations (comments) Participant’s satisfaction ratings of the systemAdditional information about the various measures can be found in Section 3.9 on UsabilityMetrics.3.3 TasksSeveral tasks were constructed that would be realistic and representative of the kinds of activitiesa user might do with this EHR, including:1. Patient identification and selection.2. Reconciliation of patient problems.3. Reconciliation of patient allergies.4. Reconciliation of patient medications.Tasks were selected based on their frequency of use, criticality of function, and those that may bemost troublesome for users.3.4 ProceduresUpon arrival, participants were greeted; their identity was verified and matched with a name onthe participant schedule. Participants were then assigned a participant ID.To ensure that the test ran smoothly, two staff members participated in this test: the usabilityadministrator and the data logger. The usability testing staff conducting the test consisted ofexperienced usability practitioners.

The testing administrator was John Fellian, Product Owner, Team Leader, and developer of CDEat Iatric Systems. As the Product Owner and Lead developer on the Clinical Document Exchange(CDE) product, John has the most intimate knowledge of the functionality of the UI and how itwas designed to be used. John has been the product owner for CDE at Iatric Systems sinceMarch of 2017. He has 12 years of experience in his position with Iatric Systems and has aBachelor of Science Degree from Northeastern University.The data logger and Meaningful Use expert was Jeremy Blanchard. Jeremy is the Director ofAnalytics at Iatric Systems and holds a Master of Science degree in Health Informatics fromNortheastern University. He has held positions of developer and product owner for variousproducts during his 11 years at Iatric Systems.Each participant reviewed and signed an informed consent and release form (See Appendix 3).The administrator moderated the session including administering instructions and tasks. Theadministrator also obtained post-task rating data and took notes on participant comments. Asecond person served as the data logger and monitored task times, took notes on task success,path deviations, number and type of errors, and comments.Participants were instructed to perform the tasks (see specific instructions below): As quickly as possible making as few errors and deviations as possible. Without assistance; administrators could give immaterial guidance and clarification on tasks, butnot instructions on use. Without using a think aloud technique.For each task, the participants were given a written copy of the task. Task timing began once theadministrator finished reading the question and verbalized, “Start.” The task time was stoppedonce the participant indicated they had successfully completed the task and verbalized, “Done”,or the data filing process began. Scoring is discussed below in Section 3.9.Following the session, the administrator gave the participant 2 post-test questionnaires (the FinalQuestions survey and the System Usability Scale, see Appendix 5 and 6 respectively), andthanked everyone individually for their participation. Participants' demographic information, tasksuccess rate, time on task, errors, deviations, verbal responses, and post-test questionnaire wererecorded into a spreadsheet.

3.5 Test LocationTesting was conducted remotely over an internet-based video conference. Participants used theirpersonal workspaces and assumed keyboard and mouse control of an Iatric PC for the purposesof performing the testing steps. The administrator and data logger were able to see the desktopof the PC, the participants face via webcam, and audio via conference line. Each environmentwas of a minimal noise volume, comfortable to the user, and free of distractions.3.6 Test EnvironmentThe EHRUT would be typically used in a healthcare office or facility. In this instance, the testingwas conducted over a GoToMeeting video conference. For testing, the participants assumedkeyboard and mouse control of an Iatric owned Dell OptiPlex 7740 desktop computer runningWindows 10. The participants used a wireless mouse and a keyboard when interacting with theEHRUT.The EHRUT was used on a 24-inch screen, with 1920/1080 resolution and in conjunction with abuilt-in webcam. The application was set up according to the product owner’s documentationdescribing the system set-up and preparation. The application itself was running on a platformusing a training / test database on a LAN / WAN connection. Technically, the system performance(i.e., response time) was representative to what actual users would experience in a fieldimplementation. Additionally, participants were instructed not to change any of the default systemsettings (such as control of font size).3.7 Test Forms and ToolsDuring the usability test, various documents and instruments were used, including:1. Informed Consent & Nondisclosure Form2. Moderator’s Guide3. Post-test Questionnaire4. System Usability Scale (SUS) Survey5. Demographics SurveyExamples of these documents are found in Appendices 3-7 respectively. The Moderator’s Guidewas devised to capture required data. The participant’s interaction with the EHRUT was capturedand recorded digitally with screen capture software running on the test machine. A web camerarecorded each participant’s facial expressions synced with the screen capture, and verbal

comments were recorded with a microphone. The test moderator later analyzed the videorecordings.3.8 Participant InstructionsThe administrator reads the following instructions aloud to each participant (also see the fullmoderator’s guide in Appendix [B4]):I want to thank you for participating in this study. Your input is very important. Oursession today will last no more than 60 minutes.In this study, we are evaluating the usability of our clinical reconciliation system. I willprovide a brief tutorial on how to use the system, then ask you to perform 4 tasks andcomplete some surveys afterwards. We would like to record this session using awebcam and screen recording device. All information you provide will be keptconfidential and your name will not be associated with your comments at any time.Following the procedural instructions, participants were shown the EHR. The administrator gavethem task packets and the following instructions:Now we'll begin the study. This part will be recorded. You'll be asked to complete fourtasks. I will read the scenario, then say "please start". Please begin working the task,feel free to perform whatever steps you feel necessary to complete the task as quicklyand easily as possible. Due to the nature of the test, we are not allowed to guide you orgive feedback to you during the test. Remember, our goal is to observe how you wouldcomplete the tasks on your own. You may ask specific questions regarding informationthat may be missing, but I cannot guide or direct you on how to complete a task.When you have completed a task, say "done" out loud. I will ask you to rate the ease ofcompletion for the task on a scale of 1 to 5. We will then move on to the next task andrepeat until we've completed all 4 tasks.Do you have any questions before we start?

Participants were then shown the 4 tasks to complete. Tasks were displayed on the screen oneat a time next to the EHR window, so the user could easily review the task instructions. After atask was completed the next task’s instructions were displayed. Tasks are listed in themoderator’s guide in Appendix 4.3.9 Usability MetricsAccording to the NIST Guide to the Processes Approach for Improving the Usability of ElectronicHealth Records, EHRs should support a process that provides a high level of usability for allusers. The goal is for users to interact with the system effectively, efficiently, and with anacceptable level of satisfaction. To this end, metrics for effectiveness, efficiency and usersatisfaction were captured during the usability testing. The goals of the test were to assess:1. Effectiveness of CDE by measuring participant success rates and errors2. Efficiency of CDE by measuring the average task time and path deviations3. Satisfaction with CDE by measuring ease of use ratingsDATA SCORINGThe following table (Figure 3) details how tasks were scored, errors evaluated, and the time dataanalyzed.Figure 3 – Observed Data Scoring DetailsMeasuresEffectiveness:Rationale and ScoringA task was counted as a “Success” if the participant was able to achieve the correctoutcome, without assistance, within the time allotted on a per task basis.Task SuccessThe total number of successes were calculated for each task and then divided by thetotal number of times that task was attempted. The results are provided as apercentage.Task times were recorded for successes. Observed task times divided by the optimaltime for each task is a measure of optimal efficiency.Optimal task performance time, as benchmarked by expert performance underrealistic conditions, is recorded when constructing tasks. Target task times used fortask times in the Moderator’s Guide must be operationally defined by taking multiplemeasures of optimal performance and multiplying by some factor [e.g., 3] that allowssome time buffer because the participants are presumably not trained to expertperformance. Thus, if expert, optimal performance on a task was [30] seconds thenallotted task time performance was [x * 3] seconds. This ratio should be aggregatedacross tasks and reported with mean and variance scores.

Effectiveness:Task FailuresIf the participant abandoned the task, did not reach the correct answer or performed itincorrectly, or reached the end of the allotted time before successful completion, thetask was counted as a “Failures.” No task times were taken for errors.The total number of errors was calculated for each task and then divided by the totalnumber of times that task was attempted. Not all deviations would be counted aserrors. This should also be expressed as the mean number of failed tasks perparticipant.On a qualitative level, an enumeration of errors and error types should be collected.Efficiency:TaskDeviationsThe participant’s path (i.e., steps) through the application was recorded. Deviationsoccur if the participant, for example, went to a wrong screen, clicked on an incorrectmenu item, followed an incorrect link, or interacted incorrectly with an on-screencontrol. This path was compared to the optimal path. The number of steps in theobserved path is divided by the number of optimal steps to provide a ratio of pathdeviation.It is strongly recommended that task deviations be reported. Optimal paths (i.e.,procedural steps) should be recorded when constructing tasks.Efficiency:Task TimeSatisfaction:Task RatingEach task was timed from when the administrator said “Begin” until the participantsaid, “Done.” If he or she failed to say “Done,” the time was stopped when theparticipant stopped performing the task. Average time per task was calculated foreach task. Variance measures (standard deviation) were also calculated.Participant’s subjective impression of the ease of use of the application wasmeasured by administering both a simple post-task question as well as a postsession questionnaire. After each task, the participant was asked to rate “Overall, thistask was:” on a scale of 1 (Very Difficult) to 5 (Very Easy). These data are averagedacross participants.Common convention is that average ratings for systems judged easy to use shouldbe 3.3 or above.To measure participants’ confidence in and likeability of the CDE product overall, thetesting team administered the System Usability Scale (SUS) post-test questionnaire.Questions included, “I think I would like to use this system frequently,” “I thought thesystem was easy to use,” and “I would imagine that most people would learn to usethis system very quickly.” See full System Usability Score questionnaire in Appendix6.

4. Results4.1 Data Analysis and ReportingThe results of the usability test were calculated according to the methods specified in theUsability Metrics section above. If able to obtain additional resources (such as more testing timeor more usability team resources), it is preferable to test with participants on an individual basis.This will decrease the stress of perceived competition and eliminate interaction betweenparticipants. The usability testing results for the EHRUT are detailed in the Execute Summary onpg. 4 (see Table 1).The results should be seen considering the objectives and goals outlined in Section 3.2 StudyDesign. The data should yield actionable results that, if corrected, yield material, positive impacton user performance. The results from the SUS (System Usability Scale) scored the subjectivesatisfaction with the system based on performance with these tasks to be: 68.75. Broadlyinterpreted, scores under 60 represent systems with poor usability; scores over 80 would beconsidered above average.4.2 Discussion of the findingsTo determine the success of CDE, the usability team analyzed the data while keeping severalareas in mind: effectiveness, efficiency, satisfaction and areas for improvement. To gatherconclusions about the effectiveness of CDE, the team analyzed success and failure rates oftasks. To look at efficiency, the task time and deviation data was noted and interpreted. Formeasures of satisfaction, the SUS scores and reported task ratings were analyzed. The teamdetermined areas for improvement by taking note of verbalizations from participants, analyzingthe Final Questions survey data for written comments, and relying on observations from theadministrator and data loggers.EffectivenessBased on the success, failure and path deviation data, CDE can be described as effective. CDEtested well in terms of effectiveness; one out of four tasks had a 100% success rate.Task 1 had a user request a restart due to not being able to see the full GoToMeeting screen andbeing at a disadvantage. We counted the first attempt at Task 1 for this user as a failure because

the screen issue was not brought to the administrator’s attention until after the timer had alreadystarted.Task 2 had a high error rate due to the user’s interpretation of the scenario and was agreedamong the testing team to be slightly ambiguous and could cause an error depending on how theuser determined the data to be “medically relevant”. The testing team did not update the script,so that all users were given the exact same opportunity for success.Task 4 had 2 users that needed to restart the task due to exiting from the screen accidently andneeding to reset the scenario. The first attempts were counted as failures.Due to the reconciliation functionality being new to all users who participated in the study someusers took slightly longer then the allotted task time. We retained these times because as theusers progressed through the task and their comfort level rose we saw progressively faster times.All users stated that the initial attempt is a little intimidating, but as you progress through the tasksthe system becomes familiar quickly and easier to use.EfficiencyBased on the observations of task time and deviation data, this usability test helped the teamdetermine that CDE can be considered efficient, but there is room for improvement. For instance,most of our optimal task times were 45-90 seconds on task. Most tasks were completed in lesstime than anticipated. Tasks 2, 3, and 4 are repetitive tasks requiring the user to perform similaractions to the preceding task. By the time the users got to Task 4, 6 of 10 users were able tocomplete the Task in 45 seconds or less, which greatly exceeded expectations.In terms of average task success, even with our small data set of 10 participants, most performedwell. Our participants had a 100% success rate with Task 3. Task 1 and Task 4 would have seena 100% success rate as well, but 1 users had a non-product related issue and 2 users of Task 4exited the screen which required a restart of the task.We plan to account for some of the issues experienced by users that led to confusion orprevented a user from being able to successfully complete a task on their first attempt.No users of the software had previous experience with the reconciliation functionality prior to theusability test and this led to some confusion and higher task times for some users.

SatisfactionBased on the task ratings and SUS results data, the participants were generally satisfied with theusability of CDE. The SUS score of 68.75 indicates that the participants perceived CDE asusable, but with room for improvement to become intuitive and more user-friendly. Most usersscored the individual task as either Easy or Very Easy.While task success rates may differ from this perceptive data, the usability team finds that overall,the participants were satisfied with most tasks and did not per

EHR Usability Test Report of Clinical Document Exchange version 1.5 Report based on ISO/IEC 25062:2006 Common Industry Format for Usability Test Reports Product Under Test: Clinical Document Exchange version 1.5 Date of Usability Test: August 28, 2018 Date of Report: September 7, 2018