Abstract
-
Purpose
Simulation-based education plays a critical role in nursing by allowing students to acquire clinical competencies in a safe and controlled environment. However, current evaluation tools for simulation scenarios often lack standardization, resulting in inconsistencies when assessing the effectiveness of such programs.
-
Methods
This study aimed to develop a comprehensive Nursing Simulation Scenario Evaluation Tool using the Real-Time Delphi method. A panel of 10 experts in nursing and simulation education participated in two rounds of surveys. The evaluation criteria were derived from the International Nursing Association for Clinical Simulation and Learning Standards of Best Practice and relevant literature. Survey items were refined through expert consensus using content validity ratios and coefficient of variation values. The finalized tool was further enhanced with artificial intelligence (AI)–based evaluation capabilities to support objective and systematic assessment. The tool was registered and patented in the Republic of Korea (Korean Intellectual Property Office Registration No. 10-2024-0051234) to acknowledge its innovation and technical merit.
-
Results
The process resulted in an evaluation tool comprising eight key domains and 36 items, covering scenario structure, learning objectives, preparation, script development, debriefing, facilitation, expected outcomes, and scenario validity. A Kendall’s coefficient of concordance of 0.739 indicated strong agreement among the experts.
-
Conclusion
This study successfully developed a standardized and validated tool to improve the reliability and effectiveness of simulation-based education in nursing. The tool addresses a key gap in current educational practices and enhances consistency in evaluating nursing simulation scenarios. Future studies should focus on validating its application across diverse educational environments.
-
Key words: Clinical competence; Delphi technique; Educational measurement; Nursing; Simulation training
INTRODUCTION
Despite the continuous development and increasing complexity of the medical field in the 21st century, the basic structure and format of medical education have remained largely unchanged since the late 19th century [
1]. This is particularly significant in nursing, where nurses play a crucial role in ensuring patient safety and health. Nurses must acquire the necessary competencies across various situations before engaging in clinical practice with patients [
2]. Traditional knowledge tests play an important role in the prelearning phase of simulation-based education, but are insufficient on their own. In simulation-based education, learners apply previously acquired knowledge and skills to realistic scenarios that better prepare them for the complexities of real-world clinical practice [
1].
Moreover, clinical practice involving patients fails to measure the diverse competencies required for effective patient care, imposes significant stress on learners, and increases concerns about patient safety [
2]. Accordingly, the importance of simulation education (SE) in nursing has been emphasized. It enables learners to experience various clinical scenarios in a safe environment, thereby enhancing their clinical performance. The introduction and utilization of SE in nursing help students acquire the clinical reasoning skills required to navigate complex, unpredictable clinical situations, meet the varied requirements of modern medical services, and ensure patient safety through educational innovation [
3].
However, several important factors must be considered to ensure the effectiveness of SE. Currently, the validity and reliability of simulation scenarios are primarily determined through evaluations by a small group of experts. Moreover, adherence to the International Nursing Association for Clinical Simulation and Learning (INACSL) Standards of Best Practice—which provide internationally recognized guidelines for simulation-based education, including Simulation Design, Outcomes and Objectives, Facilitation, Debriefing, and Evaluation—has often been found to be insufficient, particularly in the areas of Simulation Design and Participant Evaluation [
4]. A previous study developed a pediatric nursing simulation scenario template to standardize practices in pediatric nursing education [
5]. Although it provides a useful framework for scenario development, its focus on pediatric contexts limits its applicability across diverse nursing disciplines [
5]. In addition, the absence of a robust evaluation system highlights the need for a more comprehensive and reliable approach to effectively assess simulation scenarios [
5]. This undermines the consistency and reproducibility of SE, making it difficult to compare and evaluate educational outcomes. Therefore, setting goals that match learners’ levels, ensuring scenario authenticity and expert validity, and specifying debriefing procedures are necessary [
4]. This requires systematic collection and analysis of expert opinions to formulate effective scenario development and evaluation protocols.
The development of a Nursing Simulation Scenario Evaluation Tool (NS-SET) is crucial for ensuring the quality and effectiveness of simulation-based education. NS-SETs are essential for accurately measuring learners’ clinical capabilities and providing a structured framework for evaluating the acquisition of the necessary knowledge, skills, and attitudes [
6]. By using these tools, educators can objectively assess learners’ performance, offer personalized feedback, and enhance educational outcomes, thereby supporting educational objectives [
2].
Current evaluation tools (ETs) in simulation-based education often rely on limited practical exercises and focus predominantly on knowledge-based assessments. However, this approach does not comprehensively evaluate learners’ clinical skills, such as responsiveness to real patient scenarios [
7]. Moreover, the inconsistency and limited reliability of these ETs can undermine the credibility of the evaluation results [
6]. Although these tools primarily assess simulation-based education outcomes, there remains a pressing need for standardized tools and methodologies that specifically evaluate the validity and quality of simulation scenarios.
To address these issues, the Real-Time Delphi (RTD) method integrates the opinions of diverse experts to determine the components and structure of the ET scenario [
8]. This approach helps balance educational value and practical feasibility by building a broad expert consensus. The RTD method facilitates the collection and analysis of panel members’ opinions through iterative rounds of real-time feedback based on initial responses [
8]. Additionally, the Delphi method can enhance data reliability by applying predefined criteria [
9]. This process ultimately leads to more precise decision-making and plays a pivotal role in nursing NS-SET development.
Consequently, this study aimed to use the RTD method to assemble a panel of experts, including nursing and simulation specialists, to develop an NS-SET. This endeavor sought to produce a self-checklist for the direct application of the developed scenarios in education, enabling self-evaluation. Through this process, we expect to construct the foundational components for nursing simulations systematically and establish a basis for scenario development.
METHODS
Ethical statements: This study was approved by the Institutional Review Board (IRB) of Gangneung-Wonju National University (IRB no., GWNUIRB-2023-18). Informed consent was obtained from all participants.
1. Study Design
This study employed the RTD survey method to gather expert panel opinions and derive consensus on the development of an ET for nursing simulation scenarios. This study followed the Guidelines for Conducting and Reporting Delphi Studies (CREDES) [
10].
2. Survey Development
The research team developed a nursing simulation scenario template based on the INACSL Standards of Best Practice, specifically the Simulation Design and Evaluation standards, along with findings from previous studies [
4]. This template specifies the criteria that must be included in a nursing education simulation scenario. The study also aimed to develop a self-assessment checklist for the direct application of the developed scenarios to education. The review examined previous studies on developing an ET for self-assessment after the development of simulation scenarios. The questionnaire was developed from INACSL content and prior studies and integrated with the guidelines on evidence-based clinical simulation scenarios by Waxman [
11] in 2010, the five themes that must be included in nursing simulation programs described by Page-Cutrara [
12], and the Colorado Hospital Association’s guidance on how to write a simulation scenario. A total of 35 items were derived to create an ET for simulation scenarios, focusing on eight key areas: overview of the scenario, learning objectives, preparation and pre-briefing plan, creating the script and case information, debriefing, facilitation, expected outcomes and evaluation, and development of the scenario. The overview included four items assessing adherence to the scenario template standards, completeness of the scenario, plausibility of the case, and whether the materials were evidence-based. Learning objectives were assessed to determine whether they were specific, measurable, attainable, relevant, and timely. The preparation and pre-briefing plan consisted of eight items that assessed participants’ physical, mental, and prior-knowledge preparedness, as well as the simulation lab environment, materials, and simulators, including two items related to pre-briefing preparation. Creating the script and case information included 10 items that required checking the scenario algorithm and flowsheet. The debriefing section assessed whether the questions were specific and verifiable by the educator. One item in the facilitation section asked whether the questions were clearly presented. The expected outcomes and evaluation section asked whether the ETs were clear and whether their reliability was reported. Finally, the development of the scenario assessed whether developers presented theoretical bases and frameworks and verified scenario validity. This tool includes elements of patient-centered care assessment, evidence-based nursing interventions, clinical judgment, communication, teamwork, and safety, as suggested by Page-Cutrara [
12], and it incorporates proposals from Kim et al. [
4] that the validity and reliability of the scenario must be verified.
3. Study Participants
An expert panel was carefully selected to include individuals with extensive expertise and experience in this topic. The criteria for selection included more than 10 years of nursing education and practice experience or more than 5 years of experience in simulation teaching or research. Ten experts participated in this study.
4. The Real-Time Delphi Survey Process
1) Development of Real-Time Delphi tool
We developed a dedicated website using the RTD technique (
https://k-realtimedelphi.net/app/home). Using the open-source low-code development platform Budibase (
https://github.com/Budibase/budibase), we built a website that was hosted on Amazon Web Services. Automated tasks on the site (such as collecting responses from the expert panel, survey coding, and statistical analysis) had previously been performed manually, which now allows researchers and the expert panel to participate more efficiently. Furthermore, by integrating features that ensure the anonymity of the expert panel and enable the real-time exchange of opinions, we implemented the core elements of the RTD technique.
2) Survey through Real-Time Delphi
This study was conducted in five phases. In Phase 1, the researchers explained the research objectives to the recruited experts, obtained their consent, and conducted the survey simultaneously. Phase 2 involved providing a survey link to the expert panel for the preliminary investigation of the NS-SET and conducting the survey. Phase 3 involved closing the initial results and modifying the ET based on the expert panel’s responses and opinions. In Phase 4, the survey link for the modified and enhanced ETs, based on the expert panel’s opinions, was provided again to conduct the survey. Phase 5 involved completing the survey, summarizing the results, and finalizing the ET. The RTD survey was conducted in two rounds: the first survey took place from December 26, 2023, to January 10, 2024, and the second survey was conducted from February 14, 2024, to February 27, 2024. Ten experts participated in this survey.
5. Data Collection
To derive the objectives of the NS-SET, a preliminary investigation was conducted based on a draft ET developed from a literature review [
4]. A Phase I Delphi survey was conducted as an initial step in this preliminary investigation. The nursing NS-SET was surveyed and divided into the following parts: 1. Overview of the scenario; 2. Learning objectives, 3. Preparation and pre-briefing plan; 4. Script creation and case information; 5. Debriefing plan; 6. Facilitation; 7. Expected outcomes and evaluation, and 8. Scenario development. Subsequently, based on the Phase 1 RTD survey findings, the opinions of the expert panel were collected, and the template was modified and enhanced before conducting Phase 2.
1) Round 1 Real-Time Delphi survey
The Phase 1 RTD survey consisted of an initial template divided into eight parts, and experts were asked to review the retention of each part and the items and content within each part. Examples of detailed survey questions were structured as follows: Item 1, “Are the standards of the scenario template being followed?” and Item 2, “Has all content of the template been completed?” The survey responses were categorized as “keep,” “delete,” “modify,” or “additional comments.” The experts could indicate their opinion by selecting “keep” if they felt the contents were suitable for the part, “delete” if they were inappropriate, “modify” if adjustments were necessary, or provide “additional comments” for further input (
Figure 1).
2) Round 2 Real-Time Delphi survey
Following the first RTD survey on the development of the nursing simulation scenario template and considering expert opinions, the content validity ratio (CVR) and the coefficient of variation (CV) were evaluated. Based on these findings, the template was revised and enhanced. While maintaining the structure of the template, some content within each part was added or modified based on the panelists’ comments or the CVR values. Unlike the initial survey, the second round had two questions focused on whether each part and its content within the template could be used appropriately. Responses were recorded on a 4-point scale ranging from “perfect agreement” to “perfect disagreement,” and an optional “other” section was provided for additional comments.
6. Data Analysis
Data collected through the two rounds of RTD surveys were analyzed by calculating the CVR and CV for each item using Excel (Microsoft 365; Microsoft Corp.). Interpretation of the CVR values followed predefined criteria [
13].
Figure 2 shows a flowchart of the decision-making process based on the analysis results.
The CVR depends on the number of panel members. For the first Delphi survey with 10 panel members, a CVR value of 0.60 or higher was required to consider the content valid. Therefore, items with a CVR value less than 0.60 in the first survey were considered to have low content validity and were subject to deletion or modification. The CVR formula is as follows: CVR=(N_e–N/2)/(N/2), where N is the total number of expert panel members, and N_e is the number of panelists who rated the item as “appropriate”. Stability was calculated using the CV, defined as the standard deviation (SD) of each item divided by its mean. A CV less than 0.5, indicating high consistency in expert responses, was considered stable and reliable [
14].
Kendall’s W (KW) was used to assess the consistency of responses among the experts [
15]. The KW values were analyzed using Python ver. 3.9.0 (
https://www.python.org/downloads/release/python-390/). The KW value ranges from 0 to 1, where a value closer to one indicates a high degree of agreement among the evaluators, whereas a value closer to zero indicates little or no agreement. According to conventional criteria, values below 0.3 indicate low agreement, values between 0.3 and 0.7 indicate moderate agreement, and values above 0.7 indicate strong agreement. These findings highlight the diversity of opinions among the experts and suggest that there may be no consensus on this topic.
7. AI-Based Evaluation Process
Artificial intelligence (AI) technology was utilized to automate scenario evaluation. The AI system analyzes scenario texts, applies predefined evaluation criteria, and generates scores with justifications for each item. This process enhances the consistency and objectivity of the evaluation results by reducing human subjectivity and fatigue. Instant feedback was also provided, allowing evaluation results to be obtained within seconds to minutes. Google Gemini (Google LLC) was employed as the underlying AI platform to perform natural language analysis and generate automated evaluation outputs.
RESULTS
1. Round 1 Real-Time Delphi Survey Results
In Round 1 of the RTD survey, expert panel opinions were collected for 37 items evaluated using the CVR and CV. CVR values ranged from 0.20 to 1.00, and CV values ranged from 0.00 to 0.38. Several items did not meet the validity and consistency thresholds, indicating the need for further revision and review.
For item 4 (Part 1. Overview of scenario: use of evidence-based data), the question was, “Did you use evidence-based data?” Experts emphasized the importance of referencing “evidence-based data” and recommended broadening the question. This item recorded a CVR of 0.60 and a CV of 0.24, indicating the need to clarify the definition of evidence-based guidelines.
For Item 10 (Part 2. Learning objectives: timeliness of objectives presentation), which asked, “Were the objectives presented in a timely manner?” Experts expressed that the definition of “timeliness” was ambiguous, leading to varying interpretations. They questioned whether timeliness refers to synchronization with simulation events or the overall timing of training. This item recorded a CVR of 0.40 and a CV of 0.38, highlighting the need to refine this definition.
For Item 11 (Part 3. Preparation & pre-briefing plan: consideration of participants’ physical and psychological aspects), which assessed, “Were participants' physical and psychological aspects considered in the design?” Experts raised concerns about when these aspects should be addressed. This item showed a CVR of 0.60 and a CV of 0.31, emphasizing the need for more concrete criteria.
Item 19 (Part 4. Make script and case information: prediction of learner scenarios before simulation starts) asked, “Could the learner predict the scenarios before the simulation started?” Some experts suggested modifying the question to assess whether the learner had sufficient information about the scenarios rather than focusing on prediction. This item recorded a CVR of 0.60 and a CV of 0.31.
For Item 24 (Part 4. Make script and case information: planning for simulator [patient] reactions based on learner assessments and critical decision-making), the question was, “Is the simulator (patient) programmed to respond based on the learner’s assessment and critical decision-making?” This item showed a CVR of 0.60 and a CV of 0.31.
Item 32 (Part 6. Facilitation: clarity of facilitator’s role) asked, “Was the facilitator’s role clearly defined?” Experts indicated that the description of the facilitator’s role was vague and recommended clarifying the terminology. This item recorded a CVR of 0.40 and a CV of 0.38.
For Item 34 (Part 7. Expected outcomes and evaluation: indication of evaluation tool reliability), which asked, “Was the reliability of the evaluation tool indicated?” Experts stated that reliability is important; however, the item does not sufficiently define the methods to ensure and demonstrate reliability. They recommended clearer explanations and criteria to validate the reliability of the evaluation tools. In addition, some experts expressed concerns about the difficulty of presenting objective reliability within the scenario and questioned whether it was necessary to indicate the reliability of the scenario. This item recorded a CVR of 0.20 and a CV of 0.33. For Item 35 (Part 8. Development of scenario: identification of scenario developers), which asked, “Were the scenario developers identified?” No additional feedback was given, but this item recorded a CVR of 0.40 and a CV of 0.38, indicating the need for further actions. Based on expert opinions and a follow-up meeting among the researchers, we decided to retain Item 35 as an optional section rather than delete it.
Finally, Item 36 (Part 8. Development of scenario: presentation of the scenario’s foundation and theoretical framework), which asked, “Is the foundation and theoretical framework of the scenario presented?” showed a moderate level of agreement among experts. However, an additional opinion was raised suggesting that “it would be sufficient to present the scenario’s foundation based on clinically frequent occurrences or evidence-based knowledge of the task.” After a meeting among the research team members, it was concluded that “it is not necessary to present the theoretical framework,” leading to the deletion of this item.
This analysis revealed that certain items exhibited deficiencies in validity and consistency according to expert opinions, highlighting the need for further review and adjustment. In Round 1, the KW among expert responses was very low at 0.019 (chi-square [χ
2]=94.26,
p=.040) (
Table 1).
2. Round 2 Real-Time Delphi
In the Round 2 RTD survey, expert panel opinions were collected for 36 items, which were evaluated using the CVR and CV. In this survey, CVR values ranged from 0.60 to 1.00, and CV values ranged from 0.00 to 0.13. Most items showed improvements compared with Round 1, reflecting revisions made to the nursing simulation scenario template based on expert feedback. However, one item did not fully meet the validity and consistency criteria, indicating the need for further refinement.
For instance, Item 35 (Part 8. Development of scenario: identification of scenario developers), which stated, “The scenario developers were identified,” had a CVR of 0.60 and a CV of 0.13. Although this item displayed an improved CVR compared with Round 1, the relatively low agreement indicates that further action is needed to reach a higher level of expert consensus. Based on expert feedback, we decided, through internal meetings among the researchers, to keep Item 35 as an optional item rather than remove it.
The improvements in Round 2, as demonstrated by the increased agreement and alignment for most items, indicate that the revisions made to the template after Round 1 were effective in addressing many initial concerns. In Round 2, KW among expert responses was significantly higher at 0.739 (χ
2=377.022,
p<.001), indicating a substantial increase in agreement compared to Round 1 (
Table 1).
The final version of the completed Nursing Simulation Scenario Evaluation Tool is provided in the
Appendix 1.
The boxplots in
Figure 3 and
Figure 4 illustrate the distribution of responses for each item across the two rounds of the Delphi survey. The first set of boxplots visualized the experts’ responses to 37 items on a 3-point scale, while the second set represented 36 items on a 4-point scale in the second round of the survey. Both boxplots highlight key statistical features, including the mean (black dots), SD (red lines), and interquartile range (IQR, orange boxes), which capture the central tendency and variability of the responses.
In the first round (
Figure 3), items with broader IQRs, larger SDs, and lower mean values indicated areas with potential discrepancies in expert consensus or item clarity, particularly items 10, 32, 34, and 35. For some items, the absence of orange IQR boxes indicates a very high level of agreement among experts, where the first and third quartile values are identical. This pattern suggests strong consensus among panel members for those items (
Figure 3).
In comparison, the second-round (
Figure 4) boxplots reveal a reduction in the SD and IQR for most items. This indicates improved consensus and greater alignment among the experts’ opinions between the two rounds. Thus, it can be inferred that the modifications made after the first round contributed to narrowing the variability and achieving stronger agreement in the second round (
Figure 4).
Although the number of visible IQR boxplots increased compared with that in the first round, this does not indicate lower agreement. Rather, it reflects the use of a more detailed 4-point scale in the second round, which allowed experts to express their opinions more precisely.
DISCUSSION
This study aimed to develop a tool to evaluate simulation scenarios in nursing programs. The initial ET was developed based on the INACSL Standards of Best Practice, specifically focusing on Simulation Design and Participant Evaluation standards, as well as a comprehensive review of the existing literature. Using the RTD technique, eight key areas and 36 items were identified.
This study is significant because a consensus was reached on the content and tools that could be used to evaluate the validity of scenarios used in nursing simulation programs.
To date, the evaluation of simulation practice programs has primarily been used to measure learning effects regarding students’ knowledge, skills, and attitudes, as well as to analyze qualitative data such as student satisfaction and practice experience revealed during the debriefing process [
2,
6,
16,
17].
It is challenging to establish a scientific basis for evaluating the validity of an entire simulation program based solely on its effects on learning and experience. Simulation-based learning involves the development of applicable knowledge, skills, and attitudes through experiences in environments similar to real situations. The most critical influence on this process is the scenario. However, no commonly used ET exists for the simulation scenarios.
The tool was designed as a self-assessment instrument to allow simulation educators and scenario developers to independently evaluate the completeness, clarity, and validity of their simulation scenarios. A self-assessment approach offers several advantages: it enables continuous quality improvement without the need for external reviewers, fosters reflective practices among educators, and enhances practicality and accessibility in diverse educational environments. During development, the clarity, objectivity, and ease of interpretation of each item were carefully considered to minimize bias and ensure consistent application across various simulation contexts.
Therefore, reaching a consensus among experts on the tool used to assess simulation scenarios is significant. SE utilizes various methods, including low-fidelity and high-fidelity modalities, standardized patients (SPs), and virtual reality (VR). Universities often use scenarios developed by faculty within each discipline, sometimes bundled with commercial simulation programs. Although scenarios packaged with simulators apply the teaching method presented by the INACSL, they are limited when applied across different learning environments. According to Waxman [
11], educators must develop scenarios that imitate real-life situations in controlled environments to cultivate optimal competencies. Developed scenarios require standardization using tools that evaluate usefulness and feasibility. These standards are essential for applying scenarios to various situations and cases, particularly in the context of interprofessional education (IPE). IPE involves collaborative learning among students from different healthcare professions, which is crucial for preparing them to work effectively in interdisciplinary teams. Thus, the scenario ET derived in this study serves as a foundational step toward establishing standardized criteria for simulation scenarios, ensuring that they are beneficial across multiple disciplines.
The scenario ET constructed in this study integrates INACSL and existing research [
11,
12] and includes sections such as “overview of scenario,” “learning objectives,” “preparation and pre-debriefing,” and “script creation and care information.” A total of eight areas and 36 items were developed, including “debriefing plan,” “facilitation,” “expected outcomes and evaluation,” and “development of scenario.” Page-Cutrara [
12] reviewed existing simulation ETs and categorized “patient-centered care and assessment,” “evidence-based nursing intervention and clinical judgment,” “communication and teamwork,” and “safety” as elements to be included in the simulation scenario. The guidelines for scenario development by Waxman [
11] include “learning objectives,” “assessment plan and instrument,” “evidence base for objectivity and assessment,” “prescenario learner activities,” “general debriefing plan,” and “validation.” The areas of “testing,” “facilitation,” and “debriefing” were also presented. In this study, overlapping areas were integrated based on the literature and INACSL standards.
The evaluation area presented by Page-Cutrara [
12] was biased toward scenario-centered evaluations. The guidelines for developing simulation scenarios by Waxman [
11] are valuable because they demonstrate not only the scenarios in the study, but also the process guidelines that can be used when developing simulation programs. Additionally, a checklist consisting of four areas was presented to evaluate the feasibility of the scenario: curricular integration, scenario script, simulation team information, and debriefing. However, these standards are not sufficiently specific for use as ETs, and they are difficult to apply universally to various methods and situations, such as low-fidelity, high-fidelity, SPs, and IPE.
In this study, two Delphi surveys led to the deletion and revision of one item with overlapping content and two items judged inappropriate, resulting in a final set of 36 items.
Of the original 37 items, 33 achieved a CVR of 0.80 or higher, indicating a strong expert consensus on the tool’s component areas and items. However, in the first survey, content validity was low in some subcategories of facilitation, expected outcomes, evaluation, and development scenarios. Specifically, clarification of the facilitator’s role within the facilitation area was weak. SE emphasizes self-directed learning and the use of learners’ experiences, including shared needs assessment, goal setting, the development and implementation of learning plans, and evaluation of learning outcomes. In this process, the instructor guides adult learners, helping them apply their prior experience to real-world problems and understand the rationale behind learning [
11]. Accordingly, evaluation of the facilitation section is essential. Through discussion, the research team revised and clarified this section; in the second survey, it achieved a CVR of 1.0. Retaining this item implies that the instructor’s role must be specified in detail when developing future simulation programs and scenarios.
Among items with a CVR of 0.60 or lower, “consideration of patient safety in scenarios” in the “overview of scenario” area and “consideration of participants’ physical and psychological aspects” in the “preparation and prebriefing plan” area conveyed duplicative content. This item was revised and integrated into the question “Was the simulation designed considering the physical and mental stability of the participants?” to emphasize the importance of safety and participant well-being in simulation design. The reliability of simulation-based assessments can also be supported by frameworks such as the objective structured clinical examination (OSCE), which allows consistent and objective evaluation of participants’ performance, even by assessors without prior program design or field-specific expertise [
18]. Therefore, adopting structured evaluation principles similar to those of the OSCE can enhance the objectivity and reproducibility of simulation scenario assessments.
Moreover, Page-Cutrara [
12] identified safety as a scenario evaluation area; therefore, in this study, the two aforementioned items were combined into one, the CVR was 0.80, and a second survey was ultimately adopted. In this study, the items were modified and supplemented so that the scenarios of the simulation program could be universally evaluated.
As simulation learning involves generating practical knowledge through experience, the validity of the scenario content must have a scientific basis. Therefore, the tool developed in this study can be used effectively and universally to evaluate simulation scenarios.
This study aims to develop a tool for evaluating simulation scenarios using a Delphi survey involving experts with extensive SE experience. Through this process, the quality of SE can be improved by establishing the validity of simulation practice education and specifying standards through evaluation. This can be used as a standard to determine the level and scope of scenario content when developing simulation programs for each major field.
Although the RTD method has advantages such as anonymity, iterative feedback, and efficiency, it also has limitations. Reliance on a relatively small number of experts may restrict the generalizability of the findings, and repeated surveys can contribute to participant fatigue, potentially influencing the quality of responses. Furthermore, this study was conducted with a limited panel size within a specific national context, which may limit the broader applicability of the results. Future research should therefore expand the diversity and number of expert participants to strengthen the robustness and external validity of the tool.
CONCLUSION
This study aims to develop a simulation program scenario for ET based on consensus from a group of experts. First, an integrated tool was developed by reviewing the existing literature, and eight areas and 36 items were confirmed by gathering experts’ opinions. This tool is significant because it improves the quality of SE by ensuring validity in simulation-based practice education and by specifying standards through evaluation. Furthermore, the NS-SET developed in this study integrates the Google Gemini AI to enable intelligent, automated scenario evaluations. This system automates the entire process (from PDF file upload to the AI’s in-depth analysis and presentation of evaluation results), substantially increasing evaluation efficiency and objectivity. The AI provides not only an evaluation of each item but also the rationale for each assessment, ensuring transparency and offering concrete insights for scenario improvement. However, because this study confirmed only the content validity of the tool using the Delphi technique, future work should assess the validity and reliability of the ET scenario. Based on this tool, we hope that applied studies of simulation programs and scenarios will continue.
ARTICLE INFORMATION
Figure 1.Example of round 1 Real-Time Delphi survey.
Figure 2.Flowchart of real time Delphi survey. CVR, content validity ratio; INACSL, International Nursing Association for Clinical Simulation and Learning.
Figure 3.Boxplot of experts’ responses (round 1).
Figure 4.Boxplot of experts’ responses (round 2).
Table 1.Results of the round 1 and 2 Real-Time Delphi for the development of nursing simulation scenario evaluation tool
|
Comments |
Order in round 1 |
Order in round 2 |
|
CVR |
CV |
Decision |
CVR |
CV |
|
Part 1. Overview of scenario |
|
|
|
|
|
|
1. Format conformity with scenario template |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
2. Completeness of content in template |
0.80 |
0.21 |
Retained |
1.00 |
0.13 |
|
3. Case plausibility |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
4. Use of evidence-based data |
0.60 |
0.24 |
Revised |
1.00 |
0.13 |
|
5. Consideration of patient safety in scenarios |
1.00 |
0.00 |
Retained |
1.00 |
0.00 |
|
Part 2. Learning objectives |
|
|
|
|
|
|
6. Specificity of learning objectives |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
7. Measurability of specific skills and behaviors |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
8. Attainability considering participants’ knowledge and experience |
0.80 |
0.10 |
Retained |
1.00 |
0.13 |
|
9. Relevance to participants’ knowledge and experience |
0.60 |
0.24 |
Retained |
0.80 |
0.13 |
|
10. Timeliness of objectives presentation |
0.40 |
0.38 |
Revised |
0.80 |
0.13 |
|
Part 3. Preparation & pre-briefing plan |
|
|
|
|
|
|
11. Consideration of participants’ physical and psychological aspects |
0.60 |
0.31 |
Revised |
0.80 |
0.13 |
|
12. Inclusion of pre-existing knowledge and attitudes of participants |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
13. Appropriate type of simulators |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
14. Suitable environment and settings |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
15. Adequate equipment and tools for simulation education |
0.80 |
0.21 |
Retained |
1.00 |
0.13 |
|
16. Clear delineation of roles for both simulator and participants |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
17. Provision of pre-briefing plan checklist |
0.80 |
0.21 |
Retained |
1.00 |
0.13 |
|
18. Techniques for establishing psychological environment in pre-briefing plan |
0.60 |
0.24 |
Retained |
0.80 |
0.13 |
|
Part 4. Make script & the case information |
|
|
|
|
|
|
19. Prediction of learner scenarios before simulation starts |
0.60 |
0.31 |
Revised |
1.00 |
0.13 |
|
20. Providing a realistic starting point |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
21. Providing learner nursing interventions based on patient condition |
0.80 |
0.10 |
Retained |
1.00 |
0.13 |
|
22. Planning for positive/negative changes in simulator situation by learner interventions |
0.80 |
0.10 |
Retained |
1.00 |
0.13 |
|
23. Facilitating when learner interventions do not occur |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
24. Planning for simulator (patient) reactions based on learner assessments and critical decision-making |
0.60 |
0.31 |
Revised |
1.00 |
0.13 |
|
25. Confirmation of patient stability |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
26. Explanation of team roles for learners |
0.80 |
0.21 |
Retained |
1.00 |
0.13 |
|
27. Realism and adequacy of patient situation, data, and records |
0.80 |
0.21 |
Retained |
1.00 |
0.13 |
|
28. Is the patient’s situation, data, and records sufficient to perform interventions? |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
Part 5. Debriefing plan |
|
|
|
|
|
|
29. Debriefing questions that confirm objectives and expected outcomes |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
30. Planned debriefing methods |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
31. Debriefing content easily accessible to educators |
0.80 |
0.21 |
Retained |
1.00 |
0.13 |
|
Part 6. Facilitation |
|
|
|
|
|
|
32. Clarity of facilitator’s role |
0.40 |
0.38 |
Revised |
1.00 |
0.13 |
|
Part 7. Expected outcomes & evaluation |
|
|
|
|
|
|
33. Clarity of evaluation tools for assessing goal achievement |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
34. Indication of evaluation tool reliability |
0.20 |
0.33 |
Revised |
1.00 |
0.13 |
|
Part 8. Development of scenario |
|
|
|
|
|
|
35. Identification of scenario developers |
0.40 |
0.38 |
Revised |
0.60 |
0.13 |
|
36. Presentation of scenario’s foundation and theoretical framework |
0.80 |
0.21 |
Deleted |
- |
- |
|
37.Scenario validity |
1.00 |
0.00 |
Retained |
1.00 |
0.13 |
|
W value |
0.019 |
0.73926 |
|
X2 value |
94.26 |
377.022 |
|
p
|
.040 |
<.001 |
REFERENCES
- 1. McGaghie WC, Issenberg SB, Petrusa ER, Scalese RJ. Revisiting ‘A critical review of simulation-based medical education research: 2003-2009’. Med Educ. 2016;50(10):986-991. https://doi.org/10.1111/medu.12795
- 2. Park S, Hur HK, Chung C. Learning effects of virtual versus high-fidelity simulations in nursing students: a crossover comparison. BMC Nurs. 2022;21(1):100. https://doi.org/10.1186/s12912-022-00878-2
- 3. McCaughey CS, Traynor MK. The role of simulation in nurse education. Nurse Educ Today. 2010;30(8):827-832. https://doi.org/10.1016/j.nedt.2010.03.005
- 4. Kim EJ, Cho KM, Song SS. Child nursing simulation scenario content analysis: a directed qualitative content analysis. Clin Simul Nurs. 2024;87:101488. https://doi.org/10.1016/j.ecns.2023.101488
- 5. Kim EJ, Lee MH, Park B. Developing a pediatric nursing simulation scenario template in South Korea: applying Real-Time Delphi methods. Child Health Nurs Res. 2024;30(2):142-153. https://doi.org/10.4094/chnr.2024.012
- 6. Hanshaw SL, Dickerson SS. High fidelity simulation evaluation studies in nursing education: a review of the literature. Nurse Educ Pract. 2020;46:102818. https://doi.org/10.1016/j.nepr.2020.102818
- 7. Bambini D. Writing a simulation scenario: a step-by-step guide. AACN Adv Crit Care. 2016;27(1):62-70. https://doi.org/10.4037/aacnacc2016986
- 8. Aengenheyster S, Cuhls K, Gerhold L, Heiskanen-Schüttler M, Huck J, Muszynska M. Real-Time Delphi in practice: a comparative analysis of existing software-based tools. Tech Forecast Soc Chang. 2017;118:15-27. https://doi.org/10.1016/j.techfore.2017.01.023
- 9. Varndell W, Fry M, Elliott D. Applying Real-Time Delphi methods: development of a pain management survey in emergency nursing. BMC Nurs. 2021;20(1):149. https://doi.org/10.1186/s12912-021-00661-9
- 10. Jünger S, Payne SA, Brine J, Radbruch L, Brearley SG. Guidance on Conducting and REporting DElphi Studies (CREDES) in palliative care: recommendations based on a methodological systematic review. Palliat Med. 2017;31(8):684-706. https://doi.org/10.1177/0269216317690685
- 11. Waxman KT. The development of evidence-based clinical simulation scenarios: guidelines for nurse educators. J Nurs Educ. 2010;49(1):29-35. https://doi.org/10.3928/01484834-20090916-07
- 12. Page-Cutrara K. Prebriefing in nursing simulation: a concept analysis. Clin Simul Nurs. 2015;11(7):335-340. https://doi.org/10.1016/j.ecns.2015.05.001
- 13. Lawshe CH. A quantitative approach to content validity. Pers Psychol. 1975;28(4):563-575. https://doi.org/10.1111/j.1744-6570.1975.tb01393.x
- 14. Lee J. Delphi method. Kyoyukkwahaksa; 2001.
- 15. Kendall MG. A new measure of rank correlation. Biometrika. 1938;30(1/2):81-93. https://doi.org/10.2307/2332226
- 16. Fawaz MA, Hamdan-Mansour AM. Impact of high-fidelity simulation on the development of clinical judgment and motivation among Lebanese nursing students. Nurse Educ Today. 2016;46:36-42. https://doi.org/10.1016/j.nedt.2016.08.026
- 17. Yuan HB, Williams BA, Fang JB. The contribution of high-fidelity simulation to nursing students’ confidence and competence: a systematic review. Int Nurs Rev. 2012;59(1):26-33. https://doi.org/10.1111/j.1466-7657.2011.00964.x
- 18. García-Mayor S, Quemada-González C, León-Campos Á, Kaknani-Uttumchandani S, Gutiérrez-Rodríguez L, Del Mar Carmona-Segovia A, et al. Nursing students’ perceptions on the use of clinical simulation in psychiatric and mental health nursing by means of objective structured clinical examination (OSCE). Nurse Educ Today. 2021;100:104866. https://doi.org/10.1016/j.nedt.2021.104866
Appendix
Appendix 1.
- Nursing Simulation Scenario Evaluation Tool, for development of an artificial intelligence-based nursing simulation scenario evaluation tool using the Real-Time Delphi method
Nursing Simulation Scenario Evaluation Tool (V2)
This tool is designed to evaluate the validity and reliability of a nursing simulation scenario. Please rate each item accordingly.
I. Overview of scenario
1. The scenario was constructed according to the standard scenario template.
□ Strongly agree □ Somewhat agree □ Disagree
2. All contents of the scenario template have been completed.
□ Strongly agree □ Somewhat agree □ Disagree
3. The case in the scenario is appropriate for the learning objectives.
□ Strongly agree □ Somewhat agree □ Disagree
4. The scenario contents were developed based on practical evidence.
□ Strongly agree □ Somewhat agree □ Disagree
5. Was patient safety considered in all aspects of the scenario?
□ Strongly agree □ Somewhat agree □ Disagree
II. Learning objectives
6. The learning objectives were presented specifically.
□ Strongly agree □ Somewhat agree □ Disagree
7. Specific skills and behaviors were presented to be measurable.
□ Strongly agree □ Somewhat agree □ Disagree
8. The objectives were made achievable considering the participants' knowledge and experience.
□ Strongly agree □ Somewhat agree □ Disagree
9. The learning objectives were presented considering the participants’ level and experience.
□ Strongly agree □ Somewhat agree □ Disagree
10. The objectives were planned to be achieved quickly and effectively within the time constraints.
□ Strongly agree □ Somewhat agree □ Disagree
III. Preparation & pre-briefing plan
11. The simulation was designed considering the physical and psychological stability of the participants.
□ Strongly agree □ Somewhat agree □ Disagree
12. Information about the participants’ prior knowledge and attitudes was included.
□ Strongly agree □ Somewhat agree □ Disagree
13. The appropriate type of simulators was selected.
□ Strongly agree □ Somewhat agree □ Disagree
14. A suitable environment and settings were selected.
□ Strongly agree □ Somewhat agree □ Disagree
15. All appropriate equipment and tools for simulation education were selected.
□ Strongly agree □ Somewhat agree □ Disagree
16. The roles of both the simulators and participants were clearly defined.
□ Strongly agree □ Somewhat agree □ Disagree
17. A checklist for the pre-briefing plan was prepared.
□ Strongly agree □ Somewhat agree □ Disagree
18. Techniques for establishing a psychological environment in the pre-briefing plan were described.
□ Strongly agree □ Somewhat agree □ Disagree
IV. Make script & the case information
19. Participants can predict the scenario before the simulation begins.
□ Strongly agree □ Somewhat agree □ Disagree
20. A realistic starting point for the nursing intervention was presented.
□ Strongly agree □ Somewhat agree □ Disagree
21. Nursing interventions based on the patient’s condition were provided.
□ Strongly agree □ Somewhat agree □ Disagree
22. The scenario was designed to reflect positive or negative changes in the simulator based on participants’ interventions.
□ Strongly agree □ Somewhat agree □ Disagree
23. Plans were made to facilitate interventions when participants did not intervene.
□ Strongly agree □ Somewhat agree □ Disagree
24. Plans were made to alter the simulator’s (patient’s) response based on participants’ assessment and critical decision-making.
□ Strongly agree □ Somewhat agree □ Disagree
25. Participants can confirm the simulator’s stable condition.
□ Strongly agree □ Somewhat agree □ Disagree
26. The roles of the participants in teams were clearly explained.
□ Strongly agree □ Somewhat agree □ Disagree
27. The patient’s situation data and records appeared realistic.
□ Strongly agree □ Somewhat agree □ Disagree
28. The patient’s situation data and records were sufficiently presented to perform interventions.
□ Strongly agree □ Somewhat agree □ Disagree
V. Debriefing plan
29. The debriefing questions were designed to confirm objectives and expected outcomes.
□ Strongly agree □ Somewhat agree □ Disagree
30. Specific debriefing methods were planned.
□ Strongly agree □ Somewhat agree □ Disagree
31. Debriefing content was made easily accessible to the participants.
□ Strongly agree □ Somewhat agree □ Disagree
VI. facilitation
32. The role of the facilitator (instructor) was clearly defined.
□ Strongly agree □ Somewhat agree □ Disagree
VI. Expected outcomes & evaluation
33. The evaluation tool was clearly presented to assess the achievement of the objectives.
□ Strongly agree □ Somewhat agree □ Disagree
34. The evaluation tool accurately assesses the achievement of the scenario’s objectives.
VII. Development of scenario
35. The scenario developers were identified (option).
□ Yes □ No
36. The validity of the scenario was confirmed through a process (e.g., content validity index).
□ Yes □ No