• KACHN
  • Contact us
  • E-Submission
ABOUT
BROWSE ARTICLES
EDITORIAL POLICY
FOR CONTRIBUTORS

Articles

Original Article

Development of an artificial intelligence-based nursing simulation scenario evaluation tool: a methodological study using the Real-Time Delphi method in South Korea

Child Health Nursing Research 2025;31(4):257-271.
Published online: September 23, 2025
 

1Associate Professor, Department of Nursing, Gangneung-Wonju National University, Wonju, Korea

2Professor, Department of Nursing, Gangdong University, Eumseong, Korea

3Professor, Department of Nursing, Gangneung-Wonju National University, Wonju, Korea

4Professor, Department of Nursing, Inha University, Incheon, Korea

5PhD Student, Department of Nursing, Gangneung-Wonju National University, Wonju, Korea

Corresponding author Bitna Park Department of Nursing, Gangdong University, 278 Daehak-gil, Gamgok-myeon, Eumseong 27600, Korea Tel: +82-43-879-3043 Fax: +82-43-879-3021 E-mail: bnabark@hanmail.net
• Received: July 15, 2025   • Accepted: September 8, 2025

© 2025 Korean Academy of Child Health Nursing.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial and No Derivatives License (https://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted non-commercial use, distribution of the material without any modifications, and reproduction in any medium, provided the original works properly cited.

  • 733 Views
  • 45 Download
prev next
  • Purpose
    Simulation-based education plays a critical role in nursing by allowing students to acquire clinical competencies in a safe and controlled environment. However, current evaluation tools for simulation scenarios often lack standardization, resulting in inconsistencies when assessing the effectiveness of such programs.
  • Methods
    This study aimed to develop a comprehensive Nursing Simulation Scenario Evaluation Tool using the Real-Time Delphi method. A panel of 10 experts in nursing and simulation education participated in two rounds of surveys. The evaluation criteria were derived from the International Nursing Association for Clinical Simulation and Learning Standards of Best Practice and relevant literature. Survey items were refined through expert consensus using content validity ratios and coefficient of variation values. The finalized tool was further enhanced with artificial intelligence (AI)–based evaluation capabilities to support objective and systematic assessment. The tool was registered and patented in the Republic of Korea (Korean Intellectual Property Office Registration No. 10-2024-0051234) to acknowledge its innovation and technical merit.
  • Results
    The process resulted in an evaluation tool comprising eight key domains and 36 items, covering scenario structure, learning objectives, preparation, script development, debriefing, facilitation, expected outcomes, and scenario validity. A Kendall’s coefficient of concordance of 0.739 indicated strong agreement among the experts.
  • Conclusion
    This study successfully developed a standardized and validated tool to improve the reliability and effectiveness of simulation-based education in nursing. The tool addresses a key gap in current educational practices and enhances consistency in evaluating nursing simulation scenarios. Future studies should focus on validating its application across diverse educational environments.
Despite the continuous development and increasing complexity of the medical field in the 21st century, the basic structure and format of medical education have remained largely unchanged since the late 19th century [1]. This is particularly significant in nursing, where nurses play a crucial role in ensuring patient safety and health. Nurses must acquire the necessary competencies across various situations before engaging in clinical practice with patients [2]. Traditional knowledge tests play an important role in the prelearning phase of simulation-based education, but are insufficient on their own. In simulation-based education, learners apply previously acquired knowledge and skills to realistic scenarios that better prepare them for the complexities of real-world clinical practice [1].
Moreover, clinical practice involving patients fails to measure the diverse competencies required for effective patient care, imposes significant stress on learners, and increases concerns about patient safety [2]. Accordingly, the importance of simulation education (SE) in nursing has been emphasized. It enables learners to experience various clinical scenarios in a safe environment, thereby enhancing their clinical performance. The introduction and utilization of SE in nursing help students acquire the clinical reasoning skills required to navigate complex, unpredictable clinical situations, meet the varied requirements of modern medical services, and ensure patient safety through educational innovation [3].
However, several important factors must be considered to ensure the effectiveness of SE. Currently, the validity and reliability of simulation scenarios are primarily determined through evaluations by a small group of experts. Moreover, adherence to the International Nursing Association for Clinical Simulation and Learning (INACSL) Standards of Best Practice—which provide internationally recognized guidelines for simulation-based education, including Simulation Design, Outcomes and Objectives, Facilitation, Debriefing, and Evaluation—has often been found to be insufficient, particularly in the areas of Simulation Design and Participant Evaluation [4]. A previous study developed a pediatric nursing simulation scenario template to standardize practices in pediatric nursing education [5]. Although it provides a useful framework for scenario development, its focus on pediatric contexts limits its applicability across diverse nursing disciplines [5]. In addition, the absence of a robust evaluation system highlights the need for a more comprehensive and reliable approach to effectively assess simulation scenarios [5]. This undermines the consistency and reproducibility of SE, making it difficult to compare and evaluate educational outcomes. Therefore, setting goals that match learners’ levels, ensuring scenario authenticity and expert validity, and specifying debriefing procedures are necessary [4]. This requires systematic collection and analysis of expert opinions to formulate effective scenario development and evaluation protocols.
The development of a Nursing Simulation Scenario Evaluation Tool (NS-SET) is crucial for ensuring the quality and effectiveness of simulation-based education. NS-SETs are essential for accurately measuring learners’ clinical capabilities and providing a structured framework for evaluating the acquisition of the necessary knowledge, skills, and attitudes [6]. By using these tools, educators can objectively assess learners’ performance, offer personalized feedback, and enhance educational outcomes, thereby supporting educational objectives [2].
Current evaluation tools (ETs) in simulation-based education often rely on limited practical exercises and focus predominantly on knowledge-based assessments. However, this approach does not comprehensively evaluate learners’ clinical skills, such as responsiveness to real patient scenarios [7]. Moreover, the inconsistency and limited reliability of these ETs can undermine the credibility of the evaluation results [6]. Although these tools primarily assess simulation-based education outcomes, there remains a pressing need for standardized tools and methodologies that specifically evaluate the validity and quality of simulation scenarios.
To address these issues, the Real-Time Delphi (RTD) method integrates the opinions of diverse experts to determine the components and structure of the ET scenario [8]. This approach helps balance educational value and practical feasibility by building a broad expert consensus. The RTD method facilitates the collection and analysis of panel members’ opinions through iterative rounds of real-time feedback based on initial responses [8]. Additionally, the Delphi method can enhance data reliability by applying predefined criteria [9]. This process ultimately leads to more precise decision-making and plays a pivotal role in nursing NS-SET development.
Consequently, this study aimed to use the RTD method to assemble a panel of experts, including nursing and simulation specialists, to develop an NS-SET. This endeavor sought to produce a self-checklist for the direct application of the developed scenarios in education, enabling self-evaluation. Through this process, we expect to construct the foundational components for nursing simulations systematically and establish a basis for scenario development.
Ethical statements: This study was approved by the Institutional Review Board (IRB) of Gangneung-Wonju National University (IRB no., GWNUIRB-2023-18). Informed consent was obtained from all participants.
1. Study Design
This study employed the RTD survey method to gather expert panel opinions and derive consensus on the development of an ET for nursing simulation scenarios. This study followed the Guidelines for Conducting and Reporting Delphi Studies (CREDES) [10].
2. Survey Development
The research team developed a nursing simulation scenario template based on the INACSL Standards of Best Practice, specifically the Simulation Design and Evaluation standards, along with findings from previous studies [4]. This template specifies the criteria that must be included in a nursing education simulation scenario. The study also aimed to develop a self-assessment checklist for the direct application of the developed scenarios to education. The review examined previous studies on developing an ET for self-assessment after the development of simulation scenarios. The questionnaire was developed from INACSL content and prior studies and integrated with the guidelines on evidence-based clinical simulation scenarios by Waxman [11] in 2010, the five themes that must be included in nursing simulation programs described by Page-Cutrara [12], and the Colorado Hospital Association’s guidance on how to write a simulation scenario. A total of 35 items were derived to create an ET for simulation scenarios, focusing on eight key areas: overview of the scenario, learning objectives, preparation and pre-briefing plan, creating the script and case information, debriefing, facilitation, expected outcomes and evaluation, and development of the scenario. The overview included four items assessing adherence to the scenario template standards, completeness of the scenario, plausibility of the case, and whether the materials were evidence-based. Learning objectives were assessed to determine whether they were specific, measurable, attainable, relevant, and timely. The preparation and pre-briefing plan consisted of eight items that assessed participants’ physical, mental, and prior-knowledge preparedness, as well as the simulation lab environment, materials, and simulators, including two items related to pre-briefing preparation. Creating the script and case information included 10 items that required checking the scenario algorithm and flowsheet. The debriefing section assessed whether the questions were specific and verifiable by the educator. One item in the facilitation section asked whether the questions were clearly presented. The expected outcomes and evaluation section asked whether the ETs were clear and whether their reliability was reported. Finally, the development of the scenario assessed whether developers presented theoretical bases and frameworks and verified scenario validity. This tool includes elements of patient-centered care assessment, evidence-based nursing interventions, clinical judgment, communication, teamwork, and safety, as suggested by Page-Cutrara [12], and it incorporates proposals from Kim et al. [4] that the validity and reliability of the scenario must be verified.
3. Study Participants
An expert panel was carefully selected to include individuals with extensive expertise and experience in this topic. The criteria for selection included more than 10 years of nursing education and practice experience or more than 5 years of experience in simulation teaching or research. Ten experts participated in this study.
4. The Real-Time Delphi Survey Process

1) Development of Real-Time Delphi tool

We developed a dedicated website using the RTD technique (https://k-realtimedelphi.net/app/home). Using the open-source low-code development platform Budibase (https://github.com/Budibase/budibase), we built a website that was hosted on Amazon Web Services. Automated tasks on the site (such as collecting responses from the expert panel, survey coding, and statistical analysis) had previously been performed manually, which now allows researchers and the expert panel to participate more efficiently. Furthermore, by integrating features that ensure the anonymity of the expert panel and enable the real-time exchange of opinions, we implemented the core elements of the RTD technique.

2) Survey through Real-Time Delphi

This study was conducted in five phases. In Phase 1, the researchers explained the research objectives to the recruited experts, obtained their consent, and conducted the survey simultaneously. Phase 2 involved providing a survey link to the expert panel for the preliminary investigation of the NS-SET and conducting the survey. Phase 3 involved closing the initial results and modifying the ET based on the expert panel’s responses and opinions. In Phase 4, the survey link for the modified and enhanced ETs, based on the expert panel’s opinions, was provided again to conduct the survey. Phase 5 involved completing the survey, summarizing the results, and finalizing the ET. The RTD survey was conducted in two rounds: the first survey took place from December 26, 2023, to January 10, 2024, and the second survey was conducted from February 14, 2024, to February 27, 2024. Ten experts participated in this survey.
5. Data Collection
To derive the objectives of the NS-SET, a preliminary investigation was conducted based on a draft ET developed from a literature review [4]. A Phase I Delphi survey was conducted as an initial step in this preliminary investigation. The nursing NS-SET was surveyed and divided into the following parts: 1. Overview of the scenario; 2. Learning objectives, 3. Preparation and pre-briefing plan; 4. Script creation and case information; 5. Debriefing plan; 6. Facilitation; 7. Expected outcomes and evaluation, and 8. Scenario development. Subsequently, based on the Phase 1 RTD survey findings, the opinions of the expert panel were collected, and the template was modified and enhanced before conducting Phase 2.

1) Round 1 Real-Time Delphi survey

The Phase 1 RTD survey consisted of an initial template divided into eight parts, and experts were asked to review the retention of each part and the items and content within each part. Examples of detailed survey questions were structured as follows: Item 1, “Are the standards of the scenario template being followed?” and Item 2, “Has all content of the template been completed?” The survey responses were categorized as “keep,” “delete,” “modify,” or “additional comments.” The experts could indicate their opinion by selecting “keep” if they felt the contents were suitable for the part, “delete” if they were inappropriate, “modify” if adjustments were necessary, or provide “additional comments” for further input (Figure 1).

2) Round 2 Real-Time Delphi survey

Following the first RTD survey on the development of the nursing simulation scenario template and considering expert opinions, the content validity ratio (CVR) and the coefficient of variation (CV) were evaluated. Based on these findings, the template was revised and enhanced. While maintaining the structure of the template, some content within each part was added or modified based on the panelists’ comments or the CVR values. Unlike the initial survey, the second round had two questions focused on whether each part and its content within the template could be used appropriately. Responses were recorded on a 4-point scale ranging from “perfect agreement” to “perfect disagreement,” and an optional “other” section was provided for additional comments.
6. Data Analysis
Data collected through the two rounds of RTD surveys were analyzed by calculating the CVR and CV for each item using Excel (Microsoft 365; Microsoft Corp.). Interpretation of the CVR values followed predefined criteria [13]. Figure 2 shows a flowchart of the decision-making process based on the analysis results.
The CVR depends on the number of panel members. For the first Delphi survey with 10 panel members, a CVR value of 0.60 or higher was required to consider the content valid. Therefore, items with a CVR value less than 0.60 in the first survey were considered to have low content validity and were subject to deletion or modification. The CVR formula is as follows: CVR=(N_e–N/2)/(N/2), where N is the total number of expert panel members, and N_e is the number of panelists who rated the item as “appropriate”. Stability was calculated using the CV, defined as the standard deviation (SD) of each item divided by its mean. A CV less than 0.5, indicating high consistency in expert responses, was considered stable and reliable [14].
Kendall’s W (KW) was used to assess the consistency of responses among the experts [15]. The KW values were analyzed using Python ver. 3.9.0 (https://www.python.org/downloads/release/python-390/). The KW value ranges from 0 to 1, where a value closer to one indicates a high degree of agreement among the evaluators, whereas a value closer to zero indicates little or no agreement. According to conventional criteria, values below 0.3 indicate low agreement, values between 0.3 and 0.7 indicate moderate agreement, and values above 0.7 indicate strong agreement. These findings highlight the diversity of opinions among the experts and suggest that there may be no consensus on this topic.
7. AI-Based Evaluation Process
Artificial intelligence (AI) technology was utilized to automate scenario evaluation. The AI system analyzes scenario texts, applies predefined evaluation criteria, and generates scores with justifications for each item. This process enhances the consistency and objectivity of the evaluation results by reducing human subjectivity and fatigue. Instant feedback was also provided, allowing evaluation results to be obtained within seconds to minutes. Google Gemini (Google LLC) was employed as the underlying AI platform to perform natural language analysis and generate automated evaluation outputs.
1. Round 1 Real-Time Delphi Survey Results
In Round 1 of the RTD survey, expert panel opinions were collected for 37 items evaluated using the CVR and CV. CVR values ranged from 0.20 to 1.00, and CV values ranged from 0.00 to 0.38. Several items did not meet the validity and consistency thresholds, indicating the need for further revision and review.
For item 4 (Part 1. Overview of scenario: use of evidence-based data), the question was, “Did you use evidence-based data?” Experts emphasized the importance of referencing “evidence-based data” and recommended broadening the question. This item recorded a CVR of 0.60 and a CV of 0.24, indicating the need to clarify the definition of evidence-based guidelines.
For Item 10 (Part 2. Learning objectives: timeliness of objectives presentation), which asked, “Were the objectives presented in a timely manner?” Experts expressed that the definition of “timeliness” was ambiguous, leading to varying interpretations. They questioned whether timeliness refers to synchronization with simulation events or the overall timing of training. This item recorded a CVR of 0.40 and a CV of 0.38, highlighting the need to refine this definition.
For Item 11 (Part 3. Preparation & pre-briefing plan: consideration of participants’ physical and psychological aspects), which assessed, “Were participants' physical and psychological aspects considered in the design?” Experts raised concerns about when these aspects should be addressed. This item showed a CVR of 0.60 and a CV of 0.31, emphasizing the need for more concrete criteria.
Item 19 (Part 4. Make script and case information: prediction of learner scenarios before simulation starts) asked, “Could the learner predict the scenarios before the simulation started?” Some experts suggested modifying the question to assess whether the learner had sufficient information about the scenarios rather than focusing on prediction. This item recorded a CVR of 0.60 and a CV of 0.31.
For Item 24 (Part 4. Make script and case information: planning for simulator [patient] reactions based on learner assessments and critical decision-making), the question was, “Is the simulator (patient) programmed to respond based on the learner’s assessment and critical decision-making?” This item showed a CVR of 0.60 and a CV of 0.31.
Item 32 (Part 6. Facilitation: clarity of facilitator’s role) asked, “Was the facilitator’s role clearly defined?” Experts indicated that the description of the facilitator’s role was vague and recommended clarifying the terminology. This item recorded a CVR of 0.40 and a CV of 0.38.
For Item 34 (Part 7. Expected outcomes and evaluation: indication of evaluation tool reliability), which asked, “Was the reliability of the evaluation tool indicated?” Experts stated that reliability is important; however, the item does not sufficiently define the methods to ensure and demonstrate reliability. They recommended clearer explanations and criteria to validate the reliability of the evaluation tools. In addition, some experts expressed concerns about the difficulty of presenting objective reliability within the scenario and questioned whether it was necessary to indicate the reliability of the scenario. This item recorded a CVR of 0.20 and a CV of 0.33. For Item 35 (Part 8. Development of scenario: identification of scenario developers), which asked, “Were the scenario developers identified?” No additional feedback was given, but this item recorded a CVR of 0.40 and a CV of 0.38, indicating the need for further actions. Based on expert opinions and a follow-up meeting among the researchers, we decided to retain Item 35 as an optional section rather than delete it.
Finally, Item 36 (Part 8. Development of scenario: presentation of the scenario’s foundation and theoretical framework), which asked, “Is the foundation and theoretical framework of the scenario presented?” showed a moderate level of agreement among experts. However, an additional opinion was raised suggesting that “it would be sufficient to present the scenario’s foundation based on clinically frequent occurrences or evidence-based knowledge of the task.” After a meeting among the research team members, it was concluded that “it is not necessary to present the theoretical framework,” leading to the deletion of this item.
This analysis revealed that certain items exhibited deficiencies in validity and consistency according to expert opinions, highlighting the need for further review and adjustment. In Round 1, the KW among expert responses was very low at 0.019 (chi-square [χ2]=94.26, p=.040) (Table 1).
2. Round 2 Real-Time Delphi
In the Round 2 RTD survey, expert panel opinions were collected for 36 items, which were evaluated using the CVR and CV. In this survey, CVR values ranged from 0.60 to 1.00, and CV values ranged from 0.00 to 0.13. Most items showed improvements compared with Round 1, reflecting revisions made to the nursing simulation scenario template based on expert feedback. However, one item did not fully meet the validity and consistency criteria, indicating the need for further refinement.
For instance, Item 35 (Part 8. Development of scenario: identification of scenario developers), which stated, “The scenario developers were identified,” had a CVR of 0.60 and a CV of 0.13. Although this item displayed an improved CVR compared with Round 1, the relatively low agreement indicates that further action is needed to reach a higher level of expert consensus. Based on expert feedback, we decided, through internal meetings among the researchers, to keep Item 35 as an optional item rather than remove it.
The improvements in Round 2, as demonstrated by the increased agreement and alignment for most items, indicate that the revisions made to the template after Round 1 were effective in addressing many initial concerns. In Round 2, KW among expert responses was significantly higher at 0.739 (χ2=377.022, p<.001), indicating a substantial increase in agreement compared to Round 1 (Table 1).
The final version of the completed Nursing Simulation Scenario Evaluation Tool is provided in the Appendix 1.
The boxplots in Figure 3 and Figure 4 illustrate the distribution of responses for each item across the two rounds of the Delphi survey. The first set of boxplots visualized the experts’ responses to 37 items on a 3-point scale, while the second set represented 36 items on a 4-point scale in the second round of the survey. Both boxplots highlight key statistical features, including the mean (black dots), SD (red lines), and interquartile range (IQR, orange boxes), which capture the central tendency and variability of the responses.
In the first round (Figure 3), items with broader IQRs, larger SDs, and lower mean values indicated areas with potential discrepancies in expert consensus or item clarity, particularly items 10, 32, 34, and 35. For some items, the absence of orange IQR boxes indicates a very high level of agreement among experts, where the first and third quartile values are identical. This pattern suggests strong consensus among panel members for those items (Figure 3).
In comparison, the second-round (Figure 4) boxplots reveal a reduction in the SD and IQR for most items. This indicates improved consensus and greater alignment among the experts’ opinions between the two rounds. Thus, it can be inferred that the modifications made after the first round contributed to narrowing the variability and achieving stronger agreement in the second round (Figure 4).
Although the number of visible IQR boxplots increased compared with that in the first round, this does not indicate lower agreement. Rather, it reflects the use of a more detailed 4-point scale in the second round, which allowed experts to express their opinions more precisely.
This study aimed to develop a tool to evaluate simulation scenarios in nursing programs. The initial ET was developed based on the INACSL Standards of Best Practice, specifically focusing on Simulation Design and Participant Evaluation standards, as well as a comprehensive review of the existing literature. Using the RTD technique, eight key areas and 36 items were identified.
This study is significant because a consensus was reached on the content and tools that could be used to evaluate the validity of scenarios used in nursing simulation programs.
To date, the evaluation of simulation practice programs has primarily been used to measure learning effects regarding students’ knowledge, skills, and attitudes, as well as to analyze qualitative data such as student satisfaction and practice experience revealed during the debriefing process [2,6,16,17].
It is challenging to establish a scientific basis for evaluating the validity of an entire simulation program based solely on its effects on learning and experience. Simulation-based learning involves the development of applicable knowledge, skills, and attitudes through experiences in environments similar to real situations. The most critical influence on this process is the scenario. However, no commonly used ET exists for the simulation scenarios.
The tool was designed as a self-assessment instrument to allow simulation educators and scenario developers to independently evaluate the completeness, clarity, and validity of their simulation scenarios. A self-assessment approach offers several advantages: it enables continuous quality improvement without the need for external reviewers, fosters reflective practices among educators, and enhances practicality and accessibility in diverse educational environments. During development, the clarity, objectivity, and ease of interpretation of each item were carefully considered to minimize bias and ensure consistent application across various simulation contexts.
Therefore, reaching a consensus among experts on the tool used to assess simulation scenarios is significant. SE utilizes various methods, including low-fidelity and high-fidelity modalities, standardized patients (SPs), and virtual reality (VR). Universities often use scenarios developed by faculty within each discipline, sometimes bundled with commercial simulation programs. Although scenarios packaged with simulators apply the teaching method presented by the INACSL, they are limited when applied across different learning environments. According to Waxman [11], educators must develop scenarios that imitate real-life situations in controlled environments to cultivate optimal competencies. Developed scenarios require standardization using tools that evaluate usefulness and feasibility. These standards are essential for applying scenarios to various situations and cases, particularly in the context of interprofessional education (IPE). IPE involves collaborative learning among students from different healthcare professions, which is crucial for preparing them to work effectively in interdisciplinary teams. Thus, the scenario ET derived in this study serves as a foundational step toward establishing standardized criteria for simulation scenarios, ensuring that they are beneficial across multiple disciplines.
The scenario ET constructed in this study integrates INACSL and existing research [11,12] and includes sections such as “overview of scenario,” “learning objectives,” “preparation and pre-debriefing,” and “script creation and care information.” A total of eight areas and 36 items were developed, including “debriefing plan,” “facilitation,” “expected outcomes and evaluation,” and “development of scenario.” Page-Cutrara [12] reviewed existing simulation ETs and categorized “patient-centered care and assessment,” “evidence-based nursing intervention and clinical judgment,” “communication and teamwork,” and “safety” as elements to be included in the simulation scenario. The guidelines for scenario development by Waxman [11] include “learning objectives,” “assessment plan and instrument,” “evidence base for objectivity and assessment,” “prescenario learner activities,” “general debriefing plan,” and “validation.” The areas of “testing,” “facilitation,” and “debriefing” were also presented. In this study, overlapping areas were integrated based on the literature and INACSL standards.
The evaluation area presented by Page-Cutrara [12] was biased toward scenario-centered evaluations. The guidelines for developing simulation scenarios by Waxman [11] are valuable because they demonstrate not only the scenarios in the study, but also the process guidelines that can be used when developing simulation programs. Additionally, a checklist consisting of four areas was presented to evaluate the feasibility of the scenario: curricular integration, scenario script, simulation team information, and debriefing. However, these standards are not sufficiently specific for use as ETs, and they are difficult to apply universally to various methods and situations, such as low-fidelity, high-fidelity, SPs, and IPE.
In this study, two Delphi surveys led to the deletion and revision of one item with overlapping content and two items judged inappropriate, resulting in a final set of 36 items.
Of the original 37 items, 33 achieved a CVR of 0.80 or higher, indicating a strong expert consensus on the tool’s component areas and items. However, in the first survey, content validity was low in some subcategories of facilitation, expected outcomes, evaluation, and development scenarios. Specifically, clarification of the facilitator’s role within the facilitation area was weak. SE emphasizes self-directed learning and the use of learners’ experiences, including shared needs assessment, goal setting, the development and implementation of learning plans, and evaluation of learning outcomes. In this process, the instructor guides adult learners, helping them apply their prior experience to real-world problems and understand the rationale behind learning [11]. Accordingly, evaluation of the facilitation section is essential. Through discussion, the research team revised and clarified this section; in the second survey, it achieved a CVR of 1.0. Retaining this item implies that the instructor’s role must be specified in detail when developing future simulation programs and scenarios.
Among items with a CVR of 0.60 or lower, “consideration of patient safety in scenarios” in the “overview of scenario” area and “consideration of participants’ physical and psychological aspects” in the “preparation and prebriefing plan” area conveyed duplicative content. This item was revised and integrated into the question “Was the simulation designed considering the physical and mental stability of the participants?” to emphasize the importance of safety and participant well-being in simulation design. The reliability of simulation-based assessments can also be supported by frameworks such as the objective structured clinical examination (OSCE), which allows consistent and objective evaluation of participants’ performance, even by assessors without prior program design or field-specific expertise [18]. Therefore, adopting structured evaluation principles similar to those of the OSCE can enhance the objectivity and reproducibility of simulation scenario assessments.
Moreover, Page-Cutrara [12] identified safety as a scenario evaluation area; therefore, in this study, the two aforementioned items were combined into one, the CVR was 0.80, and a second survey was ultimately adopted. In this study, the items were modified and supplemented so that the scenarios of the simulation program could be universally evaluated.
As simulation learning involves generating practical knowledge through experience, the validity of the scenario content must have a scientific basis. Therefore, the tool developed in this study can be used effectively and universally to evaluate simulation scenarios.
This study aims to develop a tool for evaluating simulation scenarios using a Delphi survey involving experts with extensive SE experience. Through this process, the quality of SE can be improved by establishing the validity of simulation practice education and specifying standards through evaluation. This can be used as a standard to determine the level and scope of scenario content when developing simulation programs for each major field.
Although the RTD method has advantages such as anonymity, iterative feedback, and efficiency, it also has limitations. Reliance on a relatively small number of experts may restrict the generalizability of the findings, and repeated surveys can contribute to participant fatigue, potentially influencing the quality of responses. Furthermore, this study was conducted with a limited panel size within a specific national context, which may limit the broader applicability of the results. Future research should therefore expand the diversity and number of expert participants to strengthen the robustness and external validity of the tool.
This study aims to develop a simulation program scenario for ET based on consensus from a group of experts. First, an integrated tool was developed by reviewing the existing literature, and eight areas and 36 items were confirmed by gathering experts’ opinions. This tool is significant because it improves the quality of SE by ensuring validity in simulation-based practice education and by specifying standards through evaluation. Furthermore, the NS-SET developed in this study integrates the Google Gemini AI to enable intelligent, automated scenario evaluations. This system automates the entire process (from PDF file upload to the AI’s in-depth analysis and presentation of evaluation results), substantially increasing evaluation efficiency and objectivity. The AI provides not only an evaluation of each item but also the rationale for each assessment, ensuring transparency and offering concrete insights for scenario improvement. However, because this study confirmed only the content validity of the tool using the Delphi technique, future work should assess the validity and reliability of the ET scenario. Based on this tool, we hope that applied studies of simulation programs and scenarios will continue.

Authors’ contribution

Conceptualization: EJK, BP, GMK, JYL, SKK. Methodology: EJK. Project administration: EJK, BP. Supervision: EJK. Software: SKK. Investigation: SKK. Data curation: SKK. Visualization: SKK. Formal analysis: GMK, SKK. Resources: JYL. Writing–original draft: BP, GMK, JYL. Writing–review and editing: EJK, BP. Final approval of published version: all authors.

Conflict of interest

No existing or potential conflict of interest relevant to this article was reported.

Funding

This study was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean government (No. 2021R1A2C1095530).

Data availability

Please contact the corresponding author for data availability.

Acknowledgements

None.

Figure 1.
Example of round 1 Real-Time Delphi survey.
chnr-2025-022f1.jpg
Figure 2.
Flowchart of real time Delphi survey. CVR, content validity ratio; INACSL, International Nursing Association for Clinical Simulation and Learning.
chnr-2025-022f2.jpg
Figure 3.
Boxplot of experts’ responses (round 1).
chnr-2025-022f3.jpg
Figure 4.
Boxplot of experts’ responses (round 2).
chnr-2025-022f4.jpg
Table 1.
Results of the round 1 and 2 Real-Time Delphi for the development of nursing simulation scenario evaluation tool
Comments Order in round 1 Order in round 2
CVR CV Decision CVR CV
Part 1. Overview of scenario
 1. Format conformity with scenario template 1.00 0.00 Retained 1.00 0.13
 2. Completeness of content in template 0.80 0.21 Retained 1.00 0.13
 3. Case plausibility 1.00 0.00 Retained 1.00 0.13
 4. Use of evidence-based data 0.60 0.24 Revised 1.00 0.13
 5. Consideration of patient safety in scenarios 1.00 0.00 Retained 1.00 0.00
Part 2. Learning objectives
 6. Specificity of learning objectives 1.00 0.00 Retained 1.00 0.13
 7. Measurability of specific skills and behaviors 1.00 0.00 Retained 1.00 0.13
 8. Attainability considering participants’ knowledge and experience 0.80 0.10 Retained 1.00 0.13
 9. Relevance to participants’ knowledge and experience 0.60 0.24 Retained 0.80 0.13
 10. Timeliness of objectives presentation 0.40 0.38 Revised 0.80 0.13
Part 3. Preparation & pre-briefing plan
 11. Consideration of participants’ physical and psychological aspects 0.60 0.31 Revised 0.80 0.13
 12. Inclusion of pre-existing knowledge and attitudes of participants 1.00 0.00 Retained 1.00 0.13
 13. Appropriate type of simulators 1.00 0.00 Retained 1.00 0.13
 14. Suitable environment and settings 1.00 0.00 Retained 1.00 0.13
 15. Adequate equipment and tools for simulation education 0.80 0.21 Retained 1.00 0.13
 16. Clear delineation of roles for both simulator and participants 1.00 0.00 Retained 1.00 0.13
 17. Provision of pre-briefing plan checklist 0.80 0.21 Retained 1.00 0.13
 18. Techniques for establishing psychological environment in pre-briefing plan 0.60 0.24 Retained 0.80 0.13
Part 4. Make script & the case information
 19. Prediction of learner scenarios before simulation starts 0.60 0.31 Revised 1.00 0.13
 20. Providing a realistic starting point 1.00 0.00 Retained 1.00 0.13
 21. Providing learner nursing interventions based on patient condition 0.80 0.10 Retained 1.00 0.13
 22. Planning for positive/negative changes in simulator situation by learner interventions 0.80 0.10 Retained 1.00 0.13
 23. Facilitating when learner interventions do not occur 1.00 0.00 Retained 1.00 0.13
 24. Planning for simulator (patient) reactions based on learner assessments and critical decision-making 0.60 0.31 Revised 1.00 0.13
 25. Confirmation of patient stability 1.00 0.00 Retained 1.00 0.13
 26. Explanation of team roles for learners 0.80 0.21 Retained 1.00 0.13
 27. Realism and adequacy of patient situation, data, and records 0.80 0.21 Retained 1.00 0.13
 28. Is the patient’s situation, data, and records sufficient to perform interventions? 1.00 0.00 Retained 1.00 0.13
Part 5. Debriefing plan
 29. Debriefing questions that confirm objectives and expected outcomes 1.00 0.00 Retained 1.00 0.13
 30. Planned debriefing methods 1.00 0.00 Retained 1.00 0.13
 31. Debriefing content easily accessible to educators 0.80 0.21 Retained 1.00 0.13
Part 6. Facilitation
 32. Clarity of facilitator’s role 0.40 0.38 Revised 1.00 0.13
Part 7. Expected outcomes & evaluation
 33. Clarity of evaluation tools for assessing goal achievement 1.00 0.00 Retained 1.00 0.13
 34. Indication of evaluation tool reliability 0.20 0.33 Revised 1.00 0.13
Part 8. Development of scenario
 35. Identification of scenario developers 0.40 0.38 Revised 0.60 0.13
 36. Presentation of scenario’s foundation and theoretical framework 0.80 0.21 Deleted - -
 37.Scenario validity 1.00 0.00 Retained 1.00 0.13
W value 0.019 0.73926
X2 value 94.26 377.022
p .040 <.001

CV, coefficient of variation; CVR, content validity ratio.

Appendix 1.
Nursing Simulation Scenario Evaluation Tool, for development of an artificial intelligence-based nursing simulation scenario evaluation tool using the Real-Time Delphi method
Nursing Simulation Scenario Evaluation Tool (V2)
This tool is designed to evaluate the validity and reliability of a nursing simulation scenario. Please rate each item accordingly.
­
I. Overview of scenario
1. The scenario was constructed according to the standard scenario template.
 □ Strongly agree □ Somewhat agree □ Disagree
2. All contents of the scenario template have been completed.
 □ Strongly agree □ Somewhat agree □ Disagree
3. The case in the scenario is appropriate for the learning objectives.
 □ Strongly agree □ Somewhat agree □ Disagree
4. The scenario contents were developed based on practical evidence.
 □ Strongly agree □ Somewhat agree □ Disagree
5. Was patient safety considered in all aspects of the scenario?
 □ Strongly agree □ Somewhat agree □ Disagree
­
II. Learning objectives
6. The learning objectives were presented specifically.
 □ Strongly agree □ Somewhat agree □ Disagree
7. Specific skills and behaviors were presented to be measurable.
 □ Strongly agree □ Somewhat agree □ Disagree
8. The objectives were made achievable considering the participants' knowledge and experience.
 □ Strongly agree □ Somewhat agree □ Disagree
9. The learning objectives were presented considering the participants’ level and experience.
 □ Strongly agree □ Somewhat agree □ Disagree
10. The objectives were planned to be achieved quickly and effectively within the time constraints.
 □ Strongly agree □ Somewhat agree □ Disagree
­
III. Preparation & pre-briefing plan
11. The simulation was designed considering the physical and psychological stability of the participants.
 □ Strongly agree □ Somewhat agree □ Disagree
12. Information about the participants’ prior knowledge and attitudes was included.
 □ Strongly agree □ Somewhat agree □ Disagree
13. The appropriate type of simulators was selected.
 □ Strongly agree □ Somewhat agree □ Disagree
14. A suitable environment and settings were selected.
 □ Strongly agree □ Somewhat agree □ Disagree
15. All appropriate equipment and tools for simulation education were selected.
 □ Strongly agree □ Somewhat agree □ Disagree
16. The roles of both the simulators and participants were clearly defined.
 □ Strongly agree □ Somewhat agree □ Disagree
17. A checklist for the pre-briefing plan was prepared.
 □ Strongly agree □ Somewhat agree □ Disagree
18. Techniques for establishing a psychological environment in the pre-briefing plan were described.
 □ Strongly agree □ Somewhat agree □ Disagree
­
IV. Make script & the case information
19. Participants can predict the scenario before the simulation begins.
 □ Strongly agree □ Somewhat agree □ Disagree
20. A realistic starting point for the nursing intervention was presented.
 □ Strongly agree □ Somewhat agree □ Disagree
21. Nursing interventions based on the patient’s condition were provided.
 □ Strongly agree □ Somewhat agree □ Disagree
22. The scenario was designed to reflect positive or negative changes in the simulator based on participants’ interventions.
 □ Strongly agree □ Somewhat agree □ Disagree
23. Plans were made to facilitate interventions when participants did not intervene.
 □ Strongly agree □ Somewhat agree □ Disagree
24. Plans were made to alter the simulator’s (patient’s) response based on participants’ assessment and critical decision-making.
 □ Strongly agree □ Somewhat agree □ Disagree
25. Participants can confirm the simulator’s stable condition.
 □ Strongly agree □ Somewhat agree □ Disagree
26. The roles of the participants in teams were clearly explained.
 □ Strongly agree □ Somewhat agree □ Disagree
27. The patient’s situation data and records appeared realistic.
 □ Strongly agree □ Somewhat agree □ Disagree
28. The patient’s situation data and records were sufficiently presented to perform interventions.
 □ Strongly agree □ Somewhat agree □ Disagree
­
V. Debriefing plan
29. The debriefing questions were designed to confirm objectives and expected outcomes.
 □ Strongly agree □ Somewhat agree □ Disagree
30. Specific debriefing methods were planned.
 □ Strongly agree □ Somewhat agree □ Disagree
31. Debriefing content was made easily accessible to the participants.
 □ Strongly agree □ Somewhat agree □ Disagree
­
VI. facilitation
32. The role of the facilitator (instructor) was clearly defined.
 □ Strongly agree □ Somewhat agree □ Disagree
­
VI. Expected outcomes & evaluation
33. The evaluation tool was clearly presented to assess the achievement of the objectives.
 □ Strongly agree □ Somewhat agree □ Disagree
34. The evaluation tool accurately assesses the achievement of the scenario’s objectives.
­
VII. Development of scenario
35. The scenario developers were identified (option).
 □ Yes □ No
36. The validity of the scenario was confirmed through a process (e.g., content validity index).
 □ Yes □ No

FIGURE & DATA

REFERENCES

    Citations

    Citations to this article as recorded by  

      Download Citation

      Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

      Format:

      Include:

      Development of an artificial intelligence-based nursing simulation scenario evaluation tool: a methodological study using the Real-Time Delphi method in South Korea
      Child Health Nurs Res. 2025;31(4):257-271.   Published online October 31, 2025
      Download Citation
      Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

      Format:
      • RIS — For EndNote, ProCite, RefWorks, and most other reference management software
      • BibTeX — For JabRef, BibDesk, and other BibTeX-specific software
      Include:
      • Citation for the content below
      Development of an artificial intelligence-based nursing simulation scenario evaluation tool: a methodological study using the Real-Time Delphi method in South Korea
      Child Health Nurs Res. 2025;31(4):257-271.   Published online October 31, 2025
      Close

      Figure

      • 0
      • 1
      • 2
      • 3
      Development of an artificial intelligence-based nursing simulation scenario evaluation tool: a methodological study using the Real-Time Delphi method in South Korea
      Image Image Image Image
      Figure 1. Example of round 1 Real-Time Delphi survey.
      Figure 2. Flowchart of real time Delphi survey. CVR, content validity ratio; INACSL, International Nursing Association for Clinical Simulation and Learning.
      Figure 3. Boxplot of experts’ responses (round 1).
      Figure 4. Boxplot of experts’ responses (round 2).
      Development of an artificial intelligence-based nursing simulation scenario evaluation tool: a methodological study using the Real-Time Delphi method in South Korea
      Comments Order in round 1 Order in round 2
      CVR CV Decision CVR CV
      Part 1. Overview of scenario
       1. Format conformity with scenario template 1.00 0.00 Retained 1.00 0.13
       2. Completeness of content in template 0.80 0.21 Retained 1.00 0.13
       3. Case plausibility 1.00 0.00 Retained 1.00 0.13
       4. Use of evidence-based data 0.60 0.24 Revised 1.00 0.13
       5. Consideration of patient safety in scenarios 1.00 0.00 Retained 1.00 0.00
      Part 2. Learning objectives
       6. Specificity of learning objectives 1.00 0.00 Retained 1.00 0.13
       7. Measurability of specific skills and behaviors 1.00 0.00 Retained 1.00 0.13
       8. Attainability considering participants’ knowledge and experience 0.80 0.10 Retained 1.00 0.13
       9. Relevance to participants’ knowledge and experience 0.60 0.24 Retained 0.80 0.13
       10. Timeliness of objectives presentation 0.40 0.38 Revised 0.80 0.13
      Part 3. Preparation & pre-briefing plan
       11. Consideration of participants’ physical and psychological aspects 0.60 0.31 Revised 0.80 0.13
       12. Inclusion of pre-existing knowledge and attitudes of participants 1.00 0.00 Retained 1.00 0.13
       13. Appropriate type of simulators 1.00 0.00 Retained 1.00 0.13
       14. Suitable environment and settings 1.00 0.00 Retained 1.00 0.13
       15. Adequate equipment and tools for simulation education 0.80 0.21 Retained 1.00 0.13
       16. Clear delineation of roles for both simulator and participants 1.00 0.00 Retained 1.00 0.13
       17. Provision of pre-briefing plan checklist 0.80 0.21 Retained 1.00 0.13
       18. Techniques for establishing psychological environment in pre-briefing plan 0.60 0.24 Retained 0.80 0.13
      Part 4. Make script & the case information
       19. Prediction of learner scenarios before simulation starts 0.60 0.31 Revised 1.00 0.13
       20. Providing a realistic starting point 1.00 0.00 Retained 1.00 0.13
       21. Providing learner nursing interventions based on patient condition 0.80 0.10 Retained 1.00 0.13
       22. Planning for positive/negative changes in simulator situation by learner interventions 0.80 0.10 Retained 1.00 0.13
       23. Facilitating when learner interventions do not occur 1.00 0.00 Retained 1.00 0.13
       24. Planning for simulator (patient) reactions based on learner assessments and critical decision-making 0.60 0.31 Revised 1.00 0.13
       25. Confirmation of patient stability 1.00 0.00 Retained 1.00 0.13
       26. Explanation of team roles for learners 0.80 0.21 Retained 1.00 0.13
       27. Realism and adequacy of patient situation, data, and records 0.80 0.21 Retained 1.00 0.13
       28. Is the patient’s situation, data, and records sufficient to perform interventions? 1.00 0.00 Retained 1.00 0.13
      Part 5. Debriefing plan
       29. Debriefing questions that confirm objectives and expected outcomes 1.00 0.00 Retained 1.00 0.13
       30. Planned debriefing methods 1.00 0.00 Retained 1.00 0.13
       31. Debriefing content easily accessible to educators 0.80 0.21 Retained 1.00 0.13
      Part 6. Facilitation
       32. Clarity of facilitator’s role 0.40 0.38 Revised 1.00 0.13
      Part 7. Expected outcomes & evaluation
       33. Clarity of evaluation tools for assessing goal achievement 1.00 0.00 Retained 1.00 0.13
       34. Indication of evaluation tool reliability 0.20 0.33 Revised 1.00 0.13
      Part 8. Development of scenario
       35. Identification of scenario developers 0.40 0.38 Revised 0.60 0.13
       36. Presentation of scenario’s foundation and theoretical framework 0.80 0.21 Deleted - -
       37.Scenario validity 1.00 0.00 Retained 1.00 0.13
      W value 0.019 0.73926
      X2 value 94.26 377.022
      p .040 <.001
      Table 1. Results of the round 1 and 2 Real-Time Delphi for the development of nursing simulation scenario evaluation tool

      CV, coefficient of variation; CVR, content validity ratio.

      TOP