From 716125e36d3f67698285d33e378c7932a3fe02cf Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Mon, 21 Apr 2025 18:00:53 -0600 Subject: [PATCH 01/10] updating data type chapters to split into two --- 01-data_types.Rmd | 18 +----------------- 01b-specific-data_types.Rmd | 31 +++++++++++++++++++++++++++++++ _bookdown.yml | 1 + 3 files changed, 33 insertions(+), 17 deletions(-) create mode 100644 01b-specific-data_types.Rmd diff --git a/01-data_types.Rmd b/01-data_types.Rmd index 648a108..643e234 100644 --- a/01-data_types.Rmd +++ b/01-data_types.Rmd @@ -52,21 +52,6 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0Ge When used in EHR research, both structured data and clinical notes are generally de-identified to protect patient privacy. Patient ID numbers might be replaced with new identifiers, with linkages maintained by institutional “honest brokers” [@Dhir2008] charged with providing clinical data for research purposes. In some cases, dates may be changed as well. Clinical notes are generally “de-identified” through specialized software designed to remove names, dates, locations, and other sensitive details. Researchers working with institutions to access clinical data should be sure to understand local data de-identification practices. - -## Specific types of clinical data - - - - -### Physiological -### Monitoring data - -### Radiology - -### Pathology - -### Synthetic Data - ## How to acquire clinical data ### Secondary Sources @@ -79,5 +64,4 @@ When used in EHR research, both structured data and clinical notes are generally ### Metadata - -## Conclusion +## Summary diff --git a/01b-specific-data_types.Rmd b/01b-specific-data_types.Rmd new file mode 100644 index 0000000..7420ff1 --- /dev/null +++ b/01b-specific-data_types.Rmd @@ -0,0 +1,31 @@ +--- +title: Clinical Data Types +--- + +```{r, include = FALSE} +ottrpal::set_knitr_image_path() +``` + + +# Specific Clinical Data Types + + +## Learning Objectives + +```{r, fig.align='center', echo = FALSE, fig.alt= "Learning Objectives: 1. Explain why clinical data is unique compared to other types of biomedical research data, 2. Describe the difference between Structured and Unstructured data, 3. List major sources and types of clinical data", out.width="100%"} +ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit#slide=id.g3385bea4ad0_0_30" ) +``` + + + +### Physiological + +### Monitoring data + +### Radiology + +### Pathology + +### Synthetic Data + +## Summary \ No newline at end of file diff --git a/_bookdown.yml b/_bookdown.yml index 6f39e81..45c9727 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -4,6 +4,7 @@ repo: https://github.com/jhudsl/DaSL_Course_Template_Bookdown/ rmd_files: ["index.Rmd", "00-intro.Rmd", "01-data_types.Rmd", + "01b-specific-data-types.Rmd" "02-data_uses.Rmd", "03-data_management.Rmd", "04-appendixI.Rmd", From 9e70a341d53380860cc2687ffa29237042abc112 Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Mon, 21 Apr 2025 18:12:21 -0600 Subject: [PATCH 02/10] updating file names --- 01b-specific-data_types.Rmd => 01b-specific_data_types.Rmd | 0 About.Rmd | 4 ++-- _bookdown.yml | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) rename 01b-specific-data_types.Rmd => 01b-specific_data_types.Rmd (100%) diff --git a/01b-specific-data_types.Rmd b/01b-specific_data_types.Rmd similarity index 100% rename from 01b-specific-data_types.Rmd rename to 01b-specific_data_types.Rmd diff --git a/About.Rmd b/About.Rmd index bd8b31c..4b4526b 100644 --- a/About.Rmd +++ b/About.Rmd @@ -15,8 +15,8 @@ These credits are based on our [course contributors table guidelines](https://gi |Lead Content Instructor(s)|[FirstName LastName](link to personal website)| |Lecturer(s) (include chapter name/link in parentheses if only for specific chapters) - make new line if more than one chapter involved| Delivered the course in some way - video or audio| |Content Author(s) (include chapter name/link in parentheses if only for specific chapters) - make new line if more than one chapter involved | If any other authors besides lead instructor| -|Content Contributor(s) (include section name/link in parentheses) - make new line if more than one section involved| Wrote less than a chapter| -|Content Editor(s)/Reviewer(s) | Checked your content| +|Content Contributor(s) (include section name/link in parentheses) - make new line if more than one section involved| Jennifer Kelleher, Ph.D.1*; Abigail S. Robbertz, Ph.D.| +|Content Editor(s)/Reviewer(s) | Julia K. Herriott, B.S.| |Content Director(s) | Helped guide the content direction| |Content Consultants (include chapter name/link in parentheses or word "General") - make new line if more than one chapter involved | Gave high level advice on content| |Acknowledgments| Gave small assistance to content but not to the level of consulting | diff --git a/_bookdown.yml b/_bookdown.yml index 45c9727..e64f20c 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -4,7 +4,7 @@ repo: https://github.com/jhudsl/DaSL_Course_Template_Bookdown/ rmd_files: ["index.Rmd", "00-intro.Rmd", "01-data_types.Rmd", - "01b-specific-data-types.Rmd" + "01b-specific_data_types.Rmd", "02-data_uses.Rmd", "03-data_management.Rmd", "04-appendixI.Rmd", From 02edeef31f48b6653c22f184048998fdb7c65954 Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Mon, 21 Apr 2025 18:50:08 -0600 Subject: [PATCH 03/10] updating author info --- 01b-specific_data_types.Rmd | 49 +++++++++++++++++++++++++------------ assets/style_ITN.css | 13 ++++++++++ 2 files changed, 47 insertions(+), 15 deletions(-) diff --git a/01b-specific_data_types.Rmd b/01b-specific_data_types.Rmd index 7420ff1..a3a026f 100644 --- a/01b-specific_data_types.Rmd +++ b/01b-specific_data_types.Rmd @@ -1,15 +1,5 @@ ---- -title: Clinical Data Types ---- - -```{r, include = FALSE} -ottrpal::set_knitr_image_path() -``` - - # Specific Clinical Data Types - ## Learning Objectives ```{r, fig.align='center', echo = FALSE, fig.alt= "Learning Objectives: 1. Explain why clinical data is unique compared to other types of biomedical research data, 2. Describe the difference between Structured and Unstructured data, 3. List major sources and types of clinical data", out.width="100%"} @@ -18,14 +8,43 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0Ge -### Physiological +## Physiological + +## Monitoring data + +:::bluebox +This section was written by: Jennifer Kelleher, Ph.D.^1^; Abigail S. Robbertz, Ph.D.^1^; and Meghan E. McGrady, Ph.D.^1,2^ + +**NOTE:** Jennifer Kelleher, Ph.D.^1^ and Abigail S. Robbertz, Ph.D.^1^ contributed equally. + +^1^ Center for Adherence and Self-Management, Division of Behavioral Medicine and Clinical Psychology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA + +^2^ Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA + +The work discussed in this section was also supported by the National Cancer Institute at the National Institutes of Health (R21CA263704, K07CA200668) to MEM. JK and ASR are supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development at the National Institutes of Health (T32HD068223). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. +::: + +Electronic monitoring devices are digital tools that can be used to track health behaviors such as: + +* Sleep +* Physical activity +* Medication-taking + +Electronic monitoring devices enable researchers to track day-to-day health behaviors in the patient’s **"real-world" setting**. This allows researchers to explore patterns or changes in a patient’s health behavior and provides a richer understanding of **daily behavior over time**. + + +### Benefits of Monitoring Data + +1) Electronic monitoring devices often include data transmission abilities that enable healthcare providers or researchers to access these data in near **real-time** potentially informing intervention and/or medical decision-making. + +1) Electronic monitoring devices also have the potential to produce **more accurate** estimates of health behaviors than alternative strategies (e.g., self-report) as they are not subject to recall bias and can detect efforts to inflate adherence due to social desirability. + -### Monitoring data -### Radiology +## Radiology -### Pathology +## Pathology -### Synthetic Data +## Synthetic Data ## Summary \ No newline at end of file diff --git a/assets/style_ITN.css b/assets/style_ITN.css index ad7e3b4..6b81382 100644 --- a/assets/style_ITN.css +++ b/assets/style_ITN.css @@ -271,6 +271,19 @@ li.appendix span, li.part span { /* for TOC part names */ /* Sidebar formating --------------------------------------------*/ /* from r-pkgs.org*/ +div.bluebox{ + border: 4px #14395f; + border-style: solid; + padding: 1em; + margin: 1em 0; + padding-left: 40px; + background-size: 70px; + background-repeat: no-repeat; + background-position: 15px center; + background-color: #e8ebee; + text-align: left;!important +} + div.notice{ border: 4px #193a5c; border-style: solid; From 383244b8d8b5b90aefaad711b448e4e0f8b46236 Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Mon, 21 Apr 2025 20:20:56 -0600 Subject: [PATCH 04/10] adding more content --- 01b-specific_data_types.Rmd | 31 +++++++++++++++- book.bib | 73 +++++++++++++++++++++++++++++++++++++ 2 files changed, 102 insertions(+), 2 deletions(-) diff --git a/01b-specific_data_types.Rmd b/01b-specific_data_types.Rmd index a3a026f..4a308ea 100644 --- a/01b-specific_data_types.Rmd +++ b/01b-specific_data_types.Rmd @@ -22,13 +22,21 @@ This section was written by: Jennifer Kelleher, Ph.D.^1^; Abigail S. Robbertz, P ^2^ Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA The work discussed in this section was also supported by the National Cancer Institute at the National Institutes of Health (R21CA263704, K07CA200668) to MEM. JK and ASR are supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development at the National Institutes of Health (T32HD068223). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. + ::: -Electronic monitoring devices are digital tools that can be used to track health behaviors such as: +Electronic monitoring devices are digital tools that can be used to track health behaviors over time such as: * Sleep * Physical activity -* Medication-taking +* Medication adherence +* Calorie intake + +Electronic monitoring devices can also be used to assess physical health indicators including: +* Blood glucose levels +* Blood pressure +* Heart rate and heart rate variability +* Oxygen saturation Electronic monitoring devices enable researchers to track day-to-day health behaviors in the patient’s **"real-world" setting**. This allows researchers to explore patterns or changes in a patient’s health behavior and provides a richer understanding of **daily behavior over time**. @@ -39,7 +47,26 @@ Electronic monitoring devices enable researchers to track day-to-day health beha 1) Electronic monitoring devices also have the potential to produce **more accurate** estimates of health behaviors than alternative strategies (e.g., self-report) as they are not subject to recall bias and can detect efforts to inflate adherence due to social desirability. +### Considerations + +:::warning +This section is not exhaustive. Research teams are strongly encouraged to consult with experts with experience and training in collecting and analyzing data from specific devices. + +To ensure the outcome variables are aligned with the research question of interest and [ethical and age/developmental considerations](https://pmc.ncbi.nlm.nih.gov/articles/PMC10798216/) (@psihogios_ethical_2024; @modi_pediatric_2012) have been appropriately accounted for, readers are encouraged to consult with researchers in their field who have integrated these measurement strategies into their work. +::: + +#### Medication Adherence + +There are three major components of medical adherence (the tracking of taking medication): + +- Initiation: Starting a prescribed regimen +- Implementation: The amount of which a patient's medication-taking behavior corresponds with the treatment regimen or protocol +- Discontinuation: Stopping a perscribed regimen + +For more information see: +- [A new taxonomy for describing and defining adherence to medications (Vrijens et al., 2012)](https://pmc.ncbi.nlm.nih.gov/articles/PMC3403197/) +- [Pediatric self-management: A framework for research, practice, and policy. (Modi et al., 2012)](https://pmc.ncbi.nlm.nih.gov/articles/PMC9923567/) ## Radiology diff --git a/book.bib b/book.bib index 0f82691..16cd62b 100644 --- a/book.bib +++ b/book.bib @@ -9,6 +9,24 @@ @misc{What_are_clinical_trials_2023 language={en} } +@article{psihogios_ethical_2024, + title = {Ethical considerations in using sensors to remotely assess pediatric health behaviors.}, + volume = {79}, + copyright = {http://www.apa.org/pubs/journals/resources/open-access.aspx}, + issn = {1935-990X, 0003-066X}, + url = {https://doi.apa.org/doi/10.1037/amp0001196}, + doi = {10.1037/amp0001196}, + abstract = {Sensors, including accelerometer-based and electronic adherence monitoring devices, have transformed health data collection. Sensors allow for unobtrusive, real-time sampling of health behaviors that relate to psychological health, including sleep, physical activity, and medicationtaking. These technical strengths have captured scholarly attention, with far less discussion about the level of human touch involved in implementing sensors. Researchers face several subjective decision points when collecting health data via sensors, with these decisions posing ethical concerns for users and the public at large. Using examples from pediatric sleep, physical activity, and medication adherence research, we pose critical ethical questions, practical dilemmas, and guidance for implementing health-based sensors. We focus on youth given that they are often deemed the ideal population for digital health approaches but have unique technologyrelated vulnerabilities and preferences. Ethical considerations are organized according to Belmont principles of respect for persons (e.g., when sensor-based data are valued above the subjective lived experiences of youth and their families), beneficence (e.g., with sensor data management and sharing), and justice (e.g., with sensor access and acceptability among minoritized pediatric populations). Recommendations include the need to increase transparency about the extent of subjective decision making with sensor data management. Without greater attention to the human factors involved in sensor research, ethical risks could outweigh the scientific promise of sensors, thereby negating their potential role in improving child health and care.}, + language = {en}, + number = {1}, + urldate = {2025-04-22}, + journal = {American Psychologist}, + author = {Psihogios, Alexandra M. and King-Dowling, Sara and Mitchell, Jonathan A. and McGrady, Meghan E. and Williamson, Ariel A.}, + month = jan, + year = {2024}, + pages = {39--51} +} + @misc{NIHClinicalTrialsProspectively, title={NIH Clinical Trials}, journal={Division of Research at Brown University}, @@ -207,3 +225,58 @@ @Book{Xie2020 note = {ISBN 9780367563837}, url = {https://bookdown.org/yihui/rmarkdown-cookbook}, } + + +@article{psihogios_ethical_2024, + title = {Ethical considerations in using sensors to remotely assess pediatric health behaviors.}, + volume = {79}, + copyright = {http://www.apa.org/pubs/journals/resources/open-access.aspx}, + issn = {1935-990X, 0003-066X}, + url = {https://doi.apa.org/doi/10.1037/amp0001196}, + doi = {10.1037/amp0001196}, + abstract = {Sensors, including accelerometer-based and electronic adherence monitoring devices, have transformed health data collection. Sensors allow for unobtrusive, real-time sampling of health behaviors that relate to psychological health, including sleep, physical activity, and medicationtaking. These technical strengths have captured scholarly attention, with far less discussion about the level of human touch involved in implementing sensors. Researchers face several subjective decision points when collecting health data via sensors, with these decisions posing ethical concerns for users and the public at large. Using examples from pediatric sleep, physical activity, and medication adherence research, we pose critical ethical questions, practical dilemmas, and guidance for implementing health-based sensors. We focus on youth given that they are often deemed the ideal population for digital health approaches but have unique technologyrelated vulnerabilities and preferences. Ethical considerations are organized according to Belmont principles of respect for persons (e.g., when sensor-based data are valued above the subjective lived experiences of youth and their families), beneficence (e.g., with sensor data management and sharing), and justice (e.g., with sensor access and acceptability among minoritized pediatric populations). Recommendations include the need to increase transparency about the extent of subjective decision making with sensor data management. Without greater attention to the human factors involved in sensor research, ethical risks could outweigh the scientific promise of sensors, thereby negating their potential role in improving child health and care.}, + language = {en}, + number = {1}, + urldate = {2025-04-22}, + journal = {American Psychologist}, + author = {Psihogios, Alexandra M. and King-Dowling, Sara and Mitchell, Jonathan A. and McGrady, Meghan E. and Williamson, Ariel A.}, + month = jan, + year = {2024}, + pages = {39--51} + } + +@article{vrijens_new_2012, + title = {A new taxonomy for describing and defining adherence to medications}, + volume = {73}, + copyright = {http://onlinelibrary.wiley.com/termsAndConditions\#vor}, + issn = {0306-5251, 1365-2125}, + url = {https://bpspubs.onlinelibrary.wiley.com/doi/10.1111/j.1365-2125.2012.04167.x}, + doi = {10.1111/j.1365-2125.2012.04167.x}, + abstract = {Interest in patient adherence has increased in recent years, with a growing literature that shows the pervasiveness of poor adherence to appropriately prescribed medications. However, four decades of adherence research has not resulted in uniformity in the terminology used to describe deviations from prescribed therapies. The aim of this review was to propose a new taxonomy, in which adherence to medications is conceptualized, based on behavioural and pharmacological science, and which will support quantifiable parameters. A systematic literature review was performed using MEDLINE, EMBASE, CINAHL, the Cochrane Library and PsycINFO from database inception to 1 April 2009. The objective was to identify the different conceptual approaches to adherence research. Definitions were analyzed according to time and methodological perspectives. A taxonomic approach was subsequently derived, evaluated and discussed with international experts. More than 10 different terms describing medication‐taking behaviour were identified through the literature review, often with differing meanings. The conceptual foundation for a new, transparent taxonomy relies on three elements, which make a clear distinction between processes that describe actions through established routines (‘Adherence to medications’, ‘Management of adherence’) and the discipline that studies those processes (‘Adherence‐related sciences’). ‘Adherence to medications’ is the process by which patients take their medication as prescribed, further divided into three quantifiable phases: ‘Initiation’, ‘Implementation’ and ‘Discontinuation’. In response to the proliferation of ambiguous or unquantifiable terms in the literature on medication adherence, this research has resulted in a new conceptual foundation for a transparent taxonomy. The terms and definitions are focused on promoting consistency and quantification in terminology and methods to aid in the conduct, analysis and interpretation of scientific studies of medication adherence.}, + language = {en}, + number = {5}, + urldate = {2025-04-22}, + journal = {British Journal of Clinical Pharmacology}, + author = {Vrijens, Bernard and De Geest, Sabina and Hughes, Dyfrig A. and Przemyslaw, Kardas and Demonceau, Jenny and Ruppar, Todd and Dobbels, Fabienne and Fargher, Emily and Morrison, Valerie and Lewek, Pawel and Matyjaszczyk, Michal and Mshelia, Comfort and Clyne, Wendy and Aronson, Jeffrey K. and Urquhart, J. and {for the ABC Project Team}}, + month = may, + year = {2012}, + pages = {691--705} + } + + @article{modi_pediatric_2012, + title = {Pediatric {Self}-management: {A} {Framework} for {Research}, {Practice}, and {Policy}}, + volume = {129}, + issn = {0031-4005, 1098-4275}, + shorttitle = {Pediatric {Self}-management}, + url = {https://publications.aap.org/pediatrics/article/129/2/e473/32549/Pediatric-Self-management-A-Framework-for-Research}, + doi = {10.1542/peds.2011-1635}, + abstract = {Self-management of chronic pediatric conditions is a formidable challenge for patients, families, and clinicians, with research demonstrating a high prevalence of poor self-management and nonadherence across pediatric conditions. Nevertheless, effective self-management is necessary to maximize treatment efficacy and clinical outcomes and to reduce unnecessary health care utilization and costs. However, this complex behavior is poorly understood as a result of insufficient definitions, reliance on condition-specific and/or adult models of self-management, failure to consider the multitude of factors that influence patient self-management behavior, and lack of synthesis of research, clinical practice, and policy implications. To address this need, we present a comprehensive conceptual model of pediatric self-management that articulates the individual, family, community, and health care system level influences that impact self-management behavior through cognitive, emotional, and social processes. This model further describes the relationship among self-management, adherence, and outcomes at both the patient and system level. Implications for research, clinical practice, and health care policy concerning pediatric chronic care are emphasized with a particular focus on modifiable influences, evidence-based targets for intervention, and the role of clinicians in the provision of self-management support. We anticipate that this unified conceptual approach will equip stakeholders in pediatric health care to (1) develop evidence-based interventions to improve self-management, (2) design programs aimed at preventing the development of poor self-management behaviors, and (3) inform health care policy that will ultimately improve the health and psychosocial outcomes of children with chronic conditions.}, + language = {en}, + number = {2}, + urldate = {2025-04-22}, + journal = {Pediatrics}, + author = {Modi, Avani C. and Pai, Ahna L. and Hommel, Kevin A. and Hood, Korey K. and Cortina, Sandra and Hilliard, Marisa E. and Guilfoyle, Shanna M. and Gray, Wendy N. and Drotar, Dennis}, + month = feb, + year = {2012}, + pages = {e473--e485} + } From dbb96c8b7a8bbdd9a143aeb6066ffbde8aaa22a5 Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Tue, 22 Apr 2025 09:07:50 -0600 Subject: [PATCH 05/10] fix spelling --- resources/dictionary.txt | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/resources/dictionary.txt b/resources/dictionary.txt index b7c6b5a..5d61f70 100644 --- a/resources/dictionary.txt +++ b/resources/dictionary.txt @@ -1,7 +1,10 @@ +al +ASR biomarker biomarkers Biomarker Biomarkers +bluebox bookdown Bookdown CDS @@ -19,6 +22,7 @@ customizations DaSL data's de +et EHR faq FHIR @@ -26,25 +30,35 @@ GARDE generalizability generalizable HAQ +HD https ICD Immunohistochemical ITCR itcrtraining ITN +JK +Kelleher Leanpub LLM LLMs LOINC Markua +McGrady +MEM +Modi +Multivariable NCI NDC NIH's NLP OMOP ontologies +participants' +perscribed Permanente personalization +Ph PHM polypharmacy pre @@ -52,12 +66,14 @@ PROMs QALY QALYs representativeness +Robbertz RMarkdown RWD RWE RxNorm SAEs sexualized +Shriver SNOMED socio SOGI @@ -67,4 +83,5 @@ Tolerability UE UE5 UMLS +Vrijens www From 587c0bf27afb6ec32d61ca7f1b38c308e796dd59 Mon Sep 17 00:00:00 2001 From: Kate Isaac <41767733+kweav@users.noreply.github.com> Date: Thu, 22 May 2025 19:31:35 -0400 Subject: [PATCH 06/10] simplify and rearrange phrasing and add a slide --- 01-data_types.Rmd | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/01-data_types.Rmd b/01-data_types.Rmd index 643e234..f7e073f 100644 --- a/01-data_types.Rmd +++ b/01-data_types.Rmd @@ -20,28 +20,40 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0Ge ## Clinical data is unique -Clinical data comes from a wide variety of sources and as such, requires careful consideration when designing, collecting, and analyzing this data. Unlike domains such as Finance or other areas in the sciences which predominantly use structured data, clinical data is often heterogeneous, integrating quantitative measurements, categorical data, subjective narratives from patient notes, and objective observations from doctor notes and even possible image analysis. Free text adds a layer of complexity with unstructured information, reflecting subjective patient experiences or qualitative insights from healthcare professionals. Furthermore, the contrast between patient and doctor notes reflects the dual perspectives of symptoms and formal diagnoses. In essence, clinical data's unique blend of structured and unstructured components, along with its multidisciplinary nature, necessitates specialized methodologies for comprehensive analysis and interpretation in the realm of healthcare. +Clinical data refers to information __collected from patients__ during healthcare delivery, clinical trials, and medical research. Clinical data comes from a wide variety of sources and as such, requires careful consideration when designing, collecting, and analyzing this data. Unlike domains such as Finance or other areas in the sciences which predominantly use structured data with predictable and consistent formats, clinical data is often heterogeneous, integrating many forms of both structured and unstructured data: quantitative measurements, categorical data, genotyping, images, subjective narratives from patient notes, and objective observations or conclusions. The unstructured nature of free text from notes reflecting subjective patient experiences or qualitative insights from healthcare professionals especially adds a further layer of complexity. Furthermore, the contrast between patient and doctor notes reflects the dual perspectives of symptoms and formal diagnoses. In essence, clinical data's unique blend of structured and unstructured components, along with its multidisciplinary nature, necessitates specialized methodologies for comprehensive analysis and interpretation in the realm of healthcare. Further, because clinical data contains sensitive, personal information about patients, there are additional security and ethics concerns in the handling and management of clinical data. ## Major clinical data types +Clinical data can come in many different forms, including -Clinical data can come in many different forms, including demographics, diagnoses, lab results, vital signs, medication records, procedures, genetic reports, images, scanned documents, and notes written by physicians, nurses, and other clinicians. Although any of these types of information might theoretically be available for use in clinical research, some sources are more accessible than others. As they are often stored directly in electronic medical record systems, notes, demographics, observations (labs, medications, procedures, and vitals), and images are the easiest data to work with, and are the focus of most Electronic Health Record (EHR) data research efforts. +* patient demographics +* medical history or records such as diagnoses, lab results, vital signs, medication records, or procedure history +* genetic reports +* images +* scanned documents, and notes written by physicians, nurses, and other clinicians +* and more ... + +```{r fig.align='center', echo = FALSE, fig.alt= "Clinical data refers to information collected from patients during healthcare delivery, clinical trials, and medical research and could include demographics, medical history, lab results, imaging, treatment outcomes, and more", out.width="100%"} +ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit?slide=id.g35bf5e18bfe_0_0#slide=id.g35bf5e18bfe_0_0") +``` + +Some sources of clinical data are more prevalent and readily obtainable than others. For instance, notes, demographics, images, and histories or observations/records (lab results, vitals, medications, procedures) are often stored directly in electronic medical record systems making them more easily accessible and so are the focus of most Electronic Health Record (EHR) data research efforts. ### Structured Data -Observational and demographic data are often collectively referred to as “structured data”, as they are stored in electronic health record databases and often provided to researchers in tabular form. Although details may vary based on the type of EHR being used, the customizations to the EHR for the specific environment in which data was collected, and any pre-processing that might be done by institutional research offices prior to providing data to researchers, details are likely to be familiar across contexts. Structured data types frequently used in EHR research include demographics, diagnoses, lab values, procedures, vitals, and medication records. Often provided as tables indexed off of a patient or visit ID, these tables -often include timestamps and other supporting descriptors. For example, medication orders might specify the drug name, class, dose, unit, quantity, route, frequency, and other instructions. +Observational records such as test results and demographic data are often collectively referred to as “structured data”, as they are stored in electronic health record databases and often provided to researchers in tabular form. Structured data types frequently used in EHR research include demographics, diagnoses, lab values, procedures, vitals, and medication records. The tables these data are stored in may be indexed by a patient or visit ID and often include timestamps and other supporting descriptors. For example, medication orders might specify the drug name, class, dose, unit, quantity, route, frequency, and other instructions. -Structured data tables often describe entries in terms of codes from standardized vocabularies. Diagnoses might be described with codes from the International Classification of Diseases (ICD) vocabulary, lab tests with Logical Observation Identifiers Names and Codes (LOINC), medications with National Drug Code (NDC), and procedures with Current Procedural Terminology (CPT) codes. These terms, or "billing codes", provide a common foundation that can be invaluable for identifying patients with a specific disease or who have received specified medications, particularly when integrating data from multiple sources. +Structured data tables often describe entries in terms of codes from standardized vocabularies. Diagnoses might be described with codes from the International Classification of Diseases (ICD) vocabulary, lab tests with Logical Observation Identifiers Names and Codes (LOINC), medications with National Drug Code (NDC), and procedures with Current Procedural Terminology (CPT) codes. These terms, or "billing codes", provide a common foundation that can be invaluable for identifying patients with a specific disease or who have received specified medications, particularly when integrating data from multiple sources. ```{r, fig.align='center', echo = FALSE, fig.alt= "Example of structured data, a table that includes patient id numbers, billing codes, dates, blood pressure measurements, weight measurements, height measurements, and prescribed medications coded using the National Drug Code (NDC)", out.width="100%"} ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit#slide=id.g3385bea4ad0_0_0" ) ``` +As we've described it, structured clinical data is expected to have similarities, although specific details may vary based on the type of EHR being used and any customizations to the EHR for the specific environment or institution in which data was collected (e.g., any specialized pre-processing by institutional research offices prior to providing data to researchers). So it is important for researchers to consider differences present in their data if they've obtained it from multiple contexts or institutions. -### Unstructured Data / Clinical Notes +### Unstructured Data / Clinical Notes -Clinical notes are, perhaps unsurprisingly, generally shared as seemingly straightforward text files. However, the simple format should not be taken as a suggestion that the data are easy to interpret. Some EHR systems contain literally dozens of types of notes, covering specialties such as pathology or surgery; specific moments in care such as admission or discharge; particular procedures such as colonoscopies; patient-provider interactions such as telehealth or phone encounters, and many others. In addition to differing in content, these sources may have different layouts and formats, ranging from free-form reports to structured SOAP (subjective, objective, assessment, and plan) formats or even templated procedure reports. Understanding the types of notes available in a given context and where relevant data might be found is a key step in effectively using clinical notes. +Clinical notes are, perhaps unsurprisingly, generally shared as seemingly straightforward text files. However, the simple format should not be taken as a suggestion that the data are easy to interpret. Some EHR systems contain literally dozens of types of notes, covering specialties such as pathology or surgery; specific moments in care such as admission or discharge; particular procedures such as colonoscopies; patient-provider interactions such as telehealth or phone encounters, and many others. In addition to differing in content, these sources may have different layouts and formats, ranging from free-form reports to structured SOAP (subjective, objective, assessment, and plan) formats or even templated procedure reports. Understanding the types of notes available in a given context and where relevant data might be found is a key step in effectively using clinical notes. ```{r, fig.align='center', echo = FALSE, fig.alt= "Unstructured Data - Data without specific format: includes images, pathology reports, radiology reports, clinical notes, discharge summaries and more. Includes an image of an x-ray and some clinical notes that states: Patient reports a sharp pain in the right lower abdomen for the past 24 hours.Temperature 98.6°F, BP 120/80, tenderness in the right lower quadrant. Suspected appendicitis. Refer to surgery for evaluation, initiate IV fluids, and pain management.", out.width="100%"} @@ -49,7 +61,7 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0Ge ``` -When used in EHR research, both structured data and clinical notes are generally de-identified to protect patient privacy. Patient ID numbers might be replaced with new identifiers, with linkages maintained by institutional “honest brokers” [@Dhir2008] charged with providing clinical data for research purposes. In some cases, dates may be changed as well. Clinical notes are generally “de-identified” through specialized software designed to remove names, dates, locations, and other sensitive details. Researchers working with institutions to access clinical data should be sure to understand local data de-identification practices. +When used in EHR research, both structured data and clinical notes are generally de-identified to protect patient privacy. Patient ID numbers might be replaced with new identifiers, with linkages maintained by institutional “honest brokers” [@Dhir2008] charged with providing clinical data for research purposes. In some cases, dates may be changed as well. Clinical notes are generally “de-identified” through specialized software designed to remove names, dates, locations, and other sensitive details. Researchers working with institutions to access clinical data should be sure to understand local data de-identification practices. ## How to acquire clinical data From 6214729140a3d337bbb77a497335443cf90dc87f Mon Sep 17 00:00:00 2001 From: Kate Isaac <41767733+kweav@users.noreply.github.com> Date: Thu, 22 May 2025 19:37:23 -0400 Subject: [PATCH 07/10] add to bullets --- 01-data_types.Rmd | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/01-data_types.Rmd b/01-data_types.Rmd index f7e073f..87af73f 100644 --- a/01-data_types.Rmd +++ b/01-data_types.Rmd @@ -29,8 +29,10 @@ Clinical data can come in many different forms, including * patient demographics * medical history or records such as diagnoses, lab results, vital signs, medication records, or procedure history * genetic reports +* health monitor data * images * scanned documents, and notes written by physicians, nurses, and other clinicians +* survey/ case report form (CRF responses) * and more ... ```{r fig.align='center', echo = FALSE, fig.alt= "Clinical data refers to information collected from patients during healthcare delivery, clinical trials, and medical research and could include demographics, medical history, lab results, imaging, treatment outcomes, and more", out.width="100%"} @@ -39,7 +41,7 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0Ge Some sources of clinical data are more prevalent and readily obtainable than others. For instance, notes, demographics, images, and histories or observations/records (lab results, vitals, medications, procedures) are often stored directly in electronic medical record systems making them more easily accessible and so are the focus of most Electronic Health Record (EHR) data research efforts. -### Structured Data +### Structured data Observational records such as test results and demographic data are often collectively referred to as “structured data”, as they are stored in electronic health record databases and often provided to researchers in tabular form. Structured data types frequently used in EHR research include demographics, diagnoses, lab values, procedures, vitals, and medication records. The tables these data are stored in may be indexed by a patient or visit ID and often include timestamps and other supporting descriptors. For example, medication orders might specify the drug name, class, dose, unit, quantity, route, frequency, and other instructions. @@ -51,7 +53,7 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0Ge As we've described it, structured clinical data is expected to have similarities, although specific details may vary based on the type of EHR being used and any customizations to the EHR for the specific environment or institution in which data was collected (e.g., any specialized pre-processing by institutional research offices prior to providing data to researchers). So it is important for researchers to consider differences present in their data if they've obtained it from multiple contexts or institutions. -### Unstructured Data / Clinical Notes +### Unstructured data / clinical notes Clinical notes are, perhaps unsurprisingly, generally shared as seemingly straightforward text files. However, the simple format should not be taken as a suggestion that the data are easy to interpret. Some EHR systems contain literally dozens of types of notes, covering specialties such as pathology or surgery; specific moments in care such as admission or discharge; particular procedures such as colonoscopies; patient-provider interactions such as telehealth or phone encounters, and many others. In addition to differing in content, these sources may have different layouts and formats, ranging from free-form reports to structured SOAP (subjective, objective, assessment, and plan) formats or even templated procedure reports. Understanding the types of notes available in a given context and where relevant data might be found is a key step in effectively using clinical notes. From 1a5625f215f13259ae7d5681f17035132b1e48d4 Mon Sep 17 00:00:00 2001 From: Kate Isaac <41767733+kweav@users.noreply.github.com> Date: Thu, 22 May 2025 19:38:43 -0400 Subject: [PATCH 08/10] run checks on stacked branches too --- .github/workflows/pull_request.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/.github/workflows/pull_request.yml b/.github/workflows/pull_request.yml index 091af95..1e56449 100644 --- a/.github/workflows/pull_request.yml +++ b/.github/workflows/pull_request.yml @@ -4,8 +4,7 @@ name: Pull Request on: - pull_request: - branches: [ main, staging ] + pull_request jobs: From 7b34b05971863b442e4bc09182d4cb44cb9b42d8 Mon Sep 17 00:00:00 2001 From: Kate Isaac <41767733+kweav@users.noreply.github.com> Date: Thu, 22 May 2025 19:51:57 -0400 Subject: [PATCH 09/10] short pithy intro --- 01-data_types.Rmd | 1 + 1 file changed, 1 insertion(+) diff --git a/01-data_types.Rmd b/01-data_types.Rmd index 87af73f..36090f9 100644 --- a/01-data_types.Rmd +++ b/01-data_types.Rmd @@ -9,6 +9,7 @@ ottrpal::set_knitr_image_path() # Clinical Data Types +Clinical data is health-related information collected from patients throughout their healthcare journey. It may come in many forms and its sensitive nature requires careful management by researchers. ## Learning Objectives From 75394647f34483d41b53d4d888650be8ff7e6f58 Mon Sep 17 00:00:00 2001 From: Kate Isaac <41767733+kweav@users.noreply.github.com> Date: Tue, 27 May 2025 18:59:20 -0400 Subject: [PATCH 10/10] move EHR figure --- 01-data_types.Rmd | 10 ++++++++-- 02-data_uses.Rmd | 17 ++++++++--------- 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/01-data_types.Rmd b/01-data_types.Rmd index 36090f9..edc15f5 100644 --- a/01-data_types.Rmd +++ b/01-data_types.Rmd @@ -9,7 +9,7 @@ ottrpal::set_knitr_image_path() # Clinical Data Types -Clinical data is health-related information collected from patients throughout their healthcare journey. It may come in many forms and its sensitive nature requires careful management by researchers. +Clinical data is health-related information collected from patients throughout their healthcare journey. It may come in many forms and its sensitive nature requires careful management by researchers. ## Learning Objectives @@ -44,7 +44,13 @@ Some sources of clinical data are more prevalent and readily obtainable than oth ### Structured data -Observational records such as test results and demographic data are often collectively referred to as “structured data”, as they are stored in electronic health record databases and often provided to researchers in tabular form. Structured data types frequently used in EHR research include demographics, diagnoses, lab values, procedures, vitals, and medication records. The tables these data are stored in may be indexed by a patient or visit ID and often include timestamps and other supporting descriptors. For example, medication orders might specify the drug name, class, dose, unit, quantity, route, frequency, and other instructions. +Observational records such as test results and demographic data are often collectively referred to as “structured data”, as they are stored in electronic health record databases and often provided to researchers in tabular form. Structured data types frequently used in EHR research consist of comprehensive longitudinal records of a patient's interactions with a healthcare system and may include demographics, diagnoses, lab values, procedures, vitals, and medication records. + +```{r, fig.align='center', echo = FALSE, fig.alt= "Electronic health records include many different kinds of records on individuals over time, including, clinical notes, family history information, lab result, images, and medication information. The image shows data on the same individual over a period of time. ", out.width="100%"} +ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit#slide=id.g338f75828af_0_7" ) +``` + +The structured nature of this data allows for it to be stored in tables which may be indexed by a patient or visit ID and often include timestamps and other supporting descriptors. For example, medication orders might specify the drug name, class, dose, unit, quantity, route, frequency, and other instructions. Structured data tables often describe entries in terms of codes from standardized vocabularies. Diagnoses might be described with codes from the International Classification of Diseases (ICD) vocabulary, lab tests with Logical Observation Identifiers Names and Codes (LOINC), medications with National Drug Code (NDC), and procedures with Current Procedural Terminology (CPT) codes. These terms, or "billing codes", provide a common foundation that can be invaluable for identifying patients with a specific disease or who have received specified medications, particularly when integrating data from multiple sources. diff --git a/02-data_uses.Rmd b/02-data_uses.Rmd index 0648e42..d5748c3 100644 --- a/02-data_uses.Rmd +++ b/02-data_uses.Rmd @@ -26,12 +26,11 @@ The near universal adoption of electronic health record (EHR) systems in the US - imaging - genetic test reports -These data can be accessed not only by health professionals, but also by patients through patient portals. EHR data can also be used to enable data-driven interventions such as provider- and patient-facing clinical decision support (CDS) and population health management (PHM). - -```{r, fig.align='center', echo = FALSE, fig.alt= "Electronic health records include many different kinds of records on individuals over time, including, clinical notes, family history information, lab result, images, and medication information. The image shows data on the same individual over a period of time. ", out.width="100%"} -ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit#slide=id.g338f75828af_0_7" ) -``` + ```{r, fig.align='center', echo = FALSE, fig.alt= "Electronic health records include many different kinds of records on individuals over time, including, clinical notes, family history information, lab result, images, and medication information. The image shows data on the same individual over a period of time. ", out.width="100%"} + ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit#slide=id.g338f75828af_0_7" ) + ``` +These data can be accessed not only by health professionals, but also by patients through patient portals. EHR data can also be used to enable data-driven interventions such as provider- and patient-facing clinical decision support (CDS) and population health management (PHM). CDS has been defined as tools that "provide clinicians, staff, patients, or other individuals with knowledge and person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health care" [@Osheroff_2007]. Examples of widely adopted CDS tools with demonstrated effectiveness for cancer prevention, diagnosis, and care include: @@ -54,7 +53,7 @@ Several PHM programs have demonstrated to be effective in increasing the uptake ottrpal::include_slide("https://docs.google.com/presentation/d/1ivDTcLjb2078O0GemkSeCgC1jmxk4fMsiFQaPaer9mQ/edit#slide=id.g338f75828af_0_28" ) ``` -Clinical Decision Support (CDS) can help patients and clinicians make decisions about an individual’s care, while Population Health Management (PHM) can help identify individuals for interventions and engagement. The image shows a single person getting a colorectal screening reminder for CDS and a population being identified for possibly needing colorectal screening for PHM. +Clinical Decision Support (CDS) can help patients and clinicians make decisions about an individual’s care, while Population Health Management (PHM) can help identify individuals for interventions and engagement. The image shows a single person getting a colorectal screening reminder for CDS and a population being identified for possibly needing colorectal screening for PHM. While some CDS and PHM approaches have been successfully adopted widely, emerging technologies such as the use of generative AI approaches to analyze diagnostic imaging, large language models (LLMs) to extract information from narrative texts (e.g., clinical notes), LLM-based chatbots to communicate with patients, and digital health tools such as home-based sensors are creating unprecedented opportunities for next generation CDS and PHM. These approaches have the potential to enable significant breakthroughs through the implementation of patient-tailored cancer prevention and care at a population scale. Nevertheless, substantial research is needed to ensure effective and fair implementation of these CDS and PHM interventions. @@ -291,13 +290,13 @@ Clinical Data provides a wealth of information that can be leveraged to answer a ### Retrospective analysis -Retrospective analysis involves the examination of pre-existing clinical data to answer specific research questions, explore hypotheses, and identify patterns or trends. This method leverages historical data collected from medical records, administrative databases, registries, electronic health records (EHRs), or other sources of clinical information. The types of questions that can be asked through retrospective analysis span a wide range of clinical and epidemiological domains. The questions typically focus on understanding patient characteristics, disease epidemiology, treatment outcomes, risk factors, healthcare utilization, and more. +Retrospective analysis involves the examination of pre-existing clinical data to answer specific research questions, explore hypotheses, and identify patterns or trends. This method leverages historical data collected from medical records, administrative databases, registries, electronic health records (EHRs), or other sources of clinical information. The types of questions that can be asked through retrospective analysis span a wide range of clinical and epidemiological domains. The questions typically focus on understanding patient characteristics, disease epidemiology, treatment outcomes, risk factors, healthcare utilization, and more. Here, we explore the different categories of questions that can be addressed using retrospective data analysis: 1. **Descriptive Questions** -Descriptive questions aim to summarize and describe the characteristics of a patient population, disease, or healthcare process. They provide a foundational understanding of a dataset and are often the starting point for more complex analyses. +Descriptive questions aim to summarize and describe the characteristics of a patient population, disease, or healthcare process. They provide a foundational understanding of a dataset and are often the starting point for more complex analyses. - **Patient Demographics and Characteristics**: What are the demographic profiles (age, gender, ethnicity, socioeconomic status) of patients diagnosed with a specific condition? What are the common comorbidities and risk factors in this population? - **Disease Prevalence and Incidence**: What is the prevalence or incidence of a specific disease or condition in a particular population or geographic area over a defined period? @@ -317,7 +316,7 @@ By examining historical data, researchers can gain insights into how different t 3. **Outcome and Prognostic Questions** -Understanding patient outcomes and prognostic factors is central to retrospective analyses. These questions focus on the end results of healthcare practices and patient management, including survival, complications, and quality of life: +Understanding patient outcomes and prognostic factors is central to retrospective analyses. These questions focus on the end results of healthcare practices and patient management, including survival, complications, and quality of life: - **Survival and Mortality**: What are the survival rates and mortality rates associated with a particular disease, condition, or treatment? What factors are associated with increased or decreased survival rates? - **Complication Rates**: What are the rates and types of complications associated with specific diseases, surgeries, or treatments? Are there identifiable risk factors for these complications?