An end-to-end data analysis project that sources, cleans, and visualises Covid-19 vaccination rates across the Czech Republic, presented in a custom-built, interactive web report.
The CoVax Project is a data-driven report that investigates the regional disparities in Covid-19 vaccination uptake across the Czech Republic. At the time of the project's inception, official government dashboards provided extensive data but lacked a clear, direct visualisation of the ratio of fully immunised residents to the total population per region. This project addresses that gap by answering a key question: Is vaccine uptake homogeneous across the country, or are there significant regional differences?
The project follows the complete data analysis pipeline: sourcing raw data from the Czech Open Data Repository, performing extensive data cleaning and wrangling in R, and merging healthcare data with geospatial coordinates. The final output is a choropleth map that clearly visualises the findings, which is presented on a custom-built, interactive website. The analysis revealed a clear heterogeneity in vaccination rates, with a roughly 10% difference between the highest and lowest regions, correlating with socio-economic and political factors discussed in the report.
-
Automated Data Sourcing: Acquires the latest immunisation data directly from the Czech Ministry of Health's public CSV endpoint.
-
Data Wrangling & Transformation: Cleans, aggregates, and merges data from multiple sources (immunisation records, population statistics, and geospatial data) using R and the tidyverse.
-
Geospatial Visualisation: Generates a high-quality choropleth map using ggplot2 and sf to display vaccination ratios by region.
-
Custom Interactive Front-End: Presents the entire report on a single-page website application with a clean UI, featuring a custom-built, multi-level tabbed navigation.
-
Responsive Design: The web report is responsive and provides an optimal viewing experience on desktops, tablets, and mobile devices.
Skills Employed
-
Data Sourcing and Cleaning
-
Data Wrangling and Transformation
-
R Programming
-
Geospatial Data Matching and Visualisation
-
Static and Interactive Data Visualisation
-
Front-End Web Development
Project Area
This project is linked to Data Science, Public Health, and Information Visualisation, and elements of Social Science the interpretation of the results.
Solution
-
Languages: R, HTML5, CSS3, JavaScript
-
Data Analysis & Visualisation (R/Studio)
-
Data Wrangling: tidyverse (dplyr, readr)
-
Geospatial: sf, RCzechia
-
Plotting: ggplot2
-
-
Web Design
-
Frameworks: Bootstrap 3
-
Libraries: jQuery, highlight.js
-
Deployment: GitHub
-
Learning Challenges
The primary challenge was transitioning the project from a static, auto-generated R Markdown document into a fully custom and interactive web application. This involved manually separating the HTML, CSS, and JavaScript, refactoring legacy code, and rebuilding the interactive tab functionality from scratch using Bootstrap's native components. This was a valuable learning experience in front-end development.
Visuals