Docker build for the ErythronDB Website and Database
The Erythron Database (ErythronDB) was a resource dedicated to facilitating better understanding of the cellular and molecular underpinnings of mammalian erythropoiesis. For more information about this project, please visit https://cstoeckert.github.io/past-projects/ErythronDB.html.
Any publications referencing ErythronDB records/reports, search results or analysis tools should cite the following publication:
Kinglsey et al. 2013. Ontogeny of erythroid gene expression.
E-blood: https://doi.org/10.1182/blood-2012-04-422394
The following citations should be used to acknowledge specific datasets in ErythronDB:
-
Ontogeny of erythroid gene expression
Kinglsey et al. 2013. Ontogeny of erythroid gene expression. E-blood: https://doi.org/10.1182/blood-2012-04-422394 ArrayExpress Accession: E-MTAB-1035
-
EPO-regulated targets in murine erythroid progenitors
Wojchowski, D. et al. 2017. ArrayExpress Accession: E-MTAB-5373
-
EPO-regulated targets in murine E1 cells with a truncated erthropoietin receptor (EpoR)
Wojchowski, D. et al. EPO-regulated targets in murine E1 cells with a truncated erthropoietin receptor (EpoR). ErythronDB. to be updated
-
EPO phosphorylation PTM targets in UT-7epo-E cells
Held MA et al. 2020. Phospho-proteomic discovery of novel signal transducers including thioredoxin-interacting protein as mediators of erythropoietin-dependent human erythropoiesis. Experimental hematology: https://doi.org/10.1016/j.exphem.2020.03.003
Held MA et al. 2020. Phospho-PTM proteomic discovery of novel EPO-modulated kinases and phosphatases, including PTPN18 as a positive regulator of EPOR/JAK2 Signaling. Cellular signalling: https://doi.org/10.1016/j.cellsig.2020.109554
The data is now available on the Dryad open data publishing platform: https://datadryad.org/stash/dataset/doi:10.5061/dryad.2bvq83bwq
- Docker or Docker Desktop
NOTE: Recommend using docker compose to build the containers NOTE: Depending on how
docker composeis installed on your system, the command may bedocker-composeinstead ofdocker compose
- Memory / Disk Space
- a minimum of
4.5GBRAM is required to build the website. Once the docker build is complete, the website will run with2GBor less of RAM, so you can stop theerythrondb-websitecontainer, adjust your memory allocations accordingly, and restart. - The database container will require
380MBof hard-drive space in the directories where docker images/containers are stored (usually/var/lib/docker). - The website container is larger and will require
3-4GBof hard-drive space in the directories where docker images/containers are stored. - The instantiated database (location of the target
pgdataon the host) will utiliize13-15GBof disk space.
- a minimum of
base directory: directory containgdocker-compose.yamlfile for the project (erythrondb-dockeror an alias specified when cloning from GitHub)
NOTE: with the exception of the
git clonestep, example commands provided below should all be run from thebase directory
Clone the parent repository (ErythronDB/erythrondb-docker)
git clone https://github.com/ErythronDB/erythrondb-docker.git
Edit sample.env and save as .env in the base directory.
- this defines environmental variables for the build environment that are required by
docker-compose.yaml - values to set are as follows:
- DB_INIT: full path to the ErythronDB database dump; if using Docker Desktop, the host path containing the
DB_INITfile must be a directory or subdirectory for whichfile sharingis enabled See Note 1, below - DB_DATA: target path on host for mounting the PostGreSQL database (store the data); if using Docker Desktop, the host
DB_DATAtarget must be a directory or subdirectory for whichfile sharingis enabled. See Notes 1 & 2, below. - DB_PORT: mapped host port for the PostGreSQL database (default=5432)
- POSTGRES_INIT_PASSWORD: placeholder for DB admin credentials; needed to initialize the database. NOTE:
POSTGRES_USERin thedocker-compose.yamlfile should not be changed; must bepostgres. - TOMCAT_PORT: mapped host port for tomcat (default=8080)
- TOMCAT_LOG: target path on host for mounting tomcat log directory; enables logs to be viewed outside of the container; if using Docker Desktop, the host
LOGtarget must be a directory or subdirectory for whichfile sharingis enabled
- DB_INIT: full path to the ErythronDB database dump; if using Docker Desktop, the host path containing the
NOTE 1 - BEST PRACTICES: Pick a location outside of the code pulled from the repository for data storage; i.e., the target directories for the database init file and the PostGreSQL
pgdatafile should not be subdirectories of thebase directory.
NOTE 2: The database will NOT initialize if the
DB_DATAdirectory already exists and has contents. If you need to reinitialize the database, you will need to first remove theDB_DATAtarget directory. For example, if you setDB_PATH=/erythrondb/data/pgdata, the path/erythrondb/data/should exist on the host, but the target directorypgdatashould not. The docker build will create it. If you need to reinitialize the database, you will need to remove thepgdatadirectory.
Edit site-admin.properties.sample and save in place as site-admin.properties.
- this file defines values that are needed to generate the website configuration during build-time, some of which would be security risks if set as environmental variables in the container (e.g., passwords).
- the values to set are as follows
- WEB_DB_PASSWORD will be provided as part of a data access request
- SITE_ADMIN_EMAIL email address to which
Contact Usmessages should be sent - TOMCAT_MANAGER_PASSWORD should be changed from the default. The user name is
tomcat-admin
- Leave all other property values in the file unchanged, unless adding an Apache layer on the host to enable SSL/HTTPS. See CORS section below
WARNING: DO NOT COMMIT the modified
.envandsite-admin.propertiesfiles to the repository as they may contained database passwords. Currently both are included in.gitignore, but in the case that you accidentally do commit either file, you should change the database passwords and let us know so we can change the default database passwords in the distributed versions.
This can be accomplished by putting an apache proxy pass/reverse proxy pass on the on the host machine. Instructions for doing this are outside the scope of this project, but can be easily found on the web.
The CORS_ALLOWED_ORIGINS setting in the site-admin.properties file will need to be updated to allow https from the hostname (or host IP). See the site-admin.properties.sample file for details.
- Create the
DB_INITtarget directory on the host. If usingDocker Desktop, make surefile sharingis enabled either for the target directory or its parent - Create the parent directory for the
DB_DATAtarget (e.g., ifDB_DATA=/erythrondb/data/pgdata, thenmkdir -p /erythrondb/data). If usingDocker Desktop, make surefile sharingis enabled for the parent directory. - Fetch the database dump (
erythrondb.sql.gz) and save in theDB_INITdirectory (dump available upon request). - Build the database container and initialize the database by executing:
docker compose up -d db
or from outside of the base directory:
docker compose -f "<path to base_directory>/docker-compose.yaml" up -d --build db
NOTE: The database must finish initializing before starting the web container for the first time. This may take 10-30 minutes depending on the available resources on the host machine.
You can track the database initialization progress using the docker logs as follows:
docker compose logs --follow --timestamps | grep erythrondb-db
The log should report database system is ready to accept connections when the database is fully built. Again, this should be after about 15-20 minutes of executing commands from the erythrondb.sql.gz database dump. If the database build completes sooner (matter of seconds), it is likely that either:
- The
DB_DATAdirectory already existed or was not empty. Remove theDB_DATAdirectory and try again. - The
POSTGRES_INIT_USERwas changed. It must be set topostgres. Correct the setting and try again.
- Create the
TOMCAT_LOGdirectory on the host. If usingDocker Desktop, make surefile sharingis enabled either for the target directory or its parent - Build the website site and start the tomcat application by executing:
docker compose up -d web
or from outside of the base directory:
docker compose -f "<path to base directory>/docker-compose.yaml" up -d --build web
If the erythrondb-website container has started successfully, you should be able to access the website from http://localhost:8080/ErythronDB. If you specified a custom TOMCAT_PORT in the .env file, substitute that value for 8080 in the URL.
-
Build is taking a long time and appears to have hung.
The docker build may take 30 minutes or more the first time on a system with limited resources. Allocating more RAM to
docker(or toWSL2if using Windows) may speed things up. -
Website build fails during JavaScript bundling.
More RAM is needed. A minimum of
4.5GBof RAM is needed to build the website. Allocate more memory todocker(or toWSL2if using Windows) and try again. -
Accessed deniedor other permissions errors when trying to access the tomcat logs on the host (TOMCAT_LOGdirectory)The log files are owned by
rootuser in the docker container. If you do not haverootorsudoaccess on the host machine, run the following command on the host to change the permissions:docker exec -it erythrondb-web bash -c "chmod -R 777 /usr/local/tomcat/logs" -
http://localhost:8080/ErythronDBgives a404error (or http://localhost:TOMCAT_PORT/ErythronDB)This is most likely due to a problem with the site configuration.
- Review
$TOMCAT_LOG/erythrondb/wdk.log4jto determine the errors in the configuration file - Update
site-admin.properties - Stop the
erythrondb-webcontainer and remove the associated image (see docker documentation for commands) - Run the
docker system prune -ato remove the build caches (see docker documentation for more specific commands) - Rebuild the container
- Review
-
Something else?
Please post a question on our issue tracker, including the following:
- OS (Mac, Windows, Linux)
- error message or docker log (run
docker logs erythrondb-db), when relevant