Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This repository proposes a possible next step in the evolution of free-text data

**CogStack-NiFi** demonstrates how to use [Apache NiFi](https://nifi.apache.org/) as the central data workflow engine for clinical document processing, integrating services such as text extraction and natural language processing (NLP). Each component runs as a standalone service, with NiFi handling data routing between components and data sources/sinks.

All NLP services are expected to implement a uniform RESTful API, allowing seamless integration into existing pipelines—making it easy to incorporate any NLP application into the stack.
All NLP/ML/DATA services are expected to implement a uniform RESTful API, allowing seamless integration into existing pipelines—making it easy to incorporate any NLP application into the stack.

---

Expand Down Expand Up @@ -48,13 +48,13 @@ Need help? Feel free to:
**Prerequisites**:

- Docker (mandatory)
- Basic knowledge of Python and Linux/UNIX systems
- Basic knowledge of Python and Linux/UNIX systems (Bash (simple commands only, we promise))

📖 Official documentation: [cogstack-nifi.readthedocs.io](https://cogstack-nifi.readthedocs.io/en/latest/)

🚀 New to the project? Start with the [deployment guide](https://cogstack-nifi.readthedocs.io/en/latest/deploy/main.html) for example setups and workflows.

🐞 For troubleshooting or bug reports, consult the [Known Issues section](https://cogstack-nifi.readthedocs.io/en/latest/deploy/main.html) before opening a ticket.
🐞 For troubleshooting or bug reports, consult the [known issues section](https://cogstack-nifi.readthedocs.io/en/latest/deploy/troubleshooting.html) before opening a ticket.

---

Expand Down
4 changes: 2 additions & 2 deletions deploy/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ start-medcat-service-deid:
$(WITH_ENV) docker compose -f ../services/cogstack-nlp/medcat-service/docker/docker-compose.yml $(DC_START_CMD) nlp-medcat-service-production-deid

start-medcat-trainer:
$(WITH_ENV) docker compose -f../services/cogstack-nlp/medcat-trainer/docker-compose-prod.yml $(DC_START_CMD) medcattrainer nginx solr
$(WITH_ENV) docker compose -f ../services/cogstack-nlp/medcat-trainer/docker-compose-prod.yml $(DC_START_CMD) medcattrainer nginx solr

start-production-db:
$(WITH_ENV) docker compose -f services.yml ${DC_START_CMD} cogstack-databank-db
Expand Down Expand Up @@ -136,7 +136,7 @@ stop-jupyter:
$(WITH_ENV) docker compose -f ../services/cogstack-jupyter-hub/docker/docker-compose.yml $(DC_STOP_CMD) cogstack-jupyter-hub

stop-medcat-trainer:
$(WITH_ENV) docker compose -f../services/cogstack-nlp/medcat-trainer/docker-compose-prod.yml $(DC_STOP_CMD) medcattrainer nginx solr
$(WITH_ENV) docker compose -f ../services/cogstack-nlp/medcat-trainer/docker-compose-prod.yml $(DC_STOP_CMD) medcattrainer nginx solr

stop-medcat-service:
$(WITH_ENV) docker compose -f ../services/cogstack-nlp/medcat-service/docker/docker-compose.yml $(DC_STOP_CMD) nlp-medcat-service-production
Expand Down
2 changes: 1 addition & 1 deletion deploy/services.yml
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ services:
- databank-vol:/var/lib/postgresql/data
command: postgres -c "max_connections=${POSTGRES_DB_MAX_CONNECTIONS:-100}"
ports:
- 5556:5432
- 5558:5432
expose:
- 5432
networks:
Expand Down
884 changes: 432 additions & 452 deletions docs/deploy/services.md

Large diffs are not rendered by default.

20 changes: 9 additions & 11 deletions docs/deploy/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Troubleshooting
# 📛 Troubleshooting

Always start with fresh containers and volumes, to make sure that there are no volumes from previous experimentations, make sure to always delete all/any cogstack running containers by executing:

Expand All @@ -8,13 +8,11 @@ followed by a cleanup or dangling volumes (careful as this will remove all volum

`docker volume prune -f` <strong> WARNING THIS WILL DELETE ALL UNUSED VOLUMES ON YOUR MACHINE!</strong>. Check the volume names used in services.yml file and delete them as necessary `dockr volume rm volume_name`

## Known Issues/errors
## 🐞 Known Issues/errors

Common issues that can be encountered across services.
<br>
<br>

### **Apple Silicon**
### 🍎 **Apple Silicon**

Many services cannot run natively on Apple Silicon (such as M1 and M2 architectures). Common error messages related to Apple silicon follow patterns similar to:
<br /><br/>
Expand All @@ -24,7 +22,7 @@ Many services cannot run natively on Apple Silicon (such as M1 and M2 architectu
- `no matching manifest for linux/arm64/v8 in the manifest list entries`
<br /><br/>
<br /><br/>
- `image with reference cogstacksystems/cogstack-ocr-service:0.2.4 was found but does not match the specified platform: wanted linux/arm64, actual: linux/amd64`
- `image with reference cogstacksystems/cogstack-ocr-service:1.0.2 was found but does not match the specified platform: wanted linux/arm64, actual: linux/amd64`
<br /><br/>
To solve these issues; Rosetta is required and enabled in Docker Desktop. Finally an environment variable is required to be set.

Expand All @@ -42,7 +40,7 @@ export DOCKER_DEFAULT_PLATFORM=linux/amd64

to set the environment variable. These issues are known to occur on the "cogstack-nifi", "cogstack-ocr-services" and "jupyter-hub" services and may occur on others.

### **NiFi**
### 🔧 **NiFi**

When dealing with contaminated deployments ( containers using volumes from previous instances ) :
<br /><br/>
Expand All @@ -63,9 +61,9 @@ When dealing with contaminated deployments ( containers using volumes from previ
<br /><br/>
- `Unable to connect to ElasticSearch` using the `ElasticSearchClientService` NiFi controller, make sure the settings are correct (username,password,certificates, etc.) and then click `Apply`, disregard the errors and click `Enable` on the controller to forcefully reload the controller, stop it and then validate the settings, start it again after and it should work.

### **Elasticsearch Errors**
### 🛢️ **Elasticsearch Errors**

#### **VM memory errors, failed bootstrap check**
#### **VM memory errors, failed bootstrap check**

It is quite a common issue for both opensearch and native-ES to error out when it comes to virtual memory allocation, this error typically comes in the form of :

Expand All @@ -91,7 +89,7 @@ For more on this issue please read: https://www.elastic.co/guide/en/elasticsearc

<br>

#### **OpenSearch: validating opensearch.yml hosts**
#### 📄 **OpenSearch: validating opensearch.yml hosts**

```bash
FATAL Error: [config validation of [opensearch].hosts]: types that failed validation:
Expand All @@ -118,7 +116,7 @@ Alternatively (if the script executes without issues):
make start-elastic
```

### DB-samples issues
### 🗃️ DB-samples issues

```bash
No table data for samples_db
Expand Down
4 changes: 2 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ Welcome to CogStack-Nifi's documentation!

main.md
news.md
nifi/main.md
security/main.md
deploy/main.md
deploy/deployment.md
deploy/troubleshooting.md
deploy/workflows.md
nifi/main.md
security/main.md

Indices and tables
==================
Expand Down
6 changes: 3 additions & 3 deletions docs/news.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# News
# 📰 News

<strong>This document covers important news with regards to the components of CogStack as a whole, any major security issues or major changes that might break existing deployments are covered here along with how to handle them.</strong>
</br>
</br>

## 13-12-2021 LOG4J Vulnerabity
## 🛑 13-12-2021 LOG4J Vulnerabity

Since the discovery of the Log4J package vulnerability (https://www.ncsc.gov.uk/news/apache-log4j-vulnerability) it is necessary and recommended to update all existing deployments of CogStack.

Expand All @@ -22,7 +22,7 @@ For NiFI:
- re-pull (docker pull cogstacksystems/cogstack-nifi:latest)
- re-pull the tika image (docker pull cogstacksystems/tika-service:latest)

## 01-10-2025 NiFi 2.0 Release
## 🚀 01-10-2025 NiFi 2.0 Release

New version of NiFi along with the long awaited NiFi registry flow released:

Expand Down
Loading