diff --git a/deploy/nifi.env b/deploy/nifi.env
index 9333b975f..94ec578d8 100644
--- a/deploy/nifi.env
+++ b/deploy/nifi.env
@@ -51,7 +51,7 @@ NIFI_PYTHON_EXTENSIONS_SOURCE_DIRECTORY_DEFAULT="/opt/nifi/nifi-current/python_e
# nifi.python.working.directory=/opt/nifi/user-scripts
NIFI_PYTHON_WORKING_DIRECTORY="/opt/nifi/user-scripts"
-LOG_LEVEL="ERROR"
+NIFI_LOG_LEVEL="ERROR"
NIFI_AUTH=tls
diff --git a/deploy/services-dev.yml b/deploy/services-dev.yml
index 14fb11977..24d944bd3 100644
--- a/deploy/services-dev.yml
+++ b/deploy/services-dev.yml
@@ -40,9 +40,6 @@ services:
- NIFI_SECURITY_DIR=${NIFI_SECURITY_DIR:-../security/nifi_certificates/}
- ELASTICSEARCH_SECURITY_DIR=${ELASTICSEARCH_SECURITY_DIR:-../security/es_certificates/}
volumes:
- # INFO: mapping custom development directory
- - ../nifi/devel:/opt/nifi/devel
-
# INFO: drivers folder
- ../nifi/drivers:/opt/nifi/drivers
diff --git a/deploy/services.yml b/deploy/services.yml
index ae8ff6f2b..788de3052 100644
--- a/deploy/services.yml
+++ b/deploy/services.yml
@@ -444,12 +444,9 @@ services:
- NIFI_OUTPUT_PORT=${NIFI_OUTPUT_PORT:-8082}
- NIFI_INPUT_SOCKET_PORT=${NIFI_INPUT_SOCKET_PORT:-10000}
volumes:
- # INFO: mapping custom development directory
- - ../nifi/devel:/opt/nifi/devel
-
# INFO: drivers folder
- ../nifi/drivers:/opt/nifi/drivers
-
+
# INFO: if there are local changes, map these content from local host to container
# (normally, these 3 directories below are bundled with our NiFi image)
# N.B. The container user may not have the permission to read these directories/files.
diff --git a/docs/main.md b/docs/main.md
index d78678c27..9ab7119c7 100644
--- a/docs/main.md
+++ b/docs/main.md
@@ -1,46 +1,2 @@
-# Introduction
-This repository proposes a possible next step for the free-text data processing capabilities implemented as [CogStack-Pipeline](https://github.com/CogStack/CogStack-Pipeline), shaping the solution more towards Platform-as-a-Service.
-CogStack-NiFi contains example recipes using [Apache NiFi](https://nifi.apache.org/) as the key data workflow engine with a set of services for documents processing with NLP.
-Each component implementing key functionality, such as Text Extraction or Natural Language Processing, runs as a service where the data routing between the components and data source/sink is handled by Apache NiFi.
-Moreover, NLP services are expected to implement an uniform RESTful API to enable easy plugging-in into existing document processing pipelines, making it possible to use any NLP application in the stack.
-
-## Development
-
-Please note that the project is under constant improvement, brining new features or services that might impact current deployments, please be aware as this might affect you, the user, when making upgrades, so be sure to check the release notes and the documentation beforehand.
-
-If you wish to contribute to the project, submit a pull request and we will review it.
-
-## Asking questions
-Feel free to ask questions on the github issue tracker or on our [discourse website](https://discourse.cogstack.org) which is frequently used by our development team!
-
-
-## Project organisation
-The project is organised in the following directories:
-- [`nifi`](https://github.com/CogStack/CogStack-NiFi/tree/main/nifi/) - custom Docker image of Apache NiFi with configuration files, drivers, example workflows and custom user resources.
-- [`security`](https://github.com/CogStack/CogStack-NiFi/tree/main/security/) - scripts to generate SSL keys and certificates for Apache NiFi and related services (when needed) with other security-related requirements.
-- [`services`](https://github.com/CogStack/CogStack-NiFi/tree/main/services/) - available services with their corresponding configuration files and resources.
-- [`deploy`](https://github.com/CogStack/CogStack-NiFi/tree/main/deploy/) - an example deployment of Apache NiFi with related services.
-- [`scripts`](https://github.com/CogStack/CogStack-NiFi/tree/main/scripts/) - helper scripts containing setup tools, sample ES ingestion, bash ingestion into DB samples etc.
-- [`data`](https://github.com/CogStack/CogStack-NiFi/tree/main/data/) - any data that you wish to ingest should be placed here.
-
-### Branches
-
-- main: main branch, production releases.
-- devel: this branch contains experimental/unstable docker images may cause irregular behaviour or crashes.
-
-## Documentation and getting started
-
-Knowledge requirements: Docker usage (mandatory), Python, Linux/UNIX understarting.
-
-Official documentation now available [here](https://cogstack-nifi.readthedocs.io/en/latest/).
-
-As a good starting point, [deployment](https://cogstack-nifi.readthedocs.io/en/latest/deploy/main.html) walks through an example deployment with some workflow examples.
-
-It is essential that a careful read through the [NiFi](https://cogstack-nifi.readthedocs.io/en/latest/nifi/main.html) section is done as it explains all the details of how NiFi is setup, the configuration and production setup tips.
-
-All issues are tracked in [README](https://cogstack-nifi.readthedocs.io/en/latest/deploy/main.html), check that section before opening a bug report ticket.
-
-## Important news and updates
-
-Please check [IMPORTANT_NEWS](https://cogstack-nifi.readthedocs.io/en/latest/news.html) for any major changes that might affect your deployment and security problems that have been discovered.
\ No newline at end of file
+```{include} ../README.md
diff --git a/docs/news.md b/docs/news.md
index 9a9add040..53f8de7d2 100644
--- a/docs/news.md
+++ b/docs/news.md
@@ -1,4 +1,5 @@
# News
+
This document covers important news with regards to the components of CogStack as a whole, any major security issues or major changes that might break existing deployments are covered here along with how to handle them.
@@ -10,12 +11,28 @@ Since the discovery of the Log4J package vulnerability (https://www.ncsc.gov.uk/
A summary of the steps needed to easily upgrade any CogStack components on an existing deployment:
For both instances (old and NiFI versions of the pipeline):
-
+
- make sure to update Elasticsearch to version 7.16.1+ if you are using the native version, if you are using OpenDistro it will be 1.13.3, and for OpenSearch it would be 1.2.1, all of these versions with their compose config can be found on the main branch of the NiFI repo, all that needs to be done is just a simple version change/increment in the docker-compose file (e.g https://github.com/CogStack/CogStack-NiFi/blob/main/deploy/services.yml , see the kibana/elasticsearch sections), followed by the pulling of the new images.
For the Old pipeline:
+
- re-pull the latest docker image (docker pull cogstacksystems/cogstack-pipeline:latest)
For NiFI:
+
- re-pull (docker pull cogstacksystems/cogstack-nifi:latest)
- re-pull the tika image (docker pull cogstacksystems/tika-service:latest)
-
\ No newline at end of file
+
+## 01-10-2025 NiFi 2.0 Release
+
+New version of NiFi along with the long awaited NiFi registry flow released:
+
+ - massive repository structure changes.
+ - new scripts and cert generation
+ - new nifi processors and a revision of old scripts
+ - restructured services and references to them
+
+Things to consider before upgrading:
+
+ - PLEASE back up a copy of your current NiFi repository before upgrading.
+ - PLEASE BE AWARE THAT THE TEMPLATES FROM NIFI 1.0 ARE NOT COMPATIBLE WITH NIFI 2.0, back them up!
+ - If you wish to migrate the nifi templates you will have to keep multiple versions operational
diff --git a/docs/requirements.txt b/docs/requirements.txt
index 6bdae63c3..94bf3cc6f 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -1,3 +1,3 @@
-Sphinx~=5.0
-sphinx-rtd-theme~=1.0
-myst-parser~=0.17
\ No newline at end of file
+Sphinx==8.2.3
+sphinx-rtd-theme==3.0.2
+myst-parser==4.0.1
\ No newline at end of file
diff --git a/docs/security.md b/docs/security.md
deleted file mode 100644
index fc63d8ba2..000000000
--- a/docs/security.md
+++ /dev/null
@@ -1,321 +0,0 @@
-# Security
-
-In [the example deployment](deploy/services.md), for the ease of deployment and demo purposes, all the services have SSL security disabled and are using the default built-in users with passwords.
-
-With NiFi 1.15+ HTTPS is enforced, this requires users to generate their own certificates. Some default publicly availble certificates are available in this repo as part of the demo but users should ALWAYS generate their own in production environment setups.
-
-The Elasticsearch instances are now setup also with certificates, mainly cause this would most likely always be a requirement as part of a production deployment.
-
-**IMPORTANT:
-Please note that the actual security configuration will depend on the requirements of the user/team/organisation planning to use the services stack.
-The information provided in this README hence should be only considered as a hint and consulted with the key stakeholders before considering any production use.**
-
-## Directory structure
-
-```
-./security
-├── certificates_elasticsearch.env <---- env vars for ES certificates, same vars are used for both native ES & Opensearch
-├── certificates_general.env <---------- root-ca env vars
-├── certificates_nifi.env <------------- NiFi cert env vars
-├── create_es_native_certs.sh <--------- Use this to create certificates for Elasticsearch Native (NOT FOR OPENSEARCH!)
-├── create_es_native_credentials.sh <--- Use this after starting up the ES containers to create the base users for ES (NOT FOR OPENSEARCH!)
-├── create_keystore.sh <---------------- Used for opensearch node cert generation
-├── create_opensearch_admin_cert.sh <-- Admin certs for Opensearch Kibana
-├── create_opensearch_client_cert.sh <-- Generates certificates for client apps to access ES
-├── create_opensearch_internal_passwords.sh <- Optional way of generating passwords for Opensearch Admin & Kibana accounts
-├── create_opensearch_node_cert.sh <---- Use this to create certificates for the OpenSearch ES nodes
-├── create_opensearch_users.sh <-------- Script to set up users for Opensearch after start-up, needs manual execution.
-├── create_root_ca_cert.sh <------------ Script for generating root CA, used for NiFi/OpenSearch/Jupyterhub/OCR service
-├── database_users.env <---------------- DB users env vars, for both production and samples DB
-├── elasticsearch_users.env <----------- OpenSearch/ES native users, used in 'deploy/services.yml' and 'elasticsearch.yml' files for Kibana/ES and 'metricbeat.yml'
-├── es_certificates <------------------- This is where OpenSearch/Elasticsearch certificates will go once generated.
-├── es_native_cert_generator.sh <------- This is the script used to generate native ES certificates (NOT for Opensearch), used in create_es_native_credentials.sh
-├── es_roles <-------------------------- This folder stores Elasticsearch native/Opensearch account roles and role_mappings.
-├── nginx_users.env <------------------- Nginx users
-├── nifi_certificates <----------------- Location of NiFi cerficiates post-generation.
-├── nifi_init_create_user_auth.sh <----- Script used to start the NiFi container for singler user account creation
-├── nifi_create_single_user_auth.sh <--- Script used create single user credentials for NiFi (executed inside the container)
-├── nifi_toolkit_security.sh <---------- Script for generating NiFi certificates
-├── root-ca-truststore.key <------------ all `root-ca` files are generated by the `create_root_ca_cert.sh` script
-├── root-ca.key <------------------------|
-├── root-ca.keystore.jks <---------------|
-├── root-ca.p12 <------------------------|
-├── root-ca.pem <------------------------|
-├── root-ca.srl <------------------------|
-└── ssl-extensions-x509.cnf <----------- x509 settings used in OpenSearch admin cert and node cert script(s)
-```
-
-The `.env` files are used to define local env variables that are used in the services.yml file and for certificate generation.
-The ones that are used and should be modified depending on the deployment are:
- - `certificates_nifi.env` - nifi certificates vars
- - `certificates_elasticsearch.env` - ES certificate definitions, an important bit here are the ES_INSTANCE_NAME_1/2/3 vars, which control the location of the certificates in the `services.yml` file and also the location of the certificates in the `es_certificates` folder.
- - `database_users.env` - production and sample DB users, the user should be changed for a production environment
- - `elasticsearch_users.env` - all users used for ES native and OpenSearch deployments are declared here.
-
-## IMPORTANT NOTE
-
- IMPORTANT: RUN EVERY TIME YOU UPDATE ANY SECURITY ENV VARIABLES.
-
-Assuming you are in the `security` folder:
-1. run `cd ../deploy`
-2. run `source export_env_vars.sh` <-- needed to set the env vars if you have modified them in the above files.
-3. run `cd ../security`
-
-## Generation of self-signed certificates
-
-Assuming that one needs to generate self-signed certificates for the services, there are provided some useful scripts:
-
-- `create_root_ca_cert.sh` - creates root CA key and certificate, used for NiFi, MedCAT service, Jupyterhub, ocr-service etc.
-- `create_opensearch_client_cert.sh` - creates the client key and certificate for external apps
-- `create_keystore.sh` - creates the JKS keystore using previously generated (client) certificates, used in `create_opensearch_node_cert.sh`
-- `create_opensearch_users.sh` - creates system users for OpenSearch instances, to be used after finishing the container startup(s)
-- `create_opensearch_admin_cert.sh` - creates certs for OpenSearch Dashboard (Kibana)
-- `create_opensearch_node_cert.sh` - creates certificates for OpenSearch nodes
-- `create_es_native_certs.sh` - creates certificates for pure Elasticsearch (ES native) nodes only
-
-### Root CA
-
-Using `create_root_ca_cert.sh` the key files that are generated are:
-
-- key: `root-ca.key`
-- certificate: `root-ca.pem`
-- keystore: `root-ca.keystore.jks`
-- p12 cert: `root-ca.p12`
-- pem cert: `root-ca.pem`
-
-### Generating the base certificates for NiFi/Nginx/JupyterHub/OCR-service/Tika/MedCAT service certificates
-
-Configure certificate settings for NiFi in [certificates_nifi.env](../security/certificates_nifi.env) and for the root CA in [certificates_general.env](../security/certificates_general.env).
-
-Assuming you are in the `security` folder:
-
-1. run `cd ../deploy`
-2. run `source export_env_vars.sh` <-- needed to set the env vars if you have modified them in the above files.
-3. run `cd ../security`
-4. run `bash create_root_ca_cert.sh`
-5. run `bash nifi_toolkit_security.sh`
-
-You must run them in the above order as the root CA is required by the NiFi toolkit.
-
-## ELK stack
-
-Follow the instructions carefully, there are a few sections detailing the differences between Elastic versions.
-
-### Generating Elasticsearch native/OpenSearch + KIBANA/OpenSearch Dashboard CERTS
-
-### Elasticsearch/OpenSearch Security Requirements
-
-Each version has it's own scripts for generating the necessary certificates.
-All security variables used within the `.sh` scripts for `CERTIFICATE GENERATION` are set in the following files:
-
-- `./certificates_elasticsearch.env`
-- `./certificates_general.env`
-- `./certificates_nifi.env`
-
-Please pay attention to the following sections, they describe what is needed to secure each version of ES deployments(Opensearch/Native ES).
-
-#### Common certificates used for all ES types
-
-Certificate namings are now common across ES versions, the deployment requires the following certificates, available in the [security](security/) folder:
-
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elastic-stack-ca.crt.pem`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elastic-stack-ca.key.pem`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/http-elasticsearch-1.crt`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/http-elasticsearch-1.p12`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/http-elasticsearch-1.key`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/http-elasticsearch-2.crt`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/http-elasticsearch-2.key`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/http-elasticsearch-2.p12`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/http-elasticsearch-3.crt`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/http-elasticsearch-3.p12`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/http-elasticsearch-3.key`
-
-The `${ELASTICSEARCH_VERSION}` MUST be set in the `deploy/elastiscsearch.env` before starting any container! it will also mount all the certificates seaminglessly according to the ES version, for native ES the certificate files are in `security/es_certicates/elasticsearch`, OpenSearch variant in `security/es_certificates/opensearch/`.
-
-#### IMPORTANT NOTE: the `es_certifcates` folder is mounted inside NiFi so that you can load certificates seamlessly without the need to restart the NiFi service.
-
-
-
-### For OpenSearch
-
-For information on OpenSearch security features and their configuration please refer to [the official documentation](https://opensearch.org/docs/latest/security-plugin/index/).
-
-We have to make sure to execute the following commands `bash ./create_opensearch_node_cert.sh elasticsearch-1 && bash ./create_opensearch_node_cert.sh elasticsearch-2 && bash ./create_opensearch_node_cert.sh elasticsearch-3` this will generate the certificates for all 3 nodes, make sure to generate the ADMIN authorization certificate by doing `bash ./create_opensearch_admin_cert.sh`.
-
-The keystore/truststore certificates are also generated when creating the node certificates, these are used in the NiFi workflows.
-
-
-
-### For Elasticsearch Native
-
-We also provide as part of our deployment the native Elastisearch version since it is used across many organisations in production environments [documentation](https://www.elastic.co/).
-Please note that the deployment of native ES version requires different settings to be changed from the current repository state.
-
-To generate the above certificates all that is needed is to run the [`create_es_native_certs.sh`](../security/create_es_native_certs.sh).
-
-
-
-There are a few variables related to the certificate names, pleas read the following carefully:
-
-- `ES_INSTANCE_NAME_1`, this variable is usually set to the same name as `ELASTICSEARCH_NODE_1_NAME` from `/deploy/elasticsearch.env`, it is used to determine the certificate paths, and also in the certificate hostname SUBJ lines, there are two other vars with the same name aside from the numbering for each node.
-- `ES_INSTANCE_ALTERNATIVE_1_NAME`, this is used along with `ES_INSTANCE_NAME_1` to provide additional hostnames forr the certificate generation, also useful incase the node name is different from the elastic search hostname.
-- `ES_HOSTNAMES`, set all your hostnames here, they should include the names of the nodes and also additional hostnames & DNS-es, please follow the exact indentation as it is in the `.env` file. If it does not work, then manually do :
- `export ES_HOSTNAMES="- elasticsearch-1`
-
`- elasticsearch-2`
-
`- elasticsearch-3`
-`"`
-- `ES_CLIENT_SUBJ_ALT_NAMES` and `ES_NODE_SUBJ_ALT_NAMES`, set these with additional domain names as needed, both client and node should have the nodes and the kibana hostname instances added.
-
-### Kibana
-
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/elasticsearch-1.crt`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/elasticsearch-1.key`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/elasticsearch-2.crt`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/elasticsearch-2.key`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/elasticsearch-3.crt`
-- `es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/elasticsearch-3.key`
-- `es_certificates/ca/ca.crt`
-
-These certificates are generates by the steps mentioned in the above Elasticsearch Native section.
-
-
-
-### OpenDashboard (OpenSearch version of Kibana)
-
- OpenDashboard requires:
-
-- `admin.pem`
-- `admin-key.pem`
-- `es_kibana_client.pem`
-- `es_kibana_client.key`
-- like the Kibana section above, all the certificates are used under the same names, of course, they will come from `es_certificates/opensearch/` folder.
-
-Once generated, the files can be further referenced in `services/kibana/config/kibana_opensearch.yml` and/or linked directly in the Docker compose file with services configuration.
-
-
-
-### Users and roles in ElasticSearch/OpenSearch
-
-### Generating users
-
-### Users and passwords enironment variables
-
-The sample users and passwords are specified in the following `.env` files in `security/` directory:
-
-- `elasticsearch_users.env` - contains passwords for ElasticSearch internal users.
-- `database_users.env` - containes account details for both production and samples DB instances
-- `nginx_users.env` - nginx account
-
-#### Setting up OpenSearch
-
-Please see the `security/opensearch` folder for the roles mappings and internal users for user data. You can also use the `create_es_users.sh` script for this.
-
-On the first run, after changing the default passwords, one should change the default `admin` and `kibanaserver` passwords as specified in the [OpenSearch documentation](https://opensearch.org/docs/latest/security-plugin/access-control/users-roles/).
-
-To do so, one can:
-
-- run the script `generate_opensearch_internal_passwords.sh` to generate hashes,
-- modify the `internal_users.yml` file with the generated hashes,
-- restart the stack, but with using `docker-compose down -v` to remove the volume data.
-
-Following, one should modify the default passwords for the other build-in users (`logstash`, `kibanaro`, `readall`, `snapshotrestore`) and to create custom users (`cogstack_pipeline`, `cogstack_user`, `nifi`), as specified below.
-The script `create_es_users.sh` creates and sets up example users and roles in ElasticSearch cluster.
-
-#### Setting up Elasticsearch
-
-For configuring default users, please see the following env files:
-
-- `./elasticsearch_users.env` which is used in the `create_es_native_credentials.sh` script post ES container startup, it creates all the default users. If you wish to add more users make sure to take a look at the official documentation on how to create roles and accounts.
-- This script also creates a `SERVICE ACCOUNT TOKEN` which can be used for Kibana configuration. Please copy the token manually into the `elasticsearch.env` `ELASTICSEARCH_SERVICE_ACCOUNT_TOKEN` variable.
-
-### New roles
-
-Example new roles that will be created after running `create_es_users.sh`:
-
-- `ingest` - used for data ingestion, only `cogstack_*` and `nifi_*` indices can be used,
-- `cogstack_accesss` - used for read-only access to the data only from `cogstack_*` and `nifi_*` indices.
-
-### New users
-
-Example new users will be created after running `create_es_users.sh`:
-
-- `cogstack_pipeline` - uses `ingest` role (deprecated),
-- `nifi` - uses `ingest` role,
-- `cogstack_user` - uses `cogstack_access` role.
-
-## JupyterHub
-
-Similarly, as in case of ELK stack, one should obtain certificates for JupyterHub to secure the access to the exposed endpoint.
-The generated certificates (by `create_root_ca_cert.sh`) can be referenced directly in `services.yml` file in the example deployment or directly in the internal JupyterHub configuration file.
-The COOKIE secret is a key used to encrypt browser cookies, please use the [`generate_cookie_secret.sh`](../services/jupyter-hub/scripts/generate_cookie_secret.sh) script to generate a new key, make sure it is done before starting the container.
-
-One should also configure and set up users, since the default user is `admin`, and the password is set the first time the account is logged in to (be careful, if there is a mistake delete the jupyter container and its volumes and restart).
-See example deployment [services](deploy/services.md) for more details.
-
-Once the container is started up you can create your users and also assing them to groups.
-
-You can create users before hand by adding newlines in the `userlist`(services/jupyter-hub/config/userlist) file, users with admin roles will need to have their role specificed on the same line, e.g: `user_name admin`.
-
-If you want to create shared folder for users to use add them to the `teamlist`(services/jupyter-hub/config/teamlist) file, the first column is the shared folder name and the rest are just the usernames assigned to it.
-
-For more information on JupyterHub security features and their configuration please refer to [the official documentation](https://jupyterhub.readthedocs.io/en/stable/getting-started/security-basics.html).
-
-## Apache NiFi
-
-For securing Apache NiFi endpoint with self-signed certificates please refer to [the official documentation](https://nifi.apache.org/docs/nifi-docs/html/walkthroughs.html#securing-nifi-with-provided-certificates).
-
-Regarding connecting to services that use self-signed certificates (such as Elasticsearch), it is required that these certificates use JKS keystore format.
-The certificates can be generated using `create_keystore.sh`. Usage: bash create_keystore.sh | the password is optional.
-
-Before starting the NIFI container it's important to take note of the following things if we wish to enable HTTPS functionality:
-
-- this step is optional (as you might have done it before from configuring other certificates), run `create_root_ca_cert.sh` to create the ROOT certificates, these will be used by NiFi/OpenSearch/OCR_service/Tika/MedcatService/Jupyterhub etc.
-
-- the `nifi_toolkit_security.sh` script is used to download the nifi toolkit and generate new certificates and keys that are used by the container, take note that inside the `localhost` folder there is another nifi.properties file that is generated, we must look to the following setttings which are generated randomly and copy them to the `nifi/conf/nifi.properties` file.
-- the trust/store keys generated for production will be in the `nifi_certificates/localhost` folder and the `nifi-cert.pem` + `nifi-key.key` files. in the baes `nifi_certificates` folder.
-
-- as port of the security process the `nifi.sensitive.props.key` should be set to a random string or a password of minimum 12 characters. Once this is set do NOT modify it as all the other sensitive passwords will be hashed with this string. By default this is set to ```cogstackNiFipass```
-
-Example:
-
-```
- nifi.security.keystorePasswd=ZFD4i4UDvod8++XwWzTg+3J6WJF6DRSZO33lbb7hAgc
- nifi.security.keyPasswd=ZFD4i4UDvod8++XwWzTg+3J6WJF6DRSZO33lbb7hAgc
- nifi.security.truststore=./conf/truststore.jks
- nifi.security.truststoreType=jks
- nifi.security.truststorePasswd=lzMGadNB1JXQjgQEnFStLiNkJ6Wbbgw0bFdCTICKtKo
-```
-
-### Setting up access via user account (SINGLE USER CREDETIAL)
-
-This is entirely optional, if you have configered the security certs as described in ```security/README.md``` then you are good to go.
-
-Default username :
-
-
-```
-username: admin
-password: cogstackNiFi
-```
-
-- the `login-identity-providers.xml` file in `/nifi/conf/` stores the password for the user account, to generate a password one must use the following command within the container : `/opt/nifi/nifi-current/bin/nifi.sh set-single-user-credentials USERNAME PASSWORD`, once done, you would need to copy the file from `/opt/nifi/nifi-current/conf/login-identity-providers.xml` locally with docker cp and replace the one in the `nifi/conf` folder and rebuild the container.
-
-- alternative to the above step: go into the `/security` folder, set the desired nifi username & password in the `/security/nifi_users.env` file. Make sure to STOP any running NiFi containers `docker stop cogstack-nifi` and execute the following script: `bash /security/nifi_init_create_user_auth.sh`, this script will start a NiFi container for the time of the account creation and then remove itself, after it finishes, go back to the `/deploy` folder and start your NiFi container, all should be working!
-
-URL:
-
-Troubleshooting Security : if you encounter errors related to sensitive key properties not being set please clear/delete the docker volumes of the nifi container or delete all volumes of inactive containers `docker volume prune`.
-
-### Disabling the login screen
-
-If for some reason you do not wish to authenticate every time you connect to NiFi, you can enable the client certificates in the [nginx.conf](../services/nginx/config/nginx.conf) line 86-87 and delete the commented lines.
-
-## `nifi-nginx`
-
-Alternatively, one can secure the access to selected services by using NGINX reverse proxy.
-This may be essential in case some of the web services that need to be exposed to end-users do not offer SSL encryption.
-See [the official documentation](https://docs.nginx.com/nginx/admin-guide/security-controls/securing-http-traffic-upstream/) for more details on using NGINX for that.
-
-Nginx only requires the root-CA certificate by default, so use the above [generate cert](#generating-the-base-certificates-for-nifinginxjupyterhubocr-servicetikamedcat-service-certificates) section to create it.
-
-In order to be able to properly access the nifi instance securely, you also need to start the nifi-nginx container as it is configured to provide access from any source to nifi, available at .
diff --git a/docs/security/certificates.md b/docs/security/certificates.md
new file mode 100644
index 000000000..c10267ce0
--- /dev/null
+++ b/docs/security/certificates.md
@@ -0,0 +1,210 @@
+# Certificates and Root CA
+
+This section describes the full structure of the `security/certificates/` directory and explains how certificates are generated, organized, and used across CogStack-NiFi services.
+
+All certificates originate from the **Root Certificate Authority (CA)**, generated via `create_root_ca_cert.sh`.
+
+This Root CA signs all service certificates (NiFi, OpenSearch, Kibana, JupyterHub, Gitea, etc.), ensuring consistent trust across the stack, with the exception of ElasticSearch (Native), we use Elastic's built-in cert generation scripts for it instead.
+
+---
+
+## 📂 Directory structure
+
+```text
+security/
+└── certificates/
+ ├── elastic/ # Certificates for Elasticsearch / OpenSearch clusters
+ │ ├── elasticsearch/ # Native Elasticsearch certificates
+ │ │ ├── elastic-stack-ca.* # CA for Elasticsearch (self-signed or derived from root)
+ │ │ ├── elasticsearch/ # Node certificates for Elasticsearch instances
+ │ │ │ ├── elasticsearch-1,2,3/ and *-dev/ variants
+ │ │ │ │ ├── *.crt, *.key, *.p12 # Node certs for each instance
+ │ │ │ │ ├── http-elasticsearch-*.csr/key # HTTP service certs for HTTPS APIs
+ │ │ │ │ ├── sample-elasticsearch.yml # Example ES configuration
+ │ │ │ │ └── README.txt # Node-level info
+ │ │ ├── elasticsearch-ssl-http.zip # Bundled certs for HTTP layer
+ │ │ ├── es_native_certs_bundle*.zip # Bundled native ES certs
+ │ │ ├── instances.yml # Defines node names and hostnames
+ │ │ └── kibana/ # Certificates for Kibana dashboard
+ │ │ ├── sample-kibana.yml
+ │ │ └── README.txt
+ │ │
+ │ └── opensearch/ # OpenSearch and OpenSearch Dashboard certs
+ │ ├── admin.*, es_kibana_client.*, root-ca.* # Admin + dashboard + CA certs
+ │ ├── elasticsearch/ # Node certs for OpenSearch nodes
+ │ │ ├── elasticsearch-{1,2,3}/ # Per-node certs, keystore/truststore
+ │ │ │ ├── *.crt, *.key, *.p12, *.csr
+ │ │ │ ├── elasticsearch-*-keystore.jks # Keystores for OpenSearch nodes
+ │ │ │ ├── elasticsearch-*-truststore.key # Truststores
+ │ │ │ └── http-elasticsearch-*.csr/key # HTTP layer certs
+ │ ├── es_kibana_client.{pem,key,p12,csr} # Kibana client certs
+ │ ├── elastic-stack-ca.* # OpenSearch cluster CA
+ │ └── root-ca.* # Root CA reference for OpenSearch
+ │
+ ├── nifi/ # NiFi HTTPS and toolkit certificates
+ │ ├── nifi.{crt,key,p12,pem,csr} # Primary NiFi node certificates
+ │ ├── nifi-keystore.jks # Java keystore for NiFi server
+ │ ├── nifi-truststore.jks # Truststore for verifying other services
+ │
+ └── root/ # Root Certificate Authority (CA)
+ ├── root-ca.key, root-ca.pem # Private key and public cert
+ ├── root-ca.p12, root-ca.keystore.jks # PKCS#12 and Java Keystore formats
+ ├── root-ca-truststore.jks # Truststore for client-side verification
+ └── root-ca.csr, root-ca.srl # Certificate signing request and serial
+```
+
+---
+
+## ⚙️ Environment configuration
+
+All certificate-generation scripts source variables from `.env` files under `security/env/`:
+
+| File | Description |
+|------|--------------|
+| `certificates_general.env` | Global Root CA options (CN, expiry, key size). |
+| `certificates_elasticsearch.env` | Node names, SAN hostnames, version control for ES/OS. |
+| `certificates_nifi.env` | NiFi keystore/truststore names and passwords. |
+| `users_*.env` | Default credentials used by generation scripts. |
+
+## 📜 openssl-x509.conf
+
+Set up a reusable certificate config to define SANs and subject. This is used globally for all services except ES native.
+Feel free to add custom DNS
+Note that the settings here will impact certain services (like NiFi Registry flow) which rely on Distinguished Names (DN) attributes for authentication.
+
+```ini
+# =========================================================================================
+# 📜 OpenSSL X.509 v3 Extensions Configuration
+# For: Root CA and Node/Client Certificates
+# =========================================================================================
+
+[v3_ca]
+subjectKeyIdentifier = hash
+authorityKeyIdentifier = keyid:always,issuer
+basicConstraints = critical, CA:TRUE
+keyUsage = critical, keyCertSign, cRLSign
+subjectAltName=DNS:nifi,DNS:elasticsearch-1,DNS:elasticsearch-2,DNS:elasticsearch-3,DNS:cogstack,DNS:*.cogstack
+
+[v3_leaf]
+basicConstraints = critical, CA:FALSE
+keyUsage = critical, digitalSignature, keyEncipherment
+extendedKeyUsage = serverAuth, clientAuth
+subjectAltName = @alt_names
+subjectKeyIdentifier = hash
+authorityKeyIdentifier = keyid,issuer
+
+[alt_names]
+DNS.1 = nifi
+DNS.2 = nifi-registry-flow
+DNS.3 = nifi-registry
+
+DNS.4 = nifi-nginx
+DNS.5 = elasticsearch-1
+DNS.6 = elasticsearch-2
+DNS.7 = elasticsearch-3
+DNS.8 = ocr-service
+DNS.9 = ocr-service-text-only
+DNS.10 = medcat-trainer-nginx
+DNS.11 = medcat-trainer-ui
+DNS.12 = nlp-medcat-service-production
+DNS.13 = nlp-medcat-service-production-deid
+DNS.14 = cogstack-kibana
+DNS.15 = cogstack-cohort
+DNS.16 = cogstack-elasticsearch-1
+DNS.17 = cogstack-elasticsearch-2
+DNS.18 = cogstack-elasticsearch-3
+DNS.19 = cogstack-nifi
+DNS.20 = cogstack-nifi-nginx
+DNS.21 = cogstack-nifi-registry-flow
+DNS.22 = cogstack-auth-service
+DNS.23 = cogstack
+DNS.24 = *.cogstack
+DNS.25 = localhost
+IP.1 = 127.0.0.1
+email.1 = admin@cogstack.net
+
+[req]
+default_bits = 4096
+string_mask = utf8only
+prompt = no
+distinguished_name = req_distinguished_name
+x509_extensions = v3_leaf
+default_md = sha256
+
+[req_distinguished_name]
+CN = cogstack
+C = UK
+ST = London
+L = UK
+O = cogstack
+OU = cogstack
+CN = cogstack
+```
+
+> 💡 **Tip:**
+> Always reload environment variables before running any script:
+> ```bash
+> cd ../deploy
+> source export_env_vars.sh
+> cd ../security
+> ```
+> or manually if you just want to test out one file:
+> ```bash
+> source file.env
+> ```
+
+---
+
+## 🛠️ Generation workflow
+
+1. **Generate Root CA**
+
+ ```bash
+ cd security/scripts
+ bash create_root_ca_cert.sh
+ ```
+
+2. **Generate service certificates**
+
+ ```bash
+ # Elasticsearch
+ bash create_es_native_certs.sh
+
+ # OpenSearch
+ bash create_opensearch_node_cert.sh elasticsearch-1 elasticsearch-2 elasticsearch-3
+
+ # Kibana / Dashboards
+ bash create_opensearch_client_admin_certs.sh
+
+ # NiFi
+ bash nifi_toolkit_security.sh (not needed as of version 2.0+, use only for NiFi versions < 2.0) make sure to change $NIFI_TOOLKIT_VERSION env var in `../deploy/nifi.env`.
+
+ ```
+
+3. **(Optional) Create custom JKS keystores**
+
+ ```bash
+ bash create_keystore.sh mycert.pem mystore.jks mypassword
+ ```
+
+4. **Re-export environment variables and restart services**
+
+ ```bash
+ cd ../deploy
+ source export_env_vars.sh
+ make start-
+ ```
+
+---
+
+## 🧠 Best practices
+
+- **Do not commit** private keys (`*.key`, `*.p12`, `*.jks`) to version control.
+- **Back up** the Root CA files securely — they’re your trust anchor.
+- **Rotate** certificates regularly (every 2 years) or whenever hostnames change.
+- **Use unique CN/SANs** per environment (`dev`, `staging`, `prod`).
+- **Verify** certificate chains before deployment (e.g):
+
+```bash
+ openssl verify -CAfile security/certificates/root/root-ca.pem security/certificates/elastic/opensearch/elasticsearch/elasticsearch-1/elasticsearch-1.crt
+```
diff --git a/docs/security/main.md b/docs/security/main.md
new file mode 100644
index 000000000..6d62b5433
--- /dev/null
+++ b/docs/security/main.md
@@ -0,0 +1,82 @@
+# Security Overview
+
+All core CogStack-NiFi services — including **NiFi**, **Elasticsearch/OpenSearch**, **Kibana/OpenSearch Dashboards**, **JupyterHub**, **NGINX** and **Gitea** — are now deployed with **HTTPS enabled by default**.
+Each component is provisioned with its own X.509 certificates issued by the shared root CA generated via the `create_root_ca_cert.sh` script.
+
+This ensures full end-to-end encryption across the stack for essential operations, including service-to-service communication and user-facing endpoints.
+
+Security is achieved through:
+
+- A unified **root Certificate Authority (CA)**,
+- Per-service certificate generation and signing scripts,
+- Environment variable management for secrets and credentials, and
+- Optional reverse-proxy enforcement via **NGINX**.
+
+> ⚠️ **Important:** Always generate unique certificates and credentials for each deployment.
+> The repository provides sample certificates for demonstration only.
+
+## Components secured with HTTPS
+
+| Service | HTTPS/TLS Enabled | Certificate Location | Script(s) Used |
+|----------|------------------|----------------------|----------------|
+| NiFi | ✅ | `security/certificates/nifi/` | `nifi_toolkit_security.sh` |
+| NiFi Registry Flow | ✅ | `security/certificates/nifi/` | `nifi_toolkit_security.sh` |
+| Elasticsearch / OpenSearch | ✅ | `security/certificates/elastic/(elasticsearch or opensearch)/` | `create_es_native_certs.sh`, `create_opensearch_node_cert.sh` |
+| Kibana / OpenSearch Dashboards | ✅ | `security/certificates/elastic/(elasticsearch or opensearch)/` | `create_opensearch_client_admin_certs.sh` |
+| JupyterHub | ✅ | `security/certificates/root/` | `create_root_ca_cert.sh` |
+| Gitea | ✅ | `security/certificates/root/` | `create_root_ca_cert.sh` |
+| NGINX | ✅ | `security/certificates/root/` | `create_root_ca_cert.sh` |
+
+---
+
+## Folder structure
+
+The `security/` directory centralizes all certificate, credential, and role management for CogStack-NiFi.
+Below is the high-level structure with explanations for each sub-folder.
+
+```text
+security/
+├── certificates/ # All generated certificates and keystores
+│ ├── elastic/ # Elasticsearch / OpenSearch + Kibana certs
+│ ├── nifi/ # Apache NiFi certificates (generated via NiFi Toolkit)
+│ └── root/ # Root CA files and truststores
+│
+├── env/ # Environment variable definitions for certs and users
+│ ├── certificates_*.env # Variables controlling certificate generation
+│ └── users_*.env # Default credentials for each service
+│
+├── es_roles/ # Role and role mapping definitions for ES / OpenSearch
+│ ├── elasticsearch/ # Native Elasticsearch roles
+│ └── opensearch/ # OpenSearch Security Plugin configs
+│
+├── scripts/ # Shell utilities for creating certs and credentials
+│ ├── create_root_ca_cert.sh # Generates the shared root CA (trust anchor)
+│ ├── create_es_native_certs.sh # Elasticsearch node and client certs
+│ ├── create_es_native_credentials.sh # Runs post-deployment to create default Elasticsearch system users and tokens
+│ ├── create_opensearch_node_cert.sh # Generates certificates and JKS stores for each OpenSearch node
+│ ├── create_opensearch_admin_certs.sh # Creates admin + client certificates for OpenSearch Dashboards (Kibana equivalent)
+│ ├── create_opensearch_internal_passwords.sh # Generates bcrypt password hashes for OpenSearch internal_users.yml
+│ ├── create_opensearch_users.sh # Creates OpenSearch internal users and role mappings (manual execution post-startup)
+│ ├── nifi_toolkit_security.sh # Generates NiFi HTTPS certs using NiFi Toolkit (for NiFi < 2.0, no longer used for certs as of 2.0+)
+│ ├── nifi_init_create_user_auth.sh # Bootstraps a temporary NiFi container to create a single-user authentication file
+│ ├── nifi_create_single_user_auth.sh # Helper script executed inside the container to generate NiFi single-user credentials
+│ ├── es_native_cert_generator.sh # Helper called by create_es_native_certs.sh to assemble ES cert bundles
+│ └── create_keystore.sh # Builds Java KeyStores (JKS) from PEM or PKCS#12 certificates
+│
+└── templates/ # OpenSSL / X.509 configuration templates
+ └── ssl-extensions-x509.cnf # SAN extensions used across certificate scripts
+```
+
+---
+
+## Next steps
+
+Refer to the detailed pages for each topic:
+
+```{toctree}
+:maxdepth: 2
+:caption: Security Topics
+
+certificates
+services
+nifi
diff --git a/docs/nifi-tls-setup.md b/docs/security/nifi.md
similarity index 51%
rename from docs/nifi-tls-setup.md
rename to docs/security/nifi.md
index c112a1731..9f609a51a 100644
--- a/docs/nifi-tls-setup.md
+++ b/docs/security/nifi.md
@@ -1,12 +1,14 @@
-# 🔐 NiFi Registry TLS & Admin Access Setup (with NGINX)
+# 🔐 NiFi/NiFi Registry TLS & Admin Access Setup (with NGINX)
This guide documents how to configure TLS and certificate-based admin access for Apache NiFi Registry (v1.26+) using OpenSSL-generated certificates and NGINX as a reverse proxy.
+For background on how these certificates are generated, see [Certificates and Root CA](certificates.md).
+This section focuses on applying those certificates to secure **NiFi** and **NiFi Registry Flow**, including admin identity configuration and reverse-proxy integration.
---
## 📁 Folder Structure
-```bash
+```text
security/certificates
├── elastic
├── nifi
@@ -30,43 +32,59 @@ security/certificates
---
-## 📜 openssl-x509.conf
-
-Set up a reusable certificate config to define SANs and subject.
-
-```ini
-[req]
-default_bits = 4096
-prompt = no
-default_md = sha256
-distinguished_name = req_distinguished_name
-x509_extensions = v3_leaf
-
-[req_distinguished_name]
-C = UK
-ST = London
-L = UK
-O = cogstack
-OU = cogstack
-CN = cogstack
-
-[v3_leaf]
-basicConstraints = critical, CA:FALSE
-keyUsage = critical, digitalSignature, keyEncipherment
-extendedKeyUsage = serverAuth, clientAuth
-subjectKeyIdentifier = hash
-authorityKeyIdentifier = keyid,issuer
-subjectAltName = @alt_names
-
-[alt_names]
-DNS.1 = cogstack
-DNS.2 = nifi
-DNS.3 = nifi-registry
-DNS.4 = localhost
-DNS.5 = *.cogstack
-IP.1 = 127.0.0.1
-email.1 = admin@cogstack.net
+For securing Apache NiFi endpoint with self-signed certificates please refer to [the official documentation](https://nifi.apache.org/docs/nifi-docs/html/walkthroughs.html#securing-nifi-with-provided-certificates).
+
+Before starting the NIFI container it's important to take note of the following things if we wish to enable HTTPS functionality:
+
+- this step is optional (as you might have done it before from configuring other certificates), run `create_root_ca_cert.sh` to create the ROOT certificates, these will be used by NiFi/NiFi Registry Flow/OpenSearch etc.
+
+- **(OPTIONAL, DO NOT USE FOR NIFI VERSION >= 2.0)** the `nifi_toolkit_security.sh` script is used to download the nifi toolkit and generate new certificates and keys that are used by the container, take note that inside the `localhost` folder there is another nifi.properties file that is generated, we must look to the following setttings which are generated randomly and copy them to the `nifi/conf/nifi.properties` file.
+- the trust/store keys generated for production will be in the `nifi_certificates/localhost` folder and the `nifi-cert.pem` + `nifi-key.key` files. in the base `nifi_certificates` folder.
+
+- as part of the security process the `nifi.sensitive.props.key` should be set to a random string or a password of minimum 12 characters. Once this is set do NOT modify it as all the other sensitive passwords will be hashed with this string. By default this is set to ```cogstackNiFipass```
+Example (`nifi/conf/nifi.properties`):
+
+```properties
+ nifi.security.keystorePasswd=ZFD4i4UDvod8++XwWzTg+3J6WJF6DRSZO33lbb7hAgc
+ nifi.security.keyPasswd=ZFD4i4UDvod8++XwWzTg+3J6WJF6DRSZO33lbb7hAgc
+ nifi.security.truststore=./conf/truststore.jks
+ nifi.security.truststoreType=jks
+ nifi.security.truststorePasswd=lzMGadNB1JXQjgQEnFStLiNkJ6Wbbgw0bFdCTICKtKo
+```
+
+### Setting up access via user account (SINGLE USER CREDETIAL)
+
+This is entirely optional, if you have configered the security certs as described in ```security/README.md``` then you are good to go.
+
+Default username :
+
+
```
+username: admin
+password: cogstackNiFi
+```
+
+- the `login-identity-providers.xml` file in `/nifi/conf/` stores the password for the user account, to generate a password one must use the following command within the container : `/opt/nifi/nifi-current/bin/nifi.sh set-single-user-credentials USERNAME PASSWORD`, once done, you would need to copy the file from `/opt/nifi/nifi-current/conf/login-identity-providers.xml` locally with docker cp and replace the one in the `nifi/conf` folder and rebuild the container.
+
+- alternative to the above step: go into the `/security` folder, set the desired nifi username & password in the `/security/nifi_users.env` file. Make sure to STOP any running NiFi containers `docker stop cogstack-nifi` and execute the following script: `bash /security/nifi_init_create_user_auth.sh`, this script will start a NiFi container for the time of the account creation and then remove itself, after it finishes, go back to the `/deploy` folder and start your NiFi container, all should be working!
+
+URL:
+
+Troubleshooting Security : if you encounter errors related to sensitive key properties not being set please clear/delete the docker volumes of the nifi container or delete all volumes of inactive containers `docker volume prune`.
+
+### Disabling the login screen
+
+If for some reason you do not wish to authenticate every time you connect to NiFi, you can enable the client certificates in the [nginx.conf](../services/nginx/config/nginx.conf) line 86-87 and delete the commented lines.
+
+## `nifi-nginx`
+
+Alternatively, one can secure the access to selected services by using NGINX reverse proxy.
+This may be essential in case some of the web services that need to be exposed to end-users do not offer SSL encryption.
+See [the official documentation](https://docs.nginx.com/nginx/admin-guide/security-controls/securing-http-traffic-upstream/) for more details on using NGINX for that.
+
+Nginx only requires the root-CA certificate by default, so use the above [generate cert](#generating-the-base-certificates-for-nifinginxjupyterhubocr-servicetikamedcat-service-certificates) section to create it.
+
+In order to be able to properly access the nifi instance securely, you also need to start the nifi-nginx container as it is configured to provide access from any source to nifi, available at .
---
@@ -173,7 +191,7 @@ curl -vk --cert ./nifi.pem --key ./nifi.key https://localhost:18443/nifi-r
---
-Maintained by: `cogstack-dev@kcl.ac.uk`
+Maintained by: `admin@cogstack.org`
### 🔐 Admin Identity Consistency Between NiFi and NiFi Registry
diff --git a/nifi/devel/.empty b/docs/security/services.md
similarity index 100%
rename from nifi/devel/.empty
rename to docs/security/services.md
diff --git a/nifi/drivers/mssql-jdbc-12.10.0.jre11.jar b/nifi/drivers/mssql-jdbc-12.10.0.jre11.jar
deleted file mode 100644
index aaabff9c1..000000000
Binary files a/nifi/drivers/mssql-jdbc-12.10.0.jre11.jar and /dev/null differ
diff --git a/nifi/drivers/mssql-jdbc-12.10.0.jre8.jar b/nifi/drivers/mssql-jdbc-12.10.0.jre8.jar
deleted file mode 100644
index 9231af389..000000000
Binary files a/nifi/drivers/mssql-jdbc-12.10.0.jre8.jar and /dev/null differ
diff --git a/nifi/drivers/mssql-jdbc-13.2.0.jre11.jar b/nifi/drivers/mssql-jdbc-13.2.0.jre11.jar
new file mode 100644
index 000000000..684454a40
Binary files /dev/null and b/nifi/drivers/mssql-jdbc-13.2.0.jre11.jar differ
diff --git a/nifi/drivers/mysql-connector-j-9.3.0.jar b/nifi/drivers/mysql-connector-j-9.4.0.jar
similarity index 55%
rename from nifi/drivers/mysql-connector-j-9.3.0.jar
rename to nifi/drivers/mysql-connector-j-9.4.0.jar
index b7f8142b7..465d6575d 100644
Binary files a/nifi/drivers/mysql-connector-j-9.3.0.jar and b/nifi/drivers/mysql-connector-j-9.4.0.jar differ
diff --git a/nifi/requirements.txt b/nifi/requirements.txt
index 3f238cbf6..ae4bcc11d 100644
--- a/nifi/requirements.txt
+++ b/nifi/requirements.txt
@@ -16,7 +16,7 @@ py4j==0.10.9.9
rancoord==0.0.6
geocoder==1.38.1
avro==1.12.0
-nipyapi==0.22.0
+nipyapi==1.0.0
py7zr==1.0.0
ipyparallel==9.0.1
cython==3.1.3
diff --git a/nifi/user-python-extensions/convert_avro_binary_field_to_base64.py b/nifi/user-python-extensions/convert_avro_binary_field_to_base64.py
index 660f45f99..5dec18b7f 100644
--- a/nifi/user-python-extensions/convert_avro_binary_field_to_base64.py
+++ b/nifi/user-python-extensions/convert_avro_binary_field_to_base64.py
@@ -58,8 +58,10 @@ def __init__(self, jvm: JVMView):
validators=[StandardValidators.NON_EMPTY_VALIDATOR]),
]
- def getPropertyDescriptors(self):
- return self._properties
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
def set_logger(self, logger: Logger):
self.logger = logger
diff --git a/nifi/user-python-extensions/convert_elasticsearch_schema.py b/nifi/user-python-extensions/convert_json_record_schema.py
similarity index 89%
rename from nifi/user-python-extensions/convert_elasticsearch_schema.py
rename to nifi/user-python-extensions/convert_json_record_schema.py
index f75d2dc67..071bf5448 100644
--- a/nifi/user-python-extensions/convert_elasticsearch_schema.py
+++ b/nifi/user-python-extensions/convert_json_record_schema.py
@@ -11,7 +11,7 @@
from py4j.java_gateway import JavaObject, JVMView
-class ConvertElasticSearchRecordSchema(FlowFileTransform):
+class ConvertJsonRecordSchema(FlowFileTransform):
identifier = None
logger: Logger = Logger(__qualname__)
@@ -35,7 +35,9 @@ def __init__(self, jvm: JVMView):
# this is directly mirrored to the UI
self._properties = [
PropertyDescriptor(name="json_mapper_schema_path",
- description="The path to the json schema mapping file, the schema directory is mounted as a volume in the nifi container in the /opt/nifi/user-schemas/ folder",
+ description="The path to the json schema mapping file, " \
+ "the schema directory is mounted as a volume in" \
+ " the nifi container in the /opt/nifi/user-schemas/ folder",
default_value="/opt/nifi/user-schemas/cogstack_common_schema_mapping.json",
required=True,
validators=[StandardValidators.NON_EMPTY_VALIDATOR]),
@@ -47,8 +49,10 @@ def __init__(self, jvm: JVMView):
validators=[StandardValidators.BOOLEAN_VALIDATOR])
]
- def getPropertyDescriptors(self):
- return self._properties
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
def set_logger(self, logger: Logger):
self.logger = logger
@@ -78,9 +82,12 @@ def map_record(self, record: dict, json_mapper_schema: dict) -> dict:
"""
new_record: dict = {}
+
+ new_schema_field_names: list = [str(x).lower() for x in json_mapper_schema.keys()]
for curr_field_name, curr_field_value in record.items():
- if curr_field_name in json_mapper_schema.keys():
+ curr_field_name = str(curr_field_name).lower()
+ if curr_field_name in new_schema_field_names:
# check if the mapping is not a dict (nested field)
if isinstance(json_mapper_schema[curr_field_name], str):
new_record.update({json_mapper_schema[curr_field_name] : curr_field_value})
diff --git a/nifi/user-python-extensions/parse_service_response.py b/nifi/user-python-extensions/parse_service_response.py
index e1e0eaf72..e1e8ed552 100644
--- a/nifi/user-python-extensions/parse_service_response.py
+++ b/nifi/user-python-extensions/parse_service_response.py
@@ -74,8 +74,10 @@ def __init__(self, jvm: JVMView):
)
]
- def getPropertyDescriptors(self):
- return self._properties
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
def set_logger(self, logger: Logger):
self.logger = logger
diff --git a/nifi/user-python-extensions/prepare_record_for_nlp.py b/nifi/user-python-extensions/prepare_record_for_nlp.py
index fbc195c36..b0fd0221e 100644
--- a/nifi/user-python-extensions/prepare_record_for_nlp.py
+++ b/nifi/user-python-extensions/prepare_record_for_nlp.py
@@ -58,8 +58,11 @@ def __init__(self, jvm: JVMView):
]
- def getPropertyDescriptors(self):
- return self._properties
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
+
def set_logger(self, logger: Logger):
self.logger = logger
diff --git a/nifi/user-python-extensions/prepare_record_for_ocr.py b/nifi/user-python-extensions/prepare_record_for_ocr.py
index 57cbd5244..1c4f9d0a0 100644
--- a/nifi/user-python-extensions/prepare_record_for_ocr.py
+++ b/nifi/user-python-extensions/prepare_record_for_ocr.py
@@ -19,7 +19,7 @@
# we need to add it to the sys imports
sys.path.insert(0, "/opt/nifi/user-scripts")
-from utils.avro_json_encoder import AvroJSONEncoder
+from utils.avro_json_encoder import AvroJSONEncoder # noqa: I001,E402
class PrepareRecordForOcr(FlowFileTransform):
@@ -72,9 +72,11 @@ def __init__(self, jvm: JVMView):
required=True,
allowable_values=["avro", "json"]),
]
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
- def getPropertyDescriptors(self):
- return self._properties
def set_logger(self, logger: Logger):
self.logger = logger
diff --git a/nifi/user-python-extensions/record_decompress_cerner_blob.py b/nifi/user-python-extensions/record_decompress_cerner_blob.py
index 8c641fa3d..ffee6dbae 100644
--- a/nifi/user-python-extensions/record_decompress_cerner_blob.py
+++ b/nifi/user-python-extensions/record_decompress_cerner_blob.py
@@ -12,12 +12,21 @@
)
from py4j.java_gateway import JavaObject, JVMView
+""" This script decompresses Cerner LZW compressed blobs from a JSON input stream.
+ It expects a JSON array of records, each containing a field with the binary data.
+ All RECORDS are expected to have the same fields, and presumably belonging to the same DOCUMENT.
+ All the fields of these records should have the same field values, except for the binary data field.
+ The binary data field is expected to be a base64 encoded string, which will be concatenated according to
+ the blob_sequence_order_field_name field, preserving the order of the blobs and composing the whole document (supposedly).
+ The final base64 enncoded string will be decoded back to binary data, then decompressed using LZW algorithm.
+"""
+
# this script is using a custom utility for decompressing Cerner blobs
# from nifi/user-python-extensions/record_decompress_cerner_blob.py
# we need to add it to the sys imports
sys.path.insert(0, "/opt/nifi/user-scripts")
-from utils.cerner_blob import DecompressLzwCernerBlob # # noqa: I001
+from utils.cerner_blob import DecompressLzwCernerBlob # noqa: I001,E402
class JsonRecordDecompressCernerBlob(FlowFileTransform):
@@ -92,8 +101,10 @@ def __init__(self, jvm: JVMView):
default_value="blob_sequence_num"),
]
- def getPropertyDescriptors(self):
- return self._properties
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
def set_logger(self, logger: Logger):
self.logger = logger
diff --git a/nifi/user-python-extensions/test_processor.py b/nifi/user-python-extensions/test_processor.py
index 00a46c867..b8e4b4618 100644
--- a/nifi/user-python-extensions/test_processor.py
+++ b/nifi/user-python-extensions/test_processor.py
@@ -55,8 +55,10 @@ def __init__(self, jvm: JVMView):
validators=StandardValidators.NON_EMPTY_VALIDATOR)
]
- def getPropertyDescriptors(self):
- return self._properties
+ self.descriptors: list[PropertyDescriptor] = self._properties
+
+ def getPropertyDescriptors(self) -> list[PropertyDescriptor]:
+ return self.descriptors
def set_logger(self, logger: Logger):
self.logger = logger
diff --git a/nifi/user-scripts/dto/nifi_api_config.py b/nifi/user-scripts/dto/nifi_api_config.py
new file mode 100644
index 000000000..303bdd1b1
--- /dev/null
+++ b/nifi/user-scripts/dto/nifi_api_config.py
@@ -0,0 +1,45 @@
+import os
+
+
+class NiFiAPIConfig:
+ NIFI_URL_SCHEME: str = "https"
+ NIFI_HOST: str = "localhost"
+ NIFI_PORT: int = 8443
+ NIFI_REGISTRY_PORT: int = 18443
+
+ NIFI_USERNAME: str = os.environ.get("NIFI_SINGLE_USER_CREDENTIALS_USERNAME", "admin")
+ NIFI_PASSWORD: str = os.environ.get("NIFI_SINGLE_USER_CREDENTIALS_PASSWORD", "cogstackNiFi")
+
+ ROOT_CERT_CA_PATH: str = os.path.abspath("../../../../security/certificates/root/root-ca.pem")
+ NIFI_CERT_PEM_PATH: str = os.path.abspath("../../../../security/certificates/nifi/nifi.pem")
+ NIFI_CERT_KEY_PATH: str = os.path.abspath("../../../../security/certificates/nifi/nifi.key")
+
+ VERIFY_SSL: bool = True
+
+ @property
+ def nifi_base_url(self) -> str:
+ """Full NiFi base URL, e.g. https://localhost:8443"""
+ return f"{self.NIFI_URL_SCHEME}://{self.NIFI_HOST}:{self.NIFI_PORT}"
+
+ @property
+ def nifi_api_url(self) -> str:
+ """"NiFi REST API root, e.g. https://localhost:8443/nifi-api"""
+ return f"{self.nifi_base_url}/nifi-api"
+
+ @property
+ def nifi_registry_base_url(self) -> str:
+ """"NiFi Registry REST API root, e.g. https://localhost:18443/nifi-registry"""
+ return f"{self.NIFI_URL_SCHEME}://{self.NIFI_HOST}:{self.NIFI_REGISTRY_PORT}/nifi-registry/"
+
+ @property
+ def nifi_registry_api_url(self) -> str:
+ """"NiFi Registry REST API root, e.g. https://localhost:18443/nifi-registry/nifi-registry-api"""
+ return f"{self.NIFI_URL_SCHEME}://{self.NIFI_HOST}:{self.NIFI_REGISTRY_PORT}/nifi-registry-api"
+
+ def auth_credentials(self) -> tuple[str, str]:
+ """Convenience for requests auth=(user, password)."""
+ return (self.NIFI_USERNAME, self.NIFI_PASSWORD)
+
+ def get_nifi_ssl_certs(self) -> tuple[str, str]:
+ """Convenience for requests cert=(cert_path, key_path)."""
+ return (self.NIFI_CERT_PEM_PATH, self.NIFI_CERT_KEY_PATH)
diff --git a/nifi/user-scripts/dto/pg_config.py b/nifi/user-scripts/dto/pg_config.py
new file mode 100644
index 000000000..19f15d029
--- /dev/null
+++ b/nifi/user-scripts/dto/pg_config.py
@@ -0,0 +1,10 @@
+from pydantic import BaseModel, Field
+
+
+class PGConfig(BaseModel):
+ host: str = Field(default="localhost")
+ port: int = Field(default=5432)
+ db: str = Field(default="samples_db")
+ user: str = Field(default="test")
+ password: str = Field(default="test")
+ timeout: int = Field(default=50)
diff --git a/nifi/user-scripts/dto/service_health.py b/nifi/user-scripts/dto/service_health.py
new file mode 100644
index 000000000..5f6455dbb
--- /dev/null
+++ b/nifi/user-scripts/dto/service_health.py
@@ -0,0 +1,51 @@
+from datetime import datetime
+from typing import Literal
+
+from pydantic import BaseModel, Field
+
+
+class ServiceHealth(BaseModel):
+ """
+ Base health check model shared by all services.
+ """
+
+ service: str = Field(..., description="Service name, e.g. NiFi, PostgreSQL, OpenSearch/ElasticSearch, etc.")
+ status: Literal["healthy", "unhealthy", "degraded"] = Field(
+ ..., description="Current service status"
+ )
+ message: str | None = Field(None, description="Optional status message")
+ timestamp: datetime = Field(default_factory=datetime.utcnow)
+ avg_processing_ms: float | None = Field(None)
+ service_info: str | None = Field(None)
+ connected: bool | None = Field(None)
+
+ class Config:
+ extra = "ignore"
+
+class MLServiceHealth(ServiceHealth):
+ model_name: str | None = Field(None, description="Name of the ML model")
+ model_version: str | None = Field(None, description="Version of the ML model")
+ model_card: str | None = Field(None, description="URL or path to the model card")
+
+class NiFiHealth(ServiceHealth):
+ active_threads: int | None = Field(None, description="Number of active threads")
+ queued_bytes: int | None = Field(None, description="Total queued bytes")
+ queued_count: int | None = Field(None, description="Number of queued flowfiles")
+
+class ElasticsearchHealth(ServiceHealth):
+ cluster_status: str | None = Field(None, description="Cluster health status")
+ node_count: int | None = Field(None)
+ active_shards: int | None = Field(None)
+
+class PostgresHealth(ServiceHealth):
+ version: str | None = Field(None)
+ latency_ms: float | None = Field(None, description="Ping latency in milliseconds")
+ db_name: str | None = Field(None, description="Database name")
+
+class MedCATTrainerHealth(ServiceHealth):
+ """Health check model for MedCAT Trainer web service."""
+ app_version: str | None = Field(None, description="MedCAT Trainer app version")
+
+class CogstackCohortHealth(ServiceHealth):
+ """Health check model for CogStack Cohort service."""
+ pass
diff --git a/nifi/user-scripts/legacy_scripts/record_decompress_cerner_blob.py b/nifi/user-scripts/legacy_scripts/record_decompress_cerner_blob.py
deleted file mode 100644
index 92909c1af..000000000
--- a/nifi/user-scripts/legacy_scripts/record_decompress_cerner_blob.py
+++ /dev/null
@@ -1,98 +0,0 @@
-import base64
-import json
-import sys
-
-from utils.cerner_blob import DecompressLzwCernerBlob
-
-""" This script decompresses Cerner LZW compressed blobs from a JSON input stream.
- It expects a JSON array of records, each containing a field with the binary data.
- All RECORDS are expected to have the same fields, and presumably belonging to the same DOCUMENT.
- All the fields of these records should have the same field values, except for the binary data field.
- The binary data field is expected to be a base64 encoded string, which will be concatenated according to
- the blob_sequence_order_field_name field, preserving the order of the blobs and composing the whole document (supposedly).
- The final base64 enncoded string will be decoded back to binary data, then decompressed using LZW algorithm.
-"""
-
-# This needs to be investigated, records might have different charsets,
-# currently only tested with "iso-8859-1"
-# other frequently used encodings: "utf-16le", "utf-16be"
-# In some cases you will need to figure this out yourself, depending on
-# the data source
-INPUT_CHARSET = "iso-8859-1"
-
-# expected (optional)
-OUTPUT_CHARSET = "windows-1252"
-
-# possible values:
-# - base64: output base64 code
-# - string: output string after decompression
-OUTPUT_MODE = "base64"
-
-BINARY_FIELD_NAME = "binary_data"
-
-BINARY_FIELD_SOURCE_ENCODING = "base64"
-
-BLOB_SEQUENCE_ORDER_FIELD_NAME = "blob_sequence_num"
-
-for arg in sys.argv:
- _arg = arg.split("=", 1)
- if _arg[0] == "output_mode":
- OUTPUT_MODE = _arg[1]
- elif _arg[0] == "input_charset":
- INPUT_CHARSET = _arg[1]
- elif _arg[0] == "output_charset":
- OUTPUT_CHARSET = _arg[1]
- elif _arg[0] == "log_file_name":
- LOG_FILE_NAME = _arg[1]
- elif _arg[0] == "binary_field_name":
- BINARY_FIELD_NAME = _arg[1]
- elif _arg[0] == "binary_field_source_encoding":
- BINARY_FIELD_SOURCE_ENCODING = _arg[1]
- elif _arg[0] == "blob_sequence_order_field_name":
- BLOB_SEQUENCE_ORDER_FIELD_NAME = _arg[1]
-
-records = json.loads(sys.stdin.read())
-
-if not isinstance(records, list):
- records = [records]
-
-concatenated_blob_sequence_order = {}
-
-output_merged_record = {}
-
-for record in records:
- if BLOB_SEQUENCE_ORDER_FIELD_NAME in record.keys():
- concatenated_blob_sequence_order[int(record[BLOB_SEQUENCE_ORDER_FIELD_NAME])] = record[BINARY_FIELD_NAME]
-
-
-# take fields from the first record, doesn't matter which one,
-# as they are expected to be the same except for the binary data field
-for k, v in records[0].items():
- if k not in output_merged_record.keys() and k != BINARY_FIELD_NAME:
- output_merged_record[k] = v
-
-del records
-
-concatenated_blob_sequence_order = sorted(concatenated_blob_sequence_order.items())
-
-output_merged_record[BINARY_FIELD_NAME] = bytearray()
-for i in concatenated_blob_sequence_order:
- try:
- temporary_blob = i[1]
- if BINARY_FIELD_SOURCE_ENCODING == "base64":
- temporary_blob: bytes = base64.b64decode(temporary_blob)
-
- decompress_blob = DecompressLzwCernerBlob()
- decompress_blob.decompress(temporary_blob) # type: ignore
- output_merged_record[BINARY_FIELD_NAME].extend(bytes(decompress_blob.output_stream))
- except Exception as exception:
- sys.stderr.write(f"Error decompressing blob with sequence order {i[0]}: {str(exception)}\n")
- sys.stderr.flush()
- raise
-
-del concatenated_blob_sequence_order
-
-if OUTPUT_MODE == "base64":
- output_merged_record[BINARY_FIELD_NAME] = base64.b64encode(output_merged_record[BINARY_FIELD_NAME]).decode(OUTPUT_CHARSET)
-
-sys.stdout.write(json.dumps(output_merged_record))
diff --git a/nifi/user-scripts/test_avro.py b/nifi/user-scripts/test_avro.py
deleted file mode 100644
index 7f893747e..000000000
--- a/nifi/user-scripts/test_avro.py
+++ /dev/null
@@ -1,61 +0,0 @@
-import io
-import json
-
-import avro
-from avro.datafile import DataFileWriter
-from avro.io import DatumWriter
-
-"""
- Use this script to test avro schemas etc with python3
-"""
-
-stream = object()
-
-json_mapper_schema = json.loads(open("../user-schemas/cogstack_common_schema_mapping.json").read())
-avro_cogstack_schema = avro.schema.parse(open("../user-schemas/cogstack_common_schema_full.avsc", "rb").read(), validate_enum_symbols=False)
-
-test_records = [{ "docid" : "1",
- "sampleid" : 1041,
- "dct" : "2020-05-11 10:52:25.273518",
- "binarydoc": "blablabla" },
- { "docid" : "1",
- "sampleid" : 1041,
- "dct" : "2020-05-11 10:52:25.273518",
- "binarydoc": "blablabla" }]
-
-schema_fields = avro_cogstack_schema.props["fields"]
-dict_fields_types = {}
-for field in schema_fields:
- dict_fields_types[field.name] = ""
- tmp_list = json.loads(str(field.type))
- if len(tmp_list) > 1 and type(tmp_list) is not str:
- if type(tmp_list[1]) is dict:
- dict_fields_types[field.name] = tmp_list[1]["type"]
- else:
- dict_fields_types[field.name] = tmp_list[1]
- else:
- dict_fields_types[field.name] = field.type
-
-available_mapping_keys = {}
-for k,v in json_mapper_schema.items():
- if v:
- available_mapping_keys[k] = v
-
-bytes_io = io.BytesIO(bytes("", encoding="UTF-8"))
-
-type_mapping = {"boolean": "bool", "long": "int", "int": "int", "float" : "float", "byte":"bytes", "string": "str", "double": "float"}
-
-
-print(avro_cogstack_schema)
-
-with DataFileWriter(bytes_io, DatumWriter(), avro_cogstack_schema) as writer:
- # re-map the value to the new keys
-
- for _record in test_records:
- record = {}
-
- for k, v in available_mapping_keys.items():
- if v in _record.keys():
- record[k] = _record[v] #getattr(__builtins__, type_mapping[dict_fields_types[k]])(_record[v])
-
- writer.append(record)
diff --git a/nifi/user-scripts/utils/helpers/logging.py b/nifi/user-scripts/utils/helpers/logging.py
new file mode 100644
index 000000000..eaa6a7bdd
--- /dev/null
+++ b/nifi/user-scripts/utils/helpers/logging.py
@@ -0,0 +1,22 @@
+import logging
+import os
+import sys
+
+
+def get_logger(name: str) -> logging.Logger:
+ """Return a configured logger shared across all NiFi clients."""
+ level_name = os.getenv("NIFI_LOG_LEVEL", "INFO").upper()
+ level = getattr(logging, level_name, logging.INFO)
+
+ logger = logging.getLogger(name)
+ if not logger.handlers:
+ handler = logging.StreamHandler(sys.stdout)
+ fmt = logging.Formatter(
+ "[%(asctime)s] [%(levelname)s] [%(name)s] %(message)s",
+ "%Y-%m-%d %H:%M:%S",
+ )
+ handler.setFormatter(fmt)
+ logger.addHandler(handler)
+ logger.setLevel(level)
+ logger.propagate = False
+ return logger
diff --git a/nifi/user-scripts/utils/helpers/nifi_api_client.py b/nifi/user-scripts/utils/helpers/nifi_api_client.py
new file mode 100644
index 000000000..372d7ef06
--- /dev/null
+++ b/nifi/user-scripts/utils/helpers/nifi_api_client.py
@@ -0,0 +1,83 @@
+from logging import Logger
+from typing import List # noqa: UP035
+
+from dto.nifi_api_config import NiFiAPIConfig
+from nipyapi import canvas, security
+from nipyapi.nifi import ApiClient, ProcessGroupsApi
+from nipyapi.nifi.configuration import Configuration as NiFiConfiguration
+from nipyapi.nifi.models.process_group_entity import ProcessGroupEntity
+from nipyapi.nifi.models.processor_entity import ProcessorEntity
+from nipyapi.registry import ApiClient as RegistryApiClient
+from nipyapi.registry import BucketsApi
+from nipyapi.registry.configuration import Configuration as RegistryConfiguration
+from utils.helpers.logging import get_logger
+
+
+class NiFiRegistryClient:
+ def __init__(self, config: NiFiAPIConfig) -> None:
+ self.config = config or NiFiAPIConfig()
+ self.nipyapi_config = RegistryConfiguration()
+ self.nipyapi_config.host = self.config.nifi_registry_api_url
+ self.nipyapi_config.verify_ssl = self.config.VERIFY_SSL
+ self.nipyapi_config.cert_file = self.config.NIFI_CERT_PEM_PATH # type: ignore
+ self.nipyapi_config.key_file = self.config.NIFI_CERT_KEY_PATH # type: ignore
+ self.nipyapi_config.ssl_ca_cert = self.config.ROOT_CERT_CA_PATH # type: ignore
+
+ self.logger: Logger = get_logger(self.__class__.__name__)
+
+ self.api_client = RegistryApiClient(self.nipyapi_config.host)
+ self.buckets_api = BucketsApi(self.api_client)
+
+ def list_buckets(self):
+ buckets = self.buckets_api.get_buckets()
+ for b in buckets:
+ self.logger.info("Bucket: %s (%s)", b.name, b.identifier)
+ return buckets
+
+
+class NiFiClient:
+ def __init__(self, config: NiFiAPIConfig) -> None:
+ self.config = config or NiFiAPIConfig()
+ self.nipyapi_config = NiFiConfiguration()
+ self.nipyapi_config.host = self.config.nifi_api_url
+ self.nipyapi_config.verify_ssl = self.config.VERIFY_SSL
+ self.nipyapi_config.cert_file = self.config.NIFI_CERT_PEM_PATH # type: ignore
+ self.nipyapi_config.key_file = self.config.NIFI_CERT_KEY_PATH # type: ignore
+ self.nipyapi_config.ssl_ca_cert = self.config.ROOT_CERT_CA_PATH # type: ignore
+
+ self.logger: Logger = get_logger(self.__class__.__name__)
+
+ self.api_client = ApiClient(self.nipyapi_config)
+ self.process_group_api = ProcessGroupsApi(self.api_client)
+
+ self._login()
+
+ def _login(self) -> None:
+ security.service_login(
+ service='nifi',
+ username=self.config.NIFI_USERNAME,
+ password=self.config.NIFI_PASSWORD
+ )
+ self.logger.info("✅ Logged in to NiFi")
+
+ def get_root_process_group_id(self) -> str:
+ return canvas.get_root_pg_id()
+
+ def get_process_group_by_name(self, process_group_name: str) -> None | List[object] | object:
+ return canvas.get_process_group(process_group_name, identifier_type="nam")
+
+ def get_process_group_by_id(self, process_group_id: str) -> ProcessGroupEntity:
+ return canvas.get_process_group(process_group_id, identifier_type="id")
+
+ def start_process_group(self, process_group_id: str) -> bool:
+ return canvas.schedule_process_group(process_group_id, True)
+
+ def stop_process_group(self, process_group_id: str) -> bool:
+ return canvas.schedule_process_group(process_group_id, False)
+
+ def get_child_process_groups_from_parent_id(self, parent_process_group_id: str) -> List[ProcessGroupEntity]:
+ parent_pg = canvas.get_process_group(parent_process_group_id, identifier_type="id")
+ return canvas.list_all_process_groups(parent_pg.id)
+
+ def get_all_processors_in_process_group(self, process_group_id: str) -> List[ProcessorEntity]:
+ return canvas.list_all_processors(process_group_id)
diff --git a/pyproject.toml b/pyproject.toml
index a05724622..8b058ce6a 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -3,9 +3,37 @@
[tool.ruff]
line-length = 120
exclude = ["nifi/user-scripts/legacy_scripts"]
-target-version = "py38"
+target-version = "py311"
+indent-width = 4
[tool.ruff.lint]
-# E=pycodestyle, F=pyflakes, B=bugbear, I=isort
-select = ["E", "F", "B", "I"]
+# Enable flake8-bugbear (`B`) rules, in addition to the defaults.
+select = [
+ # pycodestyle
+ "E",
+ # Pyflakes
+ "F",
+ # pyupgrade
+ "UP",
+ # flake8-bugbear
+ "B",
+ # flake8-simplify
+ "SIM",
+ # isort
+ "I",
+]
+
fixable = ["ALL"]
+
+[tool.mypy]
+plugins = ["pydantic.mypy"]
+ignore_missing_imports = true
+strict = false
+
+[tool.isort]
+line_length = 120
+skip = ["venv", "venv-test", "envs", "docker", "models"]
+
+[tool.flake8]
+max-line-length = 120
+exclude = ["venv", "venv-test", "envs", "docker", "models"]
diff --git a/scripts/git_freeze_security.sh b/scripts/git_freeze_security.sh
index 3b890055a..f0fd7d9ff 100644
--- a/scripts/git_freeze_security.sh
+++ b/scripts/git_freeze_security.sh
@@ -30,4 +30,4 @@ for path in "${CERT_AND_CONFIG_PATHS[@]}"; do
git ls-files -z "$path" 2>/dev/null | xargs -0 git update-index --skip-worktree || true
done
-echo "✅ Freeze complete — all sensitive or deployment-specific files are now ignored by Git"
\ No newline at end of file
+echo "✅ Freeze complete — all sensitive or deployment-specific files are now ignored by Git"
diff --git a/scripts/git_update_submodules_in_repo.sh b/scripts/git_update_submodules_in_repo.sh
index b40986edb..5e2c628ab 100644
--- a/scripts/git_update_submodules_in_repo.sh
+++ b/scripts/git_update_submodules_in_repo.sh
@@ -26,4 +26,5 @@ git submodule foreach '
#git submodule foreach git pull origin main
git add $(git config -f .gitmodules --get-regexp '^submodule\..*\.path$' | awk '{print $2}') || true
-git commit -m "Update submodules to latest release tags (or main)" || echo "ℹ️ No changes to commit."
\ No newline at end of file
+git commit -m "Update submodules to latest release tags (or main)" || echo "ℹ️ No changes to commit."
+echo "✅ Submodule update complete."
diff --git a/services/nginx/config/nginx.conf b/services/nginx/config/nginx.conf
index 7ad773e9f..486d70328 100644
--- a/services/nginx/config/nginx.conf
+++ b/services/nginx/config/nginx.conf
@@ -238,14 +238,14 @@ http {
# proxy_pass https://nifi$1;
# }
- location ^~ /nifi-standard-content-viewer-2.5.0/ {
+ location ^~ /nifi-standard-content-viewer-2.6.0/ {
proxy_set_header Host nifi;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-ProxyHost $host;
proxy_set_header X-ProxyPort 8443;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-ProxyScheme $scheme;
- proxy_pass https://nifi/nifi-standard-content-viewer-2.5.0/;
+ proxy_pass https://nifi/nifi-standard-content-viewer-2.6.0/;
}
}
}