Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion deploy/nifi.env
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ NIFI_PYTHON_EXTENSIONS_SOURCE_DIRECTORY_DEFAULT="/opt/nifi/nifi-current/python_e
# nifi.python.working.directory=/opt/nifi/user-scripts
NIFI_PYTHON_WORKING_DIRECTORY="/opt/nifi/user-scripts"

LOG_LEVEL="ERROR"
NIFI_LOG_LEVEL="ERROR"

NIFI_AUTH=tls

Expand Down
3 changes: 0 additions & 3 deletions deploy/services-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,6 @@ services:
- NIFI_SECURITY_DIR=${NIFI_SECURITY_DIR:-../security/nifi_certificates/}
- ELASTICSEARCH_SECURITY_DIR=${ELASTICSEARCH_SECURITY_DIR:-../security/es_certificates/}
volumes:
# INFO: mapping custom development directory
- ../nifi/devel:/opt/nifi/devel

# INFO: drivers folder
- ../nifi/drivers:/opt/nifi/drivers

Expand Down
5 changes: 1 addition & 4 deletions deploy/services.yml
Original file line number Diff line number Diff line change
Expand Up @@ -444,12 +444,9 @@ services:
- NIFI_OUTPUT_PORT=${NIFI_OUTPUT_PORT:-8082}
- NIFI_INPUT_SOCKET_PORT=${NIFI_INPUT_SOCKET_PORT:-10000}
volumes:
# INFO: mapping custom development directory
- ../nifi/devel:/opt/nifi/devel

# INFO: drivers folder
- ../nifi/drivers:/opt/nifi/drivers

# INFO: if there are local changes, map these content from local host to container
# (normally, these 3 directories below are bundled with our NiFi image)
# N.B. The container user may not have the permission to read these directories/files.
Expand Down
46 changes: 1 addition & 45 deletions docs/main.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,2 @@
# Introduction
This repository proposes a possible next step for the free-text data processing capabilities implemented as [CogStack-Pipeline](https://github.com/CogStack/CogStack-Pipeline), shaping the solution more towards Platform-as-a-Service.

CogStack-NiFi contains example recipes using [Apache NiFi](https://nifi.apache.org/) as the key data workflow engine with a set of services for documents processing with NLP.
Each component implementing key functionality, such as Text Extraction or Natural Language Processing, runs as a service where the data routing between the components and data source/sink is handled by Apache NiFi.
Moreover, NLP services are expected to implement an uniform RESTful API to enable easy plugging-in into existing document processing pipelines, making it possible to use any NLP application in the stack.

## Development

Please note that the project is under constant improvement, brining new features or services that might impact current deployments, please be aware as this might affect you, the user, when making upgrades, so be sure to check the release notes and the documentation beforehand.

If you wish to contribute to the project, submit a pull request and we will review it.

## Asking questions
Feel free to ask questions on the github issue tracker or on our [discourse website](https://discourse.cogstack.org) which is frequently used by our development team!
<br>

## Project organisation
The project is organised in the following directories:
- [`nifi`](https://github.com/CogStack/CogStack-NiFi/tree/main/nifi/) - custom Docker image of Apache NiFi with configuration files, drivers, example workflows and custom user resources.
- [`security`](https://github.com/CogStack/CogStack-NiFi/tree/main/security/) - scripts to generate SSL keys and certificates for Apache NiFi and related services (when needed) with other security-related requirements.
- [`services`](https://github.com/CogStack/CogStack-NiFi/tree/main/services/) - available services with their corresponding configuration files and resources.
- [`deploy`](https://github.com/CogStack/CogStack-NiFi/tree/main/deploy/) - an example deployment of Apache NiFi with related services.
- [`scripts`](https://github.com/CogStack/CogStack-NiFi/tree/main/scripts/) - helper scripts containing setup tools, sample ES ingestion, bash ingestion into DB samples etc.
- [`data`](https://github.com/CogStack/CogStack-NiFi/tree/main/data/) - any data that you wish to ingest should be placed here.

### Branches

- main: main branch, production releases.
- devel: this branch contains experimental/unstable docker images may cause irregular behaviour or crashes.

## Documentation and getting started

Knowledge requirements: Docker usage (mandatory), Python, Linux/UNIX understarting.

Official documentation now available [here](https://cogstack-nifi.readthedocs.io/en/latest/).

As a good starting point, [deployment](https://cogstack-nifi.readthedocs.io/en/latest/deploy/main.html) walks through an example deployment with some workflow examples.

It is essential that a careful read through the [NiFi](https://cogstack-nifi.readthedocs.io/en/latest/nifi/main.html) section is done as it explains all the details of how NiFi is setup, the configuration and production setup tips.

All issues are tracked in [README](https://cogstack-nifi.readthedocs.io/en/latest/deploy/main.html), check that section before opening a bug report ticket.

## Important news and updates

Please check [IMPORTANT_NEWS](https://cogstack-nifi.readthedocs.io/en/latest/news.html) for any major changes that might affect your deployment and <strong>security problems</strong> that have been discovered.
```{include} ../README.md
21 changes: 19 additions & 2 deletions docs/news.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# News

<strong>This document covers important news with regards to the components of CogStack as a whole, any major security issues or major changes that might break existing deployments are covered here along with how to handle them.</strong>
</br>
</br>
Expand All @@ -10,12 +11,28 @@ Since the discovery of the Log4J package vulnerability (https://www.ncsc.gov.uk/
A summary of the steps needed to easily upgrade any CogStack components on an existing deployment:

For both instances (old and NiFI versions of the pipeline):
</br>

- make sure to update Elasticsearch to version 7.16.1+ if you are using the native version, if you are using OpenDistro it will be 1.13.3, and for OpenSearch it would be 1.2.1, all of these versions with their compose config can be found on the main branch of the NiFI repo, all that needs to be done is just a simple version change/increment in the docker-compose file (e.g https://github.com/CogStack/CogStack-NiFi/blob/main/deploy/services.yml , see the kibana/elasticsearch sections), followed by the pulling of the new images.

For the Old pipeline:

- re-pull the latest docker image (docker pull cogstacksystems/cogstack-pipeline:latest)
For NiFI:

- re-pull (docker pull cogstacksystems/cogstack-nifi:latest)
- re-pull the tika image (docker pull cogstacksystems/tika-service:latest)


## 01-10-2025 NiFi 2.0 Release

New version of NiFi along with the long awaited NiFi registry flow released:

- massive repository structure changes.
- new scripts and cert generation
- new nifi processors and a revision of old scripts
- restructured services and references to them

Things to consider before upgrading:

- PLEASE back up a copy of your current NiFi repository before upgrading.
- PLEASE BE AWARE THAT THE TEMPLATES FROM NIFI 1.0 ARE NOT COMPATIBLE WITH NIFI 2.0, back them up!
- If you wish to migrate the nifi templates you will have to keep multiple versions operational
6 changes: 3 additions & 3 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
Sphinx~=5.0
sphinx-rtd-theme~=1.0
myst-parser~=0.17
Sphinx==8.2.3
sphinx-rtd-theme==3.0.2
myst-parser==4.0.1
Loading
Loading