MQuery is an HTTP API server for mining language corpora using Manatee-Open engine. Unlike other Manatee-based solutions, MQuery uses more fine-tuned C bindings without relying on SWIG, and naturally leverages a worker queue architecture for efficient query processing and scalability.
The simplest way to run MQuery is using Docker Compose, which automatically sets up the server, worker, and Redis:
-
Clone the repository:
git clone https://github.com/czcorpus/mquery.git cd mquery -
Create a Docker configuration file
conf-docker.jsonbased onconf.sample.json:cp conf.sample.json conf-docker.json
-
Edit
conf-docker.jsonto match your setup:- Set
listenAddressto0.0.0.0(to accept connections from outside the container) - Set
listenPortto8989 - Set Redis host to
redis(the service name in docker-compose.yml) - Configure your corpora paths:
registryDir:/var/lib/manatee/registrysplitCorporaDir:/var/lib/manatee/split
- Set
-
Place your corpus data and registry files in directories that will be mounted:
- The docker-compose setup creates volumes for corpus data at
/var/lib/manatee - You can modify the volume mounts in
docker-compose.ymlto point to your existing corpus directories
- The docker-compose setup creates volumes for corpus data at
-
Start the services:
docker-compose up -d
-
Access the API at
http://localhost:8989
The Docker Compose setup includes:
- mquery-server: HTTP API server (port 8989)
- mquery-worker: Background worker for processing corpus queries
- redis: Redis database for job queuing and results caching
- View logs:
docker-compose logs -f - Stop services:
docker-compose down - Rebuild after code changes:
docker-compose up -d --build
If you prefer to install MQuery manually without Docker:
- a working Linux server with installed Manatee-open library
- Redis database
- Go language compiler and tools
- (optional) an HTTP proxy server (Nginx, Apache, ...)
- Install
Golanguage environment, either via a package manager or manually from Go download page- make sure
/usr/local/go/binand~/go/binare in your$PATHso you can run any installed Go tools without specifying a full path
- make sure
- Install Manatee-open from the download page. No specific language bindings are required.
configure --with-pcre --disable-python && make && sudo make install && sudo ldconfig
- Get MQuery sources (
git clone --depth 1 https://github.com/czcorpus/mquery.git) - Run
./configure - Run
make - Run
make install- the application will be installed in
/opt/mquery - for data and registry,
/var/opt/corpora/dataand/var/opt/corpora/registrydirectories will be created - systemd services
mquery-server.serviceandmquery-worker-all.targetwill be created
- the application will be installed in
- Copy at least one corpus and its configuration (registry) into respective directories (
/var/opt/corpora/data,/var/opt/corpora/registry) - Update corpora entries in
/opt/mquery/conf.jsonfile to match your installed corpora - start the service:
systemctl start mquery-serversystemctl start mquery-worker-all.target
For the most recent API Docs, please see https://korpus.cz/mquery-test/docs/