This contains information on the available methods to interact with a deployed instance of Sleeper.
This will assume you have a deployed instance of Sleeper, and you've installed the CLI as described in the getting started guide. See the deployment guide for more information on deploying an instance.
If you just want to test locally, see the documentation on deploying to localstack. This has very limited functionality compared to a deployed instance.
It's currently necessary to build the system before any of the clients will work. In the future we may publish pre-built artefacts that will make this unnecessary.
We will build the system in a container invoked with sleeper builder. Run the following commands:
sleeper builder # Create a Docker container with a workspace mounted in from the host directory ~/.sleeper/builder
git clone https://github.com/gchq/sleeper.git --branch main # Get the latest release version of Sleeper
cd sleeper # Navigate into the Git repository
./scripts/build/build.shThis will take 20-40 minutes. This script creates the necessary artefacts and prepares the workspace to run the other
scripts. This build will be persisted to other invocations of sleeper builder, and you can run any of Sleeper's
scripts and tools in the command line this gives you.
Details of all Sleeper configuration properties are available here: Properties. These can be edited in the administration client detailed below, or set during deployment. Also see Sleeper instance configuration.
Data in Sleeper is held in tables. You can always add or remove Sleeper tables from an instance. See the tables documentation for how to define and edit a table.
Data is ingested in large, sorted files which are then added to a Sleeper table. There are a number of options available for creating these files and adding data to the system. See the ingest documentation for details.
See the data retrieval documentation for ways to query a Sleeper table.
In the future it will be possible to export Sleeper table data in bulk. See the data export documentation.
There are clients and scripts in the scripts/deploy and scripts/utility directories that can be used to work with an
existing instance.
Also see the tables documentation for scripts to add/edit Sleeper tables.
We have provided a command line client that will enable you to:
- List Sleeper instance properties
- List Sleeper table names
- List Sleeper table properties
- Change an instance/table property
- Get status reports (also see checking the status of the system)
This client will prompt you for things like your instance ID as mentioned above and/or the name of the table you want to look at. To adjust property values it will open a text editor for a temporary file.
You can run this client with the following command:
./scripts/utility/adminClient.sh ${INSTANCE_ID}If you want to fully compact all files in leaf partitions, but the compaction strategy is not compacting files in a partition, you can run the following script to force compactions to be created for files in leaf partitions that were skipped by the compaction strategy:
./scripts/utility/compactAllFiles.sh ${INSTANCE_ID} <table-name-1> <table-name-2> ...The clients module can be used as a dependency for an application to interact with Sleeper. This is not currently
published but is built with Maven. We have a class SleeperClient that can be used as an entrypoint for direct access
to an instance of Sleeper. This requires permissions to interact with the underlying AWS resources. We have an open
issue to introduce a REST API that may simplify this in the future (#1786).
See the Python API documentation for details of the Python client library for Sleeper.
Experimental integrations are available to interact with Sleeper via Athena and Trino.
Fine-grained security contains information on implementing fine-grained security in an app with data held in Sleeper.