Azul Metastore enables storage and retrieval of binary files and plugin execution results
- Store plugin results + info via ingestors
- Deletion of old data via age-off
- Retry of failed execution
- Manual data deletion
- Expose functionality via restapi using
azul-restapi-server
Supplies Rest API endpoints for:
- binary submission (malpz and cart support)
- download of files (selectable neutering format)
- download of other plugin artifacts, reports, etc.
- query whether content (still) exists for binary
- Re-enqueing processing
- metadata interaction / search
Most functionality is available via the command line script.
$ azul-metastore --help
Usage: azul-metastore [OPTIONS] COMMAND [ARGS]...
Entrypoint to the program.
Options:
--help Show this message and exit.
Commands:
age-off Delete expired indices.
force-update-templates Force opensearch templates to be added.
ingest-binary Ingest binary events from dispatcher.
ingest-plugin Ingest plugin events from dispatcher.
ingest-status Ingest status events from dispatcher.
process-lost-tasks Retry failed processing tasks from dispatcher.
purge Purge metadata and data.
apply-opensearch-config Create roles in Opensearch that are required by Azul to function.Deletes the expired indices out of Opensearch, based on the source configuration. This is useful to minimise the amount of data in Opensearch and delete data that is no longer required.
Updates the Opensearch templates for the current Opensearch indices. Useful when a significant change has been made to the Opensearch model and you need to increment the index prefix. This command will allow you to create the new templates to be used on the new indices.
Run as a pod in Azul and queries kafka through dispatcher for binary topics that have new data. That data is then indexed and then transformed into Opensearch documents where it is then inserted into Opensearch.
Same as binary ingestor but for plugin events.
Same as binary ingestor but for status events.
Run as a pod in Azul and looks for events that have dequeued events but have no associated completion event. When these events are found a message is sent to dispatcher to retry this event.
Removes all metadata and binary data about a particular hash from Azul. It does this by deleting all the data out of S3 and Opensearch through dispatcher and metastore.
Used to create the roles in Opensearch associated with the current security configuration and the necessary default roles. To modify the roles this command creates update the security labels.
There is also a restapi component that can only be used via azul-restapi-server project.
Controlled through environment variables. See azul_metastore/settings.py for more info.
The Azul team do not recommend using the metastore as a library, as there is no guarantee of the stability of any public functions.
Run unit tests via pytest tests/unit
To setup a local instance of OpenSearch please look at demo-cluster/readme.md.
Run all tests via pytest tests.
classes and utilities shared between other parts of the project
handle conversion between metastore searchable format and dispatcher message format
handle the querying of data from opensearch.
expose azul-metastore functions via rest api
Assorted scripts to assist different kinds of development. Not intended for use in production systems.
To run metastore's restapi locally you should install azul-restapi-server and a development version of metastore.
Refer to azul-restapi-server on how to startup the server locally.
Dependencies are managed in the pyproject.toml and debian.txt file.
Version pinning is achieved using the uv.lock file.
Because the uv.lock file is configured to use a private UV registry, external developers using UV will need to delete the existing uv.lock file and update the project configuration to point to the publicly available PyPI registry instead.
To add new dependencies it's recommended to use uv with the command uv add <new-package>
or for a dev package uv add --dev <new-dev-package>
The tool used for linting and managing styling is ruff and it is configured via pyproject.toml
The debian.txt file manages the debian dependencies that need to be installed on development systems and docker images.
Sometimes the debian.txt file is insufficient and in this case the Dockerfile may need to be modified directly to install complex dependencies.