Documentation ▼

Installing the ingestor

Prerequisites: docker and docker compose installed on target machine; familiarity with docker and docker compose, see https://docs.docker.com/compose/

Basic Setup and Configuration

1. Checkout the deployment repository at https://github.com/SwissOpenEM/openem-deployment

git clone https://github.com/SwissOpenEM/openem-deployment

2. Modify the .env.example and save it as .env

The .env.example file contains configuration values; some are specific to the PSI endpoints and some are facility specific values which need to be adapted.

Parameter Example Value Description Facility Specific
FACILITY myfacility Facility name; used in some naming conventions Yes
INGESTOR_VERSION v1.0.0 or latest Version of the Ingestor Yes
INGESTOR_DOMAIN https://ingestor.facility.com URL to the facility’s ingestor Yes
HOST_COLLECTION_PATH /server/data Path to the directory on the host system to the data Yes
HOST_COLLECTION_NAME DataServer Name of the data directory that will appear in the UI Yes
GLOBUS_SOURCE_FACILITY DCIL Globus source facility tag, one of DCIL, UNIBAS, UNIGE, UNIBE Yes
GLOBUS_COLLECTION_ROOT_PATH /server/data Path of the collection passed to Globus, needs to match HOST_COLLECTION_PATH yes
KEYCLOAK_CLIENT_ID openem-ingestor-DCIL Keycloak client for this facility’s ingestor, one of openem-ingestor-DCIL, openem-ingestor-UNIBAS, etc. Yes
LIFESCIENCE_EXTRACTOR_ADDITIONAL_PARAMS --cs 2.7 Optional, additional parameters for the life science metadata extractor yes
GLOBUS_DESTINATION_FACILITY PSI Destination facility for Globus, one of PSI, PSI_QA, PS_DEV No
SCICAT_BACKEND_URL https://dacat.psi.ch URL of Scicat’s backend No
SCICAT_FRONTEND_URL https://discovery.psi.ch URL of Scicat’s frontend No
GLOBUS_TRANSFER_PROXY_URL https://globus-proxy..psi.ch URL to the Globus Proxy No
KEYCLOAK_URL https://kc.psi.ch URL to the Keycloak instance No
KEYCLOAK_REALM awi URL to the Keycloak realm No

The complete configuration of the ingestor can be found in the docker-compose file of the ingestor, see https://raw.githubusercontent.com/SwissOpenEM/openem-deployment/refs/heads/main/services/ingestor/compose.yaml.

It is not generally necessary to modify the configuration except for updating the Metadata Extractor, see section Advanced Configuration.

3. Start the service and verify that it is running without errors

Start in detached mode

docker compose up -d

This will start a container with the name openem-ingestor and check its logs

docker logs openem-ingestor

There should be no messages with an ERROR tag in the logs

Verify that the container is reachable by opening

https://<INGESTOR_DOMAIN>/version

in a browser.

Advanced Configuration

Detailed information about the configuration of the ingestor can be found in its repo https://github.com/SwissOpenEM/Ingestor

Caddy Reverse Proxy

Installing an additional reverse proxy is not needed in general if Globus Connect Server is installed alongside.

Metadata Extractors

Updating an extractor requires to change its version and the checksum in the configuration within the docker-compose.yml. Both can be found in the respective release pages of the extractor, e.g. https://github.com/osc-em/oscem-extractor-life/releases

See https://github.com/SwissOpenEM/Ingestor for a more detailed description.

Updating schemas can be done by restarting the ingestor if the schema URLs are pointing to latest and not a specific version. Otherwise, the ingestor needs to be stopped, the URL adapted and the ingestor started again.

User Identity

In case the ingestor needs to run using as a specific user, add the following variables to th .env file

Parameter Example Value Description Facility Specific
UID 1001 User id yes
GID 1001 Group id yes
< Back Next >