NOMAD Oasis: test users unauthorized in example setup (local keycloak)

I am struggling to get the example setup, as provided by the Oasis documentation running.

I have taken the Docker setup with keycloak with these few modifications:

  • Disabled (commented out) North in the Docker YAML and the nginx config (North is not a priority right now)
  • In docker-compose.yaml, changed KEYCLOAK_FRONTEND_URL=https://my-oasis.org/keycloak/auth (like pointed out in the install docs)
  • In configs/nomad.yaml, changed public_server_url: 'https://my-oasis.org/keycloak/auth/' (also according to docs)
  • Changed the admin credentials

Issue 1: “Unknown user (401)”

The provided keycloak realm config brings a test user. Login works.

If I now attempt Publish → Uploads, it won’t proceed, but fails with the message “You are logged in with an unknown user (401)”.

If I login with admin credentials on https://my-public-hostname/keycloak/auth , I can configure the realm. I created another user, but it behaves identically.

The logs of the nginx proxy confirm that there is indeed an access issue:

$SERVER_IP - - [22/Feb/2024:14:49:10 +0000] "GET /nomad-oasis/api/v1/uploads?page_size=10&page=1&order_by=upload_create_time&order=desc HTTP/1.1" 401 51 "https://my-oasis.org/nomad-oasis/gui/user/uploads" "$browser" "$CLIENT_IP"
$SERVER_IP - - [22/Feb/2024:14:49:10 +0000] "GET /nomad-oasis/api/v1/uploads?is_published=false&roles=main_author&page_size=10000&order_by=upload_create_time&order=desc HTTP/1.1" 401 51 "https://my-oasis.org/nomad-oasis/gui/user/uploads" "$browser" "$CLIENT_IP"
$SERVER_IP - - [22/Feb/2024:14:49:10 +0000] "GET /nomad-oasis/api/v1/uploads/command-examples HTTP/1.1" 401 51 "https://my-oasis.org/nomad-oasis/gui/user/uploads" "$browser" "$CLIENT_IP"
$SERVER_IP - - [22/Feb/2024:14:49:11 +0000] "GET /nomad-oasis/api/v1/uploads?page_size=10&page=1&order_by=upload_create_time&order=desc HTTP/1.1" 401 51 "https://my-oasis.org/nomad-oasis/gui/user/uploads" "$browser" "$CLIENT_IP"
$SERVER_IP - - [22/Feb/2024:14:49:11 +0000] "GET /nomad-oasis/api/v1/uploads/command-examples HTTP/1.1" 401 51 "https://my-oasis.org/nomad-oasis/gui/user/uploads" "$browser" "$CLIENT_IP"
$SERVER_IP - - [22/Feb/2024:14:49:11 +0000] "GET /nomad-oasis/api/v1/uploads?is_published=false&roles=main_author&page_size=10000&order_by=upload_create_time&order=desc HTTP/1.1" 401 51 "https://my-oasis.org/nomad-oasis/gui/user/uploads" "$browser" "$CLIENT_IP"

Which property must be set for a user to have access to the upload process? Why is this disabled in the reference setup from the documentation?

What puzzles me, is that I had played around with another test Oasis before, and successfully uploaded data there (example datasets etc.).

Issue 2: No user access to keycloak account console

If logged in as a user, NOMAD offers a link to the account settings at the top right (https://my-public-hostname/keycloak/auth/realms/nomad/account/). It displays the error message “failed to initialize keycloak”, and then is just stuck at the “Account Console loading …” screen.

A look into the logs of the nginx proxy:

$SERVER_IP - - [22/Feb/2024:14:37:20 +0000] "GET /keycloak/auth/realms/nomad/protocol/openid-connect/login-status-iframe.html/init?client_id=account-console&origin=https%3A%2F%2Fmy-oasis.org HTTP/1.1" 403 0 "-" "$browser" "$CLIENT_IP"

This means that unprivileged users currently have no access to their account data.


I only have a superficial understanding of how the authentication process to NOMAD Oasis works. The experiences so fare were not the most encouraging for the more sophisticated setup that we are aiming at for the production-grade setup.

I have no doubt that a working setup is possible for self-hosted keycloak instances. It’s just that the example setup given in the documentation is not working out of the box, and a first debugging round has not led me to any insightful progress on why.

All containers have been pulled today. I will provide detailed version information if necessary.

The “Unknown user (401)” suggests that the keycloak+realm that the backend uses is not the same as the one that UI is using to log you in.

Can you check that if you login on the oasis UI, you are actually forwarded to your keycloak (https://my-oasis.org/keycloak/auth/) and not to the global nomad keycloak (https://nomad-lab.eu/fairdi/keycloak/auth). The URL where you see the login also contains the realm name. Please also check if this is the correct one.

Can you post the nomad.yaml and docker-compose.yaml with your changes?

Thank you very much for your assistance!

Like mentioned before, my modifications to the example configs are minor. Find the configs here:

docker-compose.yaml
version: "3.4"

services:
  # keycloak user management
  keycloak:
    restart: unless-stopped
    image: jboss/keycloak:16.1.1
    container_name: nomad_oasis_keycloak
    environment:
      - PROXY_ADDRESS_FORWARDING=true
      - KEYCLOAK_USER=admin
      - KEYCLOAK_PASSWORD=verysecretpassword
      - KEYCLOAK_FRONTEND_URL=https://my-oasis.org/keycloak/auth
      - KEYCLOAK_IMPORT="/tmp/nomad-realm.json"
    command:
      - "-Dkeycloak.import=/tmp/nomad-realm.json -Dkeycloak.migration.strategy=IGNORE_EXISTING"
    volumes:
      - keycloak:/opt/jboss/keycloak/standalone/data
      - ./configs/nomad-realm.json:/tmp/nomad-realm.json
    healthcheck:
      test:
        - "CMD"
        - "curl"
        - "--fail"
        - "--silent"
        - "http://127.0.0.1:9990/health/live"
      interval: 10s
      timeout: 10s
      retries: 30
      start_period: 30s

  # broker for celery
  rabbitmq:
    restart: unless-stopped
    image: rabbitmq:3.11.5
    container_name: nomad_oasis_rabbitmq
    environment:
      - RABBITMQ_ERLANG_COOKIE=SWQOKODSQALRPCLNMEQG
      - RABBITMQ_DEFAULT_USER=rabbitmq
      - RABBITMQ_DEFAULT_PASS=rabbitmq
      - RABBITMQ_DEFAULT_VHOST=/
    volumes:
      - rabbitmq:/var/lib/rabbitmq
    healthcheck:
      test: ["CMD", "rabbitmq-diagnostics", "--silent", "--quiet", "ping"]
      interval: 10s
      timeout: 10s
      retries: 30
      start_period: 10s

  # the search engine
  elastic:
    restart: unless-stopped
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.1
    container_name: nomad_oasis_elastic
    environment:
      - ES_JAVA_OPTS=-Xms512m -Xmx512m
      - discovery.type=single-node
    volumes:
      - elastic:/usr/share/elasticsearch/data
    healthcheck:
      test:
        - "CMD"
        - "curl"
        - "--fail"
        - "--silent"
        - "http://elastic:9200/_cat/health"
      interval: 10s
      timeout: 10s
      retries: 30
      start_period: 60s

  # the user data db
  mongo:
    restart: unless-stopped
    image: mongo:5.0.6
    container_name: nomad_oasis_mongo
    environment:
      - MONGO_DATA_DIR=/data/db
      - MONGO_LOG_DIR=/dev/null
    volumes:
      - mongo:/data/db
      - ./.volumes/mongo:/backup
    command: mongod --logpath=/dev/null # --quiet
    healthcheck:
      test:
        - "CMD"
        - "mongo"
        - "mongo:27017/test"
        - "--quiet"
        - "--eval"
        - "'db.runCommand({ping:1}).ok'"
      interval: 10s
      timeout: 10s
      retries: 30
      start_period: 10s

  # nomad worker (processing)
  worker:
    restart: unless-stopped
    image: gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-fair:latest
    container_name: nomad_oasis_worker
    environment:
      NOMAD_SERVICE: nomad_oasis_worker
      NOMAD_RABBITMQ_HOST: rabbitmq
      NOMAD_ELASTIC_HOST: elastic
      NOMAD_MONGO_HOST: mongo
      #NOMAD_LOGSTASH_HOST: logtransfer
    depends_on:
      - rabbitmq
      - elastic
      - mongo
    volumes:
      - ./configs/nomad.yaml:/app/nomad.yaml
      - ./.volumes/fs:/app/.volumes/fs
    command: python -m celery -A nomad.processing worker -l info -Q celery

  # nomad app (api + proxy)
  app:
    restart: unless-stopped
    image: gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-fair:latest
    container_name: nomad_oasis_app
    environment:
      NOMAD_SERVICE: nomad_oasis_app
      NOMAD_SERVICES_API_PORT: 8000
      NOMAD_FS_EXTERNAL_WORKING_DIRECTORY: "$PWD"
      NOMAD_RABBITMQ_HOST: rabbitmq
      NOMAD_ELASTIC_HOST: elastic
      NOMAD_MONGO_HOST: mongo
      #NOMAD_LOGSTASH_HOST: logtransfer
      #NOMAD_NORTH_HUB_HOST: north
    depends_on:
      - rabbitmq
      - elastic
      - mongo
      #- north
      - keycloak
    volumes:
      - ./configs/nomad.yaml:/app/nomad.yaml
      - ./.volumes/fs:/app/.volumes/fs
    command: ./run.sh
    healthcheck:
      test:
        - "CMD"
        - "curl"
        - "--fail"
        - "--silent"
        - "http://localhost:8000/-/health"
      interval: 10s
      timeout: 10s
      retries: 30
      start_period: 10s
    ports:
      - 8000:8000

  # nomad proxy (a reverse proxy for nomad)
  proxy:
    restart: unless-stopped
    image: nginx:1.13.9-alpine
    container_name: nomad_oasis_proxy
    command: nginx -g 'daemon off;'
    volumes:
      - ./configs/nginx.conf:/etc/nginx/conf.d/default.conf
    depends_on:
      - keycloak
      - app
      - worker
      #- north
    ports:
      - 8073:8073

volumes:
  mongo:
    name: "nomad_oasis_mongo"
  elastic:
    name: "nomad_oasis_elastic"
  rabbitmq:
    name: "nomad_oasis_rabbitmq"
  keycloak:
    name: "nomad_oasis_keycloak"
configs/nomad.yaml
services:
  api_host: 'localhost'
  api_base_path: '/nomad-oasis'

oasis:
  is_oasis: true
  uses_central_user_management: false

keycloak:
  server_url: 'http://keycloak:8080/auth/'
  public_server_url: 'https://my-oasis.org/keycloak/auth/'
  realm_name: nomad
  username: 'admin'
  password: 'verysecretpassword'

meta:
  deployment: 'oasisname'
  deployment_url: 'https://my-oasis.org/api'
  maintainer_email: '[email protected]'

mongo:
    db_name: nomad_oasis_v1

elastic:
    entries_index: nomad_oasis_entries_v1
    materials_index: nomad_oasis_materials_v1

From the UI front page https://my-oasis.org/nomad-oasis/gui/about/information , the “Login/Register” button carries me to https://my-oasis.org/keycloak/auth/realms/nomad/protocol/openid-connect/auth?client_id=nomad_public&redirect_uri=

I am certain that our local keycloak is used as the authentication source, since I have manually created a new user there, and I can login with its credentials. What is left, is an authorization issue (users, be it the one created by the example files, or newly created ones, somehow have to be explicitly allowed to do things).

A peek into the logs of keycloak:

09:46:17,147 WARN  [org.keycloak.events] (default task-9) type=LOGIN_ERROR, realmId=nomad, clientId=admin-cli, userId=asfdasdfasdfasdf, ipAddress=$ClientIP, error=invalid_user_credentials, auth_method=openid-connect, grant_type=password, client_auth_method=client-secret, username=admin, authSessionParentId=asdfasdfasdfasdf, authSessionTabId=asdfasdf

What puzzles me, is that I had had a Nomad Oasis running before (in December), where the default test user and manually created users were permitted R/W access to the database, but this is no longer the case with fresh deployments.


BTW, I had to modify the outwards HTTP port of the proxy (to add another layer of reverse proxying via Apache ProxyPass). But this is not likely to be a source of either the 401 or the keycloak account issues, since it behaves identically if I access the Oasis by explicit IP and port.

The warning from the keycloak logs is a good clue.

The NOMAD GUI uses a keycloak client called nomad_public that is installed together with the example realm that we provide. Here everything seems to work as you can login with the GUI.

The NOMAD backend talks to keycloak with the admin-cli client. This is a build in client that every keycloak realm has. Here NOMAD uses the “admin” user of the “nomad” realm. This is what keycloak.user and keycloak.password in the nomad.yaml refer to. NOMAD needs this to get a list of all available users, something that normal users cannot do. This is necessary to share your data with other accounts.

This “admin” user in the “nomad” realm is different from the “admin” user in the “master” realm. The admin@master user is the one you use to log into the admin console. That is the one specified in the docker-compose unser services.keycloak.environment.KEYCLOAK_USER and services.keycloak.environment.KEYCLOAK_PASSWORD. On the first keycloak start this password is used to create the admin@master user.

In our examples from the documentation, we use “password” as the password on both users. If you changed the password to “verysecretpassword” make sure you actually change it for both the admin@master and the admin@nomad user. They don’t have to use the same password, but you have to set services.keycloak.environment.KEYCLOAK_PASSWORD in the docker-compose to the admin@master password and the keycloak.password in the nomad.yaml to the admin@nomad password respectively.

When you change the passwords in your config file, the admin@master password is set automatically on first start. But the admin@nomad user comes from the realm file we provided in the .zip file. If you change this in the config, you will also need to change this in the nomad realm (via the keyclaok admin console, http:///keycloak/auth).

1 Like

Thanks a lot! I properly set the passwords of the admin user in master and nomad realm, and now the user interaction with the database works, and my understanding of realms and the authentication hierarchy deepened. This solves my first problem.

What did not change, though, is that users (including the admin user) still cannot access the account console. The behaviour of the web server is unchanged:
Click the name on the top right, or access https://my-oasis.org/keycloak/auth/realms/nomad/account/ and the page stalls at “failed to initialize keycloak”. nginx logs the status 403 (see my initial post).

When I then shorten the URL to https://my-oasis.org/keycloak/auth/ and try to access the Administration console, it does no longer ask me for login, but nginx responds with a “502 Bad Gateway” instead. I have not found another way to get rid of that error, other than restarting the keycloak container.

I think this is a problem with the default configuration that we provide. I can reproduce this problem. I opened an issue about this: The keycloak account console is not working with the nomad-oasis-with-keycloak default config. (#1910) · Issues · nomad-lab / nomad-FAIR · GitLab

Someone will look into it, but it might take a while. Its not the most critical feature, as login seems to work. How much does this impede you ability to use the Oasis?

I don’t know what I am missing :wink: If it is really just user settings (like password change), then we might not need it at all, not for the test phase (accounts managed manually) nor in production (read-only access to external LDAP).

This is really just about changing your name or password. And changing your password you can even do with the “forgot password” feature.