[ENH] Allow using a local test server by PGijsbers · Pull Request #1630 · openml/openml-python

PGijsbers · 2026-01-30T10:02:14Z

Update the tests to allow connecting to a local test server instead of a remote one (requires openml/services#13).

Running the tests locally:

Locally start the services (as defined in Update to function as out-of-the-box test server services#13) using docker compose --profile "rest-api" --profile "evaluation-engine" up -d. Startup can take a few minutes, as currently the PHP container still builds the ES indices from scratch.

I noticed that the start_period for some services isn't sufficient on my M1 Mac, possibly due to some containers requiring Rosetta to run, slowing things down. You can recognize this by the services reporting "Error" while the container remains running. To avoid this, you can either increase the start_period of the services (mostly elastic search and php api), or you can simply run the command again (the services are then already in healthy state and the services that depended on it can start successfully).

The following containers should run: openml-test-database, openml-php-rest-api, openml-nginx, openml-evaluation-engine, openml-elasticsearch, openml-minio
Update the openml/config.py's TEST_SERVER_URL variable to "http://localhost:8000".
Run the tests (python -m pytest -m "not production" tests).

This PR builds off unmerged PR #1620.

Locally, MinIO already has more parquet files than on the test server.

Note that the previously strategy didn't work anymore if the server returned a parquet file, which is the case for the new local setup.

This means it is not reliant on the evaluation engine processing the dataset. Interestingly, the database state purposely seems to keep the last task's dataset in preparation explicitly (by having processing marked as done but having to dataset_status entry).

tests/files/localhost:8080

codecov-commenter · 2026-01-30T10:08:38Z

Codecov Report

❌ Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.79%. Comparing base (d421b9e) to head (03bf396).

Files with missing lines	Patch %	Lines
openml/tasks/functions.py	50.00%	2 Missing ⚠️
openml/cli.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1630      +/-   ##
==========================================
+ Coverage   52.04%   52.79%   +0.74%     
==========================================
  Files          36       36              
  Lines        4333     4336       +3     
==========================================
+ Hits         2255     2289      +34     
+ Misses       2078     2047      -31

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

geetu040

Thanks for these fixes.
I hope the test-server can be fully replicated locally for the tests now.
Left a few questions, and tests/files/localhost:8080 file needs to revised.

geetu040 · 2026-02-06T16:38:07Z

openml/tasks/functions.py

+    task_cache_directory = openml.utils._create_cache_directory_for_id(
+        TASKS_CACHE_DIR_NAME, task_id
+    )
+    task_cache_directory_existed = task_cache_directory.exists()


task_cache_directory now points to the folder with all the tasks instead of the folder with the specific task ...

No, it does not. The _create_cache_directory_for_id function, as the name implies, returns the cache directory for the specific identifier, e.g., [...]/org/openml/test/tasks/1882

geetu040 · 2026-02-06T16:38:15Z

openml/tasks/functions.py

-        if not tid_cache_dir_existed:
-            openml.utils._remove_cache_dir_for_id(TASKS_CACHE_DIR_NAME, tid_cache_dir)
+        if not task_cache_directory_existed:
+            openml.utils._remove_cache_dir_for_id(TASKS_CACHE_DIR_NAME, task_cache_directory)


.. and therefore remove the complete directory, instead of the specific task. is that intentional?

geetu040 · 2026-02-06T16:38:18Z

tests/files/localhost

@@ -0,0 +1 @@
+org/openml/test


why this symlink is uploaded? probably a mistake?

or is this used for linking remote-test-server cache with local-test-server cache. if this was intentional, can I ask what's it's purpose and this path can't be tracked on windows because of the :

leads to CI failure on windows: job/61983305389

or is this used for linking remote-test-server cache with local-test-server cache.

Yes, it is. There is a cache folder checked in to the repository here. Because cache lookup depends on the configured server, they are not found if the server is configured to be localhost:8000, this symlink solves that.

I suppose we can update the cache directory resolution to drop the port number (and update the symlink file accordingly), then it should continue to work while windows PCs can still check out the repository.

update: done.

Thanks for the update.
How safe is the symlink here? I was getting errors like "could not update file in this path", but I'll have a look again.

geetu040 · 2026-02-13T10:10:01Z

openml/config.py

+TEST_SERVER_URL = "https://test.openml.org"
+TEST_SERVER_URL = "http://localhost:8000"
+


unintentional mistake here:

Suggested change

TEST_SERVER_URL = "https://test.openml.org"

TEST_SERVER_URL = "http://localhost:8000"

TEST_SERVER_URL = "https://test.openml.org"

geetu040 · 2026-02-13T10:10:05Z

openml/config.py

-    reversed_url_suffix = os.sep.join(url_suffix.split(".")[::-1])  # noqa: PTH118
+    url_parts = url_suffix.split(".")[::-1]
+    url_parts_no_port = [part.split(":")[0] for part in url_parts]
+    reversed_url_suffix = os.sep.join(url_parts_no_port)  # noqa: PTH118


this ignores the port completely, as a result apis running on different ports will lead to same path, e.g (php-api on port:8080 and python-api on port:8082) which should not really be the case.

~/.cache/openml/localhost

PGijsbers added 10 commits January 20, 2026 12:35

Use the correct path to the cache directory for the task

83e1531

Push configuration of test server URL exclusively to config.py

f90036d

Update the test to use a dataset which does not have a parquet file

3a257ab

Locally, MinIO already has more parquet files than on the test server.

Replace hard-coded cache directory by configured one

3b79017

Update test to use dataset file that is already in cache

f524d75

Note that the previously strategy didn't work anymore if the server returned a parquet file, which is the case for the new local setup.

relax assumptions on local file structure

a5601e3

Do not use static cache directory

d862be2

Add symlink to regular test cache directory

775dcf7

Skip test for 1.8 since expected results differ too much

319cb35

PGijsbers mentioned this pull request Jan 30, 2026

Update to function as out-of-the-box test server openml/services#13

Open

PGijsbers commented Jan 30, 2026

View reviewed changes

tests/files/localhost:8080 Outdated Show resolved Hide resolved

PGijsbers added 3 commits January 30, 2026 11:08

Simplify path to static cache directory

a680ebe

Update symbolic link to be relative

b161b3b

Fix typo

0b989d1

geetu040 suggested changes Feb 6, 2026

View reviewed changes

satvshr mentioned this pull request Feb 10, 2026

[MNT] Dockerized tests for CI runs using localhost #1629

Open

PGijsbers added 2 commits February 13, 2026 10:44

Do not include ports in cache path. ':' not supported by windows

e4a6807

Merge branch 'main' into update-tests-for-local

03bf396

geetu040 suggested changes Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH] Allow using a local test server#1630

[ENH] Allow using a local test server#1630
PGijsbers wants to merge 15 commits intomainfrom
update-tests-for-local

PGijsbers commented Jan 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

codecov-commenter commented Jan 30, 2026 •

edited

Loading

Uh oh!

geetu040 left a comment

Uh oh!

geetu040 Feb 6, 2026

Uh oh!

PGijsbers Feb 13, 2026

Uh oh!

geetu040 Feb 6, 2026

Uh oh!

geetu040 Feb 6, 2026

Uh oh!

PGijsbers Feb 13, 2026 •

edited

Loading

Uh oh!

geetu040 Feb 13, 2026

Uh oh!

geetu040 Feb 13, 2026

Uh oh!

geetu040 Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		TEST_SERVER_URL = "https://test.openml.org"
		TEST_SERVER_URL = "http://localhost:8000"

		@@ -0,0 +1 @@
		org/openml/test

Uh oh!

Conversation

PGijsbers commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

geetu040 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

geetu040 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

geetu040 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

geetu040 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

geetu040 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

geetu040 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PGijsbers commented Jan 30, 2026 •

edited

Loading

codecov-commenter commented Jan 30, 2026 •

edited

Loading

PGijsbers Feb 13, 2026 •

edited

Loading