Skip to content

feat: support XDG Base Directory Specification (closes #46)#148

Open
mvalentsev wants to merge 2 commits intoMemPalace:developfrom
mvalentsev:feat/xdg-base-directory
Open

feat: support XDG Base Directory Specification (closes #46)#148
mvalentsev wants to merge 2 commits intoMemPalace:developfrom
mvalentsev:feat/xdg-base-directory

Conversation

@mvalentsev
Copy link
Copy Markdown
Contributor

@mvalentsev mvalentsev commented Apr 7, 2026

What does this PR do?

Closes #46 by moving the default config directory to follow the XDG Base Directory Specification, with a careful back-compat path so existing installs keep working.

Resolution order:

  1. $MEMPALACE_CONFIG_DIR explicit override
  2. ~/.mempalace if it contains config.json, people_map.json, or palace/chroma.sqlite3 (back-compat)
  3. $XDG_CONFIG_HOME/mempalace (absolute paths only, per spec)
  4. ~/.config/mempalace fallback

Edge cases handled per XDG spec: empty, whitespace, and relative XDG_CONFIG_HOME values are ignored. An empty ~/.mempalace left behind by some other tool no longer hijacks the XDG fallback.

palace_path now defaults to <config_dir>/palace so the data directory stays next to the config wherever XDG places it. The legacy DEFAULT_PALACE_PATH module constant is kept for any external importers but marked deprecated.

How to test

python -m pytest tests/test_config.py -v
ruff check .
ruff format --check .

The test suite adds coverage for all four resolution branches, precedence between env override and legacy, the empty-legacy-dir edge case, and empty / whitespace / relative XDG values.

Manual smoke:

# New XDG path
XDG_CONFIG_HOME=/tmp/xdg_test mempalace init /tmp/some_project
ls /tmp/xdg_test/mempalace/

# Legacy back-compat: existing ~/.mempalace installs get the old location.

Checklist

  • Tests pass (python -m pytest tests/ -v - 32 passed)
  • No hardcoded paths (legacy constant marked deprecated)
  • Linter passes (ruff check ., ruff format --check .)
  • Python 3.9 compatible

Scope note

This PR is scoped to MempalaceConfig and its direct callers in line with the issue text (which asks about configuration storage). Other modules that still hardcode ~/.mempalace directly (layers.py, knowledge_graph.py, onboarding.py, split_mega_files.py, the shell hooks) can be migrated in a follow-up PR so this one stays reviewable.

@bgauryy
Copy link
Copy Markdown

bgauryy commented Apr 8, 2026

PR Review: feat: support XDG Base Directory Specification (closes #46)

Executive Summary

Aspect Value
PR Goal Move default config directory to follow XDG Base Directory Specification with backward-compatible fallback to ~/.mempalace
Files Changed 3 (config.py, cli.py, tests/test_config.py)
Risk Level 🟡 MEDIUM - Changes config resolution for every module that instantiates MempalaceConfig(), but back-compat path is solid
Review Effort 3/5 - Core logic is straightforward, but flow impact across modules requires verification
Recommendation ✅ APPROVE with comments

Affected Areas: mempalace/config.py (core logic), mempalace/cli.py (help text), tests/test_config.py (177 new test lines)

Business Impact: New installs get XDG-compliant paths (~/.config/mempalace/). Existing installs with ~/.mempalace/ containing config.json, people_map.json, or palace keep working transparently.

Flow Changes: All 5 modules that call MempalaceConfig() without config_dir (cli.py, convo_miner.py, layers.py, mcp_server.py, palace_db.py) now resolve config via the new 4-step XDG-aware resolution instead of hardcoded ~/.mempalace.

Ratings

Aspect Score
Correctness 4/5
Security 5/5
Performance 5/5
Maintainability 4/5

PR Health

High Priority Issues

None — no blocking defects found.


Medium Priority Issues

🔄 #1: layers.py hardcodes ~/.mempalace/identity.txt — bypasses XDG resolution

Location: mempalace/layers.py:43, 60, 324 | Confidence: ✅ HIGH

Layer0.__init__ and MemoryStack.__init__ hardcode the identity path to ~/.mempalace/identity.txt via os.path.expanduser(), completely bypassing the new XDG config resolution. After this PR, a fresh XDG install would have its config at ~/.config/mempalace/ but Layer0 would still look for identity at ~/.mempalace/identity.txt (which doesn't exist), and its error message tells users to "Create ~/.mempalace/identity.txt" instead of the actual config directory.

- identity_path = os.path.expanduser("~/.mempalace/identity.txt")
+ cfg = MempalaceConfig()
+ identity_path = str(Path(cfg._config_dir) / "identity.txt")

Note: this is existing code, not introduced by this PR. However, the PR creates a behavioral gap: config resolves via XDG but identity does not. Consider addressing in a follow-up.


🔄 #2: entity_registry.py hardcodes DEFAULT_PATH to legacy location

Location: mempalace/entity_registry.py:289 | Confidence: ✅ HIGH

EntityRegistry.DEFAULT_PATH is set to Path.home() / ".mempalace" / "entity_registry.json", which won't follow XDG. Same behavioral gap as #1 — a fresh XDG install would look for the entity registry in the wrong directory.

- DEFAULT_PATH = Path.home() / ".mempalace" / "entity_registry.json"
+ # Consider: DEFAULT_PATH = _default_config_dir() / "entity_registry.json"

Also existing code. Consider a follow-up to unify all path resolution through MempalaceConfig.


🎨 #3: Tests use raw os.environ manipulation instead of monkeypatch

Location: tests/test_config.py:12-36 (all new XDG tests) | Confidence: ⚠️ MED

All 9 new tests manipulate os.environ directly with _save_env/_restore_env helpers. Per AGENTS.md, the project runs pytest -n auto for parallel execution. Since os.environ is process-global, parallel test workers sharing the same process could see environment mutations from other tests, causing flaky failures. The project's conftest.py already provides tmp_path-based fixtures for isolation.

- saved = _save_env()
- try:
-     os.environ["XDG_CONFIG_HOME"] = xdg
-     ...
- finally:
-     _restore_env(saved)
+ def test_default_config_dir_uses_xdg_when_set(monkeypatch, tmp_path):
+     monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path / "xdg"))
+     monkeypatch.delenv("MEMPALACE_CONFIG_DIR", raising=False)
+     ...

In practice, pytest-xdist runs tests in separate processes so this may not bite today, but monkeypatch is the standard pattern and prevents surprises if the runner changes.


Low Priority Issues

🏗️ #4: Palace data stored under XDG_CONFIG_HOME instead of XDG_DATA_HOME

Location: mempalace/config.py:47 (_default_config_dir) | Confidence: ⚠️ MED

The XDG spec separates config (XDG_CONFIG_HOME) from data (XDG_DATA_HOME). The palace is a ChromaDB data store, not configuration. Storing it under XDG_CONFIG_HOME/mempalace/palace conflates the two. Strict XDG compliance would place the palace at $XDG_DATA_HOME/mempalace/palace (~/.local/share/mempalace/palace).

This is an intentional design choice (palace follows config for simplicity), and splitting would add complexity. Worth documenting the rationale explicitly.


🎨 #5: DEFAULT_PALACE_PATH deprecation lacks runtime warning

Location: mempalace/config.py:50 | Confidence: ⚠️ MED

The constant is marked deprecated via a comment but external importers won't see it unless they read source. A warnings.warn() on access would signal the deprecation to consumers.

import warnings
def _get_default_palace_path():
    warnings.warn("DEFAULT_PALACE_PATH is deprecated; use MempalaceConfig().palace_path", DeprecationWarning, stacklevel=2)
    return os.path.expanduser("~/.mempalace/palace")

Low urgency — the constant is only imported by tests in the current codebase.


🎨 #6: AGENTS.md and hook scripts still reference hardcoded ~/.mempalace

Location: AGENTS.md:22,42,80-83, hooks/*.sh | Confidence: ✅ HIGH

Documentation and shell hooks reference ~/.mempalace/ as the canonical location in 10+ places. After this PR, the default location for new installs is ~/.config/mempalace/. The docs should mention XDG resolution to avoid confusion.

Not blocking — follow-up documentation PR.


Flow Impact Analysis

MempalaceConfig()  ──new──▶  _default_config_dir()
       │                         │
       │                    ┌────┴────────────────────┐
       │                    │ 1. $MEMPALACE_CONFIG_DIR │
       │                    │ 2. ~/.mempalace (legacy) │
       │                    │ 3. $XDG_CONFIG_HOME/mp   │
       │                    │ 4. ~/.config/mempalace   │
       │                    └─────────────────────────┘
       │
       ├── cli.py (6 call sites)           ✅ follows XDG
       ├── convo_miner.py (1 call site)    ✅ follows XDG
       ├── layers.py (4 call sites)        ⚠️ identity.txt hardcoded
       ├── mcp_server.py (1 call site)     ✅ follows XDG
       ├── palace_db.py (3 call sites)     ✅ follows XDG
       └── entity_registry.py              ⚠️ DEFAULT_PATH hardcoded

Created by Octocode MCP https://octocode.ai

@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch from 37dfbde to 96cce3e Compare April 9, 2026 06:23
@mvalentsev
Copy link
Copy Markdown
Contributor Author

Rebased on latest main and refactored the XDG tests to use monkeypatch + tmp_path. The custom _save_env / _restore_env helpers are gone, and each test sets both HOME and USERPROFILE via monkeypatch.setenv so Path.home() resolves consistently on POSIX and Windows runners. The session-level HOME isolation added in conftest.py stays intact — per-test overrides layer on top of it.

The layers.py / entity_registry.py / AGENTS.md points are already noted as a scope follow-up in the PR body: this change is intentionally limited to MempalaceConfig.

@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch 6 times, most recently from c1fafe9 to 391f1ad Compare April 10, 2026 15:52
@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch from 391f1ad to 5ebd068 Compare April 10, 2026 16:43
Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: XDG Base Directory Specification Support

Clean, focused PR that closes a long-standing request. The resolution order is well-documented and the back-compat story is solid.

What's done well

The resolution order is correct and clearly documented. MEMPALACE_CONFIG_DIR > legacy ~/.mempalace (if populated) > $XDG_CONFIG_HOME/mempalace > ~/.config/mempalace is exactly the right priority. The "legacy if populated" check using _LEGACY_MARKERS prevents false positives from empty leftover directories.

No migration required for existing users. Existing installs with ~/.mempalace/config.json or ~/.mempalace/people_map.json transparently keep using that path. New installs get XDG. This is the smoothest possible transition.

test_config.py monkeypatch cleanup. Changing the test_env_override test to use monkeypatch.setenv instead of os.environ["..."] = ... + manual delete is a genuine correctness fix — the old version leaked state if the test raised.

_set_home helper correctly sets both HOME and USERPROFILE. Good cross-platform hygiene.

Issues found

$XDG_DATA_HOME and $XDG_CACHE_HOME are not addressed. The XDG spec defines three variables: XDG_DATA_HOME (default ~/.local/share), XDG_CONFIG_HOME (default ~/.config), and XDG_CACHE_HOME (default ~/.cache). This PR only handles XDG_CONFIG_HOME. If MemPalace's palace data (the ChromaDB/LanceDB files and the SQLite KG) lives in the config dir, that's fine — but if these are meant to be user data, they probably belong in XDG_DATA_HOME. Worth at least a comment in config.py explaining why XDG_DATA_HOME is out of scope.

Windows fallback. On Windows, $XDG_CONFIG_HOME is almost never set, and ~/.config doesn't exist by convention. The fallback silently creates ~/.config/mempalace on Windows, which is unexpected. The XDG spec says "if not set or empty, use the default", but on Windows the conventional path is %APPDATA%\mempalace. Consider: if sys.platform == "win32" and not os.environ.get("XDG_CONFIG_HOME") and not legacy.exists(): return Path(os.environ.get("APPDATA", Path.home())) / "mempalace".

_LEGACY_MARKERS doesn't include palace/ as a directory. The code checks (legacy / "palace").exists() but if someone has a palace directory but no config files, the check passes the any(...) guard. Wait — it does include "palace" in the tuple. But it uses .exists() which returns True for both files and directories. A bare palace directory created by some other tool would trigger the legacy path. Consider checking (legacy / "palace").is_dir() or (legacy / "palace" / "chroma.sqlite3").exists() for a tighter guard.

No test for Windows APPDATA fallback (if you add it). But this is a non-blocker.

_default_config_dir() is not cached. It calls Path.home() and stat checks on every call. In our integration, config resolution happens on every search and mine call. Wrapping with functools.lru_cache(maxsize=1) would eliminate repeated syscalls.

Suggestions

  1. Add a comment explaining the XDG_DATA_HOME/XDG_CACHE_HOME scope decision
  2. Add a Windows APPDATA fallback or at least a warning
  3. Tighten the "palace" marker check to .is_dir() rather than .exists()
  4. Cache _default_config_dir() with lru_cache or a module-level lazy singleton

Overall

Solid, well-scoped implementation. The XDG compliance covers the 90% case (Linux/macOS users with XDG set) and doesn't break anyone. The issues above are mostly edge cases and polish. The tests are well-structured and use monkeypatch correctly throughout.

APPROVED — issues are non-blocking, happy to see this land.


Reviewed by MemPalace-AGI — autonomous research system with perfect memory

@mvalentsev
Copy link
Copy Markdown
Contributor Author

Tightened the legacy palace marker as suggested. _LEGACY_MARKERS no longer includes the bare palace name -- the check is now in a _has_legacy_install helper that requires palace/chroma.sqlite3 to exist before treating the directory as a real install. Added two tests: one for the bare-dir case (falls through to XDG) and one for the palace-with-ChromaDB case (wins over XDG).

The other points (XDG_DATA_HOME / XDG_CACHE_HOME scope, Windows APPDATA fallback, lru_cache on _default_config_dir) are reasonable as follow-ups. This PR is scoped to the XDG_CONFIG_HOME branch per the issue text; the data-home split and platform-specific Windows path deserve their own PRs so the diff stays reviewable.

@web3guru888
Copy link
Copy Markdown

Well-designed implementation. The resolution order is correct and the _has_legacy_install() guard is the key detail that makes this safe to merge.

_has_legacy_install() refinement is the right call. The original PR would have treated any empty ~/.mempalace directory as a legacy install, which means tools that create that directory (Docker volumes, dotfile managers) could silently prevent XDG adoption. Requiring config.json, people_map.json, or palace/chroma.sqlite3 is tight enough to distinguish a real install from an empty dir.

palace_path default now derives from _config_dir (str(self._config_dir / "palace")) rather than hardcoding DEFAULT_PALACE_PATH. This is the important correctness fix — without it, a fresh install on a clean machine would create ~/.config/mempalace/config.json but point the palace at ~/.mempalace/palace, which is a confusing split state.

Test coverage is thorough. The edge cases covered (empty dir, bare palace dir without chroma.sqlite3, relative XDG path, whitespace XDG value) are exactly the ones that would cause silent failures in production.

A few observations:

  • DEFAULT_PALACE_PATH is kept with a deprecation comment but is frozen at the legacy location. Any external tool that reads it programmatically will silently get the wrong path on a clean XDG install. Consider a DeprecationWarning when it's imported or accessed — though that might be too noisy. Fine as-is for this PR's scope.
  • _default_config_dir() is not cached (lru_cache). On a server or test-heavy environment, this gets called for every MempalaceConfig() instantiation, which does a Path.home() + is_dir() + is_file() stat chain. Probably not a hot path in practice, but worth noting.

Both points are reasonable follow-ups. This PR is scoped correctly. LGTM.

@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch from c2148e1 to 453d3e0 Compare April 11, 2026 11:27
@bensig bensig changed the base branch from main to develop April 11, 2026 22:23
@bensig bensig requested a review from igorls as a code owner April 11, 2026 22:23
@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch 6 times, most recently from dd80beb to f639e16 Compare April 14, 2026 03:31
@igorls igorls added the enhancement New feature or request label Apr 14, 2026
@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch 3 times, most recently from e325dbb to c354f1c Compare April 15, 2026 18:32
The default config directory now resolves via a chain that keeps
existing installs working and matches the XDG spec:

1. $MEMPALACE_CONFIG_DIR (explicit override for tests and CI)
2. ~/.mempalace if it contains config.json, people_map.json, or palace
3. $XDG_CONFIG_HOME/mempalace (absolute paths only, per spec)
4. ~/.config/mempalace

Empty, whitespace, or relative XDG_CONFIG_HOME values fall through to
the default, matching the spec wording. An empty ~/.mempalace directory
no longer hijacks the XDG fallback: at least one marker file or the
palace sub-directory must be present for legacy detection to kick in.

palace_path now defaults to <config_dir>/palace so data stays next to
the config wherever XDG places it. The module-level DEFAULT_PALACE_PATH
constant is kept for external importers but marked deprecated.

The new tests use the standard pytest monkeypatch + tmp_path fixtures
for environment and filesystem isolation on top of the session-level
HOME isolation already provided by conftest.py. HOME is set via both
HOME and USERPROFILE so the Path.home() resolution is consistent on
POSIX and Windows test runners.

Other modules (layers, knowledge_graph, onboarding, hooks) still hard
code the legacy ~/.mempalace path; migration can follow later.

Closes MemPalace#46
A bare ~/.mempalace/palace directory left behind by some other tool
would pass the previous .exists() check and hijack the XDG path.
Require the palace sub-directory to contain chroma.sqlite3 before
treating it as a legacy install, and move the detection into a
dedicated _has_legacy_install helper.

Tests cover the new behaviour end-to-end: a bare palace/ dir falls
through to XDG, while palace/chroma.sqlite3 still wins over XDG.
@mvalentsev mvalentsev force-pushed the feat/xdg-base-directory branch from c354f1c to 1299906 Compare April 16, 2026 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Support XDG base directory for configuration

4 participants