mps-cli-py: complete binary (.mpb) persistency implementation#54
Merged
Prithvi686 merged 33 commits intomainfrom Apr 15, 2026
Conversation
- Refactored binary persistency implementation to separate constants, low-level reader utilities. - Fixed model header parsing to correctly handle model-reference kind vs model-id kind according to MPS binary persistency format. - Correctly reconstruct model UUID with 'r:' prefix. - Updated low-level test expectations to reflect fully- qualified model names.
… persistency and added low-level tests covering imports
… read_reference - Integrated node loading into SModelBuilderBinaryPersistency - Added root_nodes structure - Extended tests to validate full model tree parsing
…rchitecture - Removed registry dict usage - Integrated index_2_* maps from base builder - Construct real SConcept, SProperty, SNode instances - Unified binary builder structure with XML persistency - Updated tests to validate object-based model structure
- Implement full binary (.mpb) model parsing - Load model header, registry, used languages and imports - Build concept/property/reference/containment index maps - Parse node tree including containment roles and properties - Added support for reference kind validation and resolve_info - Applied node id encoding during parsing - Added repository-level completeness and resolution tests
…ode ids and also corrected existing test case failure
…nodes.py to fix the failing test
… node info in mpb files
1) Fixed wrong field order issue in nodes.py to correctly parse node info in mpb file. 2) Corrected the parser to now correctly handle all the structural variants encountered across real plugin mpb files with V3 stream format (0x00000500) and the mpb files that use a DEPENDENCY_V1 byte. 3) Implemented complete binary persistency for real plugin mpb files but extensive py tests are still pending 4) Corrected a few issues with parsing model uuid and implemented logic to build models in parallel instead of one-by-one by using separate processes to get around Python's speed limits using concurrent.futures.ProcessPoolExecutor API. 5) Also, improved parser performance by implementing logic to only peek into jar files first to determine if msd files are present and only then extract the jars to parse mpb files. This significantly reduced the parse execution time from ~248 seconds to hardly ~9 seconds.
04d648d to
9efb3d5
Compare
…rs, imports, node trees, libraries,references, registry entries, language extraction, registry parsing performance and fixed two failing tests and a few mini cleanups
…lderBinaryPersistency to correctly format uuid's
…iles in parallel is already handled by SSolutionBuilder
…onality is already contained within node_id_utils.py and few minor clean ups
…_for_binary_persistency' of https://github.com/mbeddr/mps-cli into feature/E3AARCHAI-23018_enhance_mps_cli_py_with_support_for_binary_persistency
…t models from .mpl files in jar files - Added _jar_is_relevant() to accept jars containing either .msd or .mpl (language jars have no .msd so we previously filtered out silently) - SLanguageBuilder.load_from_mpl() reads language namespace, uuid, and languageVersion from .mpl and populates the existing SLanguage registry entry - SLanguageBuilder._load_aspect_models() parses all .mpb aspect files found in the models directory inside the jar and attaches them to the SLanguage - SLanguage now has two new fields: language_version(int) and models (list) - demo.py has now been updated to show language aspect model counts and sample concept names - 1 new test file added (test_binary_language_from_mpl.py) using jetbrains.mps.build.tips-src.jar as test data under mps_cli_binary_persistency_language folder
…aspect information to a markdown file
…_for_binary_persistency' of https://github.com/mbeddr/mps-cli into feature/E3AARCHAI-23018_enhance_mps_cli_py_with_support_for_binary_persistency
danielratiu
reviewed
Apr 13, 2026
Member
danielratiu
left a comment
There was a problem hiding this comment.
I have left some comments which we need to address
- Separated concerns: extracted ModelCache, MpbBatchParser - Extended parametrized tests to cover .mpb format, removed weak assertions and merged completeness test into parametrized suite - Added black formatter and pytest config to VS Code settings.json, extensions.json, and pyproject.toml files and updated .gitignore as well
…_for_binary_persistency' of https://github.com/mbeddr/mps-cli into feature/E3AARCHAI-23018_enhance_mps_cli_py_with_support_for_binary_persistency
Collaborator
Author
Hi @danielratiu , Addressed all the comments, thank you. |
danielratiu
reviewed
Apr 15, 2026
Member
danielratiu
left a comment
There was a problem hiding this comment.
please also move the binary persistence specific tests into a subfolder under tests "tests/binary"
danielratiu
reviewed
Apr 15, 2026
danielratiu
approved these changes
Apr 15, 2026
Member
danielratiu
left a comment
There was a problem hiding this comment.
thank you for this important extension
lgtm now
- moved jar_is_relevant logic to a separate utility - moved all binary persistency related tests to a separate folder and also added a new readme explaining the same
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added full binary (.mpb) model persistency support
MPS stores models in three formats: XML (.mps), file-per-root (.model directories), and binary ('.mpb'). The first two were already supported. This PR adds complete support for the binary format.
Newly Added:
SModelBuilderBinaryPersistency.py: Top-level .mpb parser that parses header, registry, model properties, node tree.
binary/registry.py: Registry section parser that populates 'index_2_concept', 'index_2_property', 'index_2_reference_role', 'index_2_child_role_in_parent', 'concept_id_2_concept'
binary/nodes.py: Node tree parser that containing methods 'read_children', 'read_node', '_read_reference', '_read_node_id'
binary/node_id_utils.py: NodeIdEncodingUtils class that encodes/decodes MPS node IDs between raw long and Base64-variant strings
Modified:
SSolutionBuilder.py: 'build_all()' now processes '.mpb' files in parallel via 'ProcessPoolExecutor' and also added 'USE_CACHE', 'CACHE_LOAD_FN', 'CACHE_SAVE_FN' hooks.
SSolutionsRepositoryBuilder.py: Three performance optimisations:
demo.py: parses a plugins directory or test project, prints a structured summary of all solutions/models/nodes, runs a verification pass, and writes output to a timestamped log file
What is parsed and stored in SModel:
Every .mpb file becomes one 'SModel' containing:
Also, extended SSolutionsRepositoryBuilder to load SLanguage aspect models from .mpl files in jar files and added a new demo_language_extraction.py to print language's aspect information to a markdown file to verify if the aspects are correctly populated
9 new test files (~75 new test methods) have been added to verify all the parsing scenarios.