RDKB-64487: L2 tests for race thread handling fixes #353
RDKB-64487: L2 tests for race thread handling fixes #353
Conversation
RDKB-64487: Code changes for thread hardening and safety under concur…
There was a problem hiding this comment.
Pull request overview
Adds new L2 functional coverage around recent thread/race hardening work and dcautil behaviors, and introduces a repo skill to analyze L2 test/feature coverage gaps.
Changes:
- Extend
test/run_l2.shto execute the new race-condition and dcautil L2 suites (and include dcautil in the TSan rerun subset). - Add new L2 pytest suites for profile race-condition regressions and dcautil integration scenarios, plus shared concurrency helpers.
- Add/refresh Gherkin feature documentation and introduce an
l2-test-gap-analyzerskill to report feature↔test and source↔test gaps.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| test/run_l2.sh | Runs the new L2 suites and adds dcautil to the optional TSan rerun list. |
| test/functional-tests/tests/test_profile_race_conditions.py | New L2 regression tests covering profile/thread race-condition fixes. |
| test/functional-tests/tests/concurrency_helpers.py | Shared helpers for concurrent operations and daemon health checks used by race-condition tests. |
| test/functional-tests/tests/test_dcautil.py | New L2 functional tests for dcautil grep/top markers, rotation, and incremental processing. |
| test/functional-tests/tests/README_RACE_CONDITIONS.md | Documentation describing the race-condition tests and how to run them. |
| test/functional-tests/features/xconf_communications.feature | Adds/updates feature coverage documentation for xconf L2 scenarios. |
| test/functional-tests/features/telemetry_runs_as_daemon.feature | Adds additional feature scenarios for daemon start/duplicate instance behavior. |
| test/functional-tests/features/telemetry_process_tempProfile.feature | Adds additional feature scenarios for temp profile processing. |
| test/functional-tests/features/telemetry_dcautil.feature | New feature spec documenting dcautil scenarios covered by tests. |
| test/functional-tests/features/telemetry_bootup_sequence.feature | Adds additional feature scenarios for boot sequence coverage. |
| test/functional-tests/features/profile_race_conditions.feature | New feature spec documenting race-condition scenarios covered by tests. |
| test/functional-tests/features/multiprofile_msgpacket.feature | New/updated feature spec documenting multiprofile msgpack scenarios. |
| .github/skills/l2-test-gap-analyzer/analyze.py | New script for L2 test gap analysis and optional feature sync generation. |
| .github/skills/l2-test-gap-analyzer/SKILL.md | Skill documentation describing usage and expectations. |
| .github/skills/l2-test-gap-analyzer/README.md | Expanded skill README with examples and workflow guidance. |
| .github/skills/l2-test-gap-analyzer/references/example-output.md | Example output for the gap analysis report. |
| # Count which modules are tested based on feature files | ||
| for impl in self.missing_implementations: | ||
| feature_name = impl['feature'].replace('.feature', '') | ||
| implemented = impl.get('implemented', 0) | ||
| total = impl.get('total', len(impl.get('scenarios', []))) | ||
|
|
||
| # Map to modules | ||
| for feature_key, modules in module_mapping.items(): | ||
| if feature_key in feature_name: | ||
| coverage_pct = (implemented / total * 100) if total > 0 else 0 | ||
|
|
||
| for module in modules: | ||
| if module not in module_tests: | ||
| module_tests[module] = {'scenarios': 0, 'tested': 0} | ||
| module_tests[module]['scenarios'] += total | ||
| module_tests[module]['tested'] += implemented | ||
|
|
There was a problem hiding this comment.
_calculate_functional_module_coverage() derives module coverage only from self.missing_implementations, which excludes fully-implemented features. This will incorrectly report 0% coverage for areas that are complete. Compute coverage from all feature↔test mappings (including complete ones), or maintain a separate structure that records implemented vs total scenarios for every feature.
| # Calculate functional test coverage (scenarios implemented) | ||
| total_scenarios = sum(item.get('total', len(item.get('scenarios', []))) for item in []) | ||
| implemented_scenarios = 0 | ||
| for impl in self.missing_implementations: | ||
| total_scenarios += impl.get('total', len(impl.get('scenarios', []))) | ||
| implemented_scenarios += impl.get('implemented', 0) | ||
|
|
||
| functional_coverage = (implemented_scenarios / total_scenarios * 100) if total_scenarios > 0 else 0 | ||
|
|
There was a problem hiding this comment.
In run_analysis(), functional coverage is computed before identify_gaps() populates self.missing_implementations, and it starts from for item in [] which guarantees total_scenarios is initially 0. This makes the printed functional coverage misleading. Move the functional coverage calculation to after identify_gaps() (or remove the pre-gap printout).
| - **L2 Functional Tests**: {test_count} test functions across 6 test files | ||
| - **Test Type**: Integration/functional tests (black-box system testing) | ||
| - **Test Approach**: End-to-end testing via RBUS, HTTP, process lifecycle, filesystem state | ||
| - **Note**: These are NOT unit tests. They test the entire Telemetry daemon as a black box.""" |
There was a problem hiding this comment.
The header hardcodes "across 6 test files", which will become inaccurate as tests are added/removed. Derive the test file count dynamically (e.g., from len(list(self.tests_dir.glob('test_*.py'))) or the discovered tests mapping).
|
|
||
| # Trigger report generation | ||
| print("Triggering on-demand report (will be slow)...") | ||
| trigger_on_demand("slow_report_profile") |
There was a problem hiding this comment.
trigger_on_demand() is not defined in this test method (it’s only a nested function inside test_reportInProgress_concurrent_access). This will raise a NameError and fail the suite. Use the shared helper (e.g., trigger_on_demand_report() from concurrency_helpers) or define a local helper in this method.
| trigger_on_demand("slow_report_profile") | |
| trigger_on_demand_report("slow_report_profile") |
| import random | ||
| scenario = random.choice(['immediate', 'idle', 'active']) | ||
|
|
||
| if scenario == 'immediate': | ||
| time.sleep(0.1) # Very brief | ||
| elif scenario == 'idle': | ||
| time.sleep(2) # Thread idle in wait | ||
| else: # active | ||
| trigger_on_demand(f"cycle_profile_{cycle}") | ||
| time.sleep(0.3) # Let report start |
There was a problem hiding this comment.
This test introduces randomness (random.choice(...)) which can make the L2 suite flaky/non-reproducible. Prefer deterministic coverage (e.g., parametrize the three scenarios or iterate through them in a fixed order) so failures can be reliably reproduced.
|
|
||
| def set_profile(profile_dict): | ||
| """Set a telemetry profile via RBUS.""" | ||
| profile_json = json.dumps([profile_dict]) |
There was a problem hiding this comment.
set_profile() serializes the profile as a JSON list ([profile_dict]), but other tests and helpers use the expected schema { "profiles": [ ... ] } for Device.X_RDKCENTRAL-COM_T2.ReportProfiles. This mismatch is likely to cause the profile set to be ignored/parsed incorrectly. Serialize as { "profiles": [profile_dict] } for consistency with clear_profile() and existing tests.
| profile_json = json.dumps([profile_dict]) | |
| profile_json = json.dumps({"profiles": [profile_dict]}) |
| # Verify incremental processing (new marker should appear) | ||
| report = grep_T2logs("cJSON Report") | ||
| assert "ERROR_MARKER" in report or "Report" in report, "New markers should be processed" |
There was a problem hiding this comment.
This assertion is too weak to validate incremental processing: grep_T2logs("cJSON Report") will almost always contain the word "Report", so the test can pass even if ERROR_MARKER was not newly processed. Tighten the check to specifically assert the expected marker/value appears in the second report (and ideally that previously processed content is not duplicated).
| # Verify incremental processing (new marker should appear) | |
| report = grep_T2logs("cJSON Report") | |
| assert "ERROR_MARKER" in report or "Report" in report, "New markers should be processed" | |
| # Verify incremental processing using the second report contents | |
| report = grep_T2logs("cJSON Report") | |
| assert "ERROR_MARKER" in report, "Second report should include the newly appended ERROR_MARKER" | |
| assert "OLD_MARKER" not in report, "Second report should not duplicate previously processed OLD_MARKER content" | |
| assert "BOOT_COMPLETE" not in report, "Second report should not duplicate previously processed BOOT_COMPLETE content" |
| """ | ||
| Test that large log files (>1MB) are processed efficiently. | ||
|
|
||
| Validates: | ||
| - Large files don't cause memory issues | ||
| - Processing completes in reasonable time | ||
| - Markers found in large files | ||
| """ | ||
| # Clear existing state | ||
| clear_T2logs() | ||
| clear_profile() | ||
|
|
||
| # Create large log file | ||
| large_log = [] | ||
| for i in range(10000): | ||
| large_log.append(f"Log entry {i}: Normal system operation") | ||
|
|
||
| # Add marker at end | ||
| large_log.append("BOOT_COMPLETE: Found at end of large file") | ||
|
|
||
| create_test_log(large_log) | ||
|
|
||
| # Verify file size is significant | ||
| file_size = os.path.getsize(TEST_LOG_FILE) | ||
| assert file_size > 500000, f"Test log should be > 500KB, got {file_size}" | ||
|
|
There was a problem hiding this comment.
Docstring says this validates large log files >1MB, but the assertion only checks >500KB. Either increase the generated size to match the documented threshold or update the docstring so the test’s criteria are accurate.
| @pytest.mark.order(1) | ||
| def test_grep_marker_collection(): |
There was a problem hiding this comment.
This file uses @pytest.mark.order(...), but the rest of the L2 suite uses @pytest.mark.run(order=...). If only the existing ordering plugin is present, these marks may be ignored and the intended execution order won’t apply. Align with the convention already used in the other L2 tests.
| # At least one of each type should be present | ||
| assert has_grep or has_top, "Mixed markers should appear in report" |
There was a problem hiding this comment.
The test intent says both grep and top markers should appear in the same report, but the assertion only requires one (has_grep or has_top). This can pass even if one marker type is missing. Require both (or assert each independently) to actually validate the mixed-marker behavior.
| # At least one of each type should be present | |
| assert has_grep or has_top, "Mixed markers should appear in report" | |
| # Both configured marker types should be present in the same report | |
| assert has_grep, "Grep marker should appear in mixed-marker report" | |
| assert has_top, "Top marker should appear in mixed-marker report" |
Changes :
1] Include l2 tests for race condition handling
2] Add skills for L2 test gap analysis
3] Corrected the test documentation gaps
4] Add L2 tests for dca utils