Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR updates many docstrings, removes/comment-outs several utility exports, adjusts probabilistic threshold functions and empty-reference/missing-case handling, adds a Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (1)
MetricsReloaded/processes/overall_process.py (1)
644-644: Replace mutable default argument{}withNoneinlabel_aggregation.Mutable default arguments are shared across all function calls, leading to unexpected behavior. Initialize the default value inside the function instead.
Proposed fix
-def label_aggregation(self, option='average',dict_args={}): +def label_aggregation(self, option='average', dict_args=None): + if dict_args is None: + dict_args = {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/processes/overall_process.py` at line 644, The function label_aggregation currently uses a mutable default dict_args={} which can be shared across calls; change the signature to use dict_args=None and inside label_aggregation initialize dict_args = {} if dict_args is None, then proceed to use dict_args as before (reference the label_aggregation function and its option parameter) so behavior remains identical but avoids the mutable default trap.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/python-app.yml:
- Line 12: The workflow push trigger lists branches: [ main, docs_test ] but the
repo’s active branch is named docs_tests (plural); update the branches array to
include the correct branch name (replace or add "docs_tests") so the push event
will trigger CI for that branch (look for the branches: [ main, docs_test ] line
in the workflow YAML and change "docs_test" to "docs_tests").
In `@MetricsReloaded/metrics/calibration_measures.py`:
- Around line 111-112: Several docstrings in
MetricsReloaded/metrics/calibration_measures.py contain LaTeX math with
unescaped backslashes (e.g., the cwECE formula), causing W605 errors; fix by
making those docstrings raw strings (prefix with r) or escaping backslashes
(e.g., \\dfrac → \\\\dfrac). Update the docstrings that include the formulas
referenced around the cwECE expression and the other occurrences noted (lines
near 167-168, 214-215, 262, 282, 295-296, 406, 444, 469, 501-502) so the
functions/classes that contain them (search for the docstrings in
calibration_measures.py and identifiers like cwECE) use r"..." or properly
escaped backslashes throughout. Ensure you change only the docstring delimiters
and escapes, not the formula text itself.
In `@MetricsReloaded/metrics/pairwise_measures.py`:
- Around line 94-95: normalised_expected_cost currently divides by naive_cost
without checking for zero/invalid values; update the function
normalised_expected_cost to validate naive_cost (the variable used in the
denominator) by ensuring it is finite and > 0 (e.g., via
math.isfinite/np.isfinite and a >0 check) and if invalid return a safe sentinel
(e.g., float('nan') or 0) instead of performing the division; ensure the guard
covers all code paths where naive_cost is used (see the division in
normalised_expected_cost and related lines around where naive_cost is computed).
- Around line 99-113: The docstrings in
MetricsReloaded/metrics/pairwise_measures.py that contain LaTeX math (e.g., the
Expected Cost docstring with EC and \sum, and other blocks referenced around
lines 739–747, 876–880, 1017–1018, 1410–1411) must be converted to raw string
literals so backslashes are not treated as escape sequences; update each
triple-quoted docstring ("""...""") that contains LaTeX commands like \sum,
\dfrac, \text, \cap, \cup, etc., to use the raw prefix r"""...""" (preserve the
exact content and formatting inside the docstring) throughout
pairwise_measures.py to eliminate W605 warnings.
In `@MetricsReloaded/processes/overall_process.py`:
- Around line 586-587: The code inconsistently uses the keys 'ref_missing' and
'ref_missing_pred', causing skipped processing or KeyError; update
complete_missing_cases() and all checks (e.g., the if guard that currently reads
len(self.data['ref_missing_pred']) == 0 and the other occurrences around the
598-599 area) to read the values via self.data.get('ref_missing') or
self.data.get('ref_missing_pred') (prefer a single unified key by normalizing:
value = self.data.get('ref_missing') or self.data.get('ref_missing_pred') or [])
and act on that normalized list; also update the constructor gate to call
complete_missing_cases() when either key exists (e.g., if 'ref_missing' in
self.data or 'ref_missing_pred' in self.data: self.complete_missing_cases()) so
both schemas are handled safely.
In `@pyproject.toml`:
- Line 4: The build requirement string in pyproject.toml is missing a closing
quote (the entry currently reads "setuptools < 82.0.0), causing TOML parsing to
fail; update the build-system.requires entry so the setuptools requirement is a
properly quoted string (close the trailing double quote on the "setuptools <
82.0.0" token) ensuring the value is a valid TOML string.
---
Nitpick comments:
In `@MetricsReloaded/processes/overall_process.py`:
- Line 644: The function label_aggregation currently uses a mutable default
dict_args={} which can be shared across calls; change the signature to use
dict_args=None and inside label_aggregation initialize dict_args = {} if
dict_args is None, then proceed to use dict_args as before (reference the
label_aggregation function and its option parameter) so behavior remains
identical but avoids the mutable default trap.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6a4f87c8-270b-49c7-a690-7ff609bf0f31
📒 Files selected for processing (9)
.github/workflows/python-app.ymlMetricsReloaded/metrics/calibration_measures.pyMetricsReloaded/metrics/pairwise_measures.pyMetricsReloaded/metrics/prob_pairwise_measures.pyMetricsReloaded/processes/mixed_measures_processes.pyMetricsReloaded/processes/overall_process.pyMetricsReloaded/utility/assignment_localization.pyMetricsReloaded/utility/utils.pypyproject.toml
| cwECE = \\dfrac{1}{K}\sum_{k=1}^{K}\sum_{i=1}^{N}\\dfrac{\\vert B_{i,k} \\vert}{N} \\left(y_{k}(B_{i,k}) - p_{k}(B_{i,k})\\right) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
ruff check MetricsReloaded/metrics/calibration_measures.py --select W605 -nRepository: Project-MONAI/MetricsReloaded
Length of output: 22957
Resolve W605 invalid escape sequences in formula docstrings.
The math expressions in docstrings contain unescaped backslashes. Convert to raw strings (prefix with r) or escape backslashes (e.g., \\dfrac → \\\\dfrac) to resolve these violations.
Also applies to: 167-168, 214-215, 262, 282, 295-296, 406, 444, 469, 501-502
🧰 Tools
🪛 Ruff (0.15.9)
[warning] 111-111: Invalid escape sequence: \s
Add backslash to escape sequence
(W605)
[warning] 111-111: Invalid escape sequence: \s
Add backslash to escape sequence
(W605)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/metrics/calibration_measures.py` around lines 111 - 112,
Several docstrings in MetricsReloaded/metrics/calibration_measures.py contain
LaTeX math with unescaped backslashes (e.g., the cwECE formula), causing W605
errors; fix by making those docstrings raw strings (prefix with r) or escaping
backslashes (e.g., \\dfrac → \\\\dfrac). Update the docstrings that include the
formulas referenced around the cwECE expression and the other occurrences noted
(lines near 167-168, 214-215, 262, 282, 295-296, 406, 444, 469, 501-502) so the
functions/classes that contain them (search for the docstrings in
calibration_measures.py and identifiers like cwECE) use r"..." or properly
escaped backslashes throughout. Ensure you change only the docstring delimiters
and escapes, not the formula text itself.
| "nec": (self.normalised_expected_cost,"NEC"), | ||
|
|
There was a problem hiding this comment.
Guard NEC against zero/invalid naive baseline before division.
By adding Line 94 ("nec"), normalised_expected_cost() becomes part of the default metric surface. Line 153 currently divides by naive_cost unguarded, which can yield inf/nan for degenerate cost setups.
💡 Proposed fix
def normalised_expected_cost(self):
@@
naive_cost = self.best_naive_ec()
ec = self.expected_cost()
- return ec / naive_cost
+ if np.isnan(naive_cost) or naive_cost == 0:
+ warnings.warn("Naive expected cost is zero/undefined; NEC is not defined")
+ return np.nan
+ return ec / naive_costAlso applies to: 147-153
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/metrics/pairwise_measures.py` around lines 94 - 95,
normalised_expected_cost currently divides by naive_cost without checking for
zero/invalid values; update the function normalised_expected_cost to validate
naive_cost (the variable used in the denominator) by ensuring it is finite and >
0 (e.g., via math.isfinite/np.isfinite and a >0 check) and if invalid return a
safe sentinel (e.g., float('nan') or 0) instead of performing the division;
ensure the guard covers all code paths where naive_cost is used (see the
division in normalised_expected_cost and related lines around where naive_cost
is computed).
| """ | ||
| Calculates the expected cost defined as: | ||
|
|
||
| Luciana Ferrer - Analysis and comparison of classification metrics - https://arxiv.org/pdf/2209.05355 | ||
|
|
||
| .. math:: | ||
|
|
||
| EC = \sum_{r}\sum_p c_{rp} P_rD_{rp} | ||
|
|
||
| where :math: `c_{rp}` {is the cost of misclassifying class r as class p. :math: `P_r` is the probability of | ||
| class r in the reference data, :math: `D_{rp}` is the fraction of samples of class r that are classified as | ||
| class p | ||
|
|
||
|
|
||
| """ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
ruff check MetricsReloaded/metrics/pairwise_measures.py --select W605 -nRepository: Project-MONAI/MetricsReloaded
Length of output: 31191
Use raw docstrings (r"""...""") for all math blocks with LaTeX escape sequences to fix W605 errors across the file.
The file contains 57 invalid escape sequence warnings in docstrings with LaTeX math notation. While the specific ranges mentioned (99–113, 739–747, 876–880, 1017–1018, 1410–1411) are accurate, the problem is more widespread. All docstrings containing LaTeX patterns like \sum, \dfrac, \text, \cap, \cup, and similar should use raw string literals to prevent Python from interpreting backslashes as escape characters. This will eliminate all W605 violations and prevent lint-gated CI failures.
🧰 Tools
🪛 Ruff (0.15.9)
[warning] 106-106: Invalid escape sequence: \s
Use a raw string literal
(W605)
[warning] 106-106: Invalid escape sequence: \s
Use a raw string literal
(W605)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/metrics/pairwise_measures.py` around lines 99 - 113, The
docstrings in MetricsReloaded/metrics/pairwise_measures.py that contain LaTeX
math (e.g., the Expected Cost docstring with EC and \sum, and other blocks
referenced around lines 739–747, 876–880, 1017–1018, 1410–1411) must be
converted to raw string literals so backslashes are not treated as escape
sequences; update each triple-quoted docstring ("""...""") that contains LaTeX
commands like \sum, \dfrac, \text, \cap, \cup, etc., to use the raw prefix
r"""...""" (preserve the exact content and formatting inside the docstring)
throughout pairwise_measures.py to eliminate W605 warnings.
| if len(self.data['ref_missing_pred']) == 0: | ||
| return |
There was a problem hiding this comment.
Unify missing-case key handling to avoid skipped processing/KeyError.
Line 586 assumes ref_missing_pred, while the constructor gate still checks ref_missing. This can silently bypass completion or fail at runtime depending on input schema.
💡 Proposed fix
def complete_missing_cases(self):
@@
- if len(self.data['ref_missing_pred']) == 0:
+ missing_cases = self.data.get("ref_missing_pred", self.data.get("ref_missing", []))
+ if len(missing_cases) == 0:
return
@@
- for (i,f) in enumerate(self.data['ref_missing_pred']):
+ for (i, f) in enumerate(missing_cases):Also update the constructor gate to call complete_missing_cases() when either key exists.
Also applies to: 598-599
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/processes/overall_process.py` around lines 586 - 587, The
code inconsistently uses the keys 'ref_missing' and 'ref_missing_pred', causing
skipped processing or KeyError; update complete_missing_cases() and all checks
(e.g., the if guard that currently reads len(self.data['ref_missing_pred']) == 0
and the other occurrences around the 598-599 area) to read the values via
self.data.get('ref_missing') or self.data.get('ref_missing_pred') (prefer a
single unified key by normalizing: value = self.data.get('ref_missing') or
self.data.get('ref_missing_pred') or []) and act on that normalized list; also
update the constructor gate to call complete_missing_cases() when either key
exists (e.g., if 'ref_missing' in self.data or 'ref_missing_pred' in self.data:
self.complete_missing_cases()) so both schemas are handled safely.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
MetricsReloaded/metrics/calibration_measures.py (2)
463-491:⚠️ Potential issue | 🟡 MinorParameter order mismatch in documentation.
The method signature is
dirichlet_kernel(self, j, i)but the docstring documents parameters in the order(i, j). While not incorrect (both are documented), it may cause confusion.📝 Suggested fix to align parameter order
- :param i: index of first sample to consider - :param j: index of second sample to consider + :param j: index of first sample to consider + :param i: index of second sample to consider🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/calibration_measures.py` around lines 463 - 491, The docstring for dirichlet_kernel documents parameters as (i, j) but the method signature is dirichlet_kernel(self, j, i); update the docstring to list parameters in the same order (j then i) and adjust the :param descriptions to match (e.g., :param j: index of first sample, :param i: index of second sample), so the documented parameter names and order match the function signature (dirichlet_kernel).
518-529:⚠️ Potential issue | 🟡 MinorMissing parameter documentation for
fmt.The docstring does not document the
fmtparameter that appears in the method signature. Although it appears to be unused in the implementation (line 527 is commented out), it should still be documented.📝 Suggested fix
def to_dict_meas(self, fmt="{:.4f}"): """ Given the selected metrics provides a dictionary with relevant metrics + :param fmt: format string for results (currently unused) :return: result_dict dictionary of results """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/calibration_measures.py` around lines 518 - 529, The method to_dict_meas has an undocumented parameter fmt and currently doesn't use it; update the to_dict_meas docstring to include a concise description of the fmt parameter (e.g., expected format string like "{:.4f}" and that it is applied to numeric metric results), and then either (a) re-enable and use fmt to format each metric result before assigning to result_dict (apply fmt.format(result)) or (b) remove the fmt parameter from the signature if formatting is not desired; adjust the implementation in to_dict_meas accordingly so the docstring and signature/behavior remain consistent with each other.
♻️ Duplicate comments (1)
MetricsReloaded/metrics/calibration_measures.py (1)
111-111:⚠️ Potential issue | 🟡 MinorUnresolved W605 invalid escape sequences in LaTeX formula.
The LaTeX formula contains unescaped backslashes that produce W605 warnings. This issue was previously flagged but remains unresolved. Convert to a raw string by prefixing with
ror escape all backslashes.🔧 Proposed fix
- cwECE = \\dfrac{1}{K}\sum_{k=1}^{K}\sum_{i=1}^{N}\\dfrac{\\vert B_{i,k} \\vert}{N} \\left(y_{k}(B_{i,k}) - p_{k}(B_{i,k})\\right) + cwECE = \\dfrac{1}{K}\\sum_{k=1}^{K}\\sum_{i=1}^{N}\\dfrac{\\vert B_{i,k} \\vert}{N} \\left(y_{k}(B_{i,k}) - p_{k}(B_{i,k})\\right)Alternatively, use a raw string (preferred):
- .. math:: - - cwECE = \\dfrac{1}{K}\sum_{k=1}^{K}\sum_{i=1}^{N}\\dfrac{\\vert B_{i,k} \\vert}{N} \\left(y_{k}(B_{i,k}) - p_{k}(B_{i,k})\\right) + .. math:: + + cwECE = \\dfrac{1}{K}\\sum_{k=1}^{K}\\sum_{i=1}^{N}\\dfrac{\\vert B_{i,k} \\vert}{N} \\left(y_{k}(B_{i,k}) - p_{k}(B_{i,k})\\right)Note: The same fix pattern applies to all other formulas with invalid escape sequences in this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/calibration_measures.py` at line 111, The LaTeX formula string for cwECE uses unescaped backslashes causing W605; update the literal containing "cwECE = \\dfrac{1}{K}..." in calibration_measures.py to be a raw string (prefix with r) or escape every backslash so Python won't treat them as invalid escape sequences, and apply the same fix to all other formula strings in the file (look for other LaTeX strings like those referencing B_{i,k}, y_{k} and p_{k}).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/python-app.yml:
- Around line 12-16: The workflow file contains unresolved Git conflict markers
(<<<<<<<, =======, >>>>>>>) around the branches array; remove the conflict
markers and keep a single valid YAML line for the branches key (e.g., branches:
[ main, docs_tests ] or branches: [ main, docs_test ] depending on the intended
branch name) so that the branches array is valid YAML and the GitHub Actions
workflow can parse; ensure there are no leftover markers anywhere in the file
and run a YAML lint or quick git diff to confirm only one branches entry
remains.
---
Outside diff comments:
In `@MetricsReloaded/metrics/calibration_measures.py`:
- Around line 463-491: The docstring for dirichlet_kernel documents parameters
as (i, j) but the method signature is dirichlet_kernel(self, j, i); update the
docstring to list parameters in the same order (j then i) and adjust the :param
descriptions to match (e.g., :param j: index of first sample, :param i: index of
second sample), so the documented parameter names and order match the function
signature (dirichlet_kernel).
- Around line 518-529: The method to_dict_meas has an undocumented parameter fmt
and currently doesn't use it; update the to_dict_meas docstring to include a
concise description of the fmt parameter (e.g., expected format string like
"{:.4f}" and that it is applied to numeric metric results), and then either (a)
re-enable and use fmt to format each metric result before assigning to
result_dict (apply fmt.format(result)) or (b) remove the fmt parameter from the
signature if formatting is not desired; adjust the implementation in
to_dict_meas accordingly so the docstring and signature/behavior remain
consistent with each other.
---
Duplicate comments:
In `@MetricsReloaded/metrics/calibration_measures.py`:
- Line 111: The LaTeX formula string for cwECE uses unescaped backslashes
causing W605; update the literal containing "cwECE = \\dfrac{1}{K}..." in
calibration_measures.py to be a raw string (prefix with r) or escape every
backslash so Python won't treat them as invalid escape sequences, and apply the
same fix to all other formula strings in the file (look for other LaTeX strings
like those referencing B_{i,k}, y_{k} and p_{k}).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 11d7b665-8b1b-4a2c-8f7f-84ccaac29225
📒 Files selected for processing (2)
.github/workflows/python-app.ymlMetricsReloaded/metrics/calibration_measures.py
…docs_tests Making all tests up to date Please enter a commit message to explain why this merge is necessary,
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
MetricsReloaded/metrics/pairwise_measures.py (2)
94-95:⚠️ Potential issue | 🟠 MajorGuard NEC denominator before division.
"nec"is now part of the selectable multiclass metrics, but NEC still divides bynaive_costwithout validating finite/non-zero denominator. This can returninf/nanfor degenerate class/cost setups.💡 Proposed fix
def normalised_expected_cost(self): """ Calculates the normalised expected cost as the ratio of the expected cost to the naive expected cost. """ naive_cost = self.best_naive_ec() ec = self.expected_cost() - return ec / naive_cost + if not np.isfinite(naive_cost) or naive_cost <= 0: + warnings.warn("Naive expected cost is non-finite or <= 0; NEC is not defined") + return np.nan + return ec / naive_costAlso applies to: 147-153
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/pairwise_measures.py` around lines 94 - 95, The NEC computation uses self.normalised_expected_cost which divides by naive_cost without checking the denominator; update the code that computes normalised_expected_cost (and the same block around the other NEC calculation at lines ~147-153) to guard the division by testing that naive_cost is finite and non-zero (e.g. use numpy.isfinite and naive_cost != 0); if the check fails, set the normalized NEC result to a safe value (e.g. 0.0 or np.nan consistently) instead of performing the division so you never return inf/nan for degenerate class/cost setups.
99-113:⚠️ Potential issue | 🟠 MajorConvert LaTeX docstrings to raw strings to fix W605.
These changed docstrings contain LaTeX backslashes in non-raw triple-quoted strings, triggering invalid escape warnings and risking lint/doc-check failures.
💡 Proposed fix (pattern)
- def expected_cost(self): - """ + def expected_cost(self): + r""" @@ - def expected_matching_ck(self): - """ + def expected_matching_ck(self): + r""" @@ - def dsc(self): - """ + def dsc(self): + r""" @@ - def intersection_over_reference(self): - """ + def intersection_over_reference(self): + r""" @@ - def measured_masd(self): - """ + def measured_masd(self): + r"""#!/bin/bash # Verify invalid escape warnings in the reviewed file only (read-only). ruff check MetricsReloaded/metrics/pairwise_measures.py --select W605 -nAlso applies to: 740-748, 876-881, 1018-1018, 1411-1412
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/pairwise_measures.py` around lines 99 - 113, The docstrings containing LaTeX backslashes (e.g., the one starting "Calculates the expected cost" and the other docstrings containing math backslashes) must be converted to raw triple-quoted strings to avoid W605 invalid-escape warnings: change the opening quotes from """ to r""" for those docstrings (the ones that include sequences like \sum, \math::, or other LaTeX backslashes) and ensure no other string concatenation is broken; after updating, run a W605 check (ruff --select W605) to confirm all occurrences (including the other reported docstrings) are fixed.
🧹 Nitpick comments (1)
test/test_utility/test_utils.py (1)
83-87: Drop these debug prints before merging.They do not support any assertion and will just add noise to pytest output when this test runs.
🧹 Proposed cleanup
- print(df_sensdscnsd) - print(val_merge4) - print(df3) - print(df5) - print(df4)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_utility/test_utils.py` around lines 83 - 87, Remove the leftover debug print statements printing df_sensdscnsd, val_merge4, df3, df5, and df4 in test_utility/test_utils.py so the test does not emit noisy output; locate the prints (print(df_sensdscnsd), print(val_merge4), print(df3), print(df5), print(df4)) and delete them, ensuring the test contains only assertions and setup/teardown logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@MetricsReloaded/metrics/pairwise_measures.py`:
- Around line 307-315: Update the docstring entry for the existing :param empty:
in the pairwise_measures docblock to explicitly state what the parameter
represents, its expected type and accepted values, the default value, and how it
affects metric computation (e.g., how empty/missing pixels or regions are
treated or substituted when computing measures). Locate the docstring containing
the parameters pred, ref, measures, connectivity_type, pixdim, empty, dict_args
in pairwise_measures.py (the function or class where those params are declared)
and replace the blank :param empty: line with a concise description mentioning
its role and behavior.
In `@MetricsReloaded/utility/utils.py`:
- Around line 115-128: A public method was removed by commenting it out, causing
runtime AttributeError for callers; restore the original implementation of
MorphologyOps.border_map2 (the method that computes border using ndimage.shift
over self.binary_map and returns border) so the public API remains available, or
if this was intentional, revert the comment change and instead open a dedicated
breaking-change PR that either (a) removes border_map2 after announcing the
breaking change, or (b) adds a compatibility shim named
MorphologyOps.border_map2 that raises a clear DeprecationWarning and directs
users to the new API; locate the symbol MorphologyOps.border_map2 in
MetricsReloaded/utility/utils.py and either re-enable the original code block
(west/east/north/south/top/bottom, cumulative, border) or replace it with the
deprecation shim and ensure tests/documentation are updated accordingly.
- Around line 420-517: The module's __all__ includes to_string_count,
to_string_dist, to_string_mt, and to_dict_meas_ but their implementations are
commented out; either restore these functions (uncomment and ensure
signatures/behavior match what importers expect) or remove those four names from
__all__ so the public API is accurate; update the module accordingly and run
tests/linters to confirm no import-time failures referencing to_string_count,
to_string_dist, to_string_mt, or to_dict_meas_.
- Around line 557-568: The function merge_list_df currently emits raw prints
(e.g., print(f, ' not present'), print(len(list_fin))) which pollute runtime
output; update merge_list_df to remove these print() calls and instead use the
module logger: add an import logging and a logger = logging.getLogger(__name__)
(or reuse existing), replace the prints with logger.debug(...) containing the
same diagnostic text, and remove any stray commented print lines so the function
no longer writes to stdout during normal execution while preserving debug
visibility via logging.
In `@test/test_processes/test_overall_process.py`:
- Line 46: The test changed the input key to 'ref_missing_pred' which prevents
ProcessEvaluation (in MetricsReloaded/processes/overall_process.py) from
detecting ref_missing and calling complete_missing_cases(), causing the
assertion to reflect the broken path; modify the test to restore the original
key name 'ref_missing' for data_miss (i.e., revert data_miss['ref_missing_pred']
back to data_miss['ref_missing']) and restore the expected result to [3,8] so
the test asserts the completed-missing-cases behavior checked by
ProcessEvaluation.complete_missing_cases().
---
Duplicate comments:
In `@MetricsReloaded/metrics/pairwise_measures.py`:
- Around line 94-95: The NEC computation uses self.normalised_expected_cost
which divides by naive_cost without checking the denominator; update the code
that computes normalised_expected_cost (and the same block around the other NEC
calculation at lines ~147-153) to guard the division by testing that naive_cost
is finite and non-zero (e.g. use numpy.isfinite and naive_cost != 0); if the
check fails, set the normalized NEC result to a safe value (e.g. 0.0 or np.nan
consistently) instead of performing the division so you never return inf/nan for
degenerate class/cost setups.
- Around line 99-113: The docstrings containing LaTeX backslashes (e.g., the one
starting "Calculates the expected cost" and the other docstrings containing math
backslashes) must be converted to raw triple-quoted strings to avoid W605
invalid-escape warnings: change the opening quotes from """ to r""" for those
docstrings (the ones that include sequences like \sum, \math::, or other LaTeX
backslashes) and ensure no other string concatenation is broken; after updating,
run a W605 check (ruff --select W605) to confirm all occurrences (including the
other reported docstrings) are fixed.
---
Nitpick comments:
In `@test/test_utility/test_utils.py`:
- Around line 83-87: Remove the leftover debug print statements printing
df_sensdscnsd, val_merge4, df3, df5, and df4 in test_utility/test_utils.py so
the test does not emit noisy output; locate the prints (print(df_sensdscnsd),
print(val_merge4), print(df3), print(df5), print(df4)) and delete them, ensuring
the test contains only assertions and setup/teardown logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1b463270-6054-4711-852c-81cfb4650109
📒 Files selected for processing (4)
MetricsReloaded/metrics/pairwise_measures.pyMetricsReloaded/utility/utils.pytest/test_processes/test_overall_process.pytest/test_utility/test_utils.py
| Input includes: | ||
| :param pred: Prediction | ||
| :param ref: Reference | ||
| :param measures: list of measures to extract | ||
| :param connectivity_type: Type of connectivity to use | ||
| :param pixdim: list of pixel dimensions | ||
| :param empty: | ||
| :param dict_args: Dictionary with additional arguments for the different metrics | ||
| """ |
There was a problem hiding this comment.
Complete the empty parameter documentation.
The class docstring leaves :param empty: without any description, so the “all parameters explained” objective is still incomplete.
💡 Proposed fix
- :param empty:
+ :param empty: Whether empty-reference/prediction handling flags are enabled for metric computation.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Input includes: | |
| :param pred: Prediction | |
| :param ref: Reference | |
| :param measures: list of measures to extract | |
| :param connectivity_type: Type of connectivity to use | |
| :param pixdim: list of pixel dimensions | |
| :param empty: | |
| :param dict_args: Dictionary with additional arguments for the different metrics | |
| """ | |
| Input includes: | |
| :param pred: Prediction | |
| :param ref: Reference | |
| :param measures: list of measures to extract | |
| :param connectivity_type: Type of connectivity to use | |
| :param pixdim: list of pixel dimensions | |
| :param empty: Whether empty-reference/prediction handling flags are enabled for metric computation. | |
| :param dict_args: Dictionary with additional arguments for the different metrics | |
| """ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/metrics/pairwise_measures.py` around lines 307 - 315, Update
the docstring entry for the existing :param empty: in the pairwise_measures
docblock to explicitly state what the parameter represents, its expected type
and accepted values, the default value, and how it affects metric computation
(e.g., how empty/missing pixels or regions are treated or substituted when
computing measures). Locate the docstring containing the parameters pred, ref,
measures, connectivity_type, pixdim, empty, dict_args in pairwise_measures.py
(the function or class where those params are declared) and replace the blank
:param empty: line with a concise description mentioning its role and behavior.
| # def border_map2(self): | ||
| # """ | ||
| # Creates the border for a 3D image | ||
| # :return: | ||
| # """ | ||
| # west = ndimage.shift(self.binary_map, [-1, 0, 0], order=0) | ||
| # east = ndimage.shift(self.binary_map, [1, 0, 0], order=0) | ||
| # north = ndimage.shift(self.binary_map, [0, 1, 0], order=0) | ||
| # south = ndimage.shift(self.binary_map, [0, -1, 0], order=0) | ||
| # top = ndimage.shift(self.binary_map, [0, 0, 1], order=0) | ||
| # bottom = ndimage.shift(self.binary_map, [0, 0, -1], order=0) | ||
| # cumulative = west + east + north + south + top + bottom | ||
| # border = ((cumulative < 6) * self.binary_map) == 1 | ||
| # return border |
There was a problem hiding this comment.
Keep MorphologyOps.border_map2() live or remove it in a dedicated breaking-change PR.
Commenting out the method removes a public API from MorphologyOps, so any existing caller now gets AttributeError. That is a runtime regression, not a docs-only change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/utility/utils.py` around lines 115 - 128, A public method was
removed by commenting it out, causing runtime AttributeError for callers;
restore the original implementation of MorphologyOps.border_map2 (the method
that computes border using ndimage.shift over self.binary_map and returns
border) so the public API remains available, or if this was intentional, revert
the comment change and instead open a dedicated breaking-change PR that either
(a) removes border_map2 after announcing the breaking change, or (b) adds a
compatibility shim named MorphologyOps.border_map2 that raises a clear
DeprecationWarning and directs users to the new API; locate the symbol
MorphologyOps.border_map2 in MetricsReloaded/utility/utils.py and either
re-enable the original code block (west/east/north/south/top/bottom, cumulative,
border) or replace it with the deprecation shim and ensure tests/documentation
are updated accordingly.
| print(f, ' not present') | ||
| break | ||
| if flag_on: | ||
| list_fin.append(k) | ||
| if len(list_fin) == 0: | ||
| return None | ||
| elif len(list_fin) == 1: | ||
| return list_fin[0] | ||
| else: | ||
| print("list fin is ",list_fin) | ||
| #print("list fin is ",list_fin) | ||
| df_fin = list_fin[0] | ||
| print(len(list_fin)) |
There was a problem hiding this comment.
Remove the raw print() calls from merge_list_df.
This helper now writes debug output on normal execution, which will leak into library consumers and test logs. If you still need diagnostics here, use a logger at debug level instead.
✂️ Proposed cleanup
for f in on:
if f not in k.columns:
flag_on = False
- print(f, ' not present')
break
@@
else:
- `#print`("list fin is ",list_fin)
df_fin = list_fin[0]
- print(len(list_fin))
for k in list_fin[1:]:
df_fin = pd.merge(df_fin, k, on=on)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f, ' not present') | |
| break | |
| if flag_on: | |
| list_fin.append(k) | |
| if len(list_fin) == 0: | |
| return None | |
| elif len(list_fin) == 1: | |
| return list_fin[0] | |
| else: | |
| print("list fin is ",list_fin) | |
| #print("list fin is ",list_fin) | |
| df_fin = list_fin[0] | |
| print(len(list_fin)) | |
| break | |
| if flag_on: | |
| list_fin.append(k) | |
| if len(list_fin) == 0: | |
| return None | |
| elif len(list_fin) == 1: | |
| return list_fin[0] | |
| else: | |
| df_fin = list_fin[0] | |
| for k in list_fin[1:]: | |
| df_fin = pd.merge(df_fin, k, on=on) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/utility/utils.py` around lines 557 - 568, The function
merge_list_df currently emits raw prints (e.g., print(f, ' not present'),
print(len(list_fin))) which pollute runtime output; update merge_list_df to
remove these print() calls and instead use the module logger: add an import
logging and a logger = logging.getLogger(__name__) (or reuse existing), replace
the prints with logger.debug(...) containing the same diagnostic text, and
remove any stray commented print lines so the function no longer writes to
stdout during normal execution while preserving debug visibility via logging.
| data_miss['list_values'] = [1] | ||
| data_miss['pred_prob'] = [None,None] | ||
| data_miss['ref_missing'] = [ref3] | ||
| data_miss['ref_missing_pred'] = [ref3] |
There was a problem hiding this comment.
Test is currently codifying a regression in missing-case handling.
Line 46 switches to ref_missing_pred, but ProcessEvaluation still checks for ref_missing before calling complete_missing_cases() (in MetricsReloaded/processes/overall_process.py Lines 359-365).
Because of that, Line 76’s new expectation [2,8] likely reflects the broken path (missing case not appended) rather than intended behavior. Please keep this test asserting the missing-case completion result (previously [3,8]) and remove the temporary “to modify” expectation.
[suggested fix in test]
Patch to keep regression coverage strict
- assert_allclose(pe.grouped_lab.shape,[2,8]) # to modify
+ assert_allclose(pe.grouped_lab.shape,[3,8])Also applies to: 76-76
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/test_processes/test_overall_process.py` at line 46, The test changed the
input key to 'ref_missing_pred' which prevents ProcessEvaluation (in
MetricsReloaded/processes/overall_process.py) from detecting ref_missing and
calling complete_missing_cases(), causing the assertion to reflect the broken
path; modify the test to restore the original key name 'ref_missing' for
data_miss (i.e., revert data_miss['ref_missing_pred'] back to
data_miss['ref_missing']) and restore the expected result to [3,8] so the test
asserts the completed-missing-cases behavior checked by
ProcessEvaluation.complete_missing_cases().
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (4)
test/test_processes/test_overall_process.py (1)
39-39:⚠️ Potential issue | 🟠 MajorTest expectation change may codify a regression in missing-case handling.
The fixture key was changed from
ref_missingtoref_missing_pred(line 39), and the expected shape changed from[3,8]to[2,8](line 69). Combined with the constructor gate still checkingref_missinginoverall_process.py(line 360), this test now passes becausecomplete_missing_cases()is never called.The
# to modifycomment suggests this is a known interim state. Once the key inconsistency inoverall_process.pyis fixed, restore the expected shape to[3,8].Also applies to: 69-69
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_processes/test_overall_process.py` at line 39, The test switched the fixture key from "ref_missing" to "ref_missing_pred" and adjusted the expected output shape to [2,8], which masks a bug: the constructor in overall_process.py still checks for "ref_missing" so complete_missing_cases() never runs; update overall_process.py to look for the new key "ref_missing_pred" (or make both keys acceptable) in the constructor/gating logic that determines whether to call complete_missing_cases(), then revert the test expectation back to [3,8] to reflect the correct missing-case handling; ensure references to "ref_missing" and "ref_missing_pred" and the method complete_missing_cases() are consistently updated.MetricsReloaded/processes/overall_process.py (1)
360-361:⚠️ Potential issue | 🟠 MajorKey name inconsistency between constructor gate and
complete_missing_cases().The constructor at line 360 checks for
'ref_missing'inself.data.keys(), butcomplete_missing_cases()at line 585 usesself.data['ref_missing_pred']. This mismatch can cause:
- The method to be called when data contains
ref_missingbut then fail withKeyErrorbecauseref_missing_preddoesn't exist- The method to be skipped entirely if only
ref_missing_predis provided (notref_missing)💡 Proposed fix
if self.flag_valid: self.process_data() - if 'ref_missing' in self.data.keys(): + if 'ref_missing_pred' in self.data.keys(): self.complete_missing_cases()Or unify both keys with a fallback:
def complete_missing_cases(self): - if len(self.data['ref_missing_pred']) == 0: + missing_cases = self.data.get('ref_missing_pred', self.data.get('ref_missing', [])) + if len(missing_cases) == 0: returnAlso applies to: 585-586
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/processes/overall_process.py` around lines 360 - 361, The constructor currently checks for 'ref_missing' in self.data but complete_missing_cases() reads self.data['ref_missing_pred'], causing KeyError or skipped execution; update the gate and the method to use a single canonical key (e.g., 'ref_missing_pred') or add a fallback: modify the initial conditional that decides to call complete_missing_cases() to check for either 'ref_missing' or 'ref_missing_pred', and inside complete_missing_cases() replace direct indexing self.data['ref_missing_pred'] with a safe lookup (e.g., self.data.get('ref_missing_pred', self.data.get('ref_missing'))) so the method always finds the data regardless of which key was provided.MetricsReloaded/metrics/pairwise_measures.py (2)
142-148:⚠️ Potential issue | 🟠 MajorGuard
normalised_expected_cost()against zero/invalid naive baseline before division.The
normalised_expected_cost()method divides bynaive_costat line 148 without checking if it's zero or NaN, which can produceinfornanfor degenerate cost setups. Now that"nec"is added to the measures_dict (line 89), this becomes part of the default metric surface.💡 Proposed fix
def normalised_expected_cost(self): """ Calculates the normalised expected cost as the ratio of the expected cost to the naive expected cost. """ naive_cost = self.best_naive_ec() ec = self.expected_cost() + if np.isnan(naive_cost) or naive_cost == 0: + warnings.warn("Naive expected cost is zero/undefined; NEC is not defined") + return np.nan return ec / naive_cost🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/pairwise_measures.py` around lines 142 - 148, normalised_expected_cost currently divides ec by naive_cost without guarding for zero/invalid baselines; update the normalised_expected_cost method to check the value returned by best_naive_ec() (and handle None) and guard against zero/NaN before performing the division (use math.isnan or numpy.isnan as appropriate) and return a safe sentinel (e.g. float('nan') or raise a clear ValueError) instead of producing inf/nan; ensure this change keeps the existing call to expected_cost() and note that "nec" in measures_dict will now behave safely when the naive baseline is degenerate.
308-308:⚠️ Potential issue | 🟡 MinorComplete the
emptyparameter documentation.The
:param empty:entry is still missing its description, leaving the PR's "all parameters explained" objective incomplete.💡 Proposed fix
- :param empty: + :param empty: Flag indicating whether empty-reference/prediction handling is enabled for metric computation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/metrics/pairwise_measures.py` at line 308, Complete the missing :param empty: docstring for the function in MetricsReloaded/metrics/pairwise_measures.py by describing what the empty parameter controls (how the function behaves when given empty inputs), enumerate the accepted values/types for empty and the effect of each option (e.g., return value, raise error, or fill with a sentinel), and state the default value; reference the parameter name "empty" in the description so future readers can locate it easily.
🧹 Nitpick comments (7)
MetricsReloaded/processes/overall_process.py (3)
643-643: Avoid mutable default argumentdict_args={}.Mutable default arguments are shared across all calls and can lead to unexpected behavior if modified.
♻️ Proposed fix
- def label_aggregation(self, option='average',dict_args={}): + def label_aggregation(self, option='average', dict_args=None): + if dict_args is None: + dict_args = {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/processes/overall_process.py` at line 643, The function label_aggregation currently uses a mutable default argument dict_args={} which can be shared across calls; change the signature of label_aggregation (the method named label_aggregation) to use dict_args=None and inside the method set dict_args = {} if dict_args is None, updating any logic that expects dict_args to exist; ensure any callers that rely on a persistent dict are updated accordingly so no behavior changes occur.
597-597: Rename unused loop variablefto_f.The loop variable
frepresenting the filename/case is not used within the loop body. Prefix with underscore to indicate it's intentionally unused.♻️ Proposed fix
- for (i,f) in enumerate(self.data['ref_missing_pred']): + for (i, _f) in enumerate(self.data['ref_missing_pred']):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/processes/overall_process.py` at line 597, The loop in overall_process.py iterates with (i, f) over self.data['ref_missing_pred'] but never uses f; rename the unused variable to _f to signal it's intentionally unused (i,_f) in the for loop header (the loop that currently reads "for (i,f) in enumerate(self.data['ref_missing_pred']):") so linters and readers understand it is unused.
561-562: Addstacklevel=2towarnings.warn()call.Without explicit stacklevel, the warning will point to this line rather than the caller's location, making debugging harder.
♻️ Proposed fix
- warnings.warn("No need to identify empty reference with image level classification - only suitable for instance segmentation, object detection and image segmentation") + warnings.warn("No need to identify empty reference with image level classification - only suitable for instance segmentation, object detection and image segmentation", stacklevel=2)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/processes/overall_process.py` around lines 561 - 562, Update the warnings.warn call so the warning originates at the caller by adding stacklevel=2: in the block that currently calls warnings.warn("No need to identify empty reference with image level classification - only suitable for instance segmentation, object detection and image segmentation") (just before returning list_empty), pass stacklevel=2 as an additional argument to warnings.warn; keep the message and return behavior unchanged.test/test_utility/test_utils.py (2)
107-107: Prefix unused unpacked variables with underscore.Variables
list_ind_labandlist_comare unpacked but never used. Prefix with underscore to indicate intentional disuse.♻️ Proposed fix
- list_ind_lab, list_volumes, list_com = mo.list_foreground_component() + _list_ind_lab, list_volumes, _list_com = mo.list_foreground_component()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_utility/test_utils.py` at line 107, The test unpacks three values from mo.list_foreground_component() but only uses list_volumes; rename the unused unpacked variables to start with an underscore (e.g., _list_ind_lab and _list_com) to signal intentional non-use and silence linter warnings; update the unpacking assignment where list_foreground_component() is called in the test (replace list_ind_lab, list_volumes, list_com with _list_ind_lab, list_volumes, _list_com).
82-86: Remove debug print statements from test.These print statements add noise to test output and were likely added during development/debugging.
♻️ Proposed fix
- print(df_sensdscnsd) - print(val_merge4) - print(df3) - print(df5) - print(df4) -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_utility/test_utils.py` around lines 82 - 86, Remove the debug print statements in the test that print df_sensdscnsd, val_merge4, df3, df5, and df4; locate the prints (print(df_sensdscnsd), print(val_merge4), print(df3), print(df5), print(df4)) in test_utils.py and delete them so the test output is clean, leaving any assertions or setup code intact.test/test_metrics/test_pairwise_measures.py (1)
791-791: Consider fixing the dict_args key instead of commenting out.The commented line uses
dict_args={"fbeta": 2}, but per the implementation inpairwise_measures.py(line 908), the expected key is"beta", not"fbeta". Consider uncommenting with the correct key to maintain test coverage for custom beta values.♻️ Proposed fix
- # bpm2 = PM(p_pred, p_ref, dict_args={"fbeta": 2}) + bpm2 = PM(p_pred, p_ref, dict_args={"beta": 2})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_metrics/test_pairwise_measures.py` at line 791, Uncomment the test line that constructs a PM with custom beta and fix the dict_args key from "fbeta" to "beta" so it matches the PM implementation; specifically update the commented call to PM(p_pred, p_ref, dict_args={"beta": 2}) in the test for pairwise measures to restore coverage of custom-beta behavior.MetricsReloaded/utility/utils.py (1)
521-533: Remove debugprint()calls fromcombine_df().Similar to
merge_list_df(), these print statements will pollute library output.♻️ Proposed fix
if df1 is None or df1.shape[0]==0: - print('Nothing in first') if df2 is None: return None @@ else: - print("Performing concatenation") return pd.concat([df1, df2])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/utility/utils.py` around lines 521 - 533, The combine_df function contains debug print() calls ("Nothing in first" and "Performing concatenation") that should be removed to avoid polluting library output; edit the combine_df implementation to delete those print() lines (leaving all conditional logic and returns intact) or replace them with appropriate logger.debug calls if project logging is required, ensuring behavior of combine_df and interplay with merge_list_df remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@MetricsReloaded/processes/mixed_measures_processes.py`:
- Around line 850-854: Fix the typo in the method docstring that currently reads
"mutilabel"—update it to "multilabel" in the function whose docstring describes
creation of the multilabel results and returns pd_mcc, pd_cal so the summary
reads "Creation of the multilabel results with multilabel counting metrics and
calibration metrics returned as separate dictionaries".
- Around line 551-555: The docstring's return type is incorrect: it currently
lists "dict_det, dict_mt" but the function actually returns two pandas DataFrame
objects; update the Returns section to state that the function returns two
pd.DataFrame (e.g., "Returns: Tuple[pd.DataFrame, pd.DataFrame]" or a brief
description like "Two pandas DataFrames: dict_det and dict_mt") and mention the
variable names dict_det and dict_mt so readers can map names to types; adjust
any type annotations or import hints in the docstring accordingly.
- Around line 736-740: Update the docstring return type: the current return line
"dict_bin, dict_mt" is incorrect—these variables are pandas DataFrame objects,
not dicts; modify the docstring in the function that builds and returns dict_bin
and dict_mt to state it returns two pd.DataFrame objects (e.g., "Returns:
tuple[pd.DataFrame, pd.DataFrame]: dict_bin, dict_mt") and import or reference
pandas as pd in the docstring for clarity.
- Around line 313-318: Update the processing-steps docstring in
MetricsReloaded/processes/mixed_measures_processes.py by correcting the typo
"assigment" to "assignment" in the numbered list (the paragraph describing steps
1–5). Locate the triple-quoted docstring that contains the list including
"assigment" and replace that word so the entry reads "assignment procedure based
on the segmentation images".
In `@test/test_metrics/test_prob_pairwise_measures.py`:
- Line 32: The unpacking from ppm.all_multi_threshold_values() currently binds
unused variables (fp, se, sp, pp); rename those to underscore-prefixed names to
satisfy Ruff RUF059. Update the unpackings (the statement calling
ppm.all_multi_threshold_values()) so only the used variable remains
non-underscored (e.g., keep t) and prefix the other bindings with an underscore
(e.g., _se, _sp, _pp, _fp) at each occurrence (the lines where t, se, sp, pp are
assigned and the other two reported occurrences).
- Around line 79-80: Replace brittle exact-equality float assertions with
tolerant comparisons: instead of `assert value_test == expected_value` and
`assert value_test2 == expected_value`, use numpy.testing.assert_allclose (or
pytest.approx) to compare `value_test` and `value_test2` to `expected_value`
with a reasonable absolute tolerance (e.g., atol=1e-8) to stabilize tests; apply
the same change to the other listed assertions (lines referencing the same
pattern).
- Line 4: Remove the debug-only print() calls in test_prob_pairwise_measures.py
(the three prints mentioned) and remove the debug-only call to
precision_recall_curve (imported as prc) that exists solely to print results; if
that prc() call assigns to variables that are unused after removal, delete those
assignments as well so the test contains only assertions and necessary
computations (refer to the test functions in this module and the prc import to
locate the debug statements).
In `@test/test_utility/test_utils.py`:
- Around line 11-13: Rename the local variable named `map` to `test_map` to
avoid shadowing Python's builtin `map`, and update all references to it
(including in the tests `test_foreground_component` and
`test_list_foreground_component`) to use `test_map` so the tests use the new
variable name consistently.
---
Duplicate comments:
In `@MetricsReloaded/metrics/pairwise_measures.py`:
- Around line 142-148: normalised_expected_cost currently divides ec by
naive_cost without guarding for zero/invalid baselines; update the
normalised_expected_cost method to check the value returned by best_naive_ec()
(and handle None) and guard against zero/NaN before performing the division (use
math.isnan or numpy.isnan as appropriate) and return a safe sentinel (e.g.
float('nan') or raise a clear ValueError) instead of producing inf/nan; ensure
this change keeps the existing call to expected_cost() and note that "nec" in
measures_dict will now behave safely when the naive baseline is degenerate.
- Line 308: Complete the missing :param empty: docstring for the function in
MetricsReloaded/metrics/pairwise_measures.py by describing what the empty
parameter controls (how the function behaves when given empty inputs), enumerate
the accepted values/types for empty and the effect of each option (e.g., return
value, raise error, or fill with a sentinel), and state the default value;
reference the parameter name "empty" in the description so future readers can
locate it easily.
In `@MetricsReloaded/processes/overall_process.py`:
- Around line 360-361: The constructor currently checks for 'ref_missing' in
self.data but complete_missing_cases() reads self.data['ref_missing_pred'],
causing KeyError or skipped execution; update the gate and the method to use a
single canonical key (e.g., 'ref_missing_pred') or add a fallback: modify the
initial conditional that decides to call complete_missing_cases() to check for
either 'ref_missing' or 'ref_missing_pred', and inside complete_missing_cases()
replace direct indexing self.data['ref_missing_pred'] with a safe lookup (e.g.,
self.data.get('ref_missing_pred', self.data.get('ref_missing'))) so the method
always finds the data regardless of which key was provided.
In `@test/test_processes/test_overall_process.py`:
- Line 39: The test switched the fixture key from "ref_missing" to
"ref_missing_pred" and adjusted the expected output shape to [2,8], which masks
a bug: the constructor in overall_process.py still checks for "ref_missing" so
complete_missing_cases() never runs; update overall_process.py to look for the
new key "ref_missing_pred" (or make both keys acceptable) in the
constructor/gating logic that determines whether to call
complete_missing_cases(), then revert the test expectation back to [3,8] to
reflect the correct missing-case handling; ensure references to "ref_missing"
and "ref_missing_pred" and the method complete_missing_cases() are consistently
updated.
---
Nitpick comments:
In `@MetricsReloaded/processes/overall_process.py`:
- Line 643: The function label_aggregation currently uses a mutable default
argument dict_args={} which can be shared across calls; change the signature of
label_aggregation (the method named label_aggregation) to use dict_args=None and
inside the method set dict_args = {} if dict_args is None, updating any logic
that expects dict_args to exist; ensure any callers that rely on a persistent
dict are updated accordingly so no behavior changes occur.
- Line 597: The loop in overall_process.py iterates with (i, f) over
self.data['ref_missing_pred'] but never uses f; rename the unused variable to _f
to signal it's intentionally unused (i,_f) in the for loop header (the loop that
currently reads "for (i,f) in enumerate(self.data['ref_missing_pred']):") so
linters and readers understand it is unused.
- Around line 561-562: Update the warnings.warn call so the warning originates
at the caller by adding stacklevel=2: in the block that currently calls
warnings.warn("No need to identify empty reference with image level
classification - only suitable for instance segmentation, object detection and
image segmentation") (just before returning list_empty), pass stacklevel=2 as an
additional argument to warnings.warn; keep the message and return behavior
unchanged.
In `@MetricsReloaded/utility/utils.py`:
- Around line 521-533: The combine_df function contains debug print() calls
("Nothing in first" and "Performing concatenation") that should be removed to
avoid polluting library output; edit the combine_df implementation to delete
those print() lines (leaving all conditional logic and returns intact) or
replace them with appropriate logger.debug calls if project logging is required,
ensuring behavior of combine_df and interplay with merge_list_df remains
unchanged.
In `@test/test_metrics/test_pairwise_measures.py`:
- Line 791: Uncomment the test line that constructs a PM with custom beta and
fix the dict_args key from "fbeta" to "beta" so it matches the PM
implementation; specifically update the commented call to PM(p_pred, p_ref,
dict_args={"beta": 2}) in the test for pairwise measures to restore coverage of
custom-beta behavior.
In `@test/test_utility/test_utils.py`:
- Line 107: The test unpacks three values from mo.list_foreground_component()
but only uses list_volumes; rename the unused unpacked variables to start with
an underscore (e.g., _list_ind_lab and _list_com) to signal intentional non-use
and silence linter warnings; update the unpacking assignment where
list_foreground_component() is called in the test (replace list_ind_lab,
list_volumes, list_com with _list_ind_lab, list_volumes, _list_com).
- Around line 82-86: Remove the debug print statements in the test that print
df_sensdscnsd, val_merge4, df3, df5, and df4; locate the prints
(print(df_sensdscnsd), print(val_merge4), print(df3), print(df5), print(df4)) in
test_utils.py and delete them so the test output is clean, leaving any
assertions or setup code intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dc66db1f-4acd-4eac-887d-077b7aa97055
📒 Files selected for processing (14)
MetricsReloaded/metrics/calibration_measures.pyMetricsReloaded/metrics/pairwise_measures.pyMetricsReloaded/metrics/prob_pairwise_measures.pyMetricsReloaded/processes/mixed_measures_processes.pyMetricsReloaded/processes/overall_process.pyMetricsReloaded/utility/assignment_localization.pyMetricsReloaded/utility/utils.pytest/test_metrics/test_calibration_metrics.pytest/test_metrics/test_pairwise_measures.pytest/test_metrics/test_prob_pairwise_measures.pytest/test_processes/test_mixed_measures_processes.pytest/test_processes/test_overall_process.pytest/test_utility/test_assignment_localization.pytest/test_utility/test_utils.py
💤 Files with no reviewable changes (1)
- test/test_processes/test_mixed_measures_processes.py
✅ Files skipped from review due to trivial changes (3)
- test/test_metrics/test_calibration_metrics.py
- test/test_utility/test_assignment_localization.py
- MetricsReloaded/utility/assignment_localization.py
🚧 Files skipped from review as they are similar to previous changes (1)
- MetricsReloaded/metrics/prob_pairwise_measures.py
|
|
||
| 1. identification for each predicted case of the items considered as of the class specified by label considered | ||
| 2. identification for each associated case, the items considered as of the class specified by the considered label | ||
| 3. creation of the associated list of individual images of elements selected beforehand both in the prediction images and the reference images (the images are listed in the same order with one element per image) | ||
| 4. assigment procedure based on the segmentation images | ||
| 5. derivation of metrics either on a case by case basis or grouping all cases together. |
There was a problem hiding this comment.
Fix typo in processing steps docstring.
Line 317 has a typo (assigment → assignment).
Suggested doc fix
- 4. assigment procedure based on the segmentation images
+ 4. assignment procedure based on the segmentation images📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 1. identification for each predicted case of the items considered as of the class specified by label considered | |
| 2. identification for each associated case, the items considered as of the class specified by the considered label | |
| 3. creation of the associated list of individual images of elements selected beforehand both in the prediction images and the reference images (the images are listed in the same order with one element per image) | |
| 4. assigment procedure based on the segmentation images | |
| 5. derivation of metrics either on a case by case basis or grouping all cases together. | |
| 1. identification for each predicted case of the items considered as of the class specified by label considered | |
| 2. identification for each associated case, the items considered as of the class specified by the considered label | |
| 3. creation of the associated list of individual images of elements selected beforehand both in the prediction images and the reference images (the images are listed in the same order with one element per image) | |
| 4. assignment procedure based on the segmentation images | |
| 5. derivation of metrics either on a case by case basis or grouping all cases together. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/processes/mixed_measures_processes.py` around lines 313 -
318, Update the processing-steps docstring in
MetricsReloaded/processes/mixed_measures_processes.py by correcting the typo
"assigment" to "assignment" in the numbered list (the paragraph describing steps
1–5). Locate the triple-quoted docstring that contains the list including
"assigment" and replace that word so the entry reads "assignment procedure based
on the segmentation images".
| """ | ||
| Process allowing for the creation of dictionaries with the resulting measures for per label detection metrics and multi threshold metrics | ||
|
|
||
| :return: dict_det, dict_mt | ||
| """ |
There was a problem hiding this comment.
Return type in docstring is inaccurate.
Line 554 says dict_det, dict_mt, but this method returns two pd.DataFrame objects.
Suggested doc fix
- :return: dict_det, dict_mt
+ :return: pd.DataFrame, pd.DataFrame🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/processes/mixed_measures_processes.py` around lines 551 -
555, The docstring's return type is incorrect: it currently lists "dict_det,
dict_mt" but the function actually returns two pandas DataFrame objects; update
the Returns section to state that the function returns two pd.DataFrame (e.g.,
"Returns: Tuple[pd.DataFrame, pd.DataFrame]" or a brief description like "Two
pandas DataFrames: dict_det and dict_mt") and mention the variable names
dict_det and dict_mt so readers can map names to types; adjust any type
annotations or import hints in the docstring accordingly.
| """ | ||
| Process allowing for the creation of dictionaries with the resulting measures for per label binary pairwise metrics and multi threshold metrics | ||
|
|
||
| :return: dict_bin, dict_mt | ||
| """ |
There was a problem hiding this comment.
Return type in docstring is inaccurate.
Line 739 says dict_bin, dict_mt, but this method returns two pd.DataFrame objects.
Suggested doc fix
- :return: dict_bin, dict_mt
+ :return: pd.DataFrame, pd.DataFrame📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """ | |
| Process allowing for the creation of dictionaries with the resulting measures for per label binary pairwise metrics and multi threshold metrics | |
| :return: dict_bin, dict_mt | |
| """ | |
| """ | |
| Process allowing for the creation of dictionaries with the resulting measures for per label binary pairwise metrics and multi threshold metrics | |
| :return: pd.DataFrame, pd.DataFrame | |
| """ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/processes/mixed_measures_processes.py` around lines 736 -
740, Update the docstring return type: the current return line "dict_bin,
dict_mt" is incorrect—these variables are pandas DataFrame objects, not dicts;
modify the docstring in the function that builds and returns dict_bin and
dict_mt to state it returns two pd.DataFrame objects (e.g., "Returns:
tuple[pd.DataFrame, pd.DataFrame]: dict_bin, dict_mt") and import or reference
pandas as pd in the docstring for clarity.
| """ | ||
| Creation of the multilabel results with mutilabel counting metrics and calibration metrics returned as separate dictionaries | ||
|
|
||
| :return: pd_mcc, pd_cal | ||
| """ |
There was a problem hiding this comment.
Fix typo in method summary.
Line 851 uses mutilabel; should be multilabel.
Suggested doc fix
- Creation of the multilabel results with mutilabel counting metrics and calibration metrics returned as separate dictionaries
+ Creation of the multilabel results with multilabel counting metrics and calibration metrics returned as separate dictionaries📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """ | |
| Creation of the multilabel results with mutilabel counting metrics and calibration metrics returned as separate dictionaries | |
| :return: pd_mcc, pd_cal | |
| """ | |
| """ | |
| Creation of the multilabel results with multilabel counting metrics and calibration metrics returned as separate dictionaries | |
| :return: pd_mcc, pd_cal | |
| """ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/processes/mixed_measures_processes.py` around lines 850 -
854, Fix the typo in the method docstring that currently reads
"mutilabel"—update it to "multilabel" in the function whose docstring describes
creation of the multilabel results and returns pd_mcc, pd_cal so the summary
reads "Creation of the multilabel results with multilabel counting metrics and
calibration metrics returned as separate dictionaries".
| from numpy.testing import assert_allclose | ||
| from sklearn.metrics import cohen_kappa_score as cks | ||
| from sklearn.metrics import matthews_corrcoef as mcc | ||
| from sklearn.metrics import precision_recall_curve as prc |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Candidate dependency files =="
fd -i 'pyproject.toml|setup.py|setup.cfg|requirements.*|environment.*\.ya?ml'
echo
echo "== Mentions of sklearn/scikit-learn in dependency files =="
rg -n --iglob 'pyproject.toml' --iglob 'setup.py' --iglob 'setup.cfg' --iglob 'requirements*' --iglob 'environment*.yml' --iglob 'environment*.yaml' 'scikit-learn|sklearn' || true
echo
echo "== precision_recall_curve + print usage in this test file =="
rg -n -C2 'precision_recall_curve|prc\(|print\(' test/test_metrics/test_prob_pairwise_measures.pyRepository: Project-MONAI/MetricsReloaded
Length of output: 1238
Remove debug-only print statements from this test module.
The three print() calls on lines 122, 140, and 151 serve only for debugging and add noise to CI output. Line 140's prc() call is exclusively used for debug printing. These can be safely removed.
♻️ Proposed cleanup
-from sklearn.metrics import precision_recall_curve as prc
@@
- print(value_test)
@@
- print("From SK", prc(ref, pred_proba))
@@
- print(dict_ppm.keys())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/test_metrics/test_prob_pairwise_measures.py` at line 4, Remove the
debug-only print() calls in test_prob_pairwise_measures.py (the three prints
mentioned) and remove the debug-only call to precision_recall_curve (imported as
prc) that exists solely to print results; if that prc() call assigns to
variables that are unused after removal, delete those assignments as well so the
test contains only assertions and necessary computations (refer to the test
functions in this module and the prc import to locate the debug statements).
| list_ppv_exp = [1, 1, 1, 0.71, 0.5] | ||
| unique_thresh = [1.9, 0.9, 0.65, 0.45, 0.2] | ||
|
|
||
| t, se, sp, pp, fp = ppm.all_multi_threshold_values() |
There was a problem hiding this comment.
Address unused unpacked values flagged by Ruff (RUF059).
Unused unpacked variables (fp, se, sp, pp) should be prefixed with _ to keep lint clean.
🧹 Minimal lint-safe edits
- t, se, sp, pp, fp = ppm.all_multi_threshold_values()
+ t, se, sp, pp, _fp = ppm.all_multi_threshold_values()
@@
- t, se, sp, pp, fp = ppm.all_multi_threshold_values()
+ t, se, sp, pp, _fp = ppm.all_multi_threshold_values()
@@
- t, se, sp, pp, fp = ppm.all_multi_threshold_values(max_number_samples=50, max_number_thresh=10)
+ t, _se, _sp, _pp, _fp = ppm.all_multi_threshold_values(max_number_samples=50, max_number_thresh=10)Also applies to: 54-54, 108-108
🧰 Tools
🪛 Ruff (0.15.10)
[warning] 32-32: Unpacked variable fp is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/test_metrics/test_prob_pairwise_measures.py` at line 32, The unpacking
from ppm.all_multi_threshold_values() currently binds unused variables (fp, se,
sp, pp); rename those to underscore-prefixed names to satisfy Ruff RUF059.
Update the unpackings (the statement calling ppm.all_multi_threshold_values())
so only the used variable remains non-underscored (e.g., keep t) and prefix the
other bindings with an underscore (e.g., _se, _sp, _pp, _fp) at each occurrence
(the lines where t, se, sp, pp are assigned and the other two reported
occurrences).
| assert value_test == expected_value | ||
| assert value_test2 == expected_value |
There was a problem hiding this comment.
Use tolerant assertions for floating-point metric outputs.
Direct == on floats is brittle; these should use assert_allclose(..., atol=...) for stability.
✅ Suggested assertion pattern
- assert value_test == expected_value
- assert value_test2 == expected_value
+ assert_allclose(value_test, expected_value, atol=1e-8)
+ assert_allclose(value_test2, expected_value, atol=1e-8)Also applies to: 194-194, 206-206, 218-218, 230-230, 241-241, 253-253
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/test_metrics/test_prob_pairwise_measures.py` around lines 79 - 80,
Replace brittle exact-equality float assertions with tolerant comparisons:
instead of `assert value_test == expected_value` and `assert value_test2 ==
expected_value`, use numpy.testing.assert_allclose (or pytest.approx) to compare
`value_test` and `value_test2` to `expected_value` with a reasonable absolute
tolerance (e.g., atol=1e-8) to stabilize tests; apply the same change to the
other listed assertions (lines referencing the same pattern).
| map = np.zeros([10,10]) | ||
| map[3:5,5:8] = 1 | ||
| map[7:9,0:3] = 1 |
There was a problem hiding this comment.
Rename map to avoid shadowing Python builtin.
The variable name map shadows the built-in map() function, which could cause confusion and unexpected behavior if map() is used later in the file.
♻️ Proposed fix
-map = np.zeros([10,10])
-map[3:5,5:8] = 1
-map[7:9,0:3] = 1
+test_map = np.zeros([10,10])
+test_map[3:5,5:8] = 1
+test_map[7:9,0:3] = 1Also update references in test_foreground_component and test_list_foreground_component to use test_map.
🧰 Tools
🪛 Ruff (0.15.10)
[error] 11-11: Variable map is shadowing a Python builtin
(A001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/test_utility/test_utils.py` around lines 11 - 13, Rename the local
variable named `map` to `test_map` to avoid shadowing Python's builtin `map`,
and update all references to it (including in the tests
`test_foreground_component` and `test_list_foreground_component`) to use
`test_map` so the tests use the new variable name consistently.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
MetricsReloaded/utility/assignment_localization.py (1)
623-630:⚠️ Potential issue | 🟠 MajorOperator precedence bug: condition may not work as intended.
Line 626 has an operator precedence issue. The condition
df_ambiguous_ref.shape[0] == 0 and df_ambiguous_pred.shape[0] == 0binds more tightly than the precedingor, causing unintended logic. The likely intent is to check if eitherdf_ambiguous_refis None/empty ORdf_ambiguous_predis empty.🐛 Proposed fix to add parentheses for clarity
if ( df_ambiguous_ref is None or df_ambiguous_ref.shape[0] == 0 - and df_ambiguous_pred.shape[0] == 0 + or df_ambiguous_pred.shape[0] == 0 ):Or if the intent is to require both to be empty:
if ( df_ambiguous_ref is None - or df_ambiguous_ref.shape[0] == 0 - and df_ambiguous_pred.shape[0] == 0 + or (df_ambiguous_ref.shape[0] == 0 and df_ambiguous_pred.shape[0] == 0) ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/utility/assignment_localization.py` around lines 623 - 630, The conditional has an operator-precedence bug: change the test so the checks for df_ambiguous_ref being None/empty are grouped before the OR with df_ambiguous_pred; e.g., replace the current condition with one that explicitly groups (df_ambiguous_ref is None or df_ambiguous_ref.shape[0] == 0) or df_ambiguous_pred.shape[0] == 0 so the intent "either df_ambiguous_ref is missing/empty OR df_ambiguous_pred is empty" is enforced; update the if in assignment_localization.py (the block that prints "No ambiguity in matching" and returns df_matching_all) accordingly.
🧹 Nitpick comments (6)
test/test_utility/test_assignment_localization.py (3)
415-415: Remove commented-out dead code.If
m22is no longer needed for assertions, remove the commented line.♻️ Suggested cleanup
m12 = match1[match1["label"] == 1] m21 = match2[match2["label"] == 0] - # m22 = match2[match2["label"] == 1] print(m12)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_utility/test_assignment_localization.py` at line 415, Remove the commented-out dead code line referencing m22 (the commented line "# m22 = match2[match2[\"label\"] == 1]") from test_assignment_localization.py; locate the occurrence of m22 in the test function or surrounding block and delete the commented line so the test file contains no unused commented code.
374-376: Prefix unused variables with underscore.The
list_val,list_val_h, andlist_val_pvariables are unpacked but never used. Prefix them with_to indicate they are intentionally unused.♻️ Suggested fix
- df_match, list_val = asm.resolve_ambiguities_matching() - df_match_h, list_val_h = asm_h.resolve_ambiguities_matching() - df_match_p, list_val_p = asm_p.resolve_ambiguities_matching() + df_match, _list_val = asm.resolve_ambiguities_matching() + df_match_h, _list_val_h = asm_h.resolve_ambiguities_matching() + df_match_p, _list_val_p = asm_p.resolve_ambiguities_matching()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_utility/test_assignment_localization.py` around lines 374 - 376, The unpacked return values list_val, list_val_h, and list_val_p from calls to resolve_ambiguities_matching() are unused; rename them to _list_val, _list_val_h, and _list_val_p (or simply _ ) where the assignments occur (df_match, _list_val = asm.resolve_ambiguities_matching(), etc.) to indicate intentional unused variables and silence linter warnings.
125-134: Remove unused commented-out variables.The
ref_comandref_comsvariables are commented out and not used in the test. Consider removing them entirely.♻️ Suggested cleanup
def test_check_localization(): ref_box = [[2,2,4,4]] - # ref_com = [[3,3]] pred_box = [[2,2,4,4]] pred_com = [[3,3]] ref_mask = np.zeros([14,14]) pred_mask = np.zeros([14,14]) ref_mask[2:5,2:5]=1 pred_mask[2:5,2:5]=1 ref_boxes = np.vstack([ref_box]) ref_masks = np.asarray([ref_mask]) - # ref_coms = np.vstack([ref_com]) pred_coms = np.vstack([pred_com])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_utility/test_assignment_localization.py` around lines 125 - 134, The test contains unused commented-out variables ref_com and ref_coms; remove the commented lines referencing ref_com and ref_coms from test_assignment_localization.py (the lines that define "# ref_com = [[3,3]]" and "# ref_coms = np.vstack([ref_com])") so the test only keeps the active variables pred_box, pred_com, ref_mask, pred_mask, ref_boxes, and ref_masks; ensure no other references to ref_com or ref_coms remain in the surrounding test code.MetricsReloaded/utility/assignment_localization.py (3)
379-380: Remove commented-out dead code.These commented lines appear to be remnants of previous logic. If the code is no longer needed, consider removing it to improve readability. If it's temporarily disabled for debugging, add a TODO comment explaining when to restore it.
♻️ Suggested cleanup
if self.flag_refmod: ref_boxes = self.ref_loc_mod - # if self.flag_predmod: - # pred_points = self.pred_loc_mod matrix_pinb = np.zeros([pred_points.shape[0],ref_boxes.shape[0]])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/utility/assignment_localization.py` around lines 379 - 380, Remove the dead commented lines referencing self.flag_predmod and pred_points = self.pred_loc_mod in assignment_localization.py; either delete them outright to clean up the codebase or, if they’re intentionally disabled temporarily, replace the commented block with a short TODO comment explaining when/why to restore it and referencing the symbols self.flag_predmod and pred_loc_mod so future readers know the intent.
175-181: Addstacklevel=2towarnings.warn()calls.Without
stacklevel, the warning message will point to this internal location rather than the user's call site, making debugging harder.♻️ Suggested fix
- warnings.warn("No adequate localization strategy chosen - not going ahead") + warnings.warn("No adequate localization strategy chosen - not going ahead", stacklevel=2) else: print(' not valid localisation ') self.flag_usable = False self.df_matching = None self.valid = None - warnings.warn("No adequate localization strategy chosen - not going ahead") + warnings.warn("No adequate localization strategy chosen - not going ahead", stacklevel=2)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/utility/assignment_localization.py` around lines 175 - 181, The two warnings.warn(...) calls in assignment_localization.py (the branch that sets self.flag_usable = False, self.df_matching = None, self.valid = None and the earlier else branch) should include stacklevel=2 so the warning points to the user's call site; update both warnings.warn calls to warnings.warn("No adequate localization strategy chosen - not going ahead", stacklevel=2). Ensure you modify both occurrences that correspond to the block manipulating self.flag_usable, self.df_matching, and self.valid.
396-399: Remove commented-out dead code.Same as above - if this code is no longer needed, remove it for clarity.
♻️ Suggested cleanup
ref_masks = self.ref_loc pred_points = self.pred_loc - # if self.flag_refmod: - # ref_masks = self.ref_loc_mod - # if self.flag_predmod: - # pred_points = self.pred_loc_mod matrix_pinm = np.zeros([pred_points.shape[0],ref_masks.shape[0]])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MetricsReloaded/utility/assignment_localization.py` around lines 396 - 399, Remove the commented-out dead code relating to conditional assignments for localization modification flags: delete the commented lines referencing self.flag_refmod, ref_masks = self.ref_loc_mod, self.flag_predmod, and pred_points = self.pred_loc_mod in assignment_localization.py; ensure no other code depends on these commented branches (search for self.flag_refmod, self.ref_loc_mod, self.flag_predmod, self.pred_loc_mod) and if needed replace with a clear, minimal comment explaining current behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@MetricsReloaded/utility/assignment_localization.py`:
- Around line 638-642: The DataFrame filter in assignment_localization.py is
wrong due to operator precedence: change the df_tmp assignment so both equality
comparisons are grouped before applying the bitwise AND, e.g. build the boolean
mask as (df_matching["pred"] == list_valid[r]) & (df_matching["ref"] == c) and
use that to index df_matching; update the expression that currently uses
list_valid[r] & (df_matching["ref"] == c) so df_tmp and subsequent list_matching
append use the corrected mask (variables: df_matching, list_valid, r, c, df_tmp,
list_matching).
---
Outside diff comments:
In `@MetricsReloaded/utility/assignment_localization.py`:
- Around line 623-630: The conditional has an operator-precedence bug: change
the test so the checks for df_ambiguous_ref being None/empty are grouped before
the OR with df_ambiguous_pred; e.g., replace the current condition with one that
explicitly groups (df_ambiguous_ref is None or df_ambiguous_ref.shape[0] == 0)
or df_ambiguous_pred.shape[0] == 0 so the intent "either df_ambiguous_ref is
missing/empty OR df_ambiguous_pred is empty" is enforced; update the if in
assignment_localization.py (the block that prints "No ambiguity in matching" and
returns df_matching_all) accordingly.
---
Nitpick comments:
In `@MetricsReloaded/utility/assignment_localization.py`:
- Around line 379-380: Remove the dead commented lines referencing
self.flag_predmod and pred_points = self.pred_loc_mod in
assignment_localization.py; either delete them outright to clean up the codebase
or, if they’re intentionally disabled temporarily, replace the commented block
with a short TODO comment explaining when/why to restore it and referencing the
symbols self.flag_predmod and pred_loc_mod so future readers know the intent.
- Around line 175-181: The two warnings.warn(...) calls in
assignment_localization.py (the branch that sets self.flag_usable = False,
self.df_matching = None, self.valid = None and the earlier else branch) should
include stacklevel=2 so the warning points to the user's call site; update both
warnings.warn calls to warnings.warn("No adequate localization strategy chosen -
not going ahead", stacklevel=2). Ensure you modify both occurrences that
correspond to the block manipulating self.flag_usable, self.df_matching, and
self.valid.
- Around line 396-399: Remove the commented-out dead code relating to
conditional assignments for localization modification flags: delete the
commented lines referencing self.flag_refmod, ref_masks = self.ref_loc_mod,
self.flag_predmod, and pred_points = self.pred_loc_mod in
assignment_localization.py; ensure no other code depends on these commented
branches (search for self.flag_refmod, self.ref_loc_mod, self.flag_predmod,
self.pred_loc_mod) and if needed replace with a clear, minimal comment
explaining current behavior.
In `@test/test_utility/test_assignment_localization.py`:
- Line 415: Remove the commented-out dead code line referencing m22 (the
commented line "# m22 = match2[match2[\"label\"] == 1]") from
test_assignment_localization.py; locate the occurrence of m22 in the test
function or surrounding block and delete the commented line so the test file
contains no unused commented code.
- Around line 374-376: The unpacked return values list_val, list_val_h, and
list_val_p from calls to resolve_ambiguities_matching() are unused; rename them
to _list_val, _list_val_h, and _list_val_p (or simply _ ) where the assignments
occur (df_match, _list_val = asm.resolve_ambiguities_matching(), etc.) to
indicate intentional unused variables and silence linter warnings.
- Around line 125-134: The test contains unused commented-out variables ref_com
and ref_coms; remove the commented lines referencing ref_com and ref_coms from
test_assignment_localization.py (the lines that define "# ref_com = [[3,3]]" and
"# ref_coms = np.vstack([ref_com])") so the test only keeps the active variables
pred_box, pred_com, ref_mask, pred_mask, ref_boxes, and ref_masks; ensure no
other references to ref_com or ref_coms remain in the surrounding test code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 92a20a0d-c4a9-43d4-9945-1a0dfeab5047
📒 Files selected for processing (2)
MetricsReloaded/utility/assignment_localization.pytest/test_utility/test_assignment_localization.py
| for (r, c) in zip(row, col): | ||
| df_tmp = df_matching[ | ||
| df_matching["seg"] == list_valid[r] & (df_matching["ref"] == c) | ||
| df_matching["pred"] == list_valid[r] & (df_matching["ref"] == c) | ||
| ] | ||
| list_matching.append(df_tmp) |
There was a problem hiding this comment.
Critical: Operator precedence bug in DataFrame filter.
Line 640 has incorrect operator precedence. The & operator binds more tightly than ==, so list_valid[r] & (df_matching["ref"] == c) is evaluated first, producing incorrect filtering results.
🐛 Proposed fix to add parentheses
df_tmp = df_matching[
- df_matching["pred"] == list_valid[r] & (df_matching["ref"] == c)
+ (df_matching["pred"] == list_valid[r]) & (df_matching["ref"] == c)
]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (r, c) in zip(row, col): | |
| df_tmp = df_matching[ | |
| df_matching["seg"] == list_valid[r] & (df_matching["ref"] == c) | |
| df_matching["pred"] == list_valid[r] & (df_matching["ref"] == c) | |
| ] | |
| list_matching.append(df_tmp) | |
| for (r, c) in zip(row, col): | |
| df_tmp = df_matching[ | |
| (df_matching["pred"] == list_valid[r]) & (df_matching["ref"] == c) | |
| ] | |
| list_matching.append(df_tmp) |
🧰 Tools
🪛 Ruff (0.15.10)
[warning] 638-638: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@MetricsReloaded/utility/assignment_localization.py` around lines 638 - 642,
The DataFrame filter in assignment_localization.py is wrong due to operator
precedence: change the df_tmp assignment so both equality comparisons are
grouped before applying the bitwise AND, e.g. build the boolean mask as
(df_matching["pred"] == list_valid[r]) & (df_matching["ref"] == c) and use that
to index df_matching; update the expression that currently uses list_valid[r] &
(df_matching["ref"] == c) so df_tmp and subsequent list_matching append use the
corrected mask (variables: df_matching, list_valid, r, c, df_tmp,
list_matching).
Update of all documentation to ensure all parameters are explained and function listed
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores / Other
Tests