Debug Final Energy [by Sector]|Transportation#20
Merged
Conversation
…by-sectortransportation
…s with Mock-Network to new statistics-calls
Owner
Author
|
@sourcery-ai review |
Contributor
Reviewer's GuideRefactors the transportation final energy statistic to be carrier-specific using standardized PyPSA statistics utilities, introduces shared statistics kwargs and domestic-share helpers, wires optional energy_totals into the processor, and updates mocks/tests to support the new behavior and richer MultiIndex structures. Sequence diagram for transportation final energy calculation via NetworkProcessorsequenceDiagram
actor User
participant NetworkProcessor
participant VariableFunction as Final_Energy_by_Sector__Transportation
participant Utils as utils
participant Net as PyPSA_Network
participant Stats as PyPSA_statistics_API
User->>NetworkProcessor: request_variable(variable_name)
NetworkProcessor->>NetworkProcessor: _execute_function_for_variable(n, func, variable, config)
NetworkProcessor->>NetworkProcessor: build kwargs (aggregate_per_year, energy_totals_path)
NetworkProcessor->>VariableFunction: call(n, aggregate_per_year, energy_totals)
VariableFunction->>Utils: get_energy_totals_domestic_share(energy_totals, aviation)
Utils-->>VariableFunction: domestic_aviation_fraction
VariableFunction->>Utils: get_energy_totals_domestic_share(energy_totals, navigation)
Utils-->>VariableFunction: domestic_navigation_fraction
VariableFunction->>Net: n.statistics
Net-->>VariableFunction: Stats
VariableFunction->>Stats: withdrawal(bus_carrier=low_voltage, carrier=BEV_charger,...)
Stats-->>VariableFunction: elec
VariableFunction->>Stats: energy_balance(carrier=BEV_charger, components=Link, at_port=bus1,...)
Stats-->>VariableFunction: charging_out_raw
VariableFunction->>Stats: energy_balance(carrier=BEV_charger, components=Link, at_port=bus0,...)
Stats-->>VariableFunction: charging_in_raw
VariableFunction->>Stats: energy_balance(carrier=V2G, components=Link, at_port=bus0,...)
Stats-->>VariableFunction: v2g_in_raw
VariableFunction->>VariableFunction: compute EV_charging_losses
VariableFunction->>Stats: withdrawal(carrier=land_transport_fuel_cell,...)
Stats-->>VariableFunction: h2
VariableFunction->>Stats: withdrawal(carrier=kerosene_for_aviation,...)
Stats-->>VariableFunction: aviation_liquids_raw
VariableFunction->>VariableFunction: apply domestic_aviation_fraction
VariableFunction->>Stats: withdrawal(carrier=[shipping_oil, shipping_methanol],...)
Stats-->>VariableFunction: navigation_liquids_raw
VariableFunction->>VariableFunction: apply domestic_navigation_fraction
VariableFunction->>Stats: withdrawal(carrier=land_transport_oil,...)
Stats-->>VariableFunction: land_transport_liquids
VariableFunction->>VariableFunction: reduce series_list and groupby statistics_grouping_index
VariableFunction-->>NetworkProcessor: total_transportation_final_energy
NetworkProcessor-->>User: result
Class diagram for updated transportation statistics and utilitiesclassDiagram
class NetworkProcessor {
Path network_results_path
bool aggregate_per_year
_execute_function_for_variable(n: Network, func: function, variable: str, config: dict) pd.Series
}
class statistics_functions {
Final_Energy_by_Sector__Transportation(n: Network, aggregate_per_year: bool, energy_totals: DataFrame) Series
Final_Energy_by_Carrier__Electricity(n: Network, aggregate_per_year: bool) Series
}
class utils {
<<module>>
statistics_kwargs: dict
statistics_grouping_index: list
get_energy_totals_domestic_share(energy_totals: Path, kind: str) Series
}
class PyPSANetwork {
statistics: PyPSAStatisticsAPI
}
class PyPSAStatisticsAPI {
withdrawal(bus_carrier: str, carrier: str, components: str, aggregate_time: bool, groupby: list, nice_names: bool) Series
energy_balance(carrier: str, components: str, at_port: list, groupby_time: bool, groupby: list, nice_names: bool) Series
}
class get_energy_totals_domestic_share
NetworkProcessor --> statistics_functions : calls
NetworkProcessor --> utils : uses energy_totals path
statistics_functions --> PyPSANetwork : uses
PyPSANetwork --> PyPSAStatisticsAPI : exposes statistics
statistics_functions --> utils : imports
utils --> get_energy_totals_domestic_share : defines
statistics_functions --> get_energy_totals_domestic_share : calls
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- The
energy_totalsparameter is inconsistently typed/used (Final_Energy_by_Sector__Transportationhintspd.DataFrame | Nonebutget_energy_totals_domestic_sharereads it as a CSVPathand hard-codesAT/2020); consider aligning types, validating non-None input explicitly, and moving the country/year selection out of the utility to avoid hidden assumptions. - In
Final_Energy_by_Sector__Transportation,series_list[0]will raise anIndexErrorif all underlying statistics calls return empty results; consider short-circuiting and returning an empty Series/DataFrame whenseries_listis empty instead of relying on at least one non-empty carrier. - The
test_raises_without_energy_totalstest asserts on the exact pandas error message string fromread_csv, which is brittle across pandas versions; it would be more robust to assert only that aValueErroris raised or raise a custom error before callingread_csvwhenenergy_totalsis missing.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `energy_totals` parameter is inconsistently typed/used (`Final_Energy_by_Sector__Transportation` hints `pd.DataFrame | None` but `get_energy_totals_domestic_share` reads it as a CSV `Path` and hard-codes `AT`/2020); consider aligning types, validating non-None input explicitly, and moving the country/year selection out of the utility to avoid hidden assumptions.
- In `Final_Energy_by_Sector__Transportation`, `series_list[0]` will raise an `IndexError` if all underlying statistics calls return empty results; consider short-circuiting and returning an empty Series/DataFrame when `series_list` is empty instead of relying on at least one non-empty carrier.
- The `test_raises_without_energy_totals` test asserts on the exact pandas error message string from `read_csv`, which is brittle across pandas versions; it would be more robust to assert only that a `ValueError` is raised or raise a custom error before calling `read_csv` when `energy_totals` is missing.
## Individual Comments
### Comment 1
<location path="pypsa_validation_processing/statistics_functions.py" line_range="89-92" />
<code_context>
def Final_Energy_by_Sector__Transportation(
n: pypsa.Network,
aggregate_per_year: bool = True,
+ energy_totals: pd.DataFrame | None = None,
) -> pd.Series | pd.DataFrame:
- """Extract transportation-sector final energy from a PyPSA Network.
</code_context>
<issue_to_address>
**issue (bug_risk):** Align `energy_totals` typing and handling with actual usage (path vs DataFrame and `None` case).
`energy_totals` is annotated and defaulted as `pd.DataFrame | None`, but is passed directly to `get_energy_totals_domestic_share`, which uses `pd.read_csv(energy_totals, ...)` and thus expects a path-like object and not `None`. Please either:
- Change the annotation to a path type (e.g. `Path | str`) to match usage, and
- Make `energy_totals` required or add an explicit `None` check with a clear error at the top of the function.
This will avoid confusing runtime errors and keep the interface consistent with `_execute_function_for_variable`, which currently passes a path.
</issue_to_address>
### Comment 2
<location path="pypsa_validation_processing/utils.py" line_range="146-155" />
<code_context>
+## UTILS FUNCTIONS
+
+
+def get_energy_totals_domestic_share(
+ energy_totals: Path,
+ kind: str,
+) -> pd.Series:
+ """
+ Return the domestic share of energy totals for a given kind.
+
+ Parameters
+ ----------
+ energy_totals
+ The energy totals data frame filtered to one energy year.
+ kind: {'aviation', 'navigation'}
+ The kind of energy totals to calculate the factor for.
+
+ Returns
+ -------
+ :
+ The share of national aviation or navigation per country.
+ """
+ # TODO generalize energy totals for all countries
+ energy_totals = pd.read_csv(energy_totals, index_col="country").loc["AT"]
+ energy_totals = energy_totals[energy_totals.year == 2020]
+ domestic = energy_totals[f"total domestic {kind}"]
+ international = energy_totals[f"total international {kind}"]
+ return (domestic / (domestic + international)).values[0]
</code_context>
<issue_to_address>
**issue:** Fix type annotations and docstring of `get_energy_totals_domestic_share` to match implementation and return value.
There are mismatches between the type hints/docstring and the actual behavior:
- `energy_totals` is annotated as `Path`, but the docstring describes a filtered `DataFrame` while the code uses `pd.read_csv(energy_totals, ...)`. Either update the docstring to describe a CSV path, or change the function to accept a `DataFrame` and move `read_csv` outside.
- The return type is annotated as `pd.Series`, but the function returns a scalar (`.values[0]`). Please either change the annotation to a scalar type (e.g. `float`) or return a `Series`/`DataFrame` consistently.
Making these consistent will avoid confusion and potential misuse of the function.
</issue_to_address>
### Comment 3
<location path="pypsa_validation_processing/class_definitions.py" line_range="291-292" />
<code_context>
kwargs["config"] = config
if "aggregate_per_year" in params:
kwargs["aggregate_per_year"] = self.aggregate_per_year
+ if "energy_totals" in params:
+ kwargs["energy_totals"] = (
+ self.network_results_path / "resources" / "energy_totals.csv"
+ )
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Clarify that `energy_totals` is a file path and keep it consistent with the called function.
Here `energy_totals` is always a file path (`self.network_results_path / "resources" / "energy_totals.csv"`), which matches `get_energy_totals_domestic_share` (it calls `pd.read_csv`) but conflicts with `Final_Energy_by_Sector__Transportation`, which types `energy_totals` as `pd.DataFrame | None`. Please either:
- Standardize on a path-like type (and update the function annotation), or
- Load the CSV here and pass a `DataFrame` consistently.
Aligning the type across caller and callee will prevent confusion and type-related bugs.
Suggested implementation:
```python
if "energy_totals" in params:
# Pass energy_totals as a path-like pointing to the CSV so that
# all downstream functions handle loading consistently.
kwargs["energy_totals"] = (
self.network_results_path / "resources" / "energy_totals.csv"
)
return func(n, **kwargs)
```
```python
from os import PathLike
from pathlib import Path
def Final_Energy_by_Sector__Transportation(
network: pypsa.Network,
scenario: str,
energy_totals: str | Path | PathLike | None = None,
aggregate_per_year: bool | None = None,
) -> pd.DataFrame:
```
```python
def Final_Energy_by_Sector__Transportation(
network: pypsa.Network,
scenario: str,
energy_totals: str | Path | PathLike | None = None,
aggregate_per_year: bool | None = None,
) -> pd.DataFrame:
"""
...
Parameters
----------
network : pypsa.Network
...
scenario : str
...
energy_totals : str | Path | os.PathLike | None, optional
Path to the ``energy_totals.csv`` file. If None, the default location
will be used. The CSV will be loaded internally as a DataFrame.
aggregate_per_year : bool | None, optional
...
"""
if energy_totals is None:
energy_totals_df = load_energy_totals_default()
else:
energy_totals_df = pd.read_csv(energy_totals)
# use energy_totals_df instead of energy_totals below
...
```
```python
def get_energy_totals_domestic_share(
energy_totals: str | Path | PathLike,
country: str,
) -> pd.DataFrame:
"""
...
Parameters
----------
energy_totals : str | Path | os.PathLike
Path to the ``energy_totals.csv`` file.
country : str
Country code or name to filter domestic shares.
"""
return pd.read_csv(energy_totals).query("country == @country")
```
I assumed the signatures and docstrings of `Final_Energy_by_Sector__Transportation` and `get_energy_totals_domestic_share` based on your description. You will need to:
1. Adjust the `SEARCH` blocks to match the exact existing function definitions and docstrings in `class_definitions.py`.
2. Replace the placeholder `...` in the function bodies with your existing implementation, making sure all internal uses of `energy_totals` are switched to `energy_totals_df` after loading.
3. Ensure `PathLike` / `Path` imports do not conflict with any existing imports; if `PathLike` or `Path` are already imported elsewhere, remove the duplicate import lines from the replacement block.
</issue_to_address>
### Comment 4
<location path="tests/test_statistics_functions.py" line_range="133-136" />
<code_context>
+ assert "unit" in result.index.names
assert len(result) > 0
+ def test_raises_without_energy_totals(self, mock_network: MockPyPSANetwork):
+ """Current implementation requires energy_totals to compute domestic shares."""
+ with pytest.raises(ValueError, match="Invalid file path or buffer object type"):
+ Final_Energy_by_Sector__Transportation(mock_network)
+
</code_context>
<issue_to_address>
**suggestion (testing):** Error-path test for missing `energy_totals` is tightly coupled to pandas’ error message rather than the function’s intended contract
The test currently asserts `pytest.raises(ValueError, match="Invalid file path or buffer object type")`, which ties it to pandas’ `read_csv` wording instead of the function’s contract that `energy_totals` is required. This is brittle and obscures the intended behavior.
Consider either:
- Raising a clearer, domain-specific error when `energy_totals` is `None`/invalid and asserting on that, or
- Relaxing or removing `match=` and asserting only that a `ValueError` is raised when `energy_totals` is omitted.
This will decouple the test from pandas internals and better express the contract.
```suggestion
def test_raises_without_energy_totals(self, mock_network: MockPyPSANetwork):
"""Current implementation requires energy_totals to compute domestic shares."""
with pytest.raises(ValueError):
Final_Energy_by_Sector__Transportation(mock_network)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
2 tasks
Collaborator
pworschischek-aggmag
left a comment
There was a problem hiding this comment.
Let's discuss any time you are ready.
pworschischek-aggmag
approved these changes
Apr 23, 2026
Collaborator
pworschischek-aggmag
left a comment
There was a problem hiding this comment.
ready to merge. Just a note on how we treat unexpected stuff in PyPSA-AT and a small simplification suggestion.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement logic from pypsa-de functions and split Transportation carrier-wise.
Summary by Sourcery
Refine transportation final energy statistics to be carrier-specific, incorporate domestic aviation/navigation shares, and wire energy_totals support through the processing pipeline.
New Features:
Enhancements:
Documentation:
Tests: