Skip to content

Debug Final Energy [by Sector]|Transportation#20

Merged
maxnutz merged 13 commits intomainfrom
19-enhance-variable-statistics-final-energy-by-sectortransportation
Apr 23, 2026
Merged

Debug Final Energy [by Sector]|Transportation#20
maxnutz merged 13 commits intomainfrom
19-enhance-variable-statistics-final-energy-by-sectortransportation

Conversation

@maxnutz
Copy link
Copy Markdown
Owner

@maxnutz maxnutz commented Mar 30, 2026

Implement logic from pypsa-de functions and split Transportation carrier-wise.

Summary by Sourcery

Refine transportation final energy statistics to be carrier-specific, incorporate domestic aviation/navigation shares, and wire energy_totals support through the processing pipeline.

New Features:

  • Add carrier-wise transportation final energy calculation using PyPSA statistics.withdrawal and energy_balance APIs, including EV charging losses and separate hydrogen and liquid fuel components.

Enhancements:

  • Introduce shared statistics configuration kwargs and a standard grouping index to unify PyPSA statistics groupby behavior and MultiIndex structure across functions.
  • Add a utility to derive domestic energy shares for aviation and navigation from an energy_totals.csv file and apply these factors in transportation calculations.
  • Extend Network_Processor to automatically pass the energy_totals.csv path into variable functions that accept an energy_totals parameter.

Documentation:

  • Document optional function parameters (config and energy_totals) in the README, including the default energy_totals.csv location.

Tests:

  • Expand statistics mocks to cover withdrawal calls, additional grouping fields, and deterministic link-port signs to support EV charging loss logic.
  • Update and extend transportation-sector tests, including behavior when aggregate_per_year is False and when energy_totals is missing.
  • Add tests ensuring Network_Processor correctly forwards the energy_totals path to eligible functions.

@maxnutz maxnutz linked an issue Mar 30, 2026 that may be closed by this pull request
sourcery-ai[bot]

This comment was marked as outdated.

@maxnutz maxnutz marked this pull request as draft April 10, 2026 17:02
Repository owner deleted a comment from sourcery-ai Bot Apr 20, 2026
@maxnutz
Copy link
Copy Markdown
Owner Author

maxnutz commented Apr 20, 2026

@sourcery-ai review

sourcery-ai[bot]

This comment was marked as outdated.

@maxnutz maxnutz marked this pull request as ready for review April 21, 2026 10:36
Repository owner deleted a comment from sourcery-ai Bot Apr 21, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 21, 2026

Reviewer's Guide

Refactors the transportation final energy statistic to be carrier-specific using standardized PyPSA statistics utilities, introduces shared statistics kwargs and domestic-share helpers, wires optional energy_totals into the processor, and updates mocks/tests to support the new behavior and richer MultiIndex structures.

Sequence diagram for transportation final energy calculation via NetworkProcessor

sequenceDiagram
    actor User
    participant NetworkProcessor
    participant VariableFunction as Final_Energy_by_Sector__Transportation
    participant Utils as utils
    participant Net as PyPSA_Network
    participant Stats as PyPSA_statistics_API

    User->>NetworkProcessor: request_variable(variable_name)
    NetworkProcessor->>NetworkProcessor: _execute_function_for_variable(n, func, variable, config)
    NetworkProcessor->>NetworkProcessor: build kwargs (aggregate_per_year, energy_totals_path)
    NetworkProcessor->>VariableFunction: call(n, aggregate_per_year, energy_totals)

    VariableFunction->>Utils: get_energy_totals_domestic_share(energy_totals, aviation)
    Utils-->>VariableFunction: domestic_aviation_fraction
    VariableFunction->>Utils: get_energy_totals_domestic_share(energy_totals, navigation)
    Utils-->>VariableFunction: domestic_navigation_fraction

    VariableFunction->>Net: n.statistics
    Net-->>VariableFunction: Stats

    VariableFunction->>Stats: withdrawal(bus_carrier=low_voltage, carrier=BEV_charger,...)
    Stats-->>VariableFunction: elec

    VariableFunction->>Stats: energy_balance(carrier=BEV_charger, components=Link, at_port=bus1,...)
    Stats-->>VariableFunction: charging_out_raw
    VariableFunction->>Stats: energy_balance(carrier=BEV_charger, components=Link, at_port=bus0,...)
    Stats-->>VariableFunction: charging_in_raw
    VariableFunction->>Stats: energy_balance(carrier=V2G, components=Link, at_port=bus0,...)
    Stats-->>VariableFunction: v2g_in_raw
    VariableFunction->>VariableFunction: compute EV_charging_losses

    VariableFunction->>Stats: withdrawal(carrier=land_transport_fuel_cell,...)
    Stats-->>VariableFunction: h2

    VariableFunction->>Stats: withdrawal(carrier=kerosene_for_aviation,...)
    Stats-->>VariableFunction: aviation_liquids_raw
    VariableFunction->>VariableFunction: apply domestic_aviation_fraction

    VariableFunction->>Stats: withdrawal(carrier=[shipping_oil, shipping_methanol],...)
    Stats-->>VariableFunction: navigation_liquids_raw
    VariableFunction->>VariableFunction: apply domestic_navigation_fraction

    VariableFunction->>Stats: withdrawal(carrier=land_transport_oil,...)
    Stats-->>VariableFunction: land_transport_liquids

    VariableFunction->>VariableFunction: reduce series_list and groupby statistics_grouping_index
    VariableFunction-->>NetworkProcessor: total_transportation_final_energy
    NetworkProcessor-->>User: result
Loading

Class diagram for updated transportation statistics and utilities

classDiagram
    class NetworkProcessor {
        Path network_results_path
        bool aggregate_per_year
        _execute_function_for_variable(n: Network, func: function, variable: str, config: dict) pd.Series
    }

    class statistics_functions {
        Final_Energy_by_Sector__Transportation(n: Network, aggregate_per_year: bool, energy_totals: DataFrame) Series
        Final_Energy_by_Carrier__Electricity(n: Network, aggregate_per_year: bool) Series
    }

    class utils {
        <<module>>
        statistics_kwargs: dict
        statistics_grouping_index: list
        get_energy_totals_domestic_share(energy_totals: Path, kind: str) Series
    }

    class PyPSANetwork {
        statistics: PyPSAStatisticsAPI
    }

    class PyPSAStatisticsAPI {
        withdrawal(bus_carrier: str, carrier: str, components: str, aggregate_time: bool, groupby: list, nice_names: bool) Series
        energy_balance(carrier: str, components: str, at_port: list, groupby_time: bool, groupby: list, nice_names: bool) Series
    }

    class get_energy_totals_domestic_share

    NetworkProcessor --> statistics_functions : calls
    NetworkProcessor --> utils : uses energy_totals path

    statistics_functions --> PyPSANetwork : uses
    PyPSANetwork --> PyPSAStatisticsAPI : exposes statistics

    statistics_functions --> utils : imports
    utils --> get_energy_totals_domestic_share : defines
    statistics_functions --> get_energy_totals_domestic_share : calls
Loading

File-Level Changes

Change Details Files
Refine Final_Energy_by_Sector__Transportation to compute carrier-wise transportation final energy using PyPSA statistics.withdrawal/energy_balance, including EV charging losses and domestic aviation/navigation shares.
  • Replace single Load-based aggregation with separate withdrawals for electricity (BEV charger), hydrogen (fuel cell), and multiple liquid fuel carriers.
  • Add explicit EV charging loss calculation using Link energy balances at bus0/bus1, including V2G handling and sign validation.
  • Apply domestic share factors for aviation and navigation based on external energy_totals data and combine all carriers via additive reduction with consistent grouping to a standardized MultiIndex.
pypsa_validation_processing/statistics_functions.py
Introduce shared statistics utilities and energy_totals domestic-share helper used by transportation statistics.
  • Add statistics_kwargs dict and statistics_grouping_index list to standardize PyPSA statistics groupby fields and MultiIndex layout across calls.
  • Implement get_energy_totals_domestic_share to read an energy_totals CSV and compute domestic vs international aviation/navigation share factors for a fixed country/year.
  • Extend unit normalization mapping to include 'land transport' entries.
pypsa_validation_processing/utils.py
Wire optional energy_totals parameter into function execution and document it in the README.
  • Update Network_Processor._execute_function_for_variable to detect an energy_totals parameter in the target function signature and pass a fixed energy_totals.csv path from network_results_path/resources.
  • Add a new test to verify that functions accepting energy_totals receive the correct path.
  • Document energy_totals: Path as an optional function parameter in the README, alongside the existing config argument.
pypsa_validation_processing/class_definitions.py
tests/test_network_processor.py
README.md
Enhance test suite and mocks to support new transportation logic, standardized statistics kwargs, and aggregate_per_year=False behavior.
  • Update Final_Energy_by_Sector__Transportation tests to pass energy_totals, relax strict index name assertions to only require presence of 'location' and 'unit', and add a negative test when energy_totals is omitted.
  • Extend MockStatisticsAccessor.energy_balance to accept extra kwargs, produce richer groupby indices including name and bus, and supply deterministic signs for Link ports used in EV-loss calculations.
  • Add a withdrawal adapter method on MockStatisticsAccessor that delegates to energy_balance using aggregate_time and groupby_time mapping.
  • Introduce an energy_totals_csv fixture producing a minimal energy_totals.csv file, and add a dedicated test ensuring transportation returns a DataFrame when aggregate_per_year=False while maintaining snapshot-indexed columns.
  • Remove Transportation from the generic aggregate_per_year=False parametrized test to handle it with the new dedicated test.
tests/test_statistics_functions.py
tests/conftest.py

Assessment against linked issues

Issue Objective Addressed Explanation
#19 Refactor Final_Energy_by_Sector__Transportation to use pypsa-de style logic, computing transportation final energy carrier-wise (electricity, hydrogen, and liquid fuels) instead of a simple summed Load over carriers.
#19 Compute and return the total transportation final energy by summing all carrier-wise components, including EV charging losses and applying domestic-share factors for aviation and navigation.
#19 Introduce and connect required dependencies/utilities (e.g. standardized statistics kwargs/index, energy_totals domestic-share helper, processor wiring, tests, and docs) so the transportation statistic runs correctly within the existing processing framework.

Possibly linked issues

  • #Enhance Variable Statistics: Final Energy [by Sector]|Transportation: PR implements the requested renewed transportation final energy statistics, splitting by carriers and summing with pypsa-de logic.
  • #New Variable Statistics: Final Energy [by Sector]|Transportation: PR delivers the requested Final Energy by Sector|Transportation function, wiring, and tests, plus related utility enhancements.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • The energy_totals parameter is inconsistently typed/used (Final_Energy_by_Sector__Transportation hints pd.DataFrame | None but get_energy_totals_domestic_share reads it as a CSV Path and hard-codes AT/2020); consider aligning types, validating non-None input explicitly, and moving the country/year selection out of the utility to avoid hidden assumptions.
  • In Final_Energy_by_Sector__Transportation, series_list[0] will raise an IndexError if all underlying statistics calls return empty results; consider short-circuiting and returning an empty Series/DataFrame when series_list is empty instead of relying on at least one non-empty carrier.
  • The test_raises_without_energy_totals test asserts on the exact pandas error message string from read_csv, which is brittle across pandas versions; it would be more robust to assert only that a ValueError is raised or raise a custom error before calling read_csv when energy_totals is missing.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `energy_totals` parameter is inconsistently typed/used (`Final_Energy_by_Sector__Transportation` hints `pd.DataFrame | None` but `get_energy_totals_domestic_share` reads it as a CSV `Path` and hard-codes `AT`/2020); consider aligning types, validating non-None input explicitly, and moving the country/year selection out of the utility to avoid hidden assumptions.
- In `Final_Energy_by_Sector__Transportation`, `series_list[0]` will raise an `IndexError` if all underlying statistics calls return empty results; consider short-circuiting and returning an empty Series/DataFrame when `series_list` is empty instead of relying on at least one non-empty carrier.
- The `test_raises_without_energy_totals` test asserts on the exact pandas error message string from `read_csv`, which is brittle across pandas versions; it would be more robust to assert only that a `ValueError` is raised or raise a custom error before calling `read_csv` when `energy_totals` is missing.

## Individual Comments

### Comment 1
<location path="pypsa_validation_processing/statistics_functions.py" line_range="89-92" />
<code_context>
 def Final_Energy_by_Sector__Transportation(
     n: pypsa.Network,
     aggregate_per_year: bool = True,
+    energy_totals: pd.DataFrame | None = None,
 ) -> pd.Series | pd.DataFrame:
-    """Extract transportation-sector final energy from a PyPSA Network.
</code_context>
<issue_to_address>
**issue (bug_risk):** Align `energy_totals` typing and handling with actual usage (path vs DataFrame and `None` case).

`energy_totals` is annotated and defaulted as `pd.DataFrame | None`, but is passed directly to `get_energy_totals_domestic_share`, which uses `pd.read_csv(energy_totals, ...)` and thus expects a path-like object and not `None`. Please either:
- Change the annotation to a path type (e.g. `Path | str`) to match usage, and
- Make `energy_totals` required or add an explicit `None` check with a clear error at the top of the function.

This will avoid confusing runtime errors and keep the interface consistent with `_execute_function_for_variable`, which currently passes a path.
</issue_to_address>

### Comment 2
<location path="pypsa_validation_processing/utils.py" line_range="146-155" />
<code_context>
+## UTILS FUNCTIONS
+
+
+def get_energy_totals_domestic_share(
+    energy_totals: Path,
+    kind: str,
+) -> pd.Series:
+    """
+    Return the domestic share of energy totals for a given kind.
+
+    Parameters
+    ----------
+    energy_totals
+        The energy totals data frame filtered to one energy year.
+    kind: {'aviation', 'navigation'}
+        The kind of energy totals to calculate the factor for.
+
+    Returns
+    -------
+    :
+        The share of national aviation or navigation per country.
+    """
+    # TODO generalize energy totals for all countries
+    energy_totals = pd.read_csv(energy_totals, index_col="country").loc["AT"]
+    energy_totals = energy_totals[energy_totals.year == 2020]
+    domestic = energy_totals[f"total domestic {kind}"]
+    international = energy_totals[f"total international {kind}"]
+    return (domestic / (domestic + international)).values[0]
</code_context>
<issue_to_address>
**issue:** Fix type annotations and docstring of `get_energy_totals_domestic_share` to match implementation and return value.

There are mismatches between the type hints/docstring and the actual behavior:
- `energy_totals` is annotated as `Path`, but the docstring describes a filtered `DataFrame` while the code uses `pd.read_csv(energy_totals, ...)`. Either update the docstring to describe a CSV path, or change the function to accept a `DataFrame` and move `read_csv` outside.
- The return type is annotated as `pd.Series`, but the function returns a scalar (`.values[0]`). Please either change the annotation to a scalar type (e.g. `float`) or return a `Series`/`DataFrame` consistently.
Making these consistent will avoid confusion and potential misuse of the function.
</issue_to_address>

### Comment 3
<location path="pypsa_validation_processing/class_definitions.py" line_range="291-292" />
<code_context>
             kwargs["config"] = config
         if "aggregate_per_year" in params:
             kwargs["aggregate_per_year"] = self.aggregate_per_year
+        if "energy_totals" in params:
+            kwargs["energy_totals"] = (
+                self.network_results_path / "resources" / "energy_totals.csv"
+            )
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Clarify that `energy_totals` is a file path and keep it consistent with the called function.

Here `energy_totals` is always a file path (`self.network_results_path / "resources" / "energy_totals.csv"`), which matches `get_energy_totals_domestic_share` (it calls `pd.read_csv`) but conflicts with `Final_Energy_by_Sector__Transportation`, which types `energy_totals` as `pd.DataFrame | None`. Please either:

- Standardize on a path-like type (and update the function annotation), or
- Load the CSV here and pass a `DataFrame` consistently.

Aligning the type across caller and callee will prevent confusion and type-related bugs.

Suggested implementation:

```python
        if "energy_totals" in params:
            # Pass energy_totals as a path-like pointing to the CSV so that
            # all downstream functions handle loading consistently.
            kwargs["energy_totals"] = (
                self.network_results_path / "resources" / "energy_totals.csv"
            )

        return func(n, **kwargs)

```

```python
from os import PathLike
from pathlib import Path

def Final_Energy_by_Sector__Transportation(
    network: pypsa.Network,
    scenario: str,
    energy_totals: str | Path | PathLike | None = None,
    aggregate_per_year: bool | None = None,
) -> pd.DataFrame:

```

```python
def Final_Energy_by_Sector__Transportation(
    network: pypsa.Network,
    scenario: str,
    energy_totals: str | Path | PathLike | None = None,
    aggregate_per_year: bool | None = None,
) -> pd.DataFrame:
    """
    ...
    Parameters
    ----------
    network : pypsa.Network
        ...
    scenario : str
        ...
    energy_totals : str | Path | os.PathLike | None, optional
        Path to the ``energy_totals.csv`` file. If None, the default location
        will be used. The CSV will be loaded internally as a DataFrame.
    aggregate_per_year : bool | None, optional
        ...
    """
    if energy_totals is None:
        energy_totals_df = load_energy_totals_default()
    else:
        energy_totals_df = pd.read_csv(energy_totals)

    # use energy_totals_df instead of energy_totals below
    ...

```

```python
def get_energy_totals_domestic_share(
    energy_totals: str | Path | PathLike,
    country: str,
) -> pd.DataFrame:
    """
    ...
    Parameters
    ----------
    energy_totals : str | Path | os.PathLike
        Path to the ``energy_totals.csv`` file.
    country : str
        Country code or name to filter domestic shares.
    """
    return pd.read_csv(energy_totals).query("country == @country")

```

I assumed the signatures and docstrings of `Final_Energy_by_Sector__Transportation` and `get_energy_totals_domestic_share` based on your description. You will need to:

1. Adjust the `SEARCH` blocks to match the exact existing function definitions and docstrings in `class_definitions.py`.
2. Replace the placeholder `...` in the function bodies with your existing implementation, making sure all internal uses of `energy_totals` are switched to `energy_totals_df` after loading.
3. Ensure `PathLike` / `Path` imports do not conflict with any existing imports; if `PathLike` or `Path` are already imported elsewhere, remove the duplicate import lines from the replacement block.
</issue_to_address>

### Comment 4
<location path="tests/test_statistics_functions.py" line_range="133-136" />
<code_context>
+            assert "unit" in result.index.names
             assert len(result) > 0

+    def test_raises_without_energy_totals(self, mock_network: MockPyPSANetwork):
+        """Current implementation requires energy_totals to compute domestic shares."""
+        with pytest.raises(ValueError, match="Invalid file path or buffer object type"):
+            Final_Energy_by_Sector__Transportation(mock_network)
+

</code_context>
<issue_to_address>
**suggestion (testing):** Error-path test for missing `energy_totals` is tightly coupled to pandas’ error message rather than the function’s intended contract

The test currently asserts `pytest.raises(ValueError, match="Invalid file path or buffer object type")`, which ties it to pandas’ `read_csv` wording instead of the function’s contract that `energy_totals` is required. This is brittle and obscures the intended behavior.

Consider either:
- Raising a clearer, domain-specific error when `energy_totals` is `None`/invalid and asserting on that, or
- Relaxing or removing `match=` and asserting only that a `ValueError` is raised when `energy_totals` is omitted.

This will decouple the test from pandas internals and better express the contract.

```suggestion
    def test_raises_without_energy_totals(self, mock_network: MockPyPSANetwork):
        """Current implementation requires energy_totals to compute domestic shares."""
        with pytest.raises(ValueError):
            Final_Energy_by_Sector__Transportation(mock_network)
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread pypsa_validation_processing/utils.py
Comment thread pypsa_validation_processing/class_definitions.py
Comment thread tests/test_statistics_functions.py
Copy link
Copy Markdown
Collaborator

@pworschischek-aggmag pworschischek-aggmag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss any time you are ready.

Comment thread pypsa_validation_processing/class_definitions.py
Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread pypsa_validation_processing/statistics_functions.py Outdated
Comment thread tests/test_statistics_functions.py
Comment thread pypsa_validation_processing/statistics_functions.py
Copy link
Copy Markdown
Collaborator

@pworschischek-aggmag pworschischek-aggmag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ready to merge. Just a note on how we treat unexpected stuff in PyPSA-AT and a small simplification suggestion.

Comment thread pypsa_validation_processing/statistics_functions.py
Comment thread pypsa_validation_processing/statistics_functions.py
Comment thread pypsa_validation_processing/statistics_functions.py
@maxnutz maxnutz merged commit b87aa2a into main Apr 23, 2026
2 checks passed
@maxnutz maxnutz deleted the 19-enhance-variable-statistics-final-energy-by-sectortransportation branch April 23, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance Variable Statistics: Final Energy [by Sector]|Transportation

2 participants