Skip to content

Transcode image format#102

Open
cargocultprogramming wants to merge 11 commits intopelican-plugins:mainfrom
cargocultprogramming:transcode-image-format
Open

Transcode image format#102
cargocultprogramming wants to merge 11 commits intopelican-plugins:mainfrom
cargocultprogramming:transcode-image-format

Conversation

@cargocultprogramming
Copy link
Copy Markdown

Pull request for #101

Feature implemented with new configuration options as proposed by @patrickfournier in #101 (comment).

The implementation of the feature caused two functions to get too long for the linter, so I had to refactor them into smaller chunks, which is why this PR got a bit heavy even though I made no unrelated changes. I added a new test file to test the format conversion and also did a limited manual test in an actual pelican project, though some more thorough testing probably won't hurt.

The entire setup is backwards-compatible, so if you don't change any settings, you'll end up with the same results as in the earlier version.

Hope this is useful - feel free to comment and change whatever needs commenting and changing.

@minchinweb
Copy link
Copy Markdown
Member

Thank you @cargocultprogramming ! I will take a bit more time to review the code, but here are some thoughts from my first pass review; please comment if any of these things are proving to be frustrating to resolve on your side.

  • thank you for providing tests!
  • most of the existing tests include a binary match with the pre-rendered image in test_data/results (here). Can you add such a test image and a matching (unit) test?
  • can you also make sure that test image is generated by our generate_test_images() function? (see here)
  • can you update the README.md file (which doubles as our documentation) with how to use it? You can probably start with the text of the Issue you raised earlier. It would also help to expand that to note an example of when this might be helpful (e.g. "This may be useful to convert from your camera's raw file format to something generally supported on the web"... or something).
  • (if known), would you also add a comment about which input and export formats are supported? (in the README/documentation)
  • what are you using to do the format conversions? Just Pillow?
  • this is probably a "minor" (aka feature) release, rather than a "major" release, as the changes are (meant to be) backwards compatible. Can you update this?

Copy link
Copy Markdown
Collaborator

@patrickfournier patrickfournier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work, this is very much appreciated.

Code

Overall, the implementation looks good. I left a few comments, mostly related to falling back to a default setting for the picture transform type. Maybe it was not clear from my comment in the related issue, but I was thinking the default output-format could be used with all transform types. So for pictures, we would have:

IMAGE_PROCESS = {
    "example-pict": {
        "type": "picture",
        "output-format": "webp",  # Default format
        "sources": [
            {
                "name": "default",
                "media": "(min-width: 640px)",
                "srcset": [
                    ("640w", ["scale_in 640 480 True"]),
                    ("1024w", ["scale_in 1024 683 True"]),
                    ("1600w", ["scale_in 1600 1200 True"]),
                ],
                "sizes": "100vw",
            },
            {
                "name": "source-1",
                "srcset": [
                    ("1x", ["crop 100 100 200 200"]),
                    ("2x", ["crop 100 100 300 300"]),
                ]
            },
        ],
        "default": ("default", "640w"),
    },
}

Tests

Please test all possible syntaxes:

  • Image replacement: please add a test for the short syntax ("article-image": (["scale_in 300 300 True"], 'webp') .
  • Responsive image: please add a test to check proper behavior when output-format is not specified.
  • Picture set: please add a test that uses the default output-format
  • For responsive images and/or picture, it would be nice to have a test for the case where default specifies a transform instead of a name, for example: "default": (["scale_in 1600 1200 True"], "webp") instead of "default": "800w"

Documentation

As @minchinweb mentioned, please update the README to document the feature and the new syntax.

Comment thread pelican/plugins/image_process/image_process.py Outdated
elif isinstance(default[1], list):
elif isinstance(default[1], (list, tuple)):
default_item_name = "default"
default_item_format = get_target_format(default[1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: shouldn't we fallback to IMAGE_PROCESS[derivative]['output-format'] ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread pelican/plugins/image_process/image_process.py
Comment thread pelican/plugins/image_process/image_process.py
Comment thread pelican/plugins/image_process/image_process.py Outdated
elif isinstance(default[1], list):
elif isinstance(default[1], (list, tuple)):
default_item_name = "default"
default_item_format = get_target_format(default[1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: shouldn't we fallback to IMAGE_PROCESS[derivative]['output-format'] if get_target_format() returns None?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this comment is addressed. If we have:

"picture_formats": {
        "type": "picture",
        "sources": [
            {
                "name": "webp-src",
                "output-format": "webp",
                "srcset": [
                    ("640w", ["scale_in 640 480 True"]),
                    ("1024w", ["scale_in 1024 768 True"]),
                ],
            },
        ],
        "default": ("webp-src", ["scale_in 500 500 True"]),
    },

default_item_format will not fallback to webp from output-format.

Comment thread pelican/plugins/image_process/image_process.py
@cargocultprogramming
Copy link
Copy Markdown
Author

cargocultprogramming commented Apr 12, 2026

I've added a number of commits today (1e959d5 to e22ee04), to address the comments of @minchinweb :

  • I removed the new test file and added tests for transcoding images to the main test file, including test image generation using generate_test_images() and the actual result images.
  • I updated the documentation in README.md with an explanation of the feature and a section about image file formats (and threw in a svg for illustration for good measure) and also changed the RELEASE.md as requested.

The tests that I added are:

  • Actual simple image transformations from one format to another and binary comparison similar to existing tests in main.
  • Complex transformations to show how specification of top-level file format transformation "output-format": "webp" works with per-item overrides in srcset and <picture>. However these tests only check if the urls found by beautifulsoup match the specified extensions and do not actually create and compare images. I think this approach is in-line with the other tests, but I'm not entirely sure. For example, this does not explicitly check if a given extension will actually be accepted by PIL (extension "foobar" will also work). Guidance appreciated.

@patrickfournier thanks for the code comments, I'll address those later, I'm all out of steam for today.

@cargocultprogramming
Copy link
Copy Markdown
Author

@patrickfournier thank you again for your comments. I hopefully addressed everything - see below.

Overall, the implementation looks good. I left a few comments, mostly related to falling back to a default setting for the picture transform type. Maybe it was not clear from my comment in the related issue, but I was thinking the default output-format could be used with all transform types. So for pictures, we would have:

IMAGE_PROCESS = {
    "example-pict": {
        "type": "picture",
        "output-format": "webp",  # Default format
        "sources": [
            {
                "name": "default",
                "media": "(min-width: 640px)",
                "srcset": [
                    ("640w", ["scale_in 640 480 True"]),
                    ("1024w", ["scale_in 1024 683 True"]),
                    ("1600w", ["scale_in 1600 1200 True"]),
                ],
                "sizes": "100vw",
            },
            {
                "name": "source-1",
                "srcset": [
                    ("1x", ["crop 100 100 200 200"]),
                    ("2x", ["crop 100 100 300 300"]),
                ]
            },
        ],
        "default": ("default", "640w"),
    },
}

I've added this syntax to the tests.

Tests

Please test all possible syntaxes:

  • Image replacement: please add a test for the short syntax ("article-image": (["scale_in 300 300 True"], 'webp') .

Added a test for this. This case was actually not covered, so I added this functionality.

  • Responsive image: please add a test to check proper behavior when output-format is not specified.

Test added.

  • Picture set: please add a test that uses the default output-format

Test added.

  • For responsive images and/or picture, it would be nice to have a test for the case where default specifies a transform instead of a name, for example: "default": (["scale_in 1600 1200 True"], "webp") instead of "default": "800w"

Test added. This case was also not covered yet in the code, but i'ts supported now.

Documentation

As @minchinweb mentioned, please update the README to document the feature and the new syntax.

Already covered in earlier commits.

Regarding your code comments / change requests - I think I covered everything in the last couple of commits. There is a fallback now when get_target_format() returns None, which was the major point. I commented the other points directly.

Copy link
Copy Markdown
Collaborator

@patrickfournier patrickfournier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes and the doc.

Unless I am mistaken, I think there are still two cases where we do not correctly fallback on output-format.

I also added a request for change in the test code, and a few corrections to the README. English is not my native language, so someone else should review this as well.

elif isinstance(default[1], list):
elif isinstance(default[1], (list, tuple)):
default_item_name = "default"
default_item_format = get_target_format(default[1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this comment is addressed. If we have:

"picture_formats": {
        "type": "picture",
        "sources": [
            {
                "name": "webp-src",
                "output-format": "webp",
                "srcset": [
                    ("640w", ["scale_in 640 480 True"]),
                    ("1024w", ["scale_in 1024 768 True"]),
                ],
            },
        ],
        "default": ("webp-src", ["scale_in 500 500 True"]),
    },

default_item_format will not fallback to webp from output-format.

elif isinstance(default[1], list):
elif isinstance(default[1], (list, tuple)):
default_item_name = "default"
default_item_format = get_target_format(default[1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transform_config = COMPLEX_FORMAT_TRANSFORMS[transform_id]
for url in urls:
ext = Path(url).suffix.lower()
expected_ext = self._determine_expected_ext(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having code to find the expected extension, we should have an explicit dict {transform_id1: [file1, file2, ...], ...} of expected file names.

The reason is, currently, the test is testing that _determine_expected_ext achieves the same result as harvest_images_in_fragment. It tests both the plugin code and the test code. If there is a mismatch, we need to hunt for the bug at both places. If we change the plugin code, we need to change the (not so simple) test code. An explicit list would be much easier to maintain.

Comment thread pelican/plugins/image_process/image_process.py
Comment thread pelican/plugins/image_process/image_process.py
Comment thread README.md
this example, it will use the image `640w` from the source `default`.
A list of operations could have been specified instead of `640w`.

Similar to `responsive image` described above, also `<picture>` allows the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Similar to `responsive image` described above, also `<picture>` allows the
Similar to `responsive image` described above, `<picture>` also allows the

Comment thread README.md

### Image File Formats

*Image Process* uses python's pillow library (PIL) to read and write files. The
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*Image Process* uses python's pillow library (PIL) to read and write files. The
*Image Process* uses Python's Pillow library (PIL) to read and write files. The

Comment thread README.md
### Image File Formats

*Image Process* uses python's pillow library (PIL) to read and write files. The
file formats, that pillow can read and write depend on libraries/plugins that
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
file formats, that pillow can read and write depend on libraries/plugins that
file formats that Pillow can read and write depend on libraries/plugins that

Comment thread README.md
`webp`), uncommon formats may cause issues depending on the system you are
working on.

To specify an image format for the derivative image, pillow will infer the image
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To specify an image format for the derivative image, pillow will infer the image
To specify an image format for the derivative image, Pillow will infer the image

Comment thread README.md

The *SVG* image format is omitted on purpose from the list above; it is a
*vector* image format (as opposed to the others, which are *raster* formats),
that is best used for logos and illustrations and you should not blindly convert
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that is best used for logos and illustrations and you should not blindly convert
that is best used for logos and illustrations. You should not blindly convert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants