Help with strings in byte buffer slices by tjpalmer · Pull Request #409 · temperlang/temper

tjpalmer · 2026-04-27T19:39:00Z

Support external and possible future needs for easy string storage in struct slices
Allow encoding into and decoding from byte buffer slices in single statements and/or expressions
Run direct mvn tests for be-java temper-core
The unit tests are mostly written by AI, but they look reasonable to me

Signed-off-by: Tom <tom@temper.systems>

tjpalmer · 2026-04-27T21:57:39Z

+        ByteBuffer source,
+        int sourceStart,
+        int sourceLength,
+        CharsetDecoder decoder


The charset decoder allows configuring how to handle incomplete chars at the end.

tjpalmer · 2026-04-27T21:58:57Z

+        CharsetDecoder decoder
+    ) throws CharacterCodingException {
+        if (decoder == null) {
+            decoder = StandardCharsets.UTF_8.newDecoder();


We likely either want platform encoding or utf8 in most cases. Probably depends on how often we expect to want structs for interchange vs just in process.

tjpalmer · 2026-04-27T21:59:47Z

+        String s,
+        ByteBuffer target,
+        int targetStart,
+        int targetLength,


Seems like length is better than end here since this is about fitting field definitions in data structs.

tjpalmer · 2026-04-27T22:00:15Z

+        int targetStart,
+        int targetLength,
+        CharsetEncoder encoder,
+        byte padByte


Could make overloads that default coder to null and pad to zero, but I haven't done that yet.

tjpalmer · 2026-04-27T22:01:53Z

+            while (target.hasRemaining()) {
+                target.put(padByte);
+            }
+            return written;


Returning bytes written requires comparing with wanted to know if it all fit. Maybe different overloads to report different things might be useful?

tjpalmer · 2026-04-27T22:12:04Z

+        CharsetDecoder decoder = StandardCharsets.ISO_8859_1.newDecoder();
+        String result = Core.decodeFromSlice(buffer, 0, 1, decoder);
+        assertEquals("£", result, "Should decode correctly using Latin-1");
+    }


At first, I only got utf8 testing, so I requested others.

tjpalmer · 2026-04-27T22:12:42Z

+        String text = "✨😀";
+        ByteBuffer buffer = ByteBuffer.allocate(10);
+        // Slice of 5 bytes at offset 0
+        // Result: "✨" (3 bytes) fits, "😀" (4 bytes) fails, 2 bytes padding


I also specifically requested a case for a partially fitting multibyte char.

tjpalmer · 2026-04-27T22:14:28Z

+        byte[] data = buffer.array();
+        assertEquals((byte)0xA3, data[0], "Latin-1 encoding for £");
+        assertEquals((byte)'?', data[1], "Replacement char for unmappable emoji");
+        assertEquals((byte)'.', data[2], "Padding");


I guess all this demonstrates why both "how many bytes got used" and "did the entire string fit" are both potentially interesting questions, depending on someone's use case. I don't know how to return both without allocation. Or maybe someone could pass in an object to receive the info that could be reused throughout a loop.

But what I have so far is likely good enough for my current needs.

tjpalmer · 2026-04-27T22:15:04Z

+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-surefire-plugin</artifactId>
+        <version>3.0.0</version>
+      </plugin>


I used the same versions that we generate into temper-built java projects.

tjpalmer · 2026-04-27T22:56:55Z

+
+tasks.named("check") {
+    dependsOn tasks.named("testJavaTemperCore")
+}


Test results here.

[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.144 s - in temper.core.SliceCoderTest

tjpalmer added 6 commits April 27, 2026 09:18

Add slice string coder helpers

db67a9e

Signed-off-by: Tom <tom@temper.systems>

Default to utf8; test others

68cd5cf

Signed-off-by: Tom <tom@temper.systems>

Test java temper-core

19f646a

Signed-off-by: Tom <tom@temper.systems>

Use double quotes

4a7c3aa

Signed-off-by: Tom <tom@temper.systems>

Fix test registration

0d0c010

Signed-off-by: Tom <tom@temper.systems>

Avoid download progress logging

637b55c

Signed-off-by: Tom <tom@temper.systems>

tjpalmer commented Apr 27, 2026

View reviewed changes

tjpalmer marked this pull request as ready for review April 27, 2026 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with strings in byte buffer slices#409

Help with strings in byte buffer slices#409
tjpalmer wants to merge 6 commits intomainfrom
struct-encode-string

tjpalmer commented Apr 27, 2026 •

edited

Loading

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

tjpalmer Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tjpalmer commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tjpalmer commented Apr 27, 2026 •

edited

Loading