v1.0.3: Fix critical DMA bugs in crt0 startup code#105
Open
CTalkobt wants to merge 37 commits into
Open
Conversation
The crt0 startup code uses STZ to clear DMA registers ($D700, $D703), but the Z register is not guaranteed to be zero at program entry. If Z holds a non-zero value from the prior context, the DMA list address and trigger register receive incorrect values, causing silent corruption during ZP save/restore. Fix: add `ldz #0` before the first DMA invocation in both crt0.s (stack convention) and crt0_zp.s (ZP convention). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The DMA list address was written to the wrong F018B registers: - MSB was going to $D702 (bank register) - $D703 was being cleared (control register, should not be used as trigger) - LSB was going to $D701 (MSB register) - $D700 was cleared to 0 (triggering DMA with wrong list address) Fixed to use the correct F018B sequence: 1. $D703 ← EN018B=1 (enable F018B enhanced DMA list format) 2. $D702 ← 0 (bank) 3. $D701 ← MSB of DMA list address 4. $D700 ← LSB of DMA list address (triggers DMA) Applied to both crt0.s (stack convention) and crt0_zp.s (ZP convention). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The DMA lists are each one byte short of an F018B list (modulo is implemented as a |
…, optimizer BRK corruption Four bugs caused the game_of_life clear_grid/step crash: 1. Proc parameter offset assignment reversed (AssemblerParser.cpp) The proc directive assigned stack offsets in reverse iteration order, causing the first declared param to get the highest offset. With the compiler's right-to-left push, the first param is closest to SP (lowest offset). This swapped memset's dest/count params, filling 10832 bytes from $03E8 and overwriting the code segment. Fix: changed loop from reverse to forward iteration. 2. rtn #0 skips callee stack cleanup (AssemblerGenerator.cpp, AssemblerParser.cpp) Hand-written stdlib functions use rtn #0 which emitted plain RTS, but the callee-cleanup convention requires RTS #N to pop params. Fix: rtn #N now auto-adds currentProc->totalParamSize. 3. memcpy return value offset wrong after PHA restore (memcpy.s) After PLZ/STZ restored saved ZP bytes, the return value loading still used +2 offsets for the (now-popped) PHA saves. Fix: use __sp_base+_p_dest (no +2) for post-restore access. 4. Optimizer tail-dedup BRK corruption (cherry-pick of 8514aed from dev_v1.1) Optimizer-created BRA/label/RTS statements had empty segmentName, causing the generator to skip them (segment filter mismatch), leaving BRK gap bytes. This corrupted step()'s PLZ frame cleanup. Also: palette_fade Makefile now auto-builds lib as a dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
memset used `cpy $05` (compare Y with count_lo) to detect loop end, but Y tracks page offset while count decrements independently. When count reaches 0, Y != count_lo in most cases, so the loop overflows — writing ~65000 extra bytes and corrupting memory. Fix: changed `cpy $05` to `ldx $05` to check if count_lo is zero (when count_hi is already zero). Applied to both stack and ZP variants. Added examples/c/memset_screen: fills screen with each character value 0-255 in a loop, exercising memset with a 1000-byte fill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RTS #N ($62 nn) is unreliable on some 45GS02 hardware, causing stack leaks that corrupt return addresses after repeated function calls. Changed calling convention from callee-cleanup (RTS #N) to caller-cleanup (PLZ x N after JSR): - IRCodeGen.cpp: emit PLZ instructions after each stack-convention call to pop argument bytes, instead of relying on callee's RTS #N - AssemblerGenerator.cpp: endproc now always emits plain RTS ($60); reverted rtn auto-add of procParamSize - AssemblerParser.cpp: endproc sizing always 1 byte; reverted rtn sizing changes - memcpy.s: use plain rts instead of rtn #0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add AZ-pair frame load/store ops (ldaz.fp, staz.fp) that avoid ZP scratch usage. IRCodeGen now loads I16 hi byte into Z for frame destinations. stax.fp rewritten to use pha/txa/taz/pla instead of ZP scratch. NOTE: Multiple mmemu execution tests currently failing — I16 frame store changes need debugging before this is ready. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous staz.fp changes broke I16 frame stores by: 1. Loading hi byte into Z instead of X for frame-destined constants, but canSkipTransfer optimization left X unloaded even when b1!=b0 2. storeVreg checked if Z was "known" (any value) rather than whether Z actually held the hi byte Fix: Add valueByte_[4] tracking to IRCodeGen that records which register was actually loaded with each value byte. CONST always loads X for I16 hi byte (standard AX convention). storeVreg checks valueByte_[1] to pick staz.fp (when Z holds hi) vs stax.fp (safe default). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: The linked crt0 uses F018B DMA to save/restore ZP ($08-$FF). The mmemu emulator's DMA handler executes the copy correctly but then terminates the program instead of resuming CPU execution, so _main() never runs. The direct-compile path (cc45 → ca45) uses an inline startup stub with a loop-based ZP save — no DMA — which works fine in mmemu. Fix: - Add compile_direct_test() helper to test_mmemu.sh - Switch all 5 mega65.h tests from compile_link_test to compile_direct_test - Keyboard test: inline C reimplementation of key_pressed() since direct compile doesn't link the stdlib - Align DMA job data in crt0.s as defensive measure for linked mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restore linked compilation path for mega65.h hardware register tests so they exercise the full crt0 + stdlib pipeline. These tests will fail until mmemu#45 (F018B DMA CPU halt bug) is resolved. The compile_direct_test helper is retained for future use. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a CONST I16 is immediately followed by STORE to a frame-allocated local, fuse into lda #lo / ldz #hi / staz.fp — loads hi byte into Z directly, avoiding the pha/txa/taz/pla transfer that stax.fp requires. Also set valueByte_[0..1] in the STORE handler's store-forwarding path so non-CONST frame stores correctly identify A:X as the value source. Saves ~5 bytes per constant frame initialization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New AY-pair frame load/store ops for use when the Z register holds a value that must be preserved (e.g., loop counter, I32 byte 3): lday.fp offset — load 16-bit from frame into A (lo) and Y (hi) stay.fp offset — store A (lo) and Y (hi) to frame Completes the register-pair frame access family: ldax.fp/stax.fp — AX pair (standard, X→Z transfer in stax.fp) lday.fp/stay.fp — AY pair (Z-preserving) ldaz.fp/staz.fp — AZ pair (X-free, preferred for constants) ldaxyz.fp/staxyz.fp — AXYZ quad (32-bit) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed 32-bit right shifts by 8, 16, or 24 bits now use byte shuffles with sign extension instead of looping through single-bit asr.32 ops. Before: >> 16 emitted 16 iterations of asr.32 .AXYZ (~160 bytes) After: >> 16 emits 10 instructions (~15 bytes) with sign extension Sign extension uses CMP #$80; LDA #0; SBC #0 to produce $FF (negative) or $00 (positive) for the vacated high bytes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document that the O(N²×Z) expiry loop is a theoretical concern only: N rarely exceeds 200 vregs, Z is capped at 64 ZP slots, so worst case is ~12K iterations — microseconds in practice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New tool and common library for creating, reading, and manipulating Commodore disk images: Library (DiskImage base + 4 format implementations): - D64: C64 1541, 35 tracks, variable sectors/track (170KB) - D71: C128 1571, 70 tracks, double-sided D64 (340KB) - D81: C65/MEGA65 1581, 80×40 uniform (800KB) - D65: MEGA65 native, 162 tracks, double-sided D81 (1.6MB) Common operations: format, add/remove/extract files, list directory, BAM management, PETSCII filename conversion, sector chain traversal. CLI tool (disk45): disk45 create <image> [-n name] [-i id] disk45 list|info <image> disk45 add <image> <file> [cbm_name] disk45 extract <image> <cbm_name> <file> disk45 remove <image> <cbm_name> All four formats verified with round-trip data integrity tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GZ compression (transparent): - Any disk image can be compressed: .d64.gz, .d81.gz, .d65.gz etc. - Auto-detected on load (magic bytes 1F 8B), .gz extension on save - Uses system zlib for inflate/deflate - 819KB D81 → ~900 bytes when empty, proportional with content ARK (Arkive) archive format: - Uncompressed CBM file collection (29-byte directory entries) - Full read/write/add/remove/extract support - Block-aligned (254 bytes) data storage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Commodore ARC archive format with full decompression: - Mode 0: Stored (uncompressed) - Mode 1: Packed (RLE with configurable control byte) - Mode 2: Squeezed (Huffman coding) - Mode 3: Crunched (LZW 12-bit, Terry Welch algorithm) - Mode 4: Squeezed + Packed (Huffman + RLE) - Mode 5: Crunched one-pass (LZW with trailing checksum) Write support uses stored mode (mode 0). SDA (Self-Dissolving Archive) headers are detected and skipped automatically. Format details based on Peter Schepers' ARC.TXT specification and the cbmconvert unarc.c reference (clean-room reimplementation). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive documentation covering: - All commands (create, list, info, add, extract, remove) - All disk formats (D64, D71, D81, D65) with capacity/layout details - Archive formats (ARK, ARC/SDA) with compression mode table - GZ transparent compression - PETSCII filename handling - Makefile integration examples - C++ library API reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document disk45 in the main codebase reference: supported formats, commands, usage examples, and library API. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LNX (Lynx) archive format:
- BASIC stub with "LYNX" signature + ASCII directory + 254-byte
block-aligned data
- Full read/write/add/remove/extract support
- Directory entries stored as ASCII (CR-terminated fields)
New dump command for disk images:
disk45 dump <image> # BAM/header hex dump
disk45 dump <image> <track> # all sectors on a track
disk45 dump <image> <track> <sec> # single sector hex dump
Output includes hex bytes + ASCII printable characters.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LNX (Lynx) archive format:
- BASIC stub + ASCII header + 254-byte block-aligned file data
- Full read/write/add/remove/extract support
New commands:
disk45 rename <image> <old> <new> — rename file in directory
disk45 label <image> [-n name] [-i id] — change disk name/ID
disk45 validate <image> — check BAM consistency
Reports: cross-linked sectors, orphaned sectors, broken chains
disk45 bam <image> — visual sector allocation map
Shows per-track free/used with . and # characters
setDiskName/setDiskId added to all four disk format classes.
renameFile and validate added to DiskImage base class.
isSectorFree made public for BAM visualization.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When adding or extracting SEQ files, -p converts between ASCII and PETSCII text encoding: disk45 add image.d81 readme.seq "README" -p # ASCII → PETSCII disk45 extract image.d81 "README" out.seq -p # PETSCII → ASCII Conversion handles: - Case mapping: a-z ↔ $41-$5A, A-Z ↔ $C1-$DA - Line endings: LF ↔ CR (strips \r for CRLF input) - Special chars: ~ ↔ π, | ↔ bar, \ ↔ £ Round-trip verified: ASCII → PETSCII → ASCII preserves content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete documentation refresh covering all 11 commands, LNX format, -p/--petscii flag, command summary table, and library API updates for renameFile/setDiskName/setDiskId/validate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
disk45 lock <image> <name> Set bit 6 of file type (locked) disk45 unlock <image> <name> Clear bit 6 (unlocked) Locked files display with '<' suffix in directory listing (CBM convention). lockFile() method added to DiskImage base class. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adapted from SDCC's GCC 8.2 torture test subset (gte/ directory), filtered for cc45 compatibility (no float, long long, printf, malloc). Current status: 39/480 compile, remainder blocked by: - #106: type specifier combinations (long int, short int) — ~15 tests - #107: multi-variable declarations (int a, b, c) — ~70 tests - #108: implicit int return type / K&R functions — ~140 tests - #109: anonymous/inline struct declarations — ~31 tests Tests use testfwk.h adapter which maps abort() → $4000=0xFF and exit(0) → $4000=0xAA for mmemu validation. adapt_tests.sh can re-generate from a fresh SDCC SVN checkout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tc.) Accept standard C type specifier combinations where multiple keywords form a single type: long int, short int, signed int, unsigned int, signed long, unsigned long, signed short, unsigned short, signed char, unsigned char After matching LONG or SHORT, an optional trailing INT token is consumed. Applied to all type-parsing locations: variable declarations, function return types, parameters, casts, sizeof, alignof, va_arg, typedef, and function pointer signatures. Fixes #106. Unblocks GTE torture tests using combined type specifiers. GTE compile count: 39 → 40 (remainder blocked by #107/#108/#109). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Standard C library functions for program termination: abort() — weak, calls __abort (BRK). Override for pre-abort hooks. exit() — weak, calls __exit (ZP restore + RTS). Override for atexit. _exit() — strong, always calls __exit directly. Non-overridable. __abort — core abort implementation in crt0 (BRK instruction). __exit — core exit implementation in crt0 (ZP restore + SP restore + RTS). Design: users override the weak abort()/exit() for pre-termination hooks (cleanup, logging), then call __abort/__exit for actual termination. Both stack and zpCall convention versions provided. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
testfwk.h now just includes <stdlib.h> and <string.h>. No macro overrides of abort/exit — tests use the real library functions. GTE tests should be compiled with -c and linked with c45.lib to get abort() and exit() from the standard library. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse comma-separated declarators after the type specifier: int a, b = 3, *c, d[4]; static int x = 1, y = 0; unsigned char m = 0xAA, n = 0xBB; Each additional declarator reuses the base type, qualifiers, and signedness. Pointer levels, array dims, and initializers are parsed independently per declarator. Multiple declarations are wrapped in a CompoundStatement. Global multi-var declarations propagate isGlobal/isStatic/isExtern/ isSigned flags to all declarators in the compound. Fixes #107. GTE compile count: 54 → 72 (+18 tests unblocked). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implicit int (C89):
foo() { return 42; } — return type defaults to int
static bar(int x) { ... } — works with storage class specifiers
K&R parameter declarations:
int add(a, b)
int a;
int b;
{ return a + b; }
Parameters in the K&R list default to int if no type declaration
follows. Type declarations after ')' update matching parameter
names with the declared type, pointer level, and qualifiers.
Also fixed: the top-level parser now consumes extern/static/inline
tokens before calling parseFunctionDeclaration() for the implicit-int
code path.
Fixes #108. GTE compile count: 72 → 80 (+8 tests unblocked).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Support inline struct/union definitions in all contexts:
struct { int x; int y; } point; — anonymous struct variable
struct RGB { int r,g,b; } color; — named struct + variable
typedef struct { int x; } Point; — typedef with anon struct
typedef struct Vec { int dx; } Vec; — typedef with named struct
void foo(struct { int a; } *p) { ... } — in parameters/returns/casts
Anonymous structs get auto-generated names (<anon_struct_N>).
Inline definitions are registered via pendingDefinitions and emitted
before the variable declaration that uses them.
Applied to all 9 type-parsing locations: variable declarations, function
return types, parameters, casts, sizeof, alignof, va_arg, typedef, and
function pointer signatures.
Also handles struct Name { ... } var; at both global and local scope
(previously only struct Name { ... }; was accepted).
Fixes #109. GTE compile count: 80 → 87 (+7 tests unblocked).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hers Parse GCC-style __attribute__((...)) in function declarations, variable declarations, and after struct definitions. Recognized attributes (silently accepted): - noinline: prevent function inlining (no-op, cc45 doesn't auto-inline aggressively) - noclone: prevent function cloning (no-op, cc45 doesn't clone) - packed: packed struct layout (already default in cc45) Warned and skipped (#110-#114): - noipa: no interprocedural analysis (cc45 has no IPA) - aligned: alignment control (use _Alignas instead) - mode: force type width (QI/HI/SI/byte) - vector_size: SIMD vectors (not applicable to 8-bit) - __may_alias__: type punning (cc45 has no TBAA) Unknown attributes emit a warning and are skipped. Both __attribute__ and __attribute forms accepted. Handles double-paren syntax ((attr)), comma-separated lists, and parenthesized arguments. GTE compile count: 87 → 91 (+4 tests unblocked). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compiler-level (parsed in parsePrimary):
__builtin_constant_p(x) — returns 1 for literals/constant expressions,
0 for variables/calls. Evaluates at parse time using AST node types.
__builtin_expect(x, v) — returns x (branch hint, no-op)
__builtin_trap() — maps to BRK via __abort
__builtin_unreachable() — no-op (UB if reached)
Library-level (individual source files in lib/stdlib/):
String: __builtin_memcpy, __builtin_memset, __builtin_memmove,
__builtin_memcmp, __builtin_strlen, __builtin_strcpy, __builtin_strcmp
— each wraps the corresponding stdlib function
Math: __builtin_abs, __builtin_labs — inline C
Bit ops: __builtin_ffs, __builtin_clz, __builtin_ctz, __builtin_popcount
— hand-written 45GS02 assembly
Other: __builtin_bswap16 (C), __builtin_trap (asm → __abort)
Both stack and zpCall convention versions provided.
Declarations added to <stdlib.h> and <string.h>.
GTE compile count: 91 → 97 (+6 tests unblocked).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Unnamed function parameters (valid in prototypes):
void foo(int, char *);
int bar(int, int, ...);
Auto-generates internal names (__unnamed_N) for unnamed params.
Multi-variable struct/union member declarations:
struct Point { long p_x, p_y; };
struct RGB { unsigned char r, g, b; };
struct Mixed { int a, *b, c[3]; };
Each additional member after comma reuses the base type/qualifiers.
Supports pointer levels, array dims, and bitfield widths per member.
GTE compile count: 97 → 116 (+19 tests unblocked).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
__attribute__ now accepted in all standard GCC positions:
- Between return type and function name: void __attribute__((noinline)) foo()
- After struct/union keyword: struct __attribute__((packed)) S { }
- Before/after typedef alias: typedef int __attribute__((aligned)) T;
- After variable name: int x __attribute__((unused));
- After struct member type: struct { int __attribute__((packed)) x; }
- After struct def before variable: struct S { } __attribute__((packed)) var;
Implicit function declarations (C89):
- Calling undeclared functions now emits a warning instead of an error
- Function is assumed to return int with unspecified parameters
- Enables compilation of code that relies on C89 implicit declarations
GTE compile count: 116 → 202 (+86 tests, 42.1%).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ositions __extension__ (GCC): Recognized as a token and silently skipped in expressions, declarations, and qualifier loops. No semantic effect — just suppresses pedantic warnings in GCC, which cc45 doesn't need. void* local declarations: void *p = ...; now accepted in local scope (void was missing from the statement-level type list that triggers parseVariableDeclaration). Also added void to parseVariableDeclaration's type matching. GTE compile count: 202 → 208 (+6 tests, 43.3%). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
$D700was triggered with 0 instead of the actual list address. Corrected to: bank→$D702, MSB→$D701, LSB→$D700(trigger). Also explicitly enables EN018B mode via$D703.stzclears without validation. Addedldz #0before first DMA invocation.Both fixes applied to
crt0.s(stack convention) andcrt0_zp.s(ZP convention).Test plan
make test)🤖 Generated with Claude Code