Bug Description
When iterating over note entries inside a coredump ELF binary (ET_CORE type) using iter_notes(), the notes parser unconditionally decodes note type 3 as a process information note (NT_PRPSINFO), which expects a 124-byte struct payload (Elf_Prpsinfo).
However, type 3 inside a coredump note segment is only NT_PRPSINFO if the note name is 'CORE'. If the note name is 'GNU', type 3 represents a standard NT_GNU_BUILD_ID note containing a raw binary build UUID (typically 16 or 20 bytes long).
Because pyelftools ignores the Note Name and switches the enums map solely on the ELF type ET_CORE, it incorrectly attempts to parse the 20-byte GNU Build ID note payload as the 124-byte Elf_Prpsinfo structure, leading to a fatal parser overrun exception:
elftools.common.exceptions.ELFParseError: expected 4, found 0
Steps to Reproduce
Run the following minimal, self-contained Python script to reproduce the crash. It builds a valid 32-bit little-endian ARM coredump (ET_CORE type) containing a standard GNU Build ID note segment (type 3, name 'GNU'), and attempts to parse it using iter_notes():
import subprocess
from elftools.elf.elffile import ELFFile
from elftools.elf.segments import NoteSegment
# Construct a minimal, valid binary coredump containing a GNU Build ID note segment (type 3, name GNU)
# Using a 32-bit little-endian EM_ARM core header
data = bytearray()
# ELF Header (32-bit, little-endian)
data.extend(b"\x7fELF\x01\x01\x01\x00" + b"\x00" * 8) # e_ident
data.extend(b"\x04\x00") # e_type = ET_CORE (4)
data.extend(b"\x28\x00") # e_machine = EM_ARM (40)
data.extend(b"\x01\x00\x00\x00") # e_version = 1
data.extend(b"\x00\x00\x00\x00") # e_entry = 0
data.extend(b"\x34\x00\x00\x00") # e_phoff = 52
data.extend(b"\x00\x00\x00\x00") # e_shoff = 0
data.extend(b"\x00\x00\x00\x00") # e_flags = 0
data.extend(b"\x34\x00") # e_ehsize = 52
data.extend(b"\x20\x00") # e_phentsize = 32
data.extend(b"\x01\x00") # e_phnum = 1
data.extend(b"\x00\x00") # e_shentsize = 0
data.extend(b"\x00\x00") # e_shnum = 0
data.extend(b"\x00\x00") # e_shstrndx = 0
# Program Header (PT_NOTE Segment)
data.extend(b"\x04\x00\x00\x00") # p_type = PT_NOTE (4)
data.extend(b"\x54\x00\x00\x00") # p_offset = 84
data.extend(b"\x54\x00\x00\x00") # p_vaddr = 84
data.extend(b"\x54\x00\x00\x00") # p_paddr = 84
data.extend(b"\x18\x00\x00\x00") # p_filesz = 24
data.extend(b"\x18\x00\x00\x00") # p_memsz = 24
data.extend(b"\x00\x00\x00\x00") # p_flags = 0
data.extend(b"\x04\x00\x00\x00") # p_align = 4
# Note Header (n_namesz=4, n_descsz=8, n_type=3)
data.extend(b"\x04\x00\x00\x00") # n_namesz = 4
data.extend(b"\x08\x00\x00\x00") # n_descsz = 8
data.extend(b"\x03\x00\x00\x00") # n_type = 3
data.extend(b"GNU\x00") # n_name = 'GNU'
data.extend(b"\xde\xad\xc0\xde\xde\xad\xc0\xde") # n_desc = 8 bytes Build ID (deadc0dedeadc0de)
# Write the constructed ELF data to disk
elf_filename = 'mock_core.elf'
with open(elf_filename, 'wb') as f:
f.write(data)
print(f"Written mock core ELF to '{elf_filename}' ({len(data)} bytes)")
# Execute 'readelf --notes' on the written file
print("\nExecuting 'readelf --notes %s':" % elf_filename)
result = subprocess.run(['readelf', '--notes', elf_filename], capture_output=True, text=True, check=True)
print("--- readelf output start ---")
print(result.stdout.strip())
print("--- readelf output end ---")
# Open the file from disk and parse with pyelftools
print("\nParsing with pyelftools:")
with open(elf_filename, 'rb') as f:
elf = ELFFile(f)
note_seg = next(seg for seg in elf.iter_segments() if isinstance(seg, NoteSegment))
notes = list(note_seg.iter_notes())
assert len(notes) == 1, "Expected 1 note, got %d" % len(notes)
note = notes[0]
print("Parsed Note Type:", note['n_type'])
print("Parsed Note Name:", note['n_name'])
print("Parsed Note Desc:", note['n_desc'])
assert note['n_type'] == 'NT_GNU_BUILD_ID', "Expected NT_GNU_BUILD_ID, got %r" % note['n_type']
assert note['n_name'] == 'GNU', "Expected 'GNU', got %r" % note['n_name']
expected_desc = 'deadc0dedeadc0de'
assert note['n_desc'] == expected_desc, "Expected %r, got %r" % (expected_desc, note['n_desc'])
print("Success!")
Expected Behavior
iter_notes() should successfully parse the note segment. It should yield the GNU Build ID note entry with type NT_GNU_BUILD_ID and the description payload.
Actual Behavior
iter_notes() crashes with ELFParseError:
File ".../site-packages/elftools/elf/notes.py", line 45, in iter_notes
note['n_desc'] = struct_parse(elffile.structs.Elf_Prpsinfo, elffile.stream, offset)
File ".../site-packages/elftools/common/utils.py", line 45, in struct_parse
raise ELFParseError(str(e))
elffile.common.exceptions.ELFParseError: expected 4, found 0
Root Cause Analysis
Inside elftools/elf/structs.py on lines 440-447:
self.Elf_Nhdr = Struct('Elf_Nhdr',
self.Elf_word('n_namesz'),
self.Elf_word('n_descsz'),
Enum(self.Elf_word('n_type'),
**(ENUM_NOTE_N_TYPE if e_type != "ET_CORE"
else ENUM_CORE_NOTE_N_TYPE)),
)
The note header type enum mapping is switched unconditionally based on the binary's e_type. If e_type == "ET_CORE", it maps note type 3 to 'NT_PRPSINFO' instead of 'NT_GNU_BUILD_ID'.
This dates back to #147 ("Better support for core dumps"), which assumed that every note segment inside a coredump ELF has coredump-specific types, overlooking the fact that coredumps can also contain standard generic GNU note segments.
Suggested Fix Strategy
The note type enums decoder mapping should not be switched unconditionally based solely on e_type. It should also check the Note Name (n_name) if it is a coredump file:
- If
n_name == 'GNU', use ENUM_NOTE_N_TYPE to decode the type (resolving type 3 to NT_GNU_BUILD_ID).
- If
n_name == 'CORE', use ENUM_CORE_NOTE_N_TYPE (resolving type 3 to NT_PRPSINFO).
Footnote
Note that it is rare, but valid, for a coredump to directly contain a NT_GNU_BUILD_ID note. Linux coredumps will instead typically contain the first page of all memory-mapped ELF files, and tools like GDB will discover the build ID(s) from those snapshots. However, other custom coredump producers (particularly embedded) may not have any memory-mapped ELF headers, and will emit the NT_GNU_BUILD_ID note directly.
Bug Description
When iterating over note entries inside a coredump ELF binary (
ET_COREtype) usingiter_notes(), the notes parser unconditionally decodes note type3as a process information note (NT_PRPSINFO), which expects a 124-byte struct payload (Elf_Prpsinfo).However, type
3inside a coredump note segment is onlyNT_PRPSINFOif the note name is'CORE'. If the note name is'GNU', type3represents a standardNT_GNU_BUILD_IDnote containing a raw binary build UUID (typically 16 or 20 bytes long).Because
pyelftoolsignores the Note Name and switches the enums map solely on the ELF typeET_CORE, it incorrectly attempts to parse the 20-byte GNU Build ID note payload as the 124-byteElf_Prpsinfostructure, leading to a fatal parser overrun exception:elftools.common.exceptions.ELFParseError: expected 4, found 0Steps to Reproduce
Run the following minimal, self-contained Python script to reproduce the crash. It builds a valid 32-bit little-endian ARM coredump (
ET_COREtype) containing a standard GNU Build ID note segment (type3, name'GNU'), and attempts to parse it usingiter_notes():Expected Behavior
iter_notes()should successfully parse the note segment. It should yield the GNU Build ID note entry with typeNT_GNU_BUILD_IDand the description payload.Actual Behavior
iter_notes()crashes withELFParseError:Root Cause Analysis
Inside
elftools/elf/structs.pyon lines 440-447:The note header type enum mapping is switched unconditionally based on the binary's
e_type. Ife_type == "ET_CORE", it maps note type3to'NT_PRPSINFO'instead of'NT_GNU_BUILD_ID'.This dates back to #147 ("Better support for core dumps"), which assumed that every note segment inside a coredump ELF has coredump-specific types, overlooking the fact that coredumps can also contain standard generic GNU note segments.
Suggested Fix Strategy
The note type enums decoder mapping should not be switched unconditionally based solely on
e_type. It should also check the Note Name (n_name) if it is a coredump file:n_name == 'GNU', useENUM_NOTE_N_TYPEto decode the type (resolving type3toNT_GNU_BUILD_ID).n_name == 'CORE', useENUM_CORE_NOTE_N_TYPE(resolving type3toNT_PRPSINFO).Footnote
Note that it is rare, but valid, for a coredump to directly contain a
NT_GNU_BUILD_IDnote. Linux coredumps will instead typically contain the first page of all memory-mapped ELF files, and tools like GDB will discover the build ID(s) from those snapshots. However, other custom coredump producers (particularly embedded) may not have any memory-mapped ELF headers, and will emit theNT_GNU_BUILD_IDnote directly.