aarch64: Restrict address mappings in loader.rs#464
aarch64: Restrict address mappings in loader.rs#464Ivan-Velickovic merged 3 commits intoseL4:mainfrom
Conversation
midnightveil
left a comment
There was a problem hiding this comment.
Fine. Just some minor code cleanups to do.
| .expect("Could not find 'end' symbol"); | ||
|
|
||
| if Aarch64::lvl1_index(text_addr) != Aarch64::lvl1_index(end_addr) { | ||
| eprintln!("ERROR: We only map 1GiB, but elfloader paddr range covers multiple GiB"); |
There was a problem hiding this comment.
I think we tend to use panic!("INTERNAL: <message>") here.
I'd also reword this: specifically the issue is the microkit loader (not elfloader) crosses a GiB boundary [or really that it requires multiple level 1 page tables].
It likely makes sense to do this change to both AArch64 and Riscv64, no sense it being different for either.
There was a problem hiding this comment.
I am not familiar with the Riscv64 MMU tables, and I do not have access to a testing platform. Is there any volunteer to do the adaptation?
There was a problem hiding this comment.
That's fair. We're planning on changing a few things after these are good to go, anyway, and having to retest.
There was a problem hiding this comment.
thank you. I noticed the CI doesn’t build for riscv64 without uart_addr. I’ll guard it to fix the build before your changes.
| if let Ok((uart_addr, _)) = elf.find_symbol("uart_addr") { | ||
| let data = elf | ||
| .get_data(uart_addr, 8) | ||
| .expect("uart_addr not initialized"); | ||
| let uart_base = u64::from_le_bytes(data[0..8].try_into().unwrap()); |
There was a problem hiding this comment.
(Not something you need to fix, just making this comment so I can link to it later):
It'd be nice if the kernel exposed the virtual address of where it wants its UART so we can map that in and get this feature working again.
|
|
||
| let mut boot_lvl2_lower: [u8; PAGE_TABLE_SIZE] = [0; PAGE_TABLE_SIZE]; | ||
|
|
||
| let pt_entry = (boot_lvl2_lower_addr | 3).to_le_bytes(); |
There was a problem hiding this comment.
can we be consistent with the other examples and split out the meanings of the 3 flags into the individual bits?
There was a problem hiding this comment.
This this is a 2 bit marker for the 1GB L1 table entry here, not individual bits
There was a problem hiding this comment.
Ah, right, we do |3 in a bunch of places below. I'll change that at a later point.
There was a problem hiding this comment.
Thanks, better to handle it globally yes. I’ll add a comment since “3” means different things depending on the granule (page or table)
0b1d9ca to
12dd82c
Compare
| let pt_entry: u64 = (entry_idx as u64 + start_addr) | | ||
| (1 << 10) | // Access flag | ||
| (3 << 8) | // Sharable | ||
| (3 << 2) | // MT_NORMAL memory |
There was a problem hiding this comment.
Looking at this more. Did you intentionally make this 3 << 2? That would correspond to MT_NORMAL_NC, not MT_NORMAL.
There was a problem hiding this comment.
Argh no! That’s an early debugging trace… thanks for noticing.
There was a problem hiding this comment.
Fixed: Note that I also had to update #465 to flush the cache before jumping to the secondary core
9dbc0d7 to
b62c440
Compare
|
EDIT: no this was just a dirty build. rerunning it everything works in debug mode (for non-smp) log
|
Limit the 1:1 address mapping in the loader to cover only the necessary blocks using level-2 translation tables and with 2MB blocks. Don't map device memory at all, therefore, printf must not be used in the elfloader after the MMU is enabled. This prevents speculative accesses to device or secure memory which otherwise would cause faults or unwanted side-effects. Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
The previous commit restricts the loader to only mapping itself rather than the first 1GB of memory. This is fine for most platforms as the only other thing they need to access is the UART which is now also separately mapped in. The one exception is the BCM2711/Raspberry Pi 4B where the a spin table is used for SMP booting which is located in the first page of memory, so we need to explicitly map that in as well. This code is a bit of hack since it assumes the CPU release addrs are always in the first 2MB of memory but the only platform we have right now that uses spin tables is the Raspberry Pi 4B. So we can merge the previous patch, we add this so that we don't have to regress the Raspberry Pi 4B. Signed-off-by: Ivan Velickovic <i.velickovic@unsw.edu.au>
38a2907 to
2ca916e
Compare
|
Rebased on main + pushed a temporary fix for the Raspberry Pi 4B so we don't regress that platform with this change. Also changed the mapping code so that it still maps the loader as strongly ordered, we did this because we don't want to worry about flushing caches for this PR (without flushing caches we'd still have issues on Raspberry Pi 4B). |
Signed-off-by: Ivan Velickovic <i.velickovic@unsw.edu.au>
I should add that the reason I say 'temporary' is since @midnightveil has in-progress patches to re-write all of this loader mapping code to be easier to read/better structured so we'll revisit as part of that. |
|
I'm also going to note down here why this was necessary, as it's not completely obvious from the fact, as the 1GB mapping the loader used was already See this discussion: seL4/seL4_tools#244 (comment) The conclusion is that ARM seems to allow speculative reads of instructions even in References in ARM ARM DDI 0487L.B, under §D8.4.4 "Effects on instruction execution permissions and restrictions on instruction fetch":
In contrast, under §B2.10.2 "Device memory", it states:
|
|
Just noticed that the commit log was copied from the CAmKes tool loader and the sentence:
doesn't apply for Microkit elf loader. Could you please amend the committed log and remove this sentence ? sorry about this, thank you. |
Hello,
This is a rework of the existing support committed in elfloader:
seL4/seL4_tools#244
and discussed here:
https://lists.sel4.systems/hyperkitty/list/devel@sel4.systems/thread/DJNDO5CUQBKGIA4SQHXCNQ3T6SKZEY4A/
This patch limits the 1:1 address mapping in the loader to cover only the necessary blocks using level-2 translation tables with 2MB blocks. The UART MMIO address is now exposed as a global uart_addr variable.
This prevents speculative accesses to device or secure memory, which could otherwise cause faults or unwanted side effects.
This has been tested only on STM32MP2. Testers on other platforms would be greatly appreciated.
Thank you for your review.