Skip to content

Errors#41

Open
dabrahams wants to merge 49 commits intomainfrom
errors
Open

Errors#41
dabrahams wants to merge 49 commits intomainfrom
errors

Conversation

@dabrahams
Copy link
Collaborator

No description provided.

Copy link
Member

@camio camio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this submits the comments I made here. They ma have gotten lost?

@camio
Copy link
Member

camio commented Jan 28, 2026

Looks like you made a different PR? Here are my comments on the old one: #39 (review)

@dabrahams
Copy link
Collaborator Author

Looks like you made a different PR? Here are my comments on the old one: #39 (review)

Thanks; they were good. I mistakenly deleted the branch in the server, which closed the PR.

@RishabhRD
Copy link
Contributor

I was wondering if it is worthy to provide wisdom on some of common error handling (mis)conceptions widely spread in industry:

  1. try/catch just to log and rethrow.
  2. Enriching the error with "context". Rust's anyhow::Context is an example and thus is becoming common practice. Usually done by throwing a custom error type that wraps the error thrown by functions called by callee.
  3. Adding StackTrace to the error.

I have noticed the above patterns are common for "backend" applications (applications running on server, interacting with DB, etc).

I know the points I mentioned has nothing to do with correctness of the program and might be very contextual but the reason for pointing this out is I have noticed multiple people exchanging multiple theories/libraries around similar ideas(an example from r/rust) without even talking about design by contract. Maybe addressing this would help them.

Comment on lines +555 to +563
A useful middle ground is to describe reported errors at the module
level, e.g.

> Any `ThisModule` function that `throws` may report a
> `ThisModule.Error`.

A description like the one above does not preclude reporting other
errors, such as those thrown by a dependency like `Foundation`, but
calls attention to the error type introduced by `ThisModule`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this middle ground. Whatever error doesn't conform to the mentioned error protocol can be usually still handled by the same handling method, it just wouldn't be as helpful. E.g. if something doesn't conform to the LocalizedStringConvertible protocol as suggested by the method's documentation, we can still display the non-localized description to the user. In a robotics system, any non-conforming error can be regarded as a fatal subsystem failure, triggering the emergency stop procedure at the top level.

I think there would be also value to have a language feature that can add a similar annotation for the module: "All throwing functions in module/scope/file X, unless otherwise specified, must throw something of a given type." Then we can express our intents more explicitly, e.g. with a required isFatalSubsystemFailure flag.
When writing a compiler, we generally want to throw Diagnostics, but and it's useful to know if the function can throw anything else, or if the function only throws the expected Diagnostics, but otherwise it's not doing any sketchy stuff. When we have that information, we can wrap throwing a function into something that returns a partial AST and a set of diagnostics, without the need to rethrow any extra errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you're getting at with isFatalSubsystemFailure, but fatal errors are distinct from the recoverable kind we report by throwing. You don't want to unwind the stack if the condition is going to be fatal.

Also, is there a suggestion for the text here or are we just chatting?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fatal means slightly different things in a robotics context than in a regular application. When an application experiences a bug, it is reasonable to trap and expect the user to restart the application if they want. However, in a robotics system, we may need to perform safety measures, such as lowering a motorized arm slowly to avoid falling and damaging components. Often, the emergency stop also doesn't just cut the power but e.g. keeps holding onto suction cups so the robot doesn't drop a 10kg glass window.

Also, fatal subsystem failures can likely happen due to broken sensors, failed communication with motors, and I think should be distinguished from our regular fatal failures caused by bugs/precondition violations. Subsystem failures can lead to the graceful termination of that specific subsystem, while the same controller may go on with finishing some remaining tasks of other subsystems.

(I don't have any suggestions to the text, just discussing.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this discussion put me off of trying to recommend emergency shutdown measures (which were part of our general philosophy), but your robotics example is excellent. I think you should post about it in that thread. The biggest problem that I have for the book is that there's no way in Swift to do emergency shutdown other than by a monitor process that runs the program as a subprocess (which is extremely limiting and not even available on some OSes like iOS). That said, I see no reason to perform unwinding in these cases; it seems to me the program should go directly to the emergency shutdown procedure before terminating.

Broken sensors aren't fatal to the program if it is going to continue, so I personally wouldn't use the word “fatal.” If that is the term of art in Robotics, so be it.


```swift
extension Array {
/// Exchanges the first and last elements.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the condition count() == 1 to count() <= 1, we can easily make this function safe even if the precondition is turned off. I think this is orthogonal here, but still slightly distracting/weakening the argument.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be worth mentioning this example still though. What's the tradeoff between writing the two different conditions, and what is the ideal precondition of this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is really not the point of the example, so I don't want to get into that here. However, I would welcome a better example that doesn't raise the concern (which occurred to me too).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple out of bounds access check for an element accessor could be sufficient to illustrate the problem. Maybe something that's not a subscript is even better, in case someone didn't know about subscripts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't hold back from using fundamental language features. We do expect to have an appendix that gives an introduction to the Swift features we use.

What kind of element accessor would you suggest? I thought of middleElement but that's got such a weird precondition (that the length is odd)…

In most cases, the only acceptable behavior at that point is to
present an error report to the user and leave their data unchanged,
i.e. the program must provide the strong guarantee. That in turn
means—unless the data is all in a transactional database—a program

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a transactional undo/redo framework that lets us compose commands similarly as functions, ensuring transactional guarantees on every level. It may be useful for more general programming tasks other than implementing undo/redo in editors, as we may often not want to discard the changes to the whole document, just the changes done in a specific scope.

See an example at: https://github.com/tothambrus11/undonete-swift/blob/master/Tests/UndoneteTests/UndoneteTests.swift

Copy link
Collaborator Author

@dabrahams dabrahams Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transactional guarantees don't compose, so using them at every level is incredibly inefficient. Is this a general remark or do you think the text should change somehow?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change the text, just discussing.

My composite command system achieves composable transactionality, assuming all low-level commands are correctly implemented to be transactional with an undo, a redo and initial execution method.

Higher level commands are composed of the execution of lower level commands (which may be themselves composite). Once any command fails to execute, it throws an exception, and the composite command undoes all previously executed subcommands, so that the composite command either fully succeeds or leaves the state in the original state.

There is a bit of syntactic overhead over regular programming, but I think explicit undoable commands can be generally useful when making a copy of the original state is unfeasible while requiring transactionality, and it makes code much easier to reason about, as it takes care of a semantically correct unwinding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undoing is usually much less efficient and much more error prone and much much more code than rolling back to a snapshot. You just use a persistent data structure to represent snapshots and then small changes don't take much storage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, those are great observations! Undoing is indeed pretty error prone. It's enough for one of the low-level components to be incorretly implemented and it would make all its user dependant commands non-transactional. If the transactionality is too fine-grained, my model also adds extra layers of undo stacks, so it also adds large memory overhead. Persistent data structures sound like they abstract over all of these bits, I will need to study them more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All you have to do is use CoW at a reasonably fine-grained level, and you get a persistent data structure with mutable value semantics. “Persistent data structure” is just a fancy word for that. There are such things as “persistent arrays” that make it more fine-grained than “the whole array” and preserve the big-O of arrays, which might be worth looking into. However, beware the constant factors. I would venture that a CoW deque would beat this data structure in many cases.

dabrahams and others added 3 commits February 2, 2026 10:33
Co-authored-by: Rishabh Dwivedi <rishabhdwivedi17@gmail.com>
Co-authored-by: Ambrus Tóth <32463042+tothambrus11@users.noreply.github.com>
@dabrahams
Copy link
Collaborator Author

dabrahams commented Feb 2, 2026

@RishabhRD I would need some suggestions of what specific wisdom to offer. I thought about all this and it seems mostly irrelevant to the core issues, so I didn't know what to say about it. If you want to wrap your errors with context information, go ahead; it can be useful… I guess the one advice I'd give is to use a higher order function, e.g.

try withContext("What I'm doing right now") {
  // the code
}

but even that seems like I'd have to set up a lot of context in the text to even mention it.

Comment on lines +55 to +57
[Perhaps the earliest use
](https://dl.acm.org/doi/10.1145/800028.808489) of the term “error
recovery” was in the domain of compilers, where the challenge, after
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no reason to believe that the term error recovery was initially used in compiler design. See, for example, this 1959 paper, which talks about error recovery (or failure recovery) in the context of hardware issues. A quick search in Google Scholar found the term used as early as 1937.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your 1937 paper appears to be from 1971. It says so on the first page.
The other reference does not mention “error recovery,” and I used the word “perhaps.”
Hardware issues are beyond the scope of this chapter.
What would you suggest I do? That would be actionable feedback.

compromising security. If user data is quietly corrupted and
subsequently saved, the damage becomes permanent.

In any case, unless the program has no mutable state and no external
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In any case, unless the program has no mutable state and no external
In any case, unless the program lacks mutable state and external

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@camio This suggestion causes me mental friction because of what I can best describe as the passiveness of the word “lacks.” The properties of having no mutable state or external effects have to be actively upheld in order to make this strategy work. I realize this is a subtle thing and could potentially be convinced that my reaction is misplaced.


Assertions are checked only in debug builds, compiling to nothing in
release builds, thereby encouraging liberal use of `assert` without
concern for slowing down release builds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: Swift got the defaults wrong here. 1) I don't think preconditions and inline assertions are fundamentally different such that one is in release and one isn't 2) For assert I think there should be two variations (assert and always_assert), but I prefer Rust's spelling: assert and debug_assert.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps focusing on the desired properties in different contexts rather than taking Swift's naming and compilation scheme as given is more useful to the general audience. The precondition identifier serves also a documentation purpose, and it's a bit awkward to replace that to assert for saving performance in release builds.

It may be worth adding the explicit imaginary alternative precondition(condition, errorMessage, debugOnly=False), and disclose that different languages use different names and defaults with regards to tagging.

At the same time, it's useful to mention where to write assert and where to write precondition with regards to their semantic/documenting purpose.

Copy link
Collaborator Author

@dabrahams dabrahams Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@camio The rationale for pairing the name assert with the semantics it has go the other way: if you're going to have a check that turns off in release builds, it should be named the same as checks with the same semantics from other languages. Precedent matters. Internally to the standard library there's _debugPrecondition which is used for precondition checks upon which safety does not depend (it also needs to be implemented differently from assert for… reasons). We decided against exposing something with that name to users because there was already going to be assert with identical semantics.

Now, you might take issue with the implication that self-checks for soundness can reasonably be disabled in release builds. That, however, is a critique of the way the chapter is written, not of Swift.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tothambrus11 I thought the text was “focusing on the desired properties in different contexts.” If you have an idea how that could be strengthened, perhaps suggest an edit. Likewise for “it's useful to mention where to write assert and where to write precondition with regards to their semantic/documenting purpose.”

I was conscious of the awkwardness of using assert for a precondition check, but I absolutely don't want to add anything imaginary and write examples in terms of that. People should be able to test the examples. To that end, I spent a bunch of time building an implementation of preconditionUncheckedInRelease (and postconditionUncheckedInRelease) to put in the text and found that they required implementation tricks that would need to be explained. In all, it seemed very heavy for the amount of benefit it was bringing so I replaced them with the use of assert with a message that you see in the text.

code in an unfinished state):

1. Something your function uses has a precondition that you can't
be sure would be satisfied:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a lot of trouble understanding these examples. I'll try to think of something that's more straightforward.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first example is not great and I wanted to find something better. I'm surprised you had trouble with the 2nd one though.

In general, when a condition *C* is necessary for fulfilling your
postcondition, there are three possible choices:

1. You can make *C* a precondition of your function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worth clarifying (unless this is covered later) that option 1 is to add a precondition D such that when D holds, C holds.

It is sometimes desirable to add a precondition that is a strict superset of C, e.g. when it simplifies the function's interface.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you make D a precondition, you are making C a precondition. Isn't the value of simple contracts already sufficiently covered in the previous chapter?

Copy link
Member

@sean-parent sean-parent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good. My notes are relatively minor.

> has a bug.

In the interest of progressive disclosure, we didn't look closely at
the idea, because behind that simple word lies a chapter's worth of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would emphasis error in the quote, because even though you mention the
concept of errors above, it isn't clear if error is what "that simple word" is
referencing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? It's in italics in the sentence that introduces the quote!

Comment on lines +194 to +196
[^techniques]: Techniques for ensuring that restarting is seamless,
such as saving incremental backup files, are well-known, but outside
the scope of this book.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since persistent transactions are a way to achieve both the strong guarantee and
to ensure restarting is seamless, they are probably worth a longer reference in
the text. Naming them as "persistent transactions" also gives the reader something
to research.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what you have in mind. Please suggest a specific edit.

Comment on lines +666 to +667
proper `x.randomShuffle()` would, and is not guaranteed to
preserve the same randomness properties. Perhaps more
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should state this more strongly than "not guaranteed." With quick or
introsort you get a high probability that the pivot element is near the center,
and the next pivot element is near either the 1/4 or 3/4 mark, and so on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not plainly obvious to me how much those facts harm the randomness. If it's obvious to you, please suggest “highly unlikely to” or something as an edit and I'll accept it.

dabrahams and others added 2 commits February 2, 2026 16:27
Co-authored-by: David Sankel <camior@gmail.com>
Very useful; thanks @camio and @sean-parent!

Co-authored-by: Sean Parent <sean.parent@stlab.cc>
Co-authored-by: David Sankel <camior@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants