In my experience, the interfaces between components (where most errors occur) are exactly where fragmented documentation fails. I implemented a hierarchical documentation system for my team that organizes knowledge as a conceptual tree, and the accuracy of code generation with AI assistants improved notably.
Formal verification tools and structured documentation are complementary: verification ensures algorithmic correctness while MECE documentation guarantees conceptual and contextual correctness. I wonder if AWS has experimented with structured documentation systems specifically for AI, or if this remains an area to explore.
"This is just a matter of discipline" is not very convincing, especially when the discipline involves long unpaid afterhours.
The examples I've seen in this report from AWS are mostly about one-shot events (helping going through important changes). It's good to see formal methods used in these cases of course, but I'd really like to read stories about how sustained use of formal methods helps reclaiming the high costs of the initial investment as the actual system evolves alongside the modelization.
There are formal methods where the underlying model is mathematically sound. There are semi-formal methods where the underlying model is structured but not proven to be sound.
For example, in your case ("organizes knowledge") a formal method is ologs from category theory. That method assures that the concepts and their relationship in your knowledge base are mathematically correct.
When you want to transfer that knowledge base into a documentation system you might want look for a mathematically sound model, but I'm afraid there is no such model, so what's left is a semi-formal method / a likely-correct model.
Right now I'm building such a likely-correct model for writing, contact me for more info.
[1] https://www.osequi.com/slides/formalism-correctness-cost/for...
I'd love to explore how your "likely-correct model for writing" might complement the MECE-based documentation system I've been developing. Since you're building a model focused on writing correctness, perhaps there's an opportunity to collaborate?
My approach addresses the organizational structure of documentation, while your work seems to focus on ensuring correctness at the content level. Together, these could potentially create a more comprehensive solution that combines hierarchical organization with formal correctness principles.
Would you be interested in discussing further? I'm curious to learn more about your model.
For example, in an e-commerce API: - Level 1: Core domains (Products, Orders, Users) - Level 2: Sub-domains (Product variants, Order fulfillment, User authentication) - Level 3: Implementation details (Variant pricing logic, Fulfillment status transitions)
I'm currently using plain Markdown with a specific folder structure, making it friendly to version control and AI tools. The key innovation is the organization pattern rather than the tooling - ensuring each concept belongs in exactly one place (mutually exclusive) while collectively covering all needed context (collectively exhaustive).
I should note that this is a work in progress - I've documented the vision and principles, but the implementation is still evolving. The early results are promising though: AI assistants navigating this conceptual tree find the context needed for specific coding tasks more effectively, significantly improving the quality of generated code.
MECE (Mutually Exclusive, Collectively Exhaustive) comes from management consulting and focuses on organizing concepts without overlap or gaps. Divio's system focuses on documentation types (tutorials, how-to, reference, explanation).
In the AI era, your question raises a fascinating point: if AI can dynamically adapt content to the user's needs, do we still need multiple presentation formats? I believe we do, but with a shift in approach.
With AI tools, we can maintain a single MECE-structured knowledge base (optimized for conceptual clarity and AI consumption) and then use AI to dynamically generate Divio-style presentations based on user needs. Rather than manually creating four different document types, we can have AI generate the appropriate format on demand.
In my experiments, I've found that a well-structured MECE knowledge base allows AI to generate much more accurate tutorials, how-tos, references, or explanations on demand. The AI adapts the presentation while drawing from a single source of truth.
This hybrid approach gives us the best of both worlds: conceptual clarity for AI consumption, and appropriate presentation for human needs - all while reducing the maintenance burden of multiple document versions.
(I'm engaged somewhat in trying to get our team to write any documentation; once I've got that, I'll start trying to organize along exactly these principles)
https://www.youtube.com/watch?v=tsSDvflzJbc
> Coding isn't Programming - Closing Keynote with Leslie Lamport - SCaLE 22x
I find it a bit surprising that TLA+ with PlusCal can be challenging to learn for your average software engineer, could anyone with experience in P show an example of something that can be difficult to express in TLA+ which is significantly easier in P?
TLA+ requires you to think about problems very differently. It is very much not programming.
It took me a long time (and many false starts) to be able to think in TLA. To quote Lamport, the real challenge of TLA+ is learning to think abstractly.
To that, I think plusCal and the teaching materials around it do a disservice to TLA. The familiar syntax hides the wildly different semantics
This issue appears to present an intrinsically unsolvable problem, implying that a formally verified system could still contain bugs due to potential issues in the verification software.
While this perspective doesn't necessarily render formal verification impractical, it does introduce certain caveats that, in my experience, are not frequently addressed in discussions about these systems.
I can't recall exactly, but I think CakeML (https://cakeml.org/jfp14.pdf) had a precursor lisp prover, written in a verified lisp, so would be amenable to this approach.
EDIT: think it was this one https://www.schemeworkshop.org/2016/verified-lisp-2016-schem...
It’s like worrying about compiler bugs instead of semantic programming errors. If you are getting to the point where you have machine checked proofs, you are already doing great. It’s way more likely that you have a specification bug or modeling error than an error in the solver.
If you are worried about it, many people have solutions that are very compelling. There are LCF style theorem provers like HOL Light with tiny kernels. But for any real world systems, none of this matters, use any automated solvers you like, it will be fine.
The issue is writing the formal specification. Formal verification is in nutshell just proving the equivalence of two different formal constructs: one that is called the spec and one that is called the system (or program). Writing specs is hard.
In category theory / ologs, a formal method for knowledge representation, the result is always mathematically sound, yet ologs are prefixed with "According to the author's world view ..."
On the other way truth, even if it's not absolute, it's very expensive.
Lately AWS advocates a middle ground, the lightweight formal methods, which are much cheaper than formal methods yet deliver good enough correctness for their business case.
In the same way MIT CSAIL's Daniel Jackson advocates a semi-formal approach to design likely-correct apps (Concept design)
It seems there is a push for better correctness in software, without the aim of perfectionism.
What are those?
As far as I can tell, people don't deploy verified or high assurance systems without testing or without online monitoring and independent fail safes.
We call these things axioms. The verifier must be assumed to be true.
For all the axioms in mathematics... we don't know if any of those axioms are actually fully true. They seem to be true because we haven't witnessed a case where they aren't true. This is similar to how we verify programs with unit tests. For the verifier it's the same thing, you treat it like an axiom you "verify" it with tests.
Axioms are true by their own definition. Therefore, discovering an axiom to be false is a concept that is inherently illogical.
Discovering that a formal verification system produced an incorrect output due to a bug in its implementation is a perfectly well-defined concept and doesn't led to any logical contradictions; unless you axiomatically define the output of formal verification systems to be necessarily correct.
I believe this definition don't make sense in the general case considering the number of vectors that can introduce errors into software.
Who makes a definition? Are definitions intrinsic to reality? No. Definitions are made up concepts from language which is also another arbitrary made up thing. When something is defined to be an axiom the human is making something up and arbitrarily assigning truth to that concept.
> I believe this definition don't make sense in the general case considering the number of vectors that can introduce errors into software.
You’re making up a definition here. Which is the same as an axiom.
Your question is what verifies the verifier? But that question is endless because what verifies the verifier of the verifier? It’s like what proves the axiom? And what proves the proof of the axiom?
The verifier defines the universe of what is correct and what is incorrect. It is the axiom. It is an assumption at its core. You can pretty it up by calling it a more advanced thing like “definition” but in the end it is the exact same thing as assuming and making up a fact that something is true without any proof.
https://www.osequi.com/studies/list/list.html
The structure (ontology, taxonomy) is created with ologs, a formal method from category theory. The behavior (choreography) is created with a semi-formal implementation (XState) of a formal method (Finite State Machines)
The user-facing aspect of the software is designed with Concept Design, a semi-formal method from MIT CSAIL.
Creating software with these methods is refreshing and fun. Maybe one day we can reach Tonsky's "Diagrams are code" vision.
Stack + dynamic memory and other system resource modeling would need proper models with especially memory life-time visualization being an open research problem, let alone resource tracking being unsolved (Valgrind offers no flexible API, license restrictions, incomplete and other platforms than Linux are less complete etc).
Reads and writes conditioned on arithmetic have all the issues related to complex arithmetic and I am unaware of (time) models for speculation or instruction re-ordering, let alone applied compiler optimizations.
The results align with your experience - structured conceptual foundations make development more intuitive and documentation more valuable for both humans and AI tools.
Since then, Amazon's hiring practices have only gotten worse. Can you invert a tree? Can you respond "tree" or "hash map" when you're asked what is the best data structure for a specific situation? Can you solve a riddle or code an ill-explained l33tcode problem? Are you sure you can parse HTML with regexes? You're Amazon material.
Did you pay attention to the lecture about formal proofs. TLA+ or Coq/Kami? That's great, but it won't help you get a job at Amazon.
The idea that formal proofs are used anywhere but the most obscure corners of AWS is laughable.
Although... it is a nice paper. Props to Amazon for supporting Ph.D.'s doing pure research that will never impact AWS' systems or processes.
* The average Amazon engineer is not expected to have awareness of CS fundamentals that go beyond LeetCode-y challenges
* The average Amazon engineer builds using tools which are each developed by a small core team who _are_ expected to know those fundamentals, and who package up those concepts to be usefully usable without full understanding
(I did 10 years on CDO, and my experience matches yours when interacting with my peers, but every time I interacted with the actual Builder Tools teams I was very aware that lived in different worlds)
I find this a bit unsettling. There are dozens of great CS schools in the US. Even non-elite BSc programs in EU sometimes teach formal methods.
There are also some good introductory books now, e.g. [1]. Perhaps its time to interview more on concepts and less on algorithmic tricks favored by LeetCode?
I doubt current undergrads can't go beyond LeetCode-like challenges.
[1] Formal Methods, An Appetizer. https://link.springer.com/book/10.1007/978-3-030-05156-3
My experience with recent CS grads is it's easier to hire Art and Political Science grads and take the time to teach them programming and all it's fundamentals. At least they won't argue with you when you tell them not to use regexes to parse HTML.
I will say that time to productivity with P is low, and, in my experience, it emphasizes practical applicability more so than traditional modeling languages like TLA+ (this point is perhaps somewhat unsubstantiated, but the specific tooling we used is still internal). That is to say, it is fairly easy to sell to management, and I can see momentum already starting to pick up internally. No Ph.D. in formal methods required.
And re: the hiring bar, I agree that the bulk of the distribution lies a bit further left than one would hope/imagine, but there is a long tail, and it only takes a handful of engineers to spearhead a project like this. For maintainability, we are betting on the approachable learning curve, but time will tell.
In your greenfield big data system, what were the state machines? What states could they be in?
Concrete example: the system sees it's queue utilization as high so it throttles incoming requests, but for too long, the queue looks super healthy on the next check, so the throttle removed but too long until checking again, and now the util is too high again.
> Teams across AWS that build some of its flagship products—from storage (e.g., Amazon S3, EBS), to databases (e.g., Amazon DynamoDB, MemoryDB, Aurora), to compute (e.g., EC2, IoT)—have been using P to reason about the correctness of their system designs.
Later it more specifically mentions these (I probably missed a few):
S3 index, S3 ShardStore, Firecracker VMM, Aurora DB engine commit protocol, The RSA implementation on Graviton2
(EDIT: formatting)
Github: https://github.com/p-org/P