- it conflates data race protection with memory safety, and it does so inconsistently. Java and C# are mentioned as MSLs and yet they totally let you race. More fundamentally, data races aren’t the thing that attackers exploit except when those data races do lead to actual memory corruption (like use after free, double free, out of bounds, access to allocator metadata etc). So it’s more precise to not mention data races freedom as a requirement for memory safety, both because otherwise languages like Java and C# don’t meet the definition despite being included in the list and because data races in the presence of memory safety are not a big deal from a security standpoint.
- The document fails to mention to mention Fil-C. It would be understandable if it was mentioned with caveats (“new project”, “performance blah blah”) but not mentioning it at all is silly.
> More fundamentally, data races aren’t the thing that attackers exploit except when those data races do lead to actual memory corruption (like use after free, double free, out of bounds, access to allocator metadata etc).
This is absolutely not true. One of the classic data races is when you do a set of operations like this non-atomically:
Which is a huge security vulnerability because it lets people double spend. Alice buys something for $1000 and something for $1 and instead of debiting her account by $1001 it debits it by $1 because the write for the second transaction clobbers the balance reduction from the first one.
Another common one is symbolic links. You check the target of a symbolic link and then access it, but between the check and the access the link changed and now you're leaking secrets or overwriting privileged data.
Data races are serious vulnerabilities completely independent of memory safety.
That filesystem example is a TOCTOU race, not a data race, and so it can happen in Rust or similar languages just the same. Although both TOCTOU races and data races are types of race condition, they are not the same thing. Many race conditions are an ordinary part of our lived experience - if you've ever thought "Oh, I need to buy more milk" and then went to a store but meanwhile your house mate, partner, colleague at work, or whatever were also out buying more milk, well, when you get back with milk there's too much milk, oops, that's a race condition, specifically a TOCTOU race - we checked the milk, then we purchased more milk, meanwhile someone else changed how much milk is there.
Data races aren't like any real world experience. The way the machine actually works is too alien for us to get our heads around so we're provided with a grossly simplified "sequentially consistent" illusion when writing high level languages like C - in which things happen in some order. Data races are reality "bleeding through" if we don't follow the rules to preserve that illusion.
> Java and C# are mentioned as MSLs and yet they totally let you race.
In Java a data race means loss of sequential consistency. Humans generally don't understand programs which lack sequential consistency so a typical Java team probably can't debug the program, but the program still always has well defined behaviour - and chances are you don't want to debug the weird non-sequentially consistent behaviour anyway, you just want them to fix the data race.
In C# data races are not too dangerous for trivial objects which are valid for all bit patterns. If you race an integer k, well, now k is smashed, don't think too hard about the value of k, it does have some value but it won't go well for you to try to reason about the value. For a complex object like a hash table, it's Undefined Behaviour.
Meanwhile in C or C++ all data races are immediate UB, you lose, game over.
A definition of memory safety without data race freedom may be more precise but arguably less complete.
It is correct that data races in a garbage collected language are difficult to turn into exploits.
The problem is that data races in C and C++ do in fact get combined with other memory safety bugs into exploits.
A definition from first principles is still missing, but imagine it takes the form of "all memory access is free from UB". Then whether the pointer is in-bounds, or whether no thread is concurrently mutating the location seem to be quite similar constraints.
Rust does give ways to control concurrency, eg via expressing exclusive access through &mut reference. So there is also precedent that the same mechanisms can be used to ensure validity of reference (not dangling) as well as absence of concurrent access.
Hmm, I take it that the situation is that there are a number of vendors/providers/distros/repos who could be distributing your memory-safe builds, but are currently still distributing unsafe builds?
I wonder if an organization like the Tor project [1] would be more motivated to "officially" distribute a Fil-C build, being that security is the whole point of their product. (I'm talking just their "onion router" [2], not (necessarily) the whole browser.)
I could imagine that once some organizations start officially shipping Fil-C builds, adoption might accelerate.
Also, have you talked to the Ladybird browser people? They seemed to be taking an interested in Fil-C.
> Memory safety is like the global warming of the software industry.
So it's an insidious long term issue that challenges our systems which reward short term thinking, and will slowly crush us, if we don't do anything about it?
It's not that there's some special interest group pushing this particular technology, it's just the best available solution so if you want to solve the problem this is how you do it.
It is also worth mentioning that not all memory safety vulns are exploitable or have a theoretical exploitation vector. Many these days are similar to theoretical crypto vulns in that "some day" the capability might be developed. It isn't just exploit mitigations but secure development practices that make it hard enough to where even theoretical exploitation isn't viable.
Does it explicitly say so? I could not find "Perl" in the PDF. There are only examples of MSLs. Perl not making the example list does not mean it is not one. It is a non-exhaustive list.
A big thing missing is swapping out dependencies in unsafe languages for ones written in safe languages.
Usually there are only a couple places that actually deal with user controlled data, so switching to safe dependencies for things like making thumbnails for pdf files can be effective.
Edit: One more thing is compiling unsafe code to web assembly or other forms of sandboxing it was not mentioned.
Incremental replacement of critical dependencies also offers a practical migration path for large legacy codebases where complete rewrites are economically infeasible.
> MSLs such as Ada, C#, Delphi/Object Pascal, Go, Java, Python, Ruby, Rust, and Swift offer built-in protections against memory safety issues
They offer default protections that can be easily overridden in most of those languages. Some of them require you to use those overrides to implement common data structures.
> MSLs can prevent entire classes of vulnerabilities, such as buffer overflows, dangling pointers, and numerous other Common Weakness Enumeration (CWE) vulnerabilities.
If used a certain way.
> Android team made a strategic decision to prioritize MSLs, specifically Rust and Java, for all new development
Was that /all/ they did?
> Invest initially in training, tools, and refactoring. This investment can usually be offset by long-term savings through reduced downtime, fewer vulnerabilities, and enhanced developer efficiency.
That is an exceedingly dubious claim to make in general.
Two big problems in this document:
- it conflates data race protection with memory safety, and it does so inconsistently. Java and C# are mentioned as MSLs and yet they totally let you race. More fundamentally, data races aren’t the thing that attackers exploit except when those data races do lead to actual memory corruption (like use after free, double free, out of bounds, access to allocator metadata etc). So it’s more precise to not mention data races freedom as a requirement for memory safety, both because otherwise languages like Java and C# don’t meet the definition despite being included in the list and because data races in the presence of memory safety are not a big deal from a security standpoint.
- The document fails to mention to mention Fil-C. It would be understandable if it was mentioned with caveats (“new project”, “performance blah blah”) but not mentioning it at all is silly.
> More fundamentally, data races aren’t the thing that attackers exploit except when those data races do lead to actual memory corruption (like use after free, double free, out of bounds, access to allocator metadata etc).
This is absolutely not true. One of the classic data races is when you do a set of operations like this non-atomically:
Which is a huge security vulnerability because it lets people double spend. Alice buys something for $1000 and something for $1 and instead of debiting her account by $1001 it debits it by $1 because the write for the second transaction clobbers the balance reduction from the first one.Another common one is symbolic links. You check the target of a symbolic link and then access it, but between the check and the access the link changed and now you're leaking secrets or overwriting privileged data.
Data races are serious vulnerabilities completely independent of memory safety.
That filesystem example is a TOCTOU race, not a data race, and so it can happen in Rust or similar languages just the same. Although both TOCTOU races and data races are types of race condition, they are not the same thing. Many race conditions are an ordinary part of our lived experience - if you've ever thought "Oh, I need to buy more milk" and then went to a store but meanwhile your house mate, partner, colleague at work, or whatever were also out buying more milk, well, when you get back with milk there's too much milk, oops, that's a race condition, specifically a TOCTOU race - we checked the milk, then we purchased more milk, meanwhile someone else changed how much milk is there.
Data races aren't like any real world experience. The way the machine actually works is too alien for us to get our heads around so we're provided with a grossly simplified "sequentially consistent" illusion when writing high level languages like C - in which things happen in some order. Data races are reality "bleeding through" if we don't follow the rules to preserve that illusion.
> Java and C# are mentioned as MSLs and yet they totally let you race.
In Java a data race means loss of sequential consistency. Humans generally don't understand programs which lack sequential consistency so a typical Java team probably can't debug the program, but the program still always has well defined behaviour - and chances are you don't want to debug the weird non-sequentially consistent behaviour anyway, you just want them to fix the data race.
In C# data races are not too dangerous for trivial objects which are valid for all bit patterns. If you race an integer k, well, now k is smashed, don't think too hard about the value of k, it does have some value but it won't go well for you to try to reason about the value. For a complex object like a hash table, it's Undefined Behaviour.
Meanwhile in C or C++ all data races are immediate UB, you lose, game over.
A definition of memory safety without data race freedom may be more precise but arguably less complete.
It is correct that data races in a garbage collected language are difficult to turn into exploits.
The problem is that data races in C and C++ do in fact get combined with other memory safety bugs into exploits.
A definition from first principles is still missing, but imagine it takes the form of "all memory access is free from UB". Then whether the pointer is in-bounds, or whether no thread is concurrently mutating the location seem to be quite similar constraints.
Rust does give ways to control concurrency, eg via expressing exclusive access through &mut reference. So there is also precedent that the same mechanisms can be used to ensure validity of reference (not dangling) as well as absence of concurrent access.
> because data races in the presence of memory safety are not a big deal from a security standpoint.
Note though that data races can make otherwise memory-safe programs not actually memory safe. See for example Go
They're not going to mention a single-person experimental project that has 900 stars on GitHub.
This is meant to be a practical strategy that can be implemented nation-wide, without turning into another https://xkcd.com/2347
> They're not going to mention a single-person experimental project that has 900 stars on GitHub.
Seems like a bad way to pick technology.
They do mention things like TRACTOR. Fil-C is far ahead of any project under the TRACTOR umbrella.
> This is meant to be a practical strategy that can be implemented nation-wide, without turning into another https://xkcd.com/2347
The solution to that is funding the thing that is essential, rather than complaining that an essential thing is unfunded. DOD could do that
Preach it brother! :)
Hmm, I take it that the situation is that there are a number of vendors/providers/distros/repos who could be distributing your memory-safe builds, but are currently still distributing unsafe builds?
I wonder if an organization like the Tor project [1] would be more motivated to "officially" distribute a Fil-C build, being that security is the whole point of their product. (I'm talking just their "onion router" [2], not (necessarily) the whole browser.)
I could imagine that once some organizations start officially shipping Fil-C builds, adoption might accelerate.
Also, have you talked to the Ladybird browser people? They seemed to be taking an interested in Fil-C.
[1] https://www.torproject.org/
[2] https://gitlab.torproject.org/tpo/core/tor
[flagged]
> Memory safety is like the global warming of the software industry.
So it's an insidious long term issue that challenges our systems which reward short term thinking, and will slowly crush us, if we don't do anything about it?
I fully agree.
jart putting Carmack and Musk at the same level is a bit sad and revealing, no wonder the downvotes.
https://x.com/ID_AA_Carmack/status/1935353905149341968
Ah, so it looks like the Rust mole in the government survived the DOGE purges, I was wondering what happened to him ;)
It's not that there's some special interest group pushing this particular technology, it's just the best available solution so if you want to solve the problem this is how you do it.
> Out of the 58 in-the-wild zero-days discovered in 2021, 67% were memory safety vulnerabilities.
About where that number was twenty years previous.
The big difference is that twenty years ago, the enemy was script kiddies. Now it's competent teams funded by multiple nation-states.
It is also worth mentioning that not all memory safety vulns are exploitable or have a theoretical exploitation vector. Many these days are similar to theoretical crypto vulns in that "some day" the capability might be developed. It isn't just exploit mitigations but secure development practices that make it hard enough to where even theoretical exploitation isn't viable.
Why is Delphi/Object Pascal an MSL?
Why shouldn't it be?
What's stopping you from UAF or OOB array access in Delphi?
Delphi arrays are bounds checked. UAF is mitigated by having ref counted strings and such. Not fully safe but much safer than C and C++.
Usually a runtime or compile time check. At least for OOB in Ada.
https://www.jdoodle.com/ia/1IgW
So, Perl with its tainted-data tracking mechanism is not considered safe. Weird.
Does it explicitly say so? I could not find "Perl" in the PDF. There are only examples of MSLs. Perl not making the example list does not mean it is not one. It is a non-exhaustive list.
A big thing missing is swapping out dependencies in unsafe languages for ones written in safe languages.
Usually there are only a couple places that actually deal with user controlled data, so switching to safe dependencies for things like making thumbnails for pdf files can be effective.
Edit: One more thing is compiling unsafe code to web assembly or other forms of sandboxing it was not mentioned.
Incremental replacement of critical dependencies also offers a practical migration path for large legacy codebases where complete rewrites are economically infeasible.
reducing security incidents for modern software developments
> MSLs such as Ada, C#, Delphi/Object Pascal, Go, Java, Python, Ruby, Rust, and Swift offer built-in protections against memory safety issues
They offer default protections that can be easily overridden in most of those languages. Some of them require you to use those overrides to implement common data structures.
> MSLs can prevent entire classes of vulnerabilities, such as buffer overflows, dangling pointers, and numerous other Common Weakness Enumeration (CWE) vulnerabilities.
If used a certain way.
> Android team made a strategic decision to prioritize MSLs, specifically Rust and Java, for all new development
Was that /all/ they did?
> Invest initially in training, tools, and refactoring. This investment can usually be offset by long-term savings through reduced downtime, fewer vulnerabilities, and enhanced developer efficiency.
That is an exceedingly dubious claim to make in general.