Memory Safe Languages: Reducing Vulnerabilities in Modern Software Development [pdf]

media.defense.gov

78 points by todsacerdoti 19 hours ago

pizlonator 13 hours ago

Two big problems in this document:

- it conflates data race protection with memory safety, and it does so inconsistently. Java and C# are mentioned as MSLs and yet they totally let you race. More fundamentally, data races aren’t the thing that attackers exploit except when those data races do lead to actual memory corruption (like use after free, double free, out of bounds, access to allocator metadata etc). So it’s more precise to not mention data races freedom as a requirement for memory safety, both because otherwise languages like Java and C# don’t meet the definition despite being included in the list and because data races in the presence of memory safety are not a big deal from a security standpoint.

- The document fails to mention to mention Fil-C. It would be understandable if it was mentioned with caveats (“new project”, “performance blah blah”) but not mentioning it at all is silly.

AnthonyMouse 4 hours ago
> More fundamentally, data races aren’t the thing that attackers exploit except when those data races do lead to actual memory corruption (like use after free, double free, out of bounds, access to allocator metadata etc).
This is absolutely not true. One of the classic data races is when you do a set of operations like this non-atomically:
```
  new_total = account.balance;
  new_total -= purchase_price;
  account.balance = new_total;
```
Which is a huge security vulnerability because it lets people double spend. Alice buys something for $1000 and something for $1 and instead of debiting her account by $1001 it debits it by $1 because the write for the second transaction clobbers the balance reduction from the first one.
Another common one is symbolic links. You check the target of a symbolic link and then access it, but between the check and the access the link changed and now you're leaking secrets or overwriting privileged data.
Data races are serious vulnerabilities completely independent of memory safety.
- tialaramex 3 hours ago
  
  That filesystem example is a TOCTOU race, not a data race, and so it can happen in Rust or similar languages just the same. Although both TOCTOU races and data races are types of race condition, they are not the same thing. Many race conditions are an ordinary part of our lived experience - if you've ever thought "Oh, I need to buy more milk" and then went to a store but meanwhile your house mate, partner, colleague at work, or whatever were also out buying more milk, well, when you get back with milk there's too much milk, oops, that's a race condition, specifically a TOCTOU race - we checked the milk, then we purchased more milk, meanwhile someone else changed how much milk is there.
  Data races aren't like any real world experience. The way the machine actually works is too alien for us to get our heads around so we're provided with a grossly simplified "sequentially consistent" illusion when writing high level languages like C - in which things happen in some order. Data races are reality "bleeding through" if we don't follow the rules to preserve that illusion.
tialaramex 3 hours ago

> Java and C# are mentioned as MSLs and yet they totally let you race.
In Java a data race means loss of sequential consistency. Humans generally don't understand programs which lack sequential consistency so a typical Java team probably can't debug the program, but the program still always has well defined behaviour - and chances are you don't want to debug the weird non-sequentially consistent behaviour anyway, you just want them to fix the data race.
In C# data races are not too dangerous for trivial objects which are valid for all bit patterns. If you race an integer k, well, now k is smashed, don't think too hard about the value of k, it does have some value but it won't go well for you to try to reason about the value. For a complex object like a hash table, it's Undefined Behaviour.
Meanwhile in C or C++ all data races are immediate UB, you lose, game over.
burakemir 5 hours ago

A definition of memory safety without data race freedom may be more precise but arguably less complete.
It is correct that data races in a garbage collected language are difficult to turn into exploits.
The problem is that data races in C and C++ do in fact get combined with other memory safety bugs into exploits.
A definition from first principles is still missing, but imagine it takes the form of "all memory access is free from UB". Then whether the pointer is in-bounds, or whether no thread is concurrently mutating the location seem to be quite similar constraints.
Rust does give ways to control concurrency, eg via expressing exclusive access through &mut reference. So there is also precedent that the same mechanisms can be used to ensure validity of reference (not dangling) as well as absence of concurrent access.
SkiFire13 an hour ago

> because data races in the presence of memory safety are not a big deal from a security standpoint.
Note though that data races can make otherwise memory-safe programs not actually memory safe. See for example Go
pornel 12 hours ago

They're not going to mention a single-person experimental project that has 900 stars on GitHub.
This is meant to be a practical strategy that can be implemented nation-wide, without turning into another https://xkcd.com/2347
- pizlonator 12 hours ago
  
  > They're not going to mention a single-person experimental project that has 900 stars on GitHub.
  Seems like a bad way to pick technology.
  They do mention things like TRACTOR. Fil-C is far ahead of any project under the TRACTOR umbrella.
  > This is meant to be a practical strategy that can be implemented nation-wide, without turning into another https://xkcd.com/2347
  The solution to that is funding the thing that is essential, rather than complaining that an essential thing is unfunded. DOD could do that
  - safercplusplus 6 hours ago
    
    Preach it brother! :)
    Hmm, I take it that the situation is that there are a number of vendors/providers/distros/repos who could be distributing your memory-safe builds, but are currently still distributing unsafe builds?
    I wonder if an organization like the Tor project [1] would be more motivated to "officially" distribute a Fil-C build, being that security is the whole point of their product. (I'm talking just their "onion router" [2], not (necessarily) the whole browser.)
    I could imagine that once some organizations start officially shipping Fil-C builds, adoption might accelerate.
    Also, have you talked to the Ladybird browser people? They seemed to be taking an interested in Fil-C.
    [1] https://www.torproject.org/
    [2] https://gitlab.torproject.org/tpo/core/tor
jart 12 hours ago

[flagged]
- Ygg2 3 hours ago
  
  > Memory safety is like the global warming of the software industry.
  So it's an insidious long term issue that challenges our systems which reward short term thinking, and will slowly crush us, if we don't do anything about it?
  I fully agree.
- jdright 10 hours ago
  
  jart putting Carmack and Musk at the same level is a bit sad and revealing, no wonder the downvotes.
  - jart 8 hours ago
    
    https://x.com/ID_AA_Carmack/status/1935353905149341968

flohofwoe 4 hours ago

Ah, so it looks like the Rust mole in the government survived the DOGE purges, I was wondering what happened to him ;)

tialaramex 2 hours ago

It's not that there's some special interest group pushing this particular technology, it's just the best available solution so if you want to solve the problem this is how you do it.

Animats 7 hours ago

> Out of the 58 in-the-wild zero-days discovered in 2021, 67% were memory safety vulnerabilities.

About where that number was twenty years previous.

The big difference is that twenty years ago, the enemy was script kiddies. Now it's competent teams funded by multiple nation-states.

notepad0x90 6 hours ago

It is also worth mentioning that not all memory safety vulns are exploitable or have a theoretical exploitation vector. Many these days are similar to theoretical crypto vulns in that "some day" the capability might be developed. It isn't just exploit mitigations but secure development practices that make it hard enough to where even theoretical exploitation isn't viable.

andreidd 5 hours ago

Why is Delphi/Object Pascal an MSL?

Ygg2 5 hours ago

Why shouldn't it be?
- andreidd 4 hours ago
  
  What's stopping you from UAF or OOB array access in Delphi?
  - sirwhinesalot 2 hours ago
    
    Delphi arrays are bounds checked. UAF is mitigated by having ref counted strings and such. Not fully safe but much safer than C and C++.
  - Ygg2 3 hours ago
    
    Usually a runtime or compile time check. At least for OOB in Ada.
    https://www.jdoodle.com/ia/1IgW

larodi 4 hours ago

So, Perl with its tainted-data tracking mechanism is not considered safe. Weird.

johnisgood 2 hours ago

Does it explicitly say so? I could not find "Perl" in the PDF. There are only examples of MSLs. Perl not making the example list does not mean it is not one. It is a non-exhaustive list.

charcircuit 13 hours ago

A big thing missing is swapping out dependencies in unsafe languages for ones written in safe languages.

Usually there are only a couple places that actually deal with user controlled data, so switching to safe dependencies for things like making thumbnails for pdf files can be effective.

Edit: One more thing is compiling unsafe code to web assembly or other forms of sandboxing it was not mentioned.

ethan_smith 11 hours ago

Incremental replacement of critical dependencies also offers a practical migration path for large legacy codebases where complete rewrites are economically infeasible.

awaymazdacx5 13 hours ago

reducing security incidents for modern software developments

timewizard 7 hours ago

> MSLs such as Ada, C#, Delphi/Object Pascal, Go, Java, Python, Ruby, Rust, and Swift offer built-in protections against memory safety issues

They offer default protections that can be easily overridden in most of those languages. Some of them require you to use those overrides to implement common data structures.

> MSLs can prevent entire classes of vulnerabilities, such as buffer overflows, dangling pointers, and numerous other Common Weakness Enumeration (CWE) vulnerabilities.

If used a certain way.

> Android team made a strategic decision to prioritize MSLs, specifically Rust and Java, for all new development

Was that /all/ they did?

> Invest initially in training, tools, and refactoring. This investment can usually be offset by long-term savings through reduced downtime, fewer vulnerabilities, and enhanced developer efficiency.

That is an exceedingly dubious claim to make in general.