Three lies your operating system tells you about memory

class: center, middle
# Three lies your operating system tells you about memory

???

- Thanks to Jen for organizing our Lunch & Learns!

- This topic will be most useful to C/C++ programmers, but hopefully it's
  an interesting peek under the hood to non-programmers as well.

- There's a lot of history here, which I'm going to mostly skip over. Some
  things I say won't always have been that way, or some things I call lies
  were once true. If we go down all those rabbit holes, this would be like a 6
  hour talk.

- Feel free to stop me and ask questions

---
class: center, middle
# What do we ~~know~~ believe as programmers or users?

---
# Let's start from CS 101

.fit[![The Stack and The Heap](./stack_heap.png)]

???

In school we're taught (or at least I was, I'm old enough that we started in C)
that there are two areas of memory, the _stack_ and the _heap_.

For non-programmers (or those that didn't learn this way) the _stack_ is a
scratch-pad-like area where the current function's state is stored until the
function returns and they go out of scope.

- Like a stack of index cards: add a card when you call a new function, throw
  it out when the function is done.

Conversely, the heap is a nebulous area where things live as long as you need
them to - either manually managed by the programmer, or managed for you by a
garbage collector in languages like C#, Python, or JavaScript.

---
# Linear memory

.fit[![Linear memory](./linear_memory.png)]

- Memory is a linear collection of bytes
  - Numbered from 0 to 2<sup>32</sup> or 2<sup>64</sup>
  - Not all of the numbers are valid, your computer doesn't have that much RAM.
  - There's no such thing as memory at address 0 (aka `NULL` or `nullptr`)

- My program is given access to certain ranges for its stack and heap (and
  other sections).

???

As we advance as programmers, we learn that it's all one big area called "main
memory"

- linear address space, with a valid range the size of physical RAM
  - on 32 bit systems, zero to 4 billion
  - on 64 bit systems, zero to 18 quintillion
- _something_ doles out portions of it to be the stack & heap
  - gives more as the heap needs to grow until you're OOM
- other programs are getting portions of that RAM too
  - too many and we'll run out quick

This is the mental model that many programmers end up keeping.

---
# Who gives you memory?

- OS Program Loader
  - Creates a stack area
  - Also loads code segments
  - And static, global, constant data

???

Let's dive into that "something" that doles out memory..

- Stack: the OS program loader
- Also gives the program memory for:
  - Code segments
  - Static and global variables
  - (often forgotten when teachers generalize to "stack" and "heap")

- The allocator
  - In charge of the heap
  - Function of the C library
  - Uses several ways to get memory from the OS (`mmap`, `sbrk` on Unix,
    `HeapCreate` on Windows)

???

- Heap: the allocator

As game programmers, you probably know more about the allocator than your
average programmer.

- Default one is part of the C or C++ standard libs
  - There are plenty of alternate allocators out there, jemalloc dlmalloc etc
- Uses different ways to get memory from the OS

---
class: middle, center
# And now for a bunch of lies

---
# Lie 1: Memory is linear

???

Ok, the first lie is linear memory.

- Seems to be the one that more people know at least something about.

For example:

- For a long time, my mental model of Virtual Memory was what I think of as
  the "Windows 3.1" model
- it's that disk space I tell Windows to use in addition to my physical RAM
  so that I can run this new game

--
name: lie1

### Virtual memory

- ***Physical addresses*** are what's used on the computer's memory bus
  - Set by the manufacturer (or firmware)
  - Full of holes and things that aren't RAM (like MMIO)

???

First let's talk about where your memory really is.

- It's all chips connected to the memory bus
- bus addresses chosen by your computer's manufacturer 
- But memory bus is used for more than just memory.
  - memory-mapped I/O is the most common way to do I/O on modern machines

---

```
00 0000 0000 - 00 0005 8000 (352 KiB)   Conventional memory
00 0005 8000 - 00 0005 9000 (4 KiB)     Reserved by the firmware
00 0005 9000 - 00 0009 e000 (276 KiB)   Conventional memory
00 0009 e000 - 00 000a 0000 (8 KiB)     Reserved by the firmware

00 0010 0000 - 00 651e 4000 (1.6 GiB)   Conventional memory
00 651e 4000 - 00 6522 4000 (256 KiB)   EFI Boot Services data
00 6522 4000 - 00 72d7 a000 (219 MiB)   Conventional memory
00 72d7 a000 - 00 750f d000 (35 MiB)    EFI data
00 750f d000 - 00 750f e000 (4 KiB)     ACPI data
00 750f e000 - 00 7e5e b000 (148 MiB)   EFI data
00 7e5e b000 - 00 7e84 2000 (2 MiB)     Conventional memory
00 7e84 2000 - 00 7f19 e000 (9 MiB)     EFI code
00 7f19 e000 - 00 7f66 0000 (4 MiB)     Reserved by the firmware
00 7f66 0000 - 00 7f6a b000 (300 KiB)   ACPI data (reclaimable)
00 7f6a b000 - 00 7fac 5000 (4 MiB)     ACPI data
00 7fac 5000 - 00 8000 0000 (4 MiB)     EFI data & code

00 e000 0000 - 00 f000 0000 (256 MiB)   MMIO (PCI Express)
00 fe00 0000 - 00 fe01 1000 (68 KiB)    MMIO (??)
00 fec0 0000 - 00 fec0 1000 (4 KiB)     MMIO (??)
00 fed0 0000 - 00 fed0 1000 (4 KiB)     MMIO (HPET)
00 fee0 0000 - 00 fee0 0000 (4 KiB)     MMIO (CPU local APIC)
00 ff00 0000 - 01 0000 0000 (16 MiB)    MMIO (UART/Serial)

01 0000 0000 - 01 4000 0000 (1 GiB)     Conventional memory
01 4000 0000 - 01 4002 b000 (172 KiB)   EFI Bootloader code
01 4002 b000 - 08 7100 0000 (28.7 GiB)  Conventional memory
```

???

This is the memory map I pulled from the firmware of a NUC I have (one of the
ones that ran TL3 servers at conventions in 2018!).

We can see it's got a few contiguous regions of actual memory:

- 640 KiB starting at address 0
- 2 GiB (minus 1 MiB) starting 1 MiB into the address space
- 30 GiB (minus 240 MiB) starting 4 GiB into the address space

And between the 2 GiB and 30 GiB blocks we can see a range of addresses used
for MMIO.

---
template: lie1

- ***Virtual Addresses*** are what we use in our programs
  - Translated to physical addresses by the MMU in 4KiB\* _pages_
  - _Page Tables_ created by the OS define the virtual -> physical mapping
  - Page Tables are (usually) unique to a process - every process thinks they
    have the entirety of the virtual address space to themselves
  - When accessing a mapping that doesn't exist, the CPU raises a _page fault_ to the OS

???

Ok, back to linear memory:

- So, the MMU translates virtual addresses into physical
- Usually based on page tables - which are themselves in memory! 
- This setup allows for _memory protection_
  - Page tables are swapped on process switch
  - Each process thinks it's the only one in memory - impossible to write to
    another process' memory
  - Page tables also allow OS to specify R/W/X per-page
  - When violated, the CPU raises the dreaded _page fault_

Now the mechanism of the _page fault_ might give a hint about the next lie...

---
name: lie2
# Lie 2: The OS gave you some memory

```cpp
static constexpr size_t GiB = 0x4000'0000;
char *novel = new char [2 * GiB];
```

???

So, we talked about how your memory allocator gets memory from the OS. Let's
dive into that a little more. Let's say I'm allocating a text buffer big enough
to write my new novel. What is the value of `novel` here?

(It's a pointer. To what?)

--
name: lie21

- The OS doesn't allocate any pages, only a _virtual address_ range
- Writing to the memory address in `novel` causes a page fault

???

So if the OS didn't allocate any actual memory, what happens when I try to start
writing my novel? There's no physical page mapped to the address in `novel`, so
boom, I get a page fault and everything explodes right?

.center[![This is fine](./this_is_fine.gif)]

???

This is, actually, fine.

---
template: lie21

- The OS catches the page fault and allocates a new page to that address
- The CPU jumps back to the failed instruction and tries again

???

This is how your CPU's page faults are intended to be used. There are two kinds
of CPU errors when running code - _faults_ and _exceptions_. Exceptions are non-
recoverable things like _division by zero_ or _unknown opcode_. Faults are things
the OS can try to fix.

So the CPU issues the fault to your OS, then tries the instruction again _et viola_,
the page is there this time and everything goes on like nothing happened.. until
you hit the next page and we start again.

Maybe you can see how this all ties in to this last lie...

---
layout: true
# Lie 3: You aren't out of memory

---
name: lie31

```cpp
static constexpr size_t GiB = 0x4000'0000;
char *novel = new char [2 * GiB];
if (novel == nullptr) 
    std::cout << "novel is null!" << std::endl;
else
    std::cout << "novel is not null!" << std::endl;
```

What does this program print when your OS reports that you have 1GiB of RAM
free?

???

I don't know about you, but I had it drilled into me early in my days of
learning C that you always, always check the return value of allocating memory
because it will return NULL when out of memory. And you should, there are
plenty of reasons you might get a NULL value back - but being out of memory
is pretty much never one of them.

---
## Swapping Pages

.fit[![x64 Paging Structures](./x64 paging structures.jpg)]

- The most well-known part of this lie
  - Write the contents of a page to disk, then give that page to another
    process
  - On many platforms the CPU happily helps out by setting "accessed" or "dirty"
    flags

???

You're probably all aware of "paging" or "swapping". This goes hand-in-hand
with the "Windows 3.1" mental model of virtual memory -- you run out of RAM and
suddenly your hard disk starts grinding as the OS swaps pages from memory to
disk to make room.

---
template: lie31

- What does even "free" mean? How do you count what's _NOT_ free?
  1. Allocated by `malloc()` or `new`? 
  2. Committed to a process through a page fault?
  3. Ready to be replaced in swap?

???

So given all that, what does "free" even mean? On top of this list, your
OS will also make use of physical pages that aren't currently being used
as a cache to speed up things like file I/O. Are those cache pages "free"
or "in use"?

The discrepency between #1 and #2 here points us at a more subtle part
of this lie...

---
## Memory Overcommit

- Processes generally don't use all the memory they allocate right away,
  if ever
- This lets modern OSes "overcommit" memory - handing out more allocations
  than the amount of physical RAM
  - The bet is that actual physical RAM usage won't actually go above the
    physical limit
  - The OS can rely on swap to cover the gap if it does

???

Memory overcommitment is the shell game that the OS is playing when it
comes to managing physical RAM. Usually this is a good thing and lets you
make more use of your system's resources. The problems only start when
you get too close to filling your RAM _and_ your swap space.

---
template: lie31

- Linux: `novel` will _never_\* be null for OOM reasons
- Windows: `novel` will only be null once you've filled both RAM and swap

???

This one is especially OS-dependent.

- Linux is technically configurable, but the default that no one ever changes
  is that it will never say no to "reasonable" requests.
  - If you try to allocate 10GiB on a machine with 4GiB RAM it will say no
  - So `novel` is never NULL for OOM reasons
  - But if you run out of RAM and swap space, the kernel will just start
    randomly killing processes.
- Windows is a bit more conservative
  - It won't overcommit past the amount that can be backed by swap
  - But this does mean there can be situations where there's RAM left
    under-utilized
- But still always check the return value of allocations, seriously.

---
layout: true

---
class: middle

# Lies your OS tells you about memory

## 1. Memory is linear

## 2. The OS gave you some memory

## 3. You aren't out of memory

???

Any questions?