Someone will hopefully come along with a more complete answer, but this will get you started. I would write more but this is typically a 200 level course in CS, and I’m on mobile. I also may do a poor job of explaining. Here it goes…
What you’re missing is that Unix-like operating systems (and most/all other general purpose operating systems) make use of different CPU contexts, which have different privilege levels; these are effectively security controls and they’re enforced by the CPU
Note, when I say “context”, I mean both the general register state AND the control registers which dictate privilege. Not only a simple “context switch”, if you know the term
Yes, you’re correct that the instructions run on the same CPU, but there are instructions that can only execute in the kernel context (and registers only addressable in that context, as well)
These instructions will fault when executed in userland, as userland is not privileged
It’s no coincidence that these privileged instructions (and registers) are critical to performing I/O as well as setting up userland memory mappings, which act as security bondareis (no mapping means no access)
Things you need to learn about to understand this:
- Protection rings, specifically ring-3 (userland context) and ring-0 (kernel context) in the case of Unix-like operating systems as well as Windows
- The MMU which is critical for restricting userland access to privileged memory
- Virtual Memory, closely related to the MMU
- Interrupts, which can be used as a signal that a userland process wants to perform a privileged operation. They trigger a context switch into interrupt handlers, implemented by the kernel
- System Calls, the interface for userland to request specific operations implementes in the kernel. There are hundreds of system calls (maybe well over a thousand at this point?) implemented in Linux. Examples of system calls include open(), read(), write(), etc. Invoking a system call from userland typically involves setting a specific general purpose register to a system call number and then invoking a specific interrupt number (via a break or trap instruction, depending on the CPU architecture)