How Windows Privilege Escalation Works

Every Windows process runs at a privilege level. The kernel enforces those levels through a combination of integrity levels, access tokens, and a component called the Security Reference Monitor. A privilege escalation exploit does not break that enforcement. It finds a path through it — a sequence of operations the kernel treats as individually legitimate that together produce an outcome the security model was designed to prevent.

Understanding why that is possible requires understanding how the enforcement model actually works.

Mandatory Integrity Control

Windows assigns every process, thread, and securable object an integrity level. The five levels — Untrusted, Low, Medium, High, and System — form an ordered hierarchy. Standard user processes run at Medium. Processes elevated through UAC run at High. Kernel drivers and system services run at System.

The kernel applies these levels through a policy called Mandatory Integrity Control. MIC operates as a pre-check that runs before the kernel even examines the Discretionary Access Control List on an object. If MIC blocks the access, the DACL is never consulted.

The default MIC policy is No Write Up: a process running at a lower integrity level cannot write to an object at a higher integrity level. A Medium-integrity process cannot modify a High-integrity object. A High-integrity process cannot modify a System-integrity object. The check is mandatory and not overridable by DACL entries.

No Read Up works differently. A lower-integrity process can read a higher-integrity object unless the object's System Access Control List explicitly sets a No Read Up flag. That flag must be configured deliberately — it is not the default. In practice, most objects are readable across integrity boundaries; MIC's primary enforcement lever is write access.

Because a kernel driver runs at System integrity, no MIC enforcement applies above it. The driver operates entirely outside the standard hierarchy. Any code executing in that context — including code a privilege escalation exploit has injected or manipulated into running there — inherits that position.

Access Tokens and How the Kernel Reads Them

An access token is the credential attached to a process or thread. Every time a process requests access to a securable object, the kernel reads the token to determine what the caller is allowed to do. The token carries the caller's user SID, group memberships, privilege flags, and integrity level.

Windows distinguishes between two token types. A primary token is assigned to a process at creation and represents its security identity for the duration of its lifetime. An impersonation token is a temporary credential a thread can adopt to act on behalf of a different security context — typically a client that the thread is serving.

Impersonation is a deliberate design feature. A service process handling requests from multiple users needs to perform file access, registry operations, and network calls on behalf of whichever client it is currently serving, without requiring that client's password. The SeImpersonatePrivilege privilege grants a process the right to take on another user's identity this way.

The seam this creates is structural. If a lower-privileged process — say, one running as LOCAL SERVICE, which holds SeImpersonatePrivilege — can cause a System-level process to authenticate against it over a local channel such as a named pipe or a COM object, it can capture the resulting System token. A call to ImpersonateLoggedOnUser() is then sufficient to make the thread execute as System. The access check that follows will see a System-level token. The DACL check will pass. The operation proceeds.

The Security Reference Monitor

The Security Reference Monitor is the kernel component that arbitrates access decisions. Every access check in Windows — whether a process opening a file, a thread duplicating a handle, or a driver accessing a kernel object — routes through the SRM.

The call path from a user-mode access request to an SRM decision follows a consistent sequence. A user-mode application calls a Win32 API, such as CreateFile. That call crosses into the kernel via a system call, reaching the I/O Manager. The I/O Manager passes the request to the Object Manager, which invokes ObOpenObjectByName. The Object Manager calls SeAccessCheck on the SRM. The SRM compares the subject — the caller's access token, including its integrity level — against the object — the security descriptor, containing the DACL and any applicable SACL entries. If the check passes, the Object Manager returns a handle. If it fails, access is denied before the I/O Manager processes the request further.

The SRM's check is instantaneous. It evaluates the token against the security descriptor at a single point in time and produces a binary result. What happens after that result — the actual I/O operation, file system mutation, or object modification — is a separate sequence of events. The gap between those two moments is where race condition vulnerabilities live.

Kernel Object Races

Synchronization in kernel space is expensive. Acquiring a lock carries overhead measured in CPU cycles; on a system handling thousands of concurrent I/O operations, that overhead accumulates. Kernel developers frequently make the engineering decision to check access at one point, release locks to allow parallel operations, then execute the operation — trusting that the state validated during the check will still hold when the operation runs.

That trust is the attack surface.

A time-of-check to time-of-use race — TOCTOU — occurs when an attacker can modify the object state after the access check has passed but before the actual operation executes. The check saw a valid state and returned success. The operation executes against a different state. The kernel proceeds because, as far as its internal logic is concerned, the check already passed.

The challenge of eliminating TOCTOU races in kernel drivers is not ignorance of the pattern — it is the structure of the kernel environment itself. Shared state is accessed across multiple processor cores simultaneously. Interrupt handlers can preempt execution at nearly any instruction boundary. Deferred procedure calls execute asynchronously. The kernel's own scheduler is a participant in the race, determining which thread runs when and for how long. Eliminating a race often requires holding locks across operations that were specifically designed to release them, reintroducing the performance cost the original design was trying to avoid.

`cldflt.sys` as a Case Study

The Windows Cloud Files Mini Filter Driver — cldflt.sys — illustrates the pattern concretely. The driver intercepts I/O requests for OneDrive placeholder files: reparse points that appear as ordinary files in Explorer but contain only metadata, with the actual content living in cloud storage. When an application accesses a placeholder, the kernel traps the I/O request and routes it to cldflt.sys for handling.

In routines like HsmOsBlockPlaceholderAccess, the driver follows a sequence: pause the incoming I/O operation, examine the access token of the requesting thread, communicate with the user-mode OneDrive sync engine to initiate file hydration, then complete the original I/O operation once the content has arrived. This sequence is asynchronous and complex. The driver releases its locks during the hydration phase to allow the sync engine to work.

The race window opens there. An attacker can substitute the placeholder file — replacing it with a hardlink or symlink pointing to a protected system file such as cmd.exe or a sensitive registry hive — after the access check on the original placeholder has already passed. The thread completing the hydration routine runs as System. It performs the final write operation against whatever the file path now resolves to. The DACL on the target file is never consulted, because the SRM's check already completed against the placeholder. The attacker has written to a System-protected file from an unprivileged process.

The success rate of exploits using this mechanism is not deterministic — it depends on winning the race. But on a modern multi-core system under normal load, the window is wide enough to be reliably exploitable with repeated attempts.

The Broader LPE Landscape

A local privilege escalation exploit is a second-stage primitive. On its own, it requires the attacker to already have a foothold — some form of code execution on the target system at a lower privilege level. Its value is in what it enables: moving from an unprivileged user session or a sandboxed process to System-level access, from which credential harvesting, persistence installation, and lateral movement become straightforward.

The categories of Windows LPE are varied, but they map to a small number of underlying mechanisms:

Kernel object races — the TOCTOU pattern applied to any kernel driver that performs access checks separately from the operations they gate. cldflt.sys is one example. The pattern appears across many drivers.
Token manipulation — abusing SeImpersonatePrivilege or related privileges to capture a higher-privileged token through forced authentication over a local channel. The Potato exploit family represents the most developed branch of this approach, producing working exploits across over a decade of Windows versions.
Named pipe impersonation — a specific instance of token manipulation where a service or privileged process is induced to connect to a pipe controlled by the attacker. The attacker calls ImpersonateNamedPipeClient() to adopt the connecting process's token.
DLL hijacking at elevated paths — placing a malicious DLL in a directory that a SYSTEM-level process will search before finding the legitimate one. Requires write access to the target directory, which is sometimes obtainable without elevation.
Service misconfigurations — writable service binary paths, weak ACLs on service executables, or unquoted service paths that allow substitution of the binary a privileged service will execute.

The common thread across all of them is the same structural reality: the Windows privilege enforcement model is correct at the level of individual checks, but the enforcement points are discrete. Between one check and the next, object state can change, tokens can be substituted, and paths can be redirected. The kernel operates on what was true at check time. What is true at execution time can be different.

That gap does not close with patches to individual CVEs. It is a property of how the enforcement model is built. Mitigations narrow specific exploitation paths; the attack surface is the architecture.

Detection, Not Prevention

The practical implication for defenders is that fully patched is a weaker guarantee than it appears. Silent fixes ship without CVEs. Patches regress. Novel exploitation paths in the same bug classes continue to appear. The gap between a working exploit and an available patch is measured in weeks on an optimistic timeline and months on a realistic one.

What that means operationally: the defensive investment that reliably outlasts any individual LPE primitive is behavioral detection on privilege transitions. An EDR that monitors for unexpected integrity level changes, anomalous token duplication calls, or processes attempting named pipe server creation from low-privilege contexts will flag exploitation attempts regardless of which specific CVE is being used. Application isolation — running exposed services at Low integrity or in AppContainer sandboxes — raises the cost of escalation without depending on the patch state of any particular driver.

Privileged access workstations, which confine high-privilege credentials to hardware that never touches general user workflows, address a different part of the problem: they limit the value of a successful escalation by ensuring that SYSTEM access on a workstation does not propagate to domain admin credentials. The LPE still works. What it buys the attacker is less.

The security boundary model is sound. The exploitable seams are in the implementation of the transitions between enforcement points — and in the assumption that what the kernel checked a moment ago still describes what exists now.