Tech Analysis: CrowdStrike’s Kernel Access and Security Architecture
Context
In today’s rapidly evolving threat landscape, the need for dynamic security measures is critical. Due to Windows’s current architecture and design, security products running in the platform, particularly those involved in endpoint protection, require kernel access to provide the highest level of visibility, enforcement and tamper-resistance, while meeting the strict performance envelopes demanded by large enterprise clients.
In this blog, we’ll explore the CrowdStrike Falcon sensor architecture on the Windows operating system and the corresponding benefits to users’ protection, performance and adaptability to emerging threats. We’ll look at and dispel some common misinterpretations about kernel access, specifically regarding the architecture of the Falcon sensor.
Kernel Access: A Widely Adopted Industry Practice for Comprehensive Security, Including Enforcement
The history and practice of security product presence in the Windows kernel has already been commented on by Microsoft and touches upon the same core benefits we introduce in this blog. The Falcon sensor’s architecture follows these principles and reflects the evolutionary path of security-focused capabilities and vendor API access on Windows.
When CrowdStrike started in 2011, the architecture and product teams took an advanced stance for an enterprise product at the time — solely supporting Windows 7 and future Windows versions, and being fully compatible with Microsoft’s Kernel Patch Protection (KPP, also called PatchGuard). Prior to Windows Vista, almost all security products were composed of multiple complex kernel drivers, performing dangerous actions such as hooking kernel system calls, changing processor configuration hardware registers and patching kernel code. These products had no other choice, as there was a lack of documented and exposed vendor-level capabilities for providing the type of security guarantees that endpoint protection products require. In Windows Vista, and later Windows 7, Microsoft introduced replacement technologies, such as object callbacks. Additionally, kernel mode code signing (KMCS) was also introduced to guarantee the integrity of kernel code.
CrowdStrike chose to avoid use of those legacy techniques and instead utilized these new capabilities that Microsoft had introduced. These APIs provide documented access to important OS operations/flows and include interfaces like the filter manager, registry filtering and process manager callbacks. These mechanisms established the foundation for standardized and unified programming models that all security vendors now follow. This approach significantly reduced instability, security and compatibility issues.
Microsoft has continued to expand the functionality available to kernel-mode security products. Examples include the image signature verification callback, the 2nd generation thread notification callbacks and named pipe filtering. The latest version of Windows 11, 22H2, has numerous advancements for user-mode visibility capabilities, while including new support for endpoint security product needs, such as kernel mode file copy.
CrowdStrike made an early architectural decision to minimize kernel-invasive approaches. As Microsoft has introduced new capabilities that allow for the safe and tamper-evident analysis of security-related data solely from user mode, CrowdStrike has sought to take advantage of these features and, whenever possible, deprecated legacy mechanisms if customer impact was low. Our priority at all times is maximizing security, and we will only deploy to user-mode security components when reliability and security can be reasonably assured.
Kernel Access Is a Requirement for Early Boot Protection
Adversaries continue to employ increasingly sophisticated techniques to target and infiltrate systems at various stages — including system startup. Bootkits, rootkits, hypervisors, firmware implants — these are all malicious technologies that the industry has seen a rise in as the complexity of attacks continues to increase as a reaction to defenders’ postures.
By having kernel-mode drivers configured and operational early, security products like the Falcon sensor can expand their operational time window, starting both before user-mode processes and services, as well as other regular system drivers. Products like firmware analysis or device control would not be possible without this design. Microsoft directly supports and endorses such capabilities in security products, namely through the Early Launch Anti Malware (ELAM) architecture, which was specifically built in Windows 8.1 to enable such types of monitoring and enforcement.
Furthermore, numerous protections and capabilities that allow user-mode applications to benefit are set up by early components in the Windows boot chain, which form part of its Trusted Computing Base (TCB). An early and comprehensive kernel presence gives the Falcon sensor insight that this boot chain is trusted and behaves as expected. As such, if the sensor loaded too late, it’s possible that an earlier malicious component may have already tampered with the system in such a way to modify the behavior of components that the sensor relies on. An early and comprehensive kernel presence allows Falcon to monitor system execution from the very start, enabling detection of unexpected behavior.
User-Mode-Only Security Products Could Be Tampered With
Until new advances first introduced in Windows 8.1, such as Protected Process Light (PPL), an elevated user had full permissions to easily hinder the operation of a security product. Furthermore, such a user could also tamper with the various sources of information that user-mode components can use to derive information about the state of the system (such as Event Tracing for Windows, or ETW).
A security product experiencing a lapse in service results in protection and visibility gaps. This is why the Falcon sensor leverages its kernel-mode presence to avoid relying only on user-mode-only sources, as well as to continue to operate even if an attacker would tamper with its user-mode service (as well as attempt to restart it). As Microsoft introduced the PPL capability, CrowdStrike quickly moved to adopt this architecture and was one of the first security vendors to do so.
Later, in Windows 10, Microsoft introduced a tamper-resistant ETW mechanism called “Secure ETW,” further improving the posture of user-mode security components, followed by the “Threat Intelligence” ETW channel in Anniversary Update. The Falcon sensor’s architecture moved quickly to support these new capabilities. Windows 10 also introduced the Anti-Malware Scanning Interface (AMSI), which the Falcon sensor makes heavy usage of, removing the need to do complex script, document and/or macro parsing.
CrowdStrike makes every effort to make use of these and other modern capabilities as they are introduced. At the same time, CrowdStrike is also committed to supporting its customers that are still running Windows 7, Server 2008 R2 under appropriate support plans, and for simplicity ships a single kernel-level sensor across all operating systems. This means that although newer user-mode capabilities might exist in modern Windows versions, code to support key capabilities in legacy operating systems must still run in the kernel, until such support is deprecated. CrowdStrike and Microsoft continue to work together to enhance anti-tampering capabilities for user-mode processes. By sharing feedback and innovative approaches, we’re working to improve security and availability for security products.
User-Mode Processing
It’s important to recognize the fact that for some extremely high-performance paths in the networking and file system stacks, the only mechanism that can scale to customer workloads today is a kernel driver. As only one example, there are currently ways in which to write ransomware whose file system writes (encrypting your files) are only visible as “paging I/O,” which is an operation that is almost completely impractical to try to defer to a user-mode component (since it could itself be pageable).
Furthermore, high-performance operations, such as Winsock’s kernel file transfer capability, happen solely in the Windows kernel. Other mechanisms, such as BypassIO, used by DirectStorage, purposefully bypass user-mode visible notification mechanisms and require a kernel presence. These, and many other key optimizations, cannot be safely pushed back up into user space without performance and sometimes also compatibility and safety issues.
Expanding Our Security Presence to User Mode
In addition to the numerous user-mode security visibility capabilities named earlier, CrowdStrike was also the first third-party security product to create a Protected Process Light sandbox leveraging AppContainer technology on Windows when it released its next-gen antivirus (NGAV) machine learning engine in Spring 2017. This architecture uses the principle of least privilege, eliminating the need to have full “SYSTEM” privileges in many Falcon sandboxes.
Some classes of bugs in such services can actually result in machine crashes as well, on top of other local denial-of-service scenarios, and the security impact of vulnerabilities is similar. As such, the mere act of running in user mode does not necessarily increase reliability or security without pairing design decisions with a holistic threat model and secure-by-design architecture.
Fully Compliant with the Most Rigorous Partner Certifications for Kernel Drivers
CrowdStrike fully adheres to Microsoft’s partner certification for kernel drivers and is a full member of the Microsoft Virus Initiative (MVI), which adds additional requirements. For each kernel driver release, CrowdStrike conducts extensive testing through all required tests in Microsoft’s Hardware Lab Kit (HLK) and Microsoft’s Hardware Certification Kit (HCK), and submits results to Microsoft Windows Hardware Quality Labs (WHQL) for certification.
It is important to understand the difference between driver attestation (attestation signing) and WHQL verification, which CrowdStrike pursues. WHQL verification is far more rigorous than attestation signing and requires copious testing through the HCK/HLK process. This designation is embedded in the Microsoft signature attached to the kernel driver in the form of an Enhanced Key Usage (EKU). The differences in certification labeling should not be confused — CrowdStrike drivers are certified as WHQL drivers. Furthermore, customers can leverage Windows Defender Application Control (WDAC) to harden their systems to only allow WHQL-signed drivers.
In addition, all Falcon driver code is compatible with the memory integrity feature of Windows, also sometimes called hypervisor-enforced code integrity (HVCI), ensuring that kernel memory pages are only made executable after passing code integrity checks and that executable pages are never writable. CrowdStrike does not have any sort of virtual-machine-like interpreter capable of general-purpose compute in its driver — despite sharing the .SYS driver file extension, rapid response channel files are not executable machine code like real drivers and only contain configuration data and inputs to existing hard-coded rules. All Windows executable code that CrowdStrike ships is signed.
As part of MVI, CrowdStrike must additionally respect additional requirements above and beyond WHQL testing in order to meet the strict performance and stability requirements that enable access to technologies such as ELAM.
Finally, if the Falcon sensor recognizes that it is running on a version of the OS kernel that CrowdStrike has not fully tested against, the sensor disables certain functionality in the interest of stability and avoiding crashes. This is a fundamental part of the philosophy that has guided design of the Falcon architecture from Day One.
Conclusion
CrowdStrike prioritizes security and remains dedicated to the continuous improvement of our product and processes for our customers. We adhere strictly to Microsoft’s certification and testing procedures.
Windows offers versatile extensibility, enabling the creation of innovative security solutions. By providing both user-mode and kernel-mode capabilities, developers can find the right architectural balance between security, resilience and availability. Our architecture is designed to provide the optimal balance of security and performance, with a strong emphasis on minimizing risks associated with kernel-level operation, as well as running in more constrained execution environments in user mode. CrowdStrike made an early architectural decision to minimize kernel-invasive approaches and continues to prioritize user-mode approaches when possible by thoroughly reviewing all decisions around component placement.
As Microsoft continues to build new technologies for user-mode tamper-evident consumption, visibility and enforcement of security operations, the Falcon sensor will continue to leverage such capabilities, while maintaining the level of security on legacy supported systems that customers expect CrowdStrike to provide them with.
We look forward to and welcome the opportunity to continue to work with Microsoft and other partners to strengthen such abilities for safe, comprehensive, performant and reliable processing of security-related data solely utilizing user-mode capabilities, if and when they become available for Windows, as we have done on macOS with the Endpoint Security Framework and on Linux with BPF.
We hope this post provides a clearer understanding of our commitment to excellence and innovation in cybersecurity to our customers and our partners. This level of transparency, dialogue and collaboration is essential to dispel common misperceptions, advance the industry and help ensure the highest levels of security and reliability for all customers.