| xj | b04a402 | 2021-11-25 15:01:52 +0800 | [diff] [blame] | 1 | iTLB multihit | 
|  | 2 | ============= | 
|  | 3 |  | 
|  | 4 | iTLB multihit is an erratum where some processors may incur a machine check | 
|  | 5 | error, possibly resulting in an unrecoverable CPU lockup, when an | 
|  | 6 | instruction fetch hits multiple entries in the instruction TLB. This can | 
|  | 7 | occur when the page size is changed along with either the physical address | 
|  | 8 | or cache type. A malicious guest running on a virtualized system can | 
|  | 9 | exploit this erratum to perform a denial of service attack. | 
|  | 10 |  | 
|  | 11 |  | 
|  | 12 | Affected processors | 
|  | 13 | ------------------- | 
|  | 14 |  | 
|  | 15 | Variations of this erratum are present on most Intel Core and Xeon processor | 
|  | 16 | models. The erratum is not present on: | 
|  | 17 |  | 
|  | 18 | - non-Intel processors | 
|  | 19 |  | 
|  | 20 | - Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont) | 
|  | 21 |  | 
|  | 22 | - Intel processors that have the PSCHANGE_MC_NO bit set in the | 
|  | 23 | IA32_ARCH_CAPABILITIES MSR. | 
|  | 24 |  | 
|  | 25 |  | 
|  | 26 | Related CVEs | 
|  | 27 | ------------ | 
|  | 28 |  | 
|  | 29 | The following CVE entry is related to this issue: | 
|  | 30 |  | 
|  | 31 | ==============  ================================================= | 
|  | 32 | CVE-2018-12207  Machine Check Error Avoidance on Page Size Change | 
|  | 33 | ==============  ================================================= | 
|  | 34 |  | 
|  | 35 |  | 
|  | 36 | Problem | 
|  | 37 | ------- | 
|  | 38 |  | 
|  | 39 | Privileged software, including OS and virtual machine managers (VMM), are in | 
|  | 40 | charge of memory management. A key component in memory management is the control | 
|  | 41 | of the page tables. Modern processors use virtual memory, a technique that creates | 
|  | 42 | the illusion of a very large memory for processors. This virtual space is split | 
|  | 43 | into pages of a given size. Page tables translate virtual addresses to physical | 
|  | 44 | addresses. | 
|  | 45 |  | 
|  | 46 | To reduce latency when performing a virtual to physical address translation, | 
|  | 47 | processors include a structure, called TLB, that caches recent translations. | 
|  | 48 | There are separate TLBs for instruction (iTLB) and data (dTLB). | 
|  | 49 |  | 
|  | 50 | Under this errata, instructions are fetched from a linear address translated | 
|  | 51 | using a 4 KB translation cached in the iTLB. Privileged software modifies the | 
|  | 52 | paging structure so that the same linear address using large page size (2 MB, 4 | 
|  | 53 | MB, 1 GB) with a different physical address or memory type.  After the page | 
|  | 54 | structure modification but before the software invalidates any iTLB entries for | 
|  | 55 | the linear address, a code fetch that happens on the same linear address may | 
|  | 56 | cause a machine-check error which can result in a system hang or shutdown. | 
|  | 57 |  | 
|  | 58 |  | 
|  | 59 | Attack scenarios | 
|  | 60 | ---------------- | 
|  | 61 |  | 
|  | 62 | Attacks against the iTLB multihit erratum can be mounted from malicious | 
|  | 63 | guests in a virtualized system. | 
|  | 64 |  | 
|  | 65 |  | 
|  | 66 | iTLB multihit system information | 
|  | 67 | -------------------------------- | 
|  | 68 |  | 
|  | 69 | The Linux kernel provides a sysfs interface to enumerate the current iTLB | 
|  | 70 | multihit status of the system:whether the system is vulnerable and which | 
|  | 71 | mitigations are active. The relevant sysfs file is: | 
|  | 72 |  | 
|  | 73 | /sys/devices/system/cpu/vulnerabilities/itlb_multihit | 
|  | 74 |  | 
|  | 75 | The possible values in this file are: | 
|  | 76 |  | 
|  | 77 | .. list-table:: | 
|  | 78 |  | 
|  | 79 | * - Not affected | 
|  | 80 | - The processor is not vulnerable. | 
|  | 81 | * - KVM: Mitigation: Split huge pages | 
|  | 82 | - Software changes mitigate this issue. | 
|  | 83 | * - KVM: Vulnerable | 
|  | 84 | - The processor is vulnerable, but no mitigation enabled | 
|  | 85 |  | 
|  | 86 |  | 
|  | 87 | Enumeration of the erratum | 
|  | 88 | -------------------------------- | 
|  | 89 |  | 
|  | 90 | A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr | 
|  | 91 | and will be set on CPU's which are mitigated against this issue. | 
|  | 92 |  | 
|  | 93 | =======================================   ===========   =============================== | 
|  | 94 | IA32_ARCH_CAPABILITIES MSR                Not present   Possibly vulnerable,check model | 
|  | 95 | IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO]    '0'           Likely vulnerable,check model | 
|  | 96 | IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO]    '1'           Not vulnerable | 
|  | 97 | =======================================   ===========   =============================== | 
|  | 98 |  | 
|  | 99 |  | 
|  | 100 | Mitigation mechanism | 
|  | 101 | ------------------------- | 
|  | 102 |  | 
|  | 103 | This erratum can be mitigated by restricting the use of large page sizes to | 
|  | 104 | non-executable pages.  This forces all iTLB entries to be 4K, and removes | 
|  | 105 | the possibility of multiple hits. | 
|  | 106 |  | 
|  | 107 | In order to mitigate the vulnerability, KVM initially marks all huge pages | 
|  | 108 | as non-executable. If the guest attempts to execute in one of those pages, | 
|  | 109 | the page is broken down into 4K pages, which are then marked executable. | 
|  | 110 |  | 
|  | 111 | If EPT is disabled or not available on the host, KVM is in control of TLB | 
|  | 112 | flushes and the problematic situation cannot happen.  However, the shadow | 
|  | 113 | EPT paging mechanism used by nested virtualization is vulnerable, because | 
|  | 114 | the nested guest can trigger multiple iTLB hits by modifying its own | 
|  | 115 | (non-nested) page tables.  For simplicity, KVM will make large pages | 
|  | 116 | non-executable in all shadow paging modes. | 
|  | 117 |  | 
|  | 118 | Mitigation control on the kernel command line and KVM - module parameter | 
|  | 119 | ------------------------------------------------------------------------ | 
|  | 120 |  | 
|  | 121 | The KVM hypervisor mitigation mechanism for marking huge pages as | 
|  | 122 | non-executable can be controlled with a module parameter "nx_huge_pages=". | 
|  | 123 | The kernel command line allows to control the iTLB multihit mitigations at | 
|  | 124 | boot time with the option "kvm.nx_huge_pages=". | 
|  | 125 |  | 
|  | 126 | The valid arguments for these options are: | 
|  | 127 |  | 
|  | 128 | ==========  ================================================================ | 
|  | 129 | force       Mitigation is enabled. In this case, the mitigation implements | 
|  | 130 | non-executable huge pages in Linux kernel KVM module. All huge | 
|  | 131 | pages in the EPT are marked as non-executable. | 
|  | 132 | If a guest attempts to execute in one of those pages, the page is | 
|  | 133 | broken down into 4K pages, which are then marked executable. | 
|  | 134 |  | 
|  | 135 | off	      Mitigation is disabled. | 
|  | 136 |  | 
|  | 137 | auto        Enable mitigation only if the platform is affected and the kernel | 
|  | 138 | was not booted with the "mitigations=off" command line parameter. | 
|  | 139 | This is the default option. | 
|  | 140 | ==========  ================================================================ | 
|  | 141 |  | 
|  | 142 |  | 
|  | 143 | Mitigation selection guide | 
|  | 144 | -------------------------- | 
|  | 145 |  | 
|  | 146 | 1. No virtualization in use | 
|  | 147 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 148 |  | 
|  | 149 | The system is protected by the kernel unconditionally and no further | 
|  | 150 | action is required. | 
|  | 151 |  | 
|  | 152 | 2. Virtualization with trusted guests | 
|  | 153 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 154 |  | 
|  | 155 | If the guest comes from a trusted source, you may assume that the guest will | 
|  | 156 | not attempt to maliciously exploit these errata and no further action is | 
|  | 157 | required. | 
|  | 158 |  | 
|  | 159 | 3. Virtualization with untrusted guests | 
|  | 160 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 161 | If the guest comes from an untrusted source, the guest host kernel will need | 
|  | 162 | to apply iTLB multihit mitigation via the kernel command line or kvm | 
|  | 163 | module parameter. |