承接上文C-state
上文说了除了使用C-state去节能,还可以使用P-state去减排。让我们看看P-state
P-states
P-state是在C-state为C0的时候起作用,就是说当CPU执行指令的时候才有P-state的用武之地,其他条件下P-state可以摸鱼了。P-state基本上只能更改电压和CPU频率来达到减少能量消耗的目的。不同的电压与频率会产成一个极大值,这个值被称为P0。实际上就是P-state需要系统在运行状态下但是它不需要火力全开(全部性能开启)状态运行,这样P-state就可以降低电压和频率来满足系统运行即可。其的中心思想就是节俭,正所谓上舞厅吃爆米花,该省的省,该花的花。
谁可以控制P-state
OS-controlled
系统比较在意P-state,并且设置了一个特定P-state。基本上是当选择一个CPU频率,会使CPU更改电压以满足给定的CPU频率,时钟生成器也锁定到这个特定的CPU频率。
HW-controlled
一句话,P-state会根据HW的支持来更改P-state,也就是HWP。OS只发送工作负载要求不参与更改P-state。本文下方信息中有关于此硬件辅助功能的输出。
超频 (Intel® Turbo Boost)
顾名思义,这项技术就是由于TDP(thermal design power)是CPU可以运行的最大功率,当功耗小于此值并且在特定条件下,CPU的频率可以超过基础频率(803 ~ 2711MHZ 本机频率范围)。Turbo Boost可以临时的增涨功率到PL2 (本机的 130W)但只能持续很短的时间。
CPU特性一窥
我就知道你们喜欢这个环节,上酸菜:
Processor [Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz]
|- Architecture [Skylake/S]
|- Vendor ID [GenuineIntel]
|- Microcode [0x000000ea]
|- Signature [ 06_5E]
|- Stepping [ 3]
|- Online CPU [ 8/ 8]
|- Base Clock [100.438]
|- Frequency (MHz) Ratio
Min 803.51 < 8 >
Max 2711.83 < 27 >
|- Factory [100.000]
2700 [ 27 ]
|- Performance
|- P-State
TGT 803.51 < 8 >
|- HWP
Min 803.51 < 8 >
Max 3615.77 < 36 >
TGT AUTO < 0 >
|- Turbo Boost [ UNLOCK]
1C 3615.77 < 36 >
2C 3414.90 < 34 >
3C 3314.46 < 33 >
4C 3214.02 < 32 >
|- Uncore [ UNLOCK]
Min 803.51 < 8 >
Max 3615.77 < 36 >
|- TDP evel < 0:3 >
|- Programmable [ UNLOCK]
|- Configuration [ UNLOCK]
|- Turbo Activation [ UNLOCK]
Nominal 2711.83 [ 27 ]
Level1 2209.64 [ 22 ]
Turbo 3515.34 < 35 >
Instruction Set Extensions
|- 3DNow!/Ext [N/N] ADX [Y] AES [Y] AVX/AVX2 [Y/Y]
|- AVX512-F [N] AVX512-DQ [N] AVX512-IFMA [N] AVX512-PF [N]
|- AVX512-ER [N] AVX512-CD [N] AVX512-BW [N] AVX512-VL [N]
|- AVX512-VBMI [N] AVX512-VBMI2 [N] AVX512-VNMI [N] AVX512-ALG [N]
|- AVX512-VPOP [N] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [N]
|- AVX512-BF16 [N] BMI1/BMI2 [Y/Y] CLWB [N] CLFLUSH/O [Y/Y]
|- CLAC-STAC [Y] CMOV [Y] CMPXCHG8B [Y] CMPXCHG16B [Y]
|- F16C [Y] FPU [Y] FXSR [Y] LAHF-SAHF [Y]
|- MMX/Ext [Y/N] MON/MWAITX [Y/N] MOVBE [Y] PCLMULQDQ [Y]
|- POPCNT [Y] RDRAND [Y] RDSEED [Y] RDTSCP [Y]
|- SEP [Y] SHA [N] SSE [Y] SSE2 [Y]
|- SSE3 [Y] SSSE3 [Y] SSE4.1/4A [Y/N] SSE4.2 [Y]
|- SERIALIZE [N] SYSCALL [Y] SGX [Y] RDPID [N]
Features
|- 1 GB Pages Support 1GB-PAGES [Capable]
|- Advanced Configuration & Power Interface ACPI [Capable]
|- Advanced Programmable Interrupt Controller APIC [Capable]
|- APIC Timer Invariance ARAT [Capable]
|- Core Multi-Processing CMP Legacy [Missing]
|- L1 Data Cache Context ID CNXT-ID [Missing]
|- Direct Cache Access DCA [Missing]
|- Debugging Extension DE [Capable]
|- Debug Store & Precise Event Based Sampling DS, PEBS [Capable]
|- CPL Qualified Debug Store DS-CPL [Capable]
|- 64-Bit Debug Store DTES64 [Capable]
|- Fast-String Operation Fast-Strings [Capable]
****** Feature 太多略过 ********
Technologies
|- Data Cache Unit
|- L1 Prefetcher L1 HW < ON>
|- L1 IP Prefetcher L1 HW IP < ON>
|- L2 Prefetcher L2 HW < ON>
|- L2 Line Prefetcher L2 HW CL < ON>
|- System Management Mode SMM-Dual [ ON]
|- Hyper-Threading HTT [ ON]
|- SpeedStep EIST < ON>
|- Dynamic Acceleration IDA [ ON]
|- Turbo Boost TURBO < ON>
|- Energy Efficiency Optimization EEO < ON>
|- Race To Halt Optimization R2H < ON>
|- Watchdog Timer TCO < ON>
|- Virtualization VMX [ ON]
|- I/O MMU VT-d [ ON]
|- Version [ 1.0]
|- Hypervisor [OFF]
|- Vendor ID [ N/A]
Performance Monitoring
|- Version PM [ 4]
|- Counters: General Fixed
| 4 x 48 bits 3 x 48 bits
|- Enhanced Halt State C1E <OFF>
|- C1 Auto Demotion C1A < ON>
|- C3 Auto Demotion C3A < ON>
|- C1 UnDemotion C1U < ON>
|- C3 UnDemotion C3U < ON>
|- C6 Core Demotion CC6 <OFF>
|- C6 Module Demotion MC6 <OFF>
|- Legacy Frequency ID control FID [OFF]
|- Legacy Voltage ID control VID [OFF]
|- P-State Hardware Coordination Feedback MPERF/APERF [ ON]
|- Hardware-Controlled Performance States HWP < ON>
|- Capabilities (MHz) Ratio
Lowest 100.44 [ 1 ]
Efficient 803.51 [ 8 ]
Guaranteed 2711.83 [ 27 ]
Highest 3615.78 [ 36 ]
|- Hardware Duty Cycling HDC < ON>
|- Package C-States
|- Configuration Control CONFIG [ LOCK]
|- Lowest C-State LIMIT < C8>
|- I/O MWAIT Redirection IOMWAIT <Disable>
|- Max C-State Inclusion RANGE < C8>
|- Core C-States
|- C-States Base Address BAR [ 0x1814]
|- MONITOR/MWAIT
|- State index: #0 #1 #2 #3 #4 #5 #6 #7
|- Sub C-State: 0 2 1 2 4 1 1 1
|- Core Cycles [Capable]
|- Instructions Retired [Capable]
|- Reference Cycles [Capable]
|- Last Level Cache References [Capable]
|- Last Level Cache Misses [Capable]
|- Branch Instructions Retired [Capable]
|- Branch Mispredicts Retired [Capable]
|- Top-down slots Counter [Capable]
Power, Current & Thermal
|- Clock Modulation ODCM <Disable>
|- DutyCycle [ 0.00%]
|- Power Management PWR MGMT [ LOCK]
|- Energy Policy Bias Hint < 6>
|- Energy Policy HWP EPP < 128>
|- Temperature Offset:Junction TjMax [ 3:100 C]
|- Digital Thermal Sensor DTS [Capable]
|- Power Limit Notification PLN [Capable]
|- Package Thermal Management PTM [Capable]
|- Thermal Monitor 1 TM1 [ Enable]
|- Thermal Monitor 2 TM2 [Capable]
|- Thermal Design Power TDP [ 45 W]
|- Minimum Power Min [Missing]
|- Maximum Power Max [Missing]
|- Thermal Design Power Package < Enable>
|- Power Limit (28 sec) PL1 < 45 W>
|- Power Limit (1 sec) PL2 < 56 W>
|- Thermal Design Power Core <Disable>
|- Power Limit PL1 [Missing]
|- Thermal Design Power Uncore <Disable>
|- Power Limit PL1 [Missing]
|- Thermal Design Power DRAM <Disable>
|- Power Limit PL1 [Missing]
|- Thermal Design Power Platform < Enable>
|- Power Limit (28 sec) PL1 < 45 W>
|- Power Limit (1 sec) PL2 < 130 W>
|- Electrical Design Current EDC [Missing]
|- Thermal Design Current TDC [Missing]
|- Core Thermal Point
|- DTS Threshold #1 Threshold [Missing]
|- DTS Threshold #2 Threshold [Missing]
|- Package Thermal Point
|- DTS Threshold #1 Threshold [Missing]
|- DTS Threshold #2 Threshold [Missing]
上面输出中可以找到P-state与C-state对应信息,以及Turbo Boost等相关数据
CPU的当前频率与电压:
CPU Freq(MHz) VID Min Vcore Max
000 34.30 0 0.0000 0.0000 0.0000
001 13.65 7835 0.6978 0.9564 1.2351
002 77.69 0 0.0000 0.0000 0.0000
003 12.96 0 0.0000 0.0000 0.0000
004 33.66 0 0.0000 0.0000 0.0000
005 16.38 0 0.0000 0.0000 0.0000
006 18.85 0 0.0000 0.0000 0.0000
007 17.84 0 0.0000 0.0000 0.0000
###总结 现在大部分机器已经使用硬件辅助来优化P-state,可共用户操作空间有限,看到这里是不是感觉白学了,那就对了,越来越多的技术会被硬件替代,就像自动化代替了手工劳动一样,我们只能被滚滚的历史车轮碾压,或者努力学习裸奔在被碾压的路上。