CPU features: Difference between revisions
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
|  (새 문서: ===References=== <references/>) | No edit summary | ||
| Line 1: | Line 1: | ||
| === Compiler CPU flags === | |||
| customize the TensorFlow source build to take advantage of the availability of some CPU features that contribute to a speedier execution of TensorFlow code | |||
| Available CPU flags on target system can be found with following command and Linux [https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/include/asm/cpufeatures.h kernel source helps] unravel the meaning for each flag, | |||
| <code>$ more /proc/cpuinfo | grep flags</code> | |||
| {| class="wikitable" | |||
| ! colspan="1" rowspan="1" |No | |||
| ! colspan="1" rowspan="1" |Flag | |||
| ! colspan="1" rowspan="1" |CPU Feature | |||
| ! colspan="1" rowspan="1" |Additional Info | |||
| |- | |||
| |1 | |||
| |ssse3 | |||
| |Supplemental Streaming SIMD Extensions 3 (SSSE-3) instruction set | |||
| | | |||
| |- | |||
| |2 | |||
| |sse4_1 | |||
| |Streaming SIMD Extensions 4.1 (SSE-4.1) instruction set | |||
| | | |||
| |- | |||
| |3 | |||
| |sse4_2 | |||
| |Streaming SIDM Extensions 4.2 (SSE-4.2) instruction set | |||
| | | |||
| |- | |||
| |4 | |||
| |fma | |||
| |Fused multiply-add (FMA) instruction set | |||
| | | |||
| |- | |||
| |5 | |||
| |cx16 | |||
| |CMPXCHG16B instruction (double-width compare-and-swap) | |||
| | | |||
| |- | |||
| |6 | |||
| |popcnt | |||
| |Population count instruction (count number of bits set to 1) | |||
| | | |||
| |- | |||
| |7 | |||
| |avx | |||
| |Advanced Vector Extensions | |||
| | | |||
| |- | |||
| |8 | |||
| |avx2 | |||
| |Advanced Vector Extension 2 | |||
| | | |||
| |} | |||
| ===References=== | ===References=== | ||
| <references/> | <references/> | ||
Revision as of 08:42, 24 March 2023
Compiler CPU flags
customize the TensorFlow source build to take advantage of the availability of some CPU features that contribute to a speedier execution of TensorFlow code
Available CPU flags on target system can be found with following command and Linux kernel source helps unravel the meaning for each flag,
$ more /proc/cpuinfo | grep flags
| No | Flag | CPU Feature | Additional Info | 
|---|---|---|---|
| 1 | ssse3 | Supplemental Streaming SIMD Extensions 3 (SSSE-3) instruction set | |
| 2 | sse4_1 | Streaming SIMD Extensions 4.1 (SSE-4.1) instruction set | |
| 3 | sse4_2 | Streaming SIDM Extensions 4.2 (SSE-4.2) instruction set | |
| 4 | fma | Fused multiply-add (FMA) instruction set | |
| 5 | cx16 | CMPXCHG16B instruction (double-width compare-and-swap) | |
| 6 | popcnt | Population count instruction (count number of bits set to 1) | |
| 7 | avx | Advanced Vector Extensions | |
| 8 | avx2 | Advanced Vector Extension 2 |