std::hardware_destructive_interference_size, std::hardware_constructive_interference_size
来自cppreference.com
在标头 <new> 定义 | ||
inlineconstexprstd::size_t hardware_destructive_interference_size =/* 由实现定义 */; | (1) | (C++17 起) |
inlineconstexprstd::size_t hardware_constructive_interference_size =/* 由实现定义 */; | (2) | (C++17 起) |
1) 两个对象间避免假数据共享的最小偏移。保证至少为 alignof(std::max_align_t)
struct keep_apart { alignas(std::hardware_destructive_interference_size)std::atomic<int> cat; alignas(std::hardware_destructive_interference_size)std::atomic<int> dog;};
2) 鼓励真共享的最大连续内存大小。保证至少为 alignof(std::max_align_t)
struct together {std::atomic<int> dog;int puppy;}; struct kennel {// 其他数据成员…… alignas(sizeof(together)) together pack; // 其他数据成员……}; static_assert(sizeof(together)<= std::hardware_constructive_interference_size);
[编辑]注解
这些常量提供一种可移植的访问 L1 数据缓存行大小的方式。
功能特性测试宏 | 值 | 标准 | 功能特性 |
---|---|---|---|
__cpp_lib_hardware_interference_size | 201703L | (C++17) | constexpr std::hardware_constructive_interference_size 和 constexpr std::hardware_destructive_interference_size |
[编辑]示例
程序使用两个线程(原子地)写入给定全局对象的数据成员。第一个对象适合于存入一条缓存行内,这导致“硬件干涉”。第二个对象保持其数据成员在分离的缓存行上,故避免了线程写入后可能的“缓存同步”。
运行此代码
#include <iostream>#include <thread>#include <chrono> #ifdef __cpp_lib_hardware_interference_sizeusing std::hardware_constructive_interference_size;using std::hardware_destructive_interference_size;#else// 在 x86-64 │ L1_CACHE_BYTES │ L1_CACHE_SHIFT │ __cacheline_aligned │ ... 上为 64 字节constexprstd::size_t hardware_constructive_interference_size =64;constexprstd::size_t hardware_destructive_interference_size =64;#endif struct one_cache_liner {::std::atomic_uint64_t x{};::std::atomic_uint64_t y{};}; struct two_cache_liner { alignas(hardware_destructive_interference_size)::std::atomic_uint64_t x{}; alignas(hardware_destructive_interference_size)::std::atomic_uint64_t y{};}; inlineauto increment_thread(::std::atomic_uint64_t&u){return[&]{constexprint max_write_iterations{10'000'000};for(::std::size_t i{}; i < max_write_iterations;++i){ u.fetch_add(1, ::std::memory_order_relaxed);};};} template<typename T>auto parallel_increment(T &&t){::std::jthread th1{increment_thread(t.x)};::std::jthread th2{increment_thread(t.y)};} struct timer { timer(): beg(::std::chrono::high_resolution_clock::now()){} ~timer(){::std::cout<<::std::chrono::high_resolution_clock::now()- beg <<'\n';}::std::chrono::high_resolution_clock::time_point beg;}; int main(){::std::cout<<"hardware_constructive_interference_size = "<< hardware_constructive_interference_size <<'\n';::std::cout<<"hardware_destructive_interference_size = "<< hardware_destructive_interference_size <<'\n'; ::std::chrono::high_resolution_clock::now();{::std::cout<<"sizeof(one_cache_liner) = "<< sizeof(one_cache_liner)<<'\n'; timer t; parallel_increment(one_cache_liner{});}{::std::cout<<"sizeof(two_cache_liner) = "<< sizeof(two_cache_liner)<<'\n'; timer t; parallel_increment(two_cache_liner{});}}
可能的输出:
hardware_constructive_interference_size = 64 hardware_destructive_interference_size = 64 sizeof(one_cache_liner) = 16 182019200ns sizeof(two_cache_liner) = 128 35766400ns
[编辑]参阅
返回实现支持的并发线程数 ( std::thread 的公开静态成员函数) | |
返回实现支持的并发线程数 ( std::jthread 的公开静态成员函数) |