I'm trying to parallelize this C++ code (computing a continuous Fourier transform of points, modeled as Dirac impulses), and this code compiles and works correctly, but it only uses 1 thread. Is there something else I need to do to get multiple threads working? This is on a Mac with 4 cores (8 threads), compiled with GCC 10.
vector<double> GetFourierImage(const Point pts[],
const int num_samples,
const int res,
const double freq_step) {
vector<double> fourier_img(res*res, 0.0);
double half_res = 0.5 * res;
vector<int> rows(res);
std::iota(rows.begin(), rows.end(), 0);
std::for_each( // Why doesn't this parallelize?
std::execution::par_unseq,
rows.begin(), rows.end(),
[&](int i) {
double y = freq_step * (i - half_res);
for (int j = 0; j < res; j++) {
double x = freq_step * (j - half_res);
double fx = 0.0, fy = 0.0;
for (int pt_idx = 0; pt_idx < num_samples; pt_idx++) {
double dot = (x * pts[pt_idx].x) + (y * pts[pt_idx].y);
double exp = -2.0 * M_PI * dot;
fx += cos(exp);
fy += sin(exp);
}
fourier_img[i*res + j] = sqrt((fx*fx + fy*fy) / num_samples);
}
});
return fourier_img;
}
question from:
https://stackoverflow.com/questions/65866491/why-is-my-parallel-stdfor-each-only-using-1-thread 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…