Skip to content

hw07 done!#1

Open
yangyueren wants to merge 1 commit into
parallel101:mainfrom
yangyueren:main
Open

hw07 done!#1
yangyueren wants to merge 1 commit into
parallel101:mainfrom
yangyueren:main

Conversation

@yangyueren

Copy link
Copy Markdown

请见ANSWER.md。

@archibate archibate left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢第一个提交作业!

  • 完成作业基本要求 42/50 分
  • 能够在 ANSWER.md 中用自己的话解释 23/25 分
  • 代码格式规范、能够跨平台 4/5 分
  • 有自己独特的创新点 11/20 分

Comment thread main.cpp
// 这两个是临时变量,有什么可以优化的? 5 分
Matrix Rt, RtA;
// ans: 改为static变量,预先分配好空间。
static Matrix Rt(std::array<std::size_t, 2>{1024, 1024}), RtA(std::array<std::size_t, 2>{1024, 1024});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static Matrix Rt(std::array<std::size_t, 2>{1024, 1024}), RtA(std::array<std::size_t, 2>{1024, 1024});
static thread_local Matrix Rt, RtA;

我觉得可以一开始为空没问题。thread_local保证如果多个线程访问不会出错。

Comment thread main.cpp
for(int i=0; i<nx; i+=32){
for(int t=0; t<nt; t++){
for(int i_block=i; i_block<i+32; i_block++){
out(i,j) += lhs(i_block, t) * rhs(t, j);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out(i,j) += lhs(i_block, t) * rhs(t, j);
out(i_block,j) += lhs(i_block, t) * rhs(t, j);

漏改了一个?

Comment thread main.cpp
for (int y = 0; y < ny; y++) {
float val = wangsrng(x, y).next_float();
out(x, y) = val;
// ans: 矩阵的x轴是紧密排列的,但是循环的内循环是y,访问数据时会跳跃,不利于cache;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10

Comment thread main.cpp
out(y, x) = in(x, y);
}
}
// ans: 因为out矩阵是紧密访问,但是in矩阵是跳跃访问,cache中放不下。应改为分块转置。

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15

Comment thread main.cpp
out(x, y) = 0; // 有没有必要手动初始化? 5 分
for (int t = 0; t < nt; t++) {
out(x, y) += lhs(x, t) * rhs(t, y);
// ans: lhs是跳跃访问,rhs是连续访问,out不动,造成无法矢量化。

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9,漏改了一个。

Comment thread main.cpp
TICK(matrix_RtAR);
// 这两个是临时变量,有什么可以优化的? 5 分
Matrix Rt, RtA;
// ans: 改为static变量,预先分配好空间。

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3,应该加thread_local,不需要初始化大小。

Comment thread main.cpp
// #pragma omp parallel for collapse(2)
// for (int y = 0; y < ny; y++) {
// for (int x = 0; x < nx; x++) {
// out(x, y) = 0; // 有没有必要手动初始化? 5 分

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants