fix: 修正或者补充第1、2章内容#1
Open
whx-6 wants to merge 23 commits into
Open
Conversation
myrfy001
reviewed
May 20, 2026
| - <strong>大型缓存层次结构</strong>:L1、L2、L3缓存,以减少指令和数据访问的平均延迟 | ||
|
|
||
| 这种设计使得CPU能够高效地执行那些具有复杂控制流、大量分支和不可预测内存访问模式的程序。一个典型的现代CPU核心可以同时执行几十个(通常是2-4个硬件线程)线程。 | ||
| 这种设计使得CPU能够高效地执行那些具有复杂控制流、大量分支和不可预测内存访问模式的程序。一个典型的现代CPU(多核)总共可以同时执行几十个硬件线程,每个核心通常支持1~2个线程。 |
Contributor
There was a problem hiding this comment.
这里修改后,一方面显得过于口语化,和其他地方行文风格不一样。另一方面,需要确认描述正确。仔细再去对比理解求改前和修改后对于 硬件线程 的描述是否严谨。
| > | ||
| > (16x16(256个线程)的线程块大小虽然在本题中是任意的,但这是一个常见的选择。网格被创建为具有足够的线程块,使得像前面一样每个矩阵元素对应一个线程。为简单起见,此示例假设每个维度上每个网格的线程数能够被该维度上每个线程块的线程数整除,尽管实际情况并非总是如此。) | ||
|
|
||
| **注意**:以上代码使用 `float A[N][N]` 语法,假设 `N` 为编译时常量,属于示意性代码。在实际开发中,我们通常使用一维指针 `float*` 配合索引计算来访问矩阵数据,参见2.7.3节的实际代码。 |
Contributor
There was a problem hiding this comment.
针对这一点,是否可以增加思考题,提问学员为什么要用一维指针而不是数组的写法?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1.修正CPU线程数量描述
2.修正CUDA发布年份2007→2006
3.优化表格与文字排版
4.补充
__syncthreads()的内存栅栏作用说明35.为第2章矩阵加法示例添加VLA警告说明