-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #59 from CloudOS-Group3/feature/gpu
fea: doc
- Loading branch information
Showing
4 changed files
with
274 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# GPU | ||
|
||
### GPU任务提交流程: | ||
|
||
1. **`kubectl` 通过配置文件创建GPU-job** | ||
|
||
配置文件主要参数有:`job-name`, `ntasks-per-node` 等超算平台需要的参数,`sourcePath` 代码和编译文件存放路径。 | ||
|
||
2. **`apiServer` 存入GPU-job,并发送创建pod消息** | ||
|
||
3. **`kubelet` 创建pod实例** | ||
|
||
创建容器的时候,将配置文件中的**参数**,通过**环境变量**形式传入。 | ||
|
||
将所需要的cuda程序代码和编译文件,通过**volumn映射**传入容器, | ||
|
||
创建container所使用的镜像中包含**`gpu_server.py`**,主要作用有: | ||
|
||
- 通过`os.getenv` 获取相关参数(环境变量全大写、下划线) | ||
- 通过参数编写作业脚本`.slurm` | ||
- 连接超算平台,传送对应文件 | ||
- 加载运行环境,编译程序,提交作业 | ||
- 轮询查看是否完成(查看指定路径是否有结果文件输出) | ||
- 完成后给apiServer发消息(job-name和status),结果文件通过volumn映射传回主机 | ||
|
||
4. apiServer收到job完成消息,修改job状态 | ||
|
||
|
||
|
||
### CUDA程序说明 | ||
|
||
示例程序位于`minik8s/testdata/Gpu` | ||
|
||
程序主要有以下几个部分: | ||
|
||
1. CUDA 核函数 | ||
|
||
利用当前线程的行和列索引,编写运算逻辑 | ||
|
||
```c | ||
__global__ void matrixMulKernel(float *A, float *B, float *C, int m, int k, int n) { | ||
int row = blockDim.y * blockIdx.y + threadIdx.y; | ||
int col = blockDim.x * blockIdx.x + threadIdx.x; | ||
|
||
if (row < m && col < n) { | ||
float sum = 0.0f; | ||
for (int i = 0; i < k; ++i) { | ||
sum += A[IDX2C(row, i, m)] * B[IDX2C(i, col, k)]; | ||
} | ||
C[IDX2C(row, col, m)] = sum; | ||
} | ||
} | ||
``` | ||
2. 内存分配和初始化 | ||
3. 配置 CUDA 核函数的执行参数 | ||
这部分是使用GPU并发能力的关键,配置线程块、计算网格大小 | ||
```c | ||
// 配置 CUDA 核函数的执行参数 | ||
dim3 threadsPerBlock(16, 16); | ||
dim3 blocksPerGrid((N + threadsPerBlock.x - 1) / threadsPerBlock.x, | ||
(M + threadsPerBlock.y - 1) / threadsPerBlock.y); | ||
// 调用矩阵乘法核函数 | ||
matrixMulKernel<<<blocksPerGrid, threadsPerBlock>>>(devPtrA, devPtrB, devPtrC, M, K, N); | ||
``` | ||
|
||
4. 结果传回主机和释放内存 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Pod | ||
|
||
### 功能使用 | ||
|
||
1. `kubectl get` 可以获取pod运行状态,可以指定namespace、name | ||
2. `kubectl apply -f <file-path>`可以通过配置文件创建pod | ||
3. `kubectl delete pod -n <namespace> name` 可以删除指定pod | ||
4. pod内部容器之间支持localhost访问 | ||
5. 外部支持 pod ip+端口 访问 | ||
|
||
|
||
|
||
### 实现方式 | ||
|
||
1. pod创建 | ||
- `apiserver` 接收请求,存入etcd,并向kafka中发送创建消息 | ||
- `scheduler` 监听pod创建消息,进行调度,将调度节点写入pod配置文件 | ||
- `kubelet` 监听pod创建消息(过滤掉没有调度的pod),创建pod实例 | ||
- pull对应镜像 | ||
- 创建pause容器,cni插件(flannel)为pause容器分配ip地址 | ||
- 创建其他容器,配置env、command、volumn等对应参数 | ||
- `kubelet` 用心跳机制发回pod ip | ||
2. pod内部通讯 | ||
- 通过flannel为pause容器分配ip地址(作为pod ip),pod ip适用于集群 | ||
- 其余容器通过linux namespace配置ipc、utc、network,共享pause容器网络(例如:/proc/<pause_pid>/ns/ipc) | ||
- pod内部容器之间支持localhost访问 | ||
|
||
|
||
|
||
### CNI插件 | ||
|
||
选择flannel,通过kubenetes集群配置方式,配置了flannel插件,通过创建容器时指定 --network flannel。主要用于分配pod ip(pause容器ip) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Service & kubeproxy | ||
### service结构定义 | ||
|
||
通过yaml文件创建 | ||
|
||
定义selector规则,端口映射,ClusterIp(用户定义的如果被占用会重新分配),对外访问的nodePort | ||
|
||
### ClusterIp生成 | ||
|
||
ip范围:10.96.0.0/16,etcd中存储ClusterIp使用情况(map ClusterIp --> ServiceName)。 | ||
|
||
预先设置dummy网卡,并将service Cluster ip加入网络 | ||
|
||
```shell | ||
ip L a minik8s0 type dummy | ||
|
||
ip addr add 10.96.0.2/32 dev | ||
|
||
echo 1 > /proc/sys/net/ipv4/vs/conntrack | ||
``` | ||
|
||
|
||
|
||
### 流量控制 | ||
|
||
使用 IPVS 控制流量转发。负载均衡选择RoundRobin。 | ||
|
||
1. 创建service :添加service(ClusterIp:port)到 IPVS 中。在添加之前,会检查service是否已存在于 IPVS 中,如果已存在则跳过。 | ||
等效指令:`ipvsadm -A -t <ClusterIP>:<Port> -s rr` | ||
2. 根据service对应的endpoints,创建路由转发规则 | ||
`ipvsadm -a -t <ClusterIP>:<Port> -r <PodIP>:<PodPort> -m` | ||
3. 删除service | ||
`ipvsadm -D -t <ClusterIP>:<Port>` | ||
4. 删除路由转发规则 | ||
`ipvsadm -d -t <ClusterIP>:<Port> -r <PodIP>:<PodPort>` | ||
5. NodePort配置: | ||
1. 将主机端口访问转发到service ClusterIP:Port | ||
2. 实现指令:`iptables -t nat -A PREROUTING -p tcp --dport <NodePort> -j DNAT --to-destination <ClusterIP>:<Port>` | ||
6. 相关组件: | ||
|
||
- `endpoint_controller`:维护endpoints的动态更新 | ||
- `kubeproxy`:监听service创建、删除等请求 |