Skip to content

Commit 026b9ef

Browse files
Liangliang-Matjruwaseloadams
authored andcommitted
[XPU] Support DeepNVMe new code structure (deepspeedai#6532)
In DeepNVMe GDS update, many functions are changed into a more abstract way. Also added some files. These change break zero-infinity on XPU. To bring this feature back, we have this PR: 1. modify the aio opbuilder for new files. 2. Add custom cpu_op_desc_t for xpu users. (XPU don't handle buffer aligned here) --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
1 parent 71e0720 commit 026b9ef

File tree

1 file changed

+51
-0
lines changed

1 file changed

+51
-0
lines changed

csrc/xpu/aio/deepspeed_cpu_op.cpp

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
// Copyright (c) Microsoft Corporation.
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
// DeepSpeed Team
5+
6+
#include "deepspeed_cpu_op.h"
7+
8+
using namespace std;
9+
10+
cpu_op_desc_t::cpu_op_desc_t(const bool read_op,
11+
const torch::Tensor& buffer,
12+
const int fd,
13+
const char* filename,
14+
const long long int file_num_bytes,
15+
const int num_threads,
16+
const bool validate)
17+
: io_op_desc_t(read_op, buffer, fd, filename, file_num_bytes, num_threads, validate),
18+
_cpu_buffer(buffer)
19+
{
20+
// XPU don't handle buffer here. See XPU Accelerator pin_memory.
21+
_contiguous_buffer = _cpu_buffer.contiguous();
22+
}
23+
24+
char* cpu_op_desc_t::data_ptr() const { return (char*)_contiguous_buffer.data_ptr(); }
25+
26+
void cpu_op_desc_t::finish()
27+
{
28+
if (_read_op && _buffer.is_xpu()) { _buffer.copy_(_cpu_buffer.to(torch::kXPU)); }
29+
}
30+
31+
void cpu_op_desc_t::validate()
32+
{
33+
validate_aio_operation(_read_op, _filename.c_str(), data_ptr(), _file_num_bytes);
34+
}
35+
36+
void cpu_op_desc_t::run(const int tid,
37+
std::unique_ptr<aio_context>& aio_ctxt,
38+
deepspeed_aio_config_t* aio_config)
39+
{
40+
assert(tid < _num_threads);
41+
const auto base_offset = _num_bytes_per_thread * tid;
42+
43+
std::unique_ptr<io_xfer_ctxt> xfer_ctxt(
44+
new io_xfer_ctxt(_fd, base_offset, _num_bytes_per_thread, data_ptr()));
45+
46+
if (aio_config->_overlap_events) {
47+
do_aio_operation_overlap(_read_op, aio_ctxt, xfer_ctxt, aio_config, nullptr);
48+
} else {
49+
do_aio_operation_sequential(_read_op, aio_ctxt, xfer_ctxt, aio_config, nullptr);
50+
}
51+
}

0 commit comments

Comments
 (0)