Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support inference value2disk #119

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions include/tpu_mlir/Support/ModuleInterpreter.h
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ class ModuleInterpreter {
std::map<std::string, Value> value_map;
std::map<std::string, std::shared_ptr<InferenceParameter>> inference_map;
std::map<std::string, std::shared_ptr<std::vector<float>>> mem_map;
std::vector<size_t> store_disk_shape;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this variable?

};

} // namespace tpu_mlir
Expand Down
46 changes: 34 additions & 12 deletions lib/Support/ModuleInterpreter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include <functional>
#include <memory>
#include <numeric>
#include <fstream>

#define DEBUG_TYPE "interpreter"

Expand Down Expand Up @@ -240,6 +241,9 @@ void ModuleInterpreter::allocate_all_tensor_in_disk() {
}
});
module::detachWeightFile(); // to free weight memory
func.walk([&](InferenceInterface infer_op) {
num_infer_op++;
});
}
}
void ModuleInterpreter::allocate_all_tensor_in_mem() {
Expand Down Expand Up @@ -361,6 +365,13 @@ void ModuleInterpreter::invoke(bool express_type) {
case mem_mode_t::PART_TENSOR_IN_MEM:
invoke_part_in_mem(express_type);
break;
case mem_mode_t::ALL_TENSOR_IN_DISK:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 37-47 can not set mem_mode to ALL_TENSOR_IN_DISK, so this branch cannot be executed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is no corresponding case here. I modified the mem_ mode=mem_ mode_t: : ALL_ TENSOR_ IN_ DISK for testing, perhaps a threshold greater than 16GB is needed to activate the case. I don't know what the threshold is, but I hope you can set it. In the above test, it was passed.

if(FILE *file = fopen("./value2disk.npz", "rb")){
fclose(file);
std::remove("./value2disk.npz");
}
invoke_to_disk("./value2disk.npz",express_type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard coding file name "value2disk.npz" could be improved using llvm TempFile or a related strategy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new commit, the llvm TempFile class was used, which seems to be more complex. In order to obtain the file on the Python side, an additional pybind interface was added because the file name was randomly generated.

break;
default:
llvm_unreachable("Mem not enough, please use invoke_to_disk");
break;
Expand Down Expand Up @@ -450,17 +461,19 @@ void ModuleInterpreter::value_to_disk(const std::string &filename,
const std::string &name,
std::vector<float> &data,
bool express_type) {
// auto value = value_map.at(name);
// if (express_type && module::isState(module::State::TPU_LOWERED)) {
// if (module::isUniformQuantized(value)) {
// auto qtype = module::getUniformQuantizedType(value);
// for (auto &d : data) {
// d = (d - (float)qtype.getZeroPoint()) * (float)qtype.getScale();
// }
// }
// }
// cnpy::npz_save(filename, name, data, "a");
llvm_unreachable("Not Implemented");
auto value = value_map.at(name);
if (express_type && module::isState(module::State::TPU_LOWERED)) {
if (module::isUniformQuantized(value)) {
auto qtype = module::getUniformQuantizedType(value);
for (auto &d : data) {
d = (d - (float)qtype.getZeroPoint()) * (float)qtype.getScale();
}
}
}
if(store_disk_shape.cbegin()!=store_disk_shape.cend())
store_disk_shape.clear();
store_disk_shape.push_back(data.size());
cnpy::npz_save(filename,name,&data[0],store_disk_shape,"a");
Learnmore666 marked this conversation as resolved.
Show resolved Hide resolved
}

void ModuleInterpreter::invoke_to_disk(const std::string &filename,
Expand All @@ -487,7 +500,16 @@ void ModuleInterpreter::invoke_to_disk(const std::string &filename,
}
auto iter = mem_uses.find(name);
if (iter == mem_uses.end()) {
continue;
if (auto WeightOp = dyn_cast<top::WeightOp>(in.getDefiningOp())) {
int num_uses = std::distance(in.user_begin(),in.user_end());
if (num_uses=1){
to_free.push_back(name);
continue;
}
else mem_uses[name] = num_uses-1;
}
else
continue;
}
iter->second--;
if (iter->second == 0) {
Expand Down
5 changes: 5 additions & 0 deletions python/tools/model_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,11 @@ def mlir_inference(inputs: dict, mlir_file: str, dump_all: bool = True, debug=No
pre_op = parser.get_pre_op_by_op_name(name)[0]
if pre_op in tensors:
outputs[pre_op] = tensors[pre_op]
else:
#if file exists,read tensor for compare
if os.path.isfile("./value2disk.npz"):
x = np.load("./value2disk.npz")
outputs[pre_op]=x[pre_op]
return outputs


Expand Down