-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDF based on existing function infra #4804
Conversation
Co-authored-by: Wey Gu <weyl.gu@gmail.com> Co-authored-by: Cheng Xuntao <7731943+xtcyclist@users.noreply.github.com>
@zhaojunnana I added an example of building the UDF, based on some downstream work from @xtcyclist , too. Nice job, we tested it worked like a charm :) @xtcyclist g++ used to enable redhat distributions to build, too. |
Docs: 编译环境准备,参考
进入到代码仓库,定义编写 UDF$ tree udf
udf
|-- Makefile
|-- standard_deviation.cpp
`-- standard_deviation.h
这里我们实现一个求给定数组的标准差的函数:
#ifndef UDF_PROJECT_STANDARD_DEVIATION_H
#define UDF_PROJECT_STANDARD_DEVIATION_H
#include "../src/common/function/GraphFunction.h"
class standard_deviation : public GraphFunction {
public:
char *name() override;
std::vector<std::vector<nebula::Value::Type>> inputType() override;
nebula::Value::Type returnType() override;
size_t minArity() override;
size_t maxArity() override;
bool isPure() override;
nebula::Value body(const std::vector<std::reference_wrapper<const nebula::Value>> &args) override;
};
#endif // UDF_PROJECT_STANDARD_DEVIATION_H
/* Copyright (c) 2020 vesoft inc. All rights reserved.
*
* This source code is licensed under Apache 2.0 License.
*/
#include <cmath>
#include <vector>
#include "standard_deviation.h"
#include "../src/common/datatypes/List.h"
#include "../src/common/datatypes/Value.h"
extern "C" GraphFunction *create() {
return new standard_deviation;
}
extern "C" void destroy(GraphFunction *function) {
delete function;
}
char *standard_deviation::name() {
const char *name = "standard_deviation";
return const_cast<char *>(name);
}
std::vector<std::vector<nebula::Value::Type>> standard_deviation::inputType() {
std::vector<nebula::Value::Type> vtp = {nebula::Value::Type::LIST};
std::vector<std::vector<nebula::Value::Type>> vvtp = {vtp};
return vvtp;
}
nebula::Value::Type standard_deviation::returnType() {
return nebula::Value::Type::FLOAT;
}
size_t standard_deviation::minArity() {
return 1;
}
size_t standard_deviation::maxArity() {
return 1;
}
bool standard_deviation::isPure() {
return true;
}
double standardDeviation(const std::vector<double> &numbers) {
double sum = 0;
for (double number : numbers) {
sum += number;
}
double average = sum / numbers.size();
double variance = 0;
for (double number : numbers) {
double difference = number - average;
variance += difference * difference;
}
variance /= numbers.size();
return sqrt(variance);
}
nebula::Value standard_deviation::body(
const std::vector<std::reference_wrapper<const nebula::Value>> &args) {
switch (args[0].get().type()) {
case nebula::Value::Type::NULLVALUE: {
return nebula::Value::kNullValue;
}
case nebula::Value::Type::LIST: {
std::vector<double> numbers;
auto list = args[0].get().getList();
auto size = list.size();
for (int i = 0; i < size; i++) {
auto &value = list[i];
if (value.isInt()) {
numbers.push_back(value.getInt());
} else if (value.isFloat()) {
numbers.push_back(value.getFloat());
} else {
return nebula::Value::kNullValue;
}
}
return nebula::Value(standardDeviation(numbers));
}
default: {
return nebula::Value::kNullValue;
}
}
} 然后编译: $ cd udf; make
clang++-10 ./standard_deviation.cpp -c -o standard_deviation.o -I ../src/ -fPIC -I ../build/third-party/install/include/
clang++-10 -shared -o standard_deviation.so standard_deviation.o 这时候,文件被编译成二进制 $ tree .
.
|-- Makefile
|-- standard_deviation.cpp
|-- standard_deviation.h
|-- standard_deviation.o
`-- standard_deviation.so 加载 UDF 到 GraphD假设我们把代码仓库放到 $ tail /usr/local/nebula/etc/nebula-graphd.conf -n 5
# enable udf, c++ only
--enable_udf=true
# set the directory where the .so of udf are stored
--udf_path=/home/foobar/dev/nebula/udf/ 重启 graphd sudo /usr/local/nebula/scripts/nebula.service restart graphd 测试 UDF连接到重启的 GraphD, 测试 $ nebula-console-3.0 -addr 127.0.0.1 -port 9669 -user root -p nebula
(root@nebula) [(none)]> yield standard_deviation([1,2,3])
+-----------------------------+
| standard_deviation([1,2,3]) |
+-----------------------------+
| 0.816496580927726 |
+-----------------------------+
Got 1 rows (time spent 9944/18471 us)
(root@nebula) [(none)]> yield standard_deviation([1,1,1])
+-----------------------------+
| standard_deviation([1,1,1]) |
+-----------------------------+
| 0.0 |
+-----------------------------+
Got 1 rows (time spent 4559/12630 us)
Fri, 03 Feb 2023 16:11:24 CST
(root@nebula) [basketballplayer]> GO 1 TO 2 STEPS FROM "player100" OVER follow YIELD properties(edge).degree AS d | yield collect($-.d)
+--------------------------+
| collect($-.d) |
+--------------------------+
| [95, 95, 95, 90, 95, 90] |
+--------------------------+
Got 1 rows (time spent 6424/14165 us)
Fri, 03 Feb 2023 16:12:05 CST
(root@nebula) [basketballplayer]> GO 1 TO 2 STEPS FROM "player100" OVER follow YIELD properties(edge).degree AS d | yield collect($-.d) AS d | yield standard_deviation($-.d)
+--------------------------+
| standard_deviation($-.d) |
+--------------------------+
| 2.357022603955158 |
+--------------------------+
Got 1 rows (time spent 9809/15682 us) |
Codecov ReportBase: 77.69% // Head: 78.61% // Increases project coverage by
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## master #4804 +/- ##
==========================================
+ Coverage 77.69% 78.61% +0.91%
==========================================
Files 1119 1120 +1
Lines 83665 83797 +132
==========================================
+ Hits 65005 65878 +873
+ Misses 18660 17919 -741
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contribution on udf is very much appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job
Thanks a lot @zhaojunnana ! |
@abby-cyber 发现我用错了名字,我给的例子是标准差,不是方差,我已经编辑了,另外就是如电话说的, make 的部分参考最新的pr |
What type of PR is this?
What problem(s) does this PR solve?
Issue(s) number:
#4793
Close #5337
Description:
Users can define functions by implementing GraphFunction
How do you solve it?
Special notes for your reviewer, ex. impact of this fix, design document, etc:
Checklist:
Tests:
Affects:
see UDF based on existing function infra #4804
Release notes:
Please confirm whether to be reflected in release notes and how to describe: