Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flash-481 Arrow encode #279

Merged
merged 150 commits into from
Oct 25, 2019
Merged
Show file tree
Hide file tree
Changes from 146 commits
Commits
Show all changes
150 commits
Select commit Hold shift + click to select a range
aa92f4e
basic framework for coprocessor support in tiflash
windtalker Jul 30, 2019
4f37218
basic support for InterpreterDagRequestV2
windtalker Jul 30, 2019
85bfd5c
code refine
windtalker Jul 30, 2019
e1700c3
tipb submodule use tipb master branch
windtalker Jul 31, 2019
0f82665
rewrite build flow in InterpreterDagRequest
windtalker Jul 31, 2019
a7655bc
rename Dag to DAG
windtalker Jul 31, 2019
f516f00
Update tipb submodule
zanmato1984 Aug 1, 2019
3b520c9
basic support for selection/limit/topn executor in InterpreterDAGRequest
windtalker Aug 2, 2019
9591d26
Merge branch 'cop' of https://github.com/pingcap/tics into cop
windtalker Aug 2, 2019
ead9609
basic support for selection/limit/topn executor in InterpreterDAGRequ…
windtalker Aug 2, 2019
bed0bd4
merge pingcap/cop branch
windtalker Aug 2, 2019
526cad9
Code reorg
zanmato1984 Aug 4, 2019
be4d80c
Format
zanmato1984 Aug 4, 2019
64a45a9
merge pingcap/cop
windtalker Aug 5, 2019
a76fdb3
merge pingcap/cop
windtalker Aug 5, 2019
0cfe045
Refine code
zanmato1984 Aug 5, 2019
e9b216c
Merge branch 'cop' of https://github.com/pingcap/tics into cop
windtalker Aug 5, 2019
3617a87
basic support for dag agg executor
windtalker Aug 5, 2019
cb55df4
Code refine
zanmato1984 Aug 5, 2019
ed41c93
Merge master into cop
zanmato1984 Aug 5, 2019
08b7142
Refine code
zanmato1984 Aug 5, 2019
bc25942
Another way of getting codec flag
zanmato1984 Aug 5, 2019
059f267
fix cop test regression (#157)
windtalker Aug 6, 2019
e59e8f3
fix npe during dag execute (#160)
windtalker Aug 6, 2019
a618cb5
Add tipb cpp gen in build script
zanmato1984 Aug 6, 2019
4f797fe
Merge branch 'master' into cop
zanmato1984 Aug 6, 2019
bb51749
Fix build error and adjust some formats
zanmato1984 Aug 6, 2019
da1cb0e
Fix build error
zanmato1984 Aug 6, 2019
816ef4b
Fix build error
zanmato1984 Aug 6, 2019
f18fcdd
Update flash configs
zanmato1984 Aug 6, 2019
2ade1cb
Format
zanmato1984 Aug 6, 2019
3870d93
Merge branch 'master' into cop
zanmato1984 Aug 7, 2019
7cb9e71
throw exception when meet error duing cop request handling (#162)
windtalker Aug 7, 2019
5fe66ee
Merge branch 'master' into cop
zanmato1984 Aug 8, 2019
0174b7e
add DAGContext so InterpreterDAG can exchange information with DAGDri…
windtalker Aug 8, 2019
9a1dd23
columnref index is based on executor output schema (#167)
windtalker Aug 8, 2019
26e20d5
Move flash/cop/dag to individual library
zanmato1984 Aug 8, 2019
bf67d9d
Merge cop lib
zanmato1984 Aug 8, 2019
62ced38
DAG planner fix and mock dag request (#169)
zanmato1984 Aug 9, 2019
b346a24
Merge branch 'master' into cop
zanmato1984 Aug 9, 2019
57cd382
Fix DAG get and lock storage
zanmato1984 Aug 9, 2019
4a76e91
handle error in cop request (#171)
windtalker Aug 12, 2019
2d093a8
code refine && several minor bug fix (#174)
windtalker Aug 12, 2019
c8cd3d7
Fix region id in mock dag
zanmato1984 Aug 12, 2019
0492af6
support udf in (#175)
windtalker Aug 14, 2019
4a6bad8
Merge branch 'master' into cop
zanmato1984 Aug 14, 2019
8713ff2
1. fix decode literal expr error, 2. add all scalar function sig in s…
windtalker Aug 14, 2019
7759af1
Merge branch 'master' into cop
zanmato1984 Aug 15, 2019
b25d1cc
some bug fix (#179)
windtalker Aug 15, 2019
3d38b7b
Support all DAG operator types in mock SQL -> DAG parser (#176)
zanmato1984 Aug 15, 2019
cbcfdb0
filter column must be uint8 in tiflash (#180)
windtalker Aug 16, 2019
d87e2d5
1. fix encode null error, 2. fix empty field type generated by TiFlas…
windtalker Aug 16, 2019
17f7fcb
Merge branch 'master' into cop
zanmato1984 Aug 16, 2019
5853b91
check validation of dag exprs field type (#183)
windtalker Aug 19, 2019
0a6767a
Merge branch 'master' into cop
zanmato1984 Aug 19, 2019
d53ca34
Merge branch 'master' into cop
zanmato1984 Aug 20, 2019
5de0ec6
add more coprocessor mock tests (#185)
windtalker Aug 20, 2019
6196171
add some log about implicit cast (#188)
windtalker Aug 21, 2019
960cc56
Merge branch 'master' into cop
zanmato1984 Aug 24, 2019
08bacd7
Pass DAG tests after merging master (#199)
zanmato1984 Aug 24, 2019
e8b4198
Fix date/datetime/bit encode error (#200)
zanmato1984 Aug 26, 2019
61cdc8f
improve dag execution time collection (#202)
windtalker Aug 26, 2019
53dcd1f
Merge branch 'master' into cop
zanmato1984 Aug 27, 2019
10e3883
column id in table scan operator may be -1 (#205)
windtalker Aug 27, 2019
39d1994
quick fix for decimal encode (#210)
windtalker Aug 30, 2019
8a0fb66
support udf like with 3 arguments (#212)
windtalker Sep 2, 2019
ff9a1de
Flash-473 optimize date and datetime comparison (#221)
windtalker Sep 5, 2019
17aacde
Merge master
zanmato1984 Sep 5, 2019
6b14b38
FLASH-479 select from empty table throw error in tiflash (#223)
windtalker Sep 6, 2019
548e519
Update flash service port
zanmato1984 Sep 6, 2019
a1b8444
fix bug in DAGBlockOutputStream
windtalker Sep 10, 2019
fce3676
fix bug in DAGBlockOutputStream (#230)
windtalker Sep 10, 2019
a9f9b48
FLASH-475: Support BATCH COMMANDS in flash service (#232)
zanmato1984 Sep 12, 2019
bdc7d57
init change for array encode
windtalker Sep 12, 2019
516d340
merge pingcap/tics/cop
windtalker Sep 12, 2019
1ccfbd4
Merge branch 'master' into cop
zhexuany Sep 12, 2019
df07939
FLASH-483: Combine raft service and flash service (#235)
zanmato1984 Sep 16, 2019
99f26c0
Merge master
zanmato1984 Sep 16, 2019
0bb7991
Fix build error
zanmato1984 Sep 16, 2019
f41f853
Fix test regression
zanmato1984 Sep 16, 2019
259ec77
Fix null value bug in datum
zanmato1984 Sep 17, 2019
ef65514
Merge branch 'master' into cop
zanmato1984 Sep 17, 2019
708d52f
FLASH-490: Fix table scan with -1 column ID and no agg (#240)
zanmato1984 Sep 23, 2019
3656a95
Merge branch 'master' into cop
zanmato1984 Sep 23, 2019
a4c1074
throw error if the cop request is not based on full region scan (#247)
windtalker Sep 24, 2019
b57656c
Merge branch 'master' into cop
zanmato1984 Sep 25, 2019
3a43942
FLASH-437 Support time zone in coprocessor (#259)
windtalker Sep 27, 2019
01caa55
Merge branch 'master' into cop
zanmato1984 Sep 27, 2019
8d2576e
Address comment
zanmato1984 Sep 29, 2019
8ec5380
Merge branch 'cop' of https://github.com/pingcap/tics into array_encode
windtalker Sep 29, 2019
2e3b1c1
use the new date implementation
windtalker Sep 29, 2019
d33a278
FLASH-489 support key condition for coprocessor query (#261)
windtalker Sep 30, 2019
087faee
Merge branch 'master' into cop
zanmato1984 Sep 30, 2019
4aa2b58
only return execute summaies if requested (#264)
windtalker Sep 30, 2019
aed5e84
Merge branch 'cop' of https://github.com/pingcap/tics into array_encode
windtalker Oct 8, 2019
8663811
refine code
windtalker Oct 8, 2019
80f6f35
Refine service init (#265)
zanmato1984 Oct 8, 2019
0b737dc
fix bug
windtalker Oct 9, 2019
d3af009
fix bug
windtalker Oct 9, 2019
004f7c5
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 9, 2019
f255362
FLASH-554 cop check range should be based on region range (#270)
windtalker Oct 10, 2019
170f652
add ut for arrow encode
windtalker Oct 11, 2019
c53e456
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 11, 2019
7fc53ad
minor improve (#273)
windtalker Oct 11, 2019
22ad2d3
Merge branch 'master' into cop
zanmato1984 Oct 11, 2019
b01ccb3
update tipb
windtalker Oct 11, 2019
a1304ae
Fix mutex on timezone retrieval (#276)
ilovesoup2000 Oct 11, 2019
687dcbe
Fix race condition of batch command handling (#277)
zanmato1984 Oct 12, 2019
4dd5e1e
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 12, 2019
80c20b2
update tipb version
windtalker Oct 12, 2019
7c5bea6
set default record_per_chunk to 1024
windtalker Oct 13, 2019
939b8cf
address comment
windtalker Oct 14, 2019
d25dadc
address comments
windtalker Oct 14, 2019
512fa8e
refine code
windtalker Oct 14, 2019
ff9bf8f
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 14, 2019
a6f6dda
refine code
windtalker Oct 14, 2019
a943e8d
add mock_dag test
windtalker Oct 14, 2019
41272da
code refine
windtalker Oct 14, 2019
00dac75
code refine
windtalker Oct 14, 2019
4080fba
address comments
windtalker Oct 14, 2019
1188e69
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 14, 2019
d2890e3
Fix NULL order for dag (#281)
zanmato1984 Oct 14, 2019
bc075c5
refine get actions in DAGExpressionAnalyzer, fix bug in dbgFuncCoproc…
windtalker Oct 15, 2019
4dbff78
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 15, 2019
fbcbdc0
remove duplicate agg funcs (#283)
windtalker Oct 15, 2019
8f2bfaf
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 16, 2019
3716b98
refine code
windtalker Oct 16, 2019
fa42c69
remove useless code
windtalker Oct 16, 2019
7bbe8c0
address comments
windtalker Oct 16, 2019
31973bf
remove uselss include
windtalker Oct 16, 2019
d968c09
address comments
windtalker Oct 16, 2019
edf32d4
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 16, 2019
f1256bd
refine code
windtalker Oct 17, 2019
73befbd
address comments
windtalker Oct 17, 2019
3188c07
format code
windtalker Oct 17, 2019
87955d1
fix typo
windtalker Oct 17, 2019
4f58878
Update dbms/src/Flash/BatchCommandsHandler.cpp
zanmato1984 Oct 17, 2019
92c16c2
revert unnecessary changes
windtalker Oct 17, 2019
0f6f0a6
Merge branch 'cop' of https://github.com/pingcap/tics into arrow_encode
windtalker Oct 17, 2019
d550644
refine code
windtalker Oct 17, 2019
bac7951
fix build error
windtalker Oct 17, 2019
4a251b0
refine code
windtalker Oct 17, 2019
e8b92b4
Merge branch 'master' into cop
zanmato1984 Oct 17, 2019
48dd7bd
Merge master
zanmato1984 Oct 18, 2019
a8cba5f
Merge remote-tracking branch 'origin/cop' into arrow_encode_2
windtalker Oct 18, 2019
e3232af
Merge branch 'master' of https://github.com/pingcap/tics into arrow_e…
windtalker Oct 21, 2019
4d5e5d4
address comments
windtalker Oct 21, 2019
c7d8d4e
refine code
windtalker Oct 22, 2019
0b1ed77
address comments
windtalker Oct 25, 2019
683e7e0
Merge branch 'master' into arrow_encode
zanmato1984 Oct 25, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion contrib/tipb
2 changes: 1 addition & 1 deletion dbms/src/Core/Defines.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
#define DEFAULT_MAX_READ_TSO 0xFFFFFFFFFFFFFFFF
#define DEFAULT_UNSPECIFIED_SCHEMA_VERSION -1

#define DEFAULT_DAG_RECORDS_PER_CHUNK 64L
#define DEFAULT_DAG_RECORDS_PER_CHUNK 1024L

/** Which blocks by default read the data (by number of rows).
* Smaller values give better cache locality, less consumption of RAM, but more overhead to process the query.
Expand Down
131 changes: 65 additions & 66 deletions dbms/src/Debug/dbgFuncCoprocessor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,11 @@
#include <DataStreams/BlocksListBlockInputStream.h>
#include <Debug/MockTiDB.h>
#include <Debug/dbgFuncCoprocessor.h>
#include <Flash/Coprocessor/ArrowChunkCodec.h>
#include <Flash/Coprocessor/DAGCodec.h>
#include <Flash/Coprocessor/DAGDriver.h>
#include <Flash/Coprocessor/DAGUtils.h>
#include <Flash/Coprocessor/DefaultChunkCodec.h>
#include <Parsers/ASTAsterisk.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
Expand Down Expand Up @@ -37,29 +39,30 @@ using TiDB::TableInfo;
using DAGColumnInfo = std::pair<String, ColumnInfo>;
using DAGSchema = std::vector<DAGColumnInfo>;
using SchemaFetcher = std::function<TableInfo(const String &, const String &)>;
std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(
Context & context, const String & query, SchemaFetcher schema_fetcher, Timestamp start_ts,
Int64 tz_offset, const String & tz_name);
tipb::SelectResponse executeDAGRequest(
Context & context, const tipb::DAGRequest & dag_request, RegionID region_id, UInt64 region_version,
std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(Context & context, const String & query, SchemaFetcher schema_fetcher,
Timestamp start_ts, Int64 tz_offset, const String & tz_name, const String & encode_type);
tipb::SelectResponse executeDAGRequest(Context & context, const tipb::DAGRequest & dag_request, RegionID region_id, UInt64 region_version,
UInt64 region_conf_version, std::vector<std::pair<DecodedTiKVKey, DecodedTiKVKey>> & key_ranges);
BlockInputStreamPtr outputDAGResponse(Context & context, const DAGSchema & schema, const tipb::SelectResponse & dag_response);

BlockInputStreamPtr dbgFuncDAG(Context & context, const ASTs & args)
{
if (args.size() < 1 || args.size() > 4)
throw Exception("Args not matched, should be: query[, region-id, tz_offset, tz_name]", ErrorCodes::BAD_ARGUMENTS);
if (args.size() < 1 || args.size() > 5)
throw Exception("Args not matched, should be: query[, region-id, encode_type, tz_offset, tz_name]", ErrorCodes::BAD_ARGUMENTS);

String query = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[0]).value);
RegionID region_id = InvalidRegionID;
if (args.size() >= 2)
region_id = safeGet<RegionID>(typeid_cast<const ASTLiteral &>(*args[1]).value);
String encode_type = "";
if (args.size() >= 3)
encode_type = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[2]).value);
Int64 tz_offset = 0;
String tz_name = "";
if (args.size() >= 3)
tz_offset = get<Int64>(typeid_cast<const ASTLiteral &>(*args[2]).value);
if (args.size() >= 4)
tz_name = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[3]).value);
tz_offset = get<Int64>(typeid_cast<const ASTLiteral &>(*args[3]).value);
if (args.size() >= 5)
tz_name = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[4]).value);
Timestamp start_ts = context.getTMTContext().getPDClient()->getTS();

auto [table_id, schema, dag_request] = compileQuery(
Expand All @@ -71,7 +74,7 @@ BlockInputStreamPtr dbgFuncDAG(Context & context, const ASTs & args)
throw Exception("Not TMT", ErrorCodes::BAD_ARGUMENTS);
return mmt->getTableInfo();
},
start_ts, tz_offset, tz_name);
start_ts, tz_offset, tz_name, encode_type);

RegionPtr region;
if (region_id == InvalidRegionID)
Expand All @@ -93,16 +96,17 @@ BlockInputStreamPtr dbgFuncDAG(Context & context, const ASTs & args)
DecodedTiKVKey start_key = RecordKVFormat::genRawKey(table_id, handle_range.first.handle_id);
DecodedTiKVKey end_key = RecordKVFormat::genRawKey(table_id, handle_range.second.handle_id);
key_ranges.emplace_back(std::make_pair(std::move(start_key), std::move(end_key)));
tipb::SelectResponse dag_response = executeDAGRequest(context, dag_request, region->id(), region->version(),
region->confVer(), key_ranges);
tipb::SelectResponse dag_response
= executeDAGRequest(context, dag_request, region->id(), region->version(), region->confVer(), key_ranges);

return outputDAGResponse(context, schema, dag_response);
}

BlockInputStreamPtr dbgFuncMockDAG(Context & context, const ASTs & args)
{
if (args.size() < 2 || args.size() > 5)
throw Exception("Args not matched, should be: query, region-id[, start-ts, tz_offset, tz_name]", ErrorCodes::BAD_ARGUMENTS);
if (args.size() < 2 || args.size() > 6)
throw Exception(
"Args not matched, should be: query, region-id[, start-ts, encode_type, tz_offset, tz_name]", ErrorCodes::BAD_ARGUMENTS);

String query = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[0]).value);
RegionID region_id = safeGet<RegionID>(typeid_cast<const ASTLiteral &>(*args[1]).value);
Expand All @@ -111,19 +115,22 @@ BlockInputStreamPtr dbgFuncMockDAG(Context & context, const ASTs & args)
start_ts = safeGet<Timestamp>(typeid_cast<const ASTLiteral &>(*args[2]).value);
if (start_ts == 0)
start_ts = context.getTMTContext().getPDClient()->getTS();
String encode_type = "";
if (args.size() >= 4)
encode_type = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[3]).value);
Int64 tz_offset = 0;
String tz_name = "";
if (args.size() >= 3)
tz_offset = safeGet<Int64>(typeid_cast<const ASTLiteral &>(*args[2]).value);
if (args.size() >= 4)
tz_name = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[3]).value);
if (args.size() >= 5)
tz_offset = safeGet<Int64>(typeid_cast<const ASTLiteral &>(*args[4]).value);
if (args.size() >= 6)
tz_name = safeGet<String>(typeid_cast<const ASTLiteral &>(*args[5]).value);

auto [table_id, schema, dag_request] = compileQuery(
context, query,
[&](const String & database_name, const String & table_name) {
return MockTiDB::instance().getTableByName(database_name, table_name)->table_info;
},
start_ts, tz_offset, tz_name);
start_ts, tz_offset, tz_name, encode_type);
std::ignore = table_id;

RegionPtr region = context.getTMTContext().getKVStore()->getRegion(region_id);
Expand All @@ -132,8 +139,8 @@ BlockInputStreamPtr dbgFuncMockDAG(Context & context, const ASTs & args)
DecodedTiKVKey start_key = RecordKVFormat::genRawKey(table_id, handle_range.first.handle_id);
DecodedTiKVKey end_key = RecordKVFormat::genRawKey(table_id, handle_range.second.handle_id);
key_ranges.emplace_back(std::make_pair(std::move(start_key), std::move(end_key)));
tipb::SelectResponse dag_response = executeDAGRequest(context, dag_request, region_id, region->version(),
region->confVer(), key_ranges);
tipb::SelectResponse dag_response
= executeDAGRequest(context, dag_request, region_id, region->version(), region->confVer(), key_ranges);

return outputDAGResponse(context, schema, dag_response);
}
Expand Down Expand Up @@ -206,7 +213,7 @@ void compileExpr(const DAGSchema & input, ASTPtr ast, tipb::Expr * expr, std::un
else if (func_name_lowercase == "greaterorequals")
{
expr->set_sig(tipb::ScalarFuncSig::GEInt);
auto *ft = expr->mutable_field_type();
auto * ft = expr->mutable_field_type();
ft->set_tp(TiDB::TypeLongLong);
ft->set_flag(TiDB::ColumnFlagUnsigned);
}
Expand Down Expand Up @@ -292,16 +299,19 @@ void compileFilter(const DAGSchema & input, ASTPtr ast, tipb::Selection * filter
compileExpr(input, ast, cond, referred_columns, col_ref_map);
}

std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(
Context & context, const String & query, SchemaFetcher schema_fetcher,
Timestamp start_ts, Int64 tz_offset, const String & tz_name)
std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(Context & context, const String & query, SchemaFetcher schema_fetcher,
Timestamp start_ts, Int64 tz_offset, const String & tz_name, const String & encode_type)
{
DAGSchema schema;
tipb::DAGRequest dag_request;
dag_request.set_time_zone_name(tz_name);
dag_request.set_time_zone_offset(tz_offset);

dag_request.set_start_ts(start_ts);
if (encode_type == "arrow")
dag_request.set_encode_type(tipb::EncodeType::TypeArrow);
else
dag_request.set_encode_type(tipb::EncodeType::TypeDefault);

ParserSelectQuery parser;
ASTPtr ast = parseQuery(parser, query.data(), query.data() + query.size(), "from DAG compiler", 0);
Expand Down Expand Up @@ -355,7 +365,8 @@ std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(
ci.tp = TiDB::TypeTimestamp;
ts_output.emplace_back(std::make_pair(column_info.name, std::move(ci)));
}
executor_ctx_map.emplace(ts_exec, ExecutorCtx{nullptr, std::move(ts_output), std::unordered_map<String, std::vector<tipb::Expr *>>{}});
executor_ctx_map.emplace(
ts_exec, ExecutorCtx{nullptr, std::move(ts_output), std::unordered_map<String, std::vector<tipb::Expr *>>{}});
last_executor = ts_exec;
}

Expand Down Expand Up @@ -400,8 +411,8 @@ std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(
tipb::Limit * limit = limit_exec->mutable_limit();
auto limit_length = safeGet<UInt64>(typeid_cast<ASTLiteral &>(*ast_query.limit_length).value);
limit->set_limit(limit_length);
executor_ctx_map.emplace(
limit_exec, ExecutorCtx{last_executor, executor_ctx_map[last_executor].output, std::unordered_map<String, std::vector<tipb::Expr *>>{}});
executor_ctx_map.emplace(limit_exec,
ExecutorCtx{last_executor, executor_ctx_map[last_executor].output, std::unordered_map<String, std::vector<tipb::Expr *>>{}});
last_executor = limit_exec;
}

Expand Down Expand Up @@ -593,8 +604,7 @@ std::tuple<TableID, DAGSchema, tipb::DAGRequest> compileQuery(
return std::make_tuple(table_info.id, std::move(schema), std::move(dag_request));
}

tipb::SelectResponse executeDAGRequest(
Context & context, const tipb::DAGRequest & dag_request, RegionID region_id, UInt64 region_version,
tipb::SelectResponse executeDAGRequest(Context & context, const tipb::DAGRequest & dag_request, RegionID region_id, UInt64 region_version,
UInt64 region_conf_version, std::vector<std::pair<DecodedTiKVKey, DecodedTiKVKey>> & key_ranges)
{
static Logger * log = &Logger::get("MockDAG");
Expand All @@ -607,49 +617,38 @@ tipb::SelectResponse executeDAGRequest(
return dag_response;
}

void arrowChunkToBlocks(const DAGSchema & schema, const tipb::SelectResponse & dag_response, BlocksList & blocks)
{
ArrowChunkCodec codec;
for (const auto & chunk : dag_response.chunks())
{
blocks.emplace_back(codec.decode(chunk, schema));
}
}

void defaultChunkToBlocks(const DAGSchema & schema, const tipb::SelectResponse & dag_response, BlocksList & blocks)
{
DefaultChunkCodec codec;
for (const auto & chunk : dag_response.chunks())
{
blocks.emplace_back(codec.decode(chunk, schema));
}
}

BlockInputStreamPtr outputDAGResponse(Context &, const DAGSchema & schema, const tipb::SelectResponse & dag_response)
{
if (dag_response.has_error())
throw Exception(dag_response.error().msg(), dag_response.error().code());

BlocksList blocks;
for (const auto & chunk : dag_response.chunks())
if (dag_response.encode_type() == tipb::EncodeType::TypeArrow)
{
std::vector<std::vector<Field>> rows;
std::vector<Field> curr_row;
const std::string & data = chunk.rows_data();
size_t cursor = 0;
while (cursor < data.size())
{
curr_row.push_back(DecodeDatum(cursor, data));
if (curr_row.size() == schema.size())
{
rows.emplace_back(std::move(curr_row));
curr_row.clear();
}
}

ColumnsWithTypeAndName columns;
for (auto & field : schema)
{
const auto & name = field.first;
auto data_type = getDataTypeByColumnInfo(field.second);
ColumnWithTypeAndName col(data_type, name);
col.column->assumeMutable()->reserve(rows.size());
columns.emplace_back(std::move(col));
}
for (const auto & row : rows)
{
for (size_t i = 0; i < row.size(); i++)
{
const Field & field = row[i];
columns[i].column->assumeMutable()->insert(DatumFlat(field, schema[i].second.tp).field());
}
}

blocks.emplace_back(Block(columns));
arrowChunkToBlocks(schema, dag_response, blocks);
}
else
{
defaultChunkToBlocks(schema, dag_response, blocks);
}

return std::make_shared<BlocksListBlockInputStream>(std::move(blocks));
}

Expand Down
84 changes: 84 additions & 0 deletions dbms/src/Flash/Coprocessor/ArrowChunkCodec.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#include <Flash/Coprocessor/ArrowChunkCodec.h>

#include <Flash/Coprocessor/ArrowColCodec.h>
#include <IO/Endian.h>

namespace DB
{

class ArrowChunkCodecStream : public ChunkCodecStream
{
public:
explicit ArrowChunkCodecStream(const std::vector<tipb::FieldType> & field_types) : ChunkCodecStream(field_types)
{
ti_chunk = std::make_unique<TiDBChunk>(field_types);
}

String getString() override
{
std::stringstream ss;
ti_chunk->encodeChunk(ss);
return ss.str();
}
void clear() override { ti_chunk->clear(); }
void encode(const Block & block, size_t start, size_t end) override;
std::unique_ptr<TiDBChunk> ti_chunk;
};

void ArrowChunkCodecStream::encode(const Block & block, size_t start, size_t end)
{
// Encode data in chunk by arrow encode
ti_chunk->buildDAGChunkFromBlock(block, field_types, start, end);
}

Block ArrowChunkCodec::decode(const tipb::Chunk & chunk, const DAGSchema & schema)
{
const String & row_data = chunk.rows_data();
const char * start = row_data.c_str();
const char * pos = start;
int column_index = 0;
ColumnsWithTypeAndName columns;
while (pos < start + row_data.size())
{
UInt32 length = toLittleEndian(*(reinterpret_cast<const UInt32 *>(pos)));
pos += 4;
UInt32 null_count = toLittleEndian(*(reinterpret_cast<const UInt32 *>(pos)));
pos += 4;
std::vector<UInt8> null_bitmap;
const auto & field = schema[column_index];
const auto & name = field.first;
auto data_type = getDataTypeByColumnInfo(field.second);
if (null_count > 0)
{
auto bit_map_length = (length + 7) / 8;
for (UInt32 i = 0; i < bit_map_length; i++)
{
null_bitmap.push_back(*pos);
pos++;
}
}
Int8 field_length = getFieldLength(field.second.tp);
std::vector<UInt64> offsets;
if (field_length == VAR_SIZE)
{
for (UInt32 i = 0; i <= length; i++)
{
offsets.push_back(toLittleEndian(*(reinterpret_cast<const UInt64 *>(pos))));
pos += 8;
}
}
ColumnWithTypeAndName col(data_type, name);
col.column->assumeMutable()->reserve(length);
pos = arrowColToFlashCol(pos, field_length, null_count, null_bitmap, offsets, col, field.second, length);
columns.emplace_back(std::move(col));
column_index++;
}
return Block(columns);
}

std::unique_ptr<ChunkCodecStream> ArrowChunkCodec::newCodecStream(const std::vector<tipb::FieldType> & field_types)
{
return std::make_unique<ArrowChunkCodecStream>(field_types);
}

} // namespace DB
18 changes: 18 additions & 0 deletions dbms/src/Flash/Coprocessor/ArrowChunkCodec.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#pragma once

#include <Flash/Coprocessor/ChunkCodec.h>

#include <Flash/Coprocessor/TiDBChunk.h>

namespace DB
{

class ArrowChunkCodec : public ChunkCodec
{
public:
ArrowChunkCodec() = default;
Block decode(const tipb::Chunk & chunk, const DAGSchema & schema) override;
std::unique_ptr<ChunkCodecStream> newCodecStream(const std::vector<tipb::FieldType> & field_types) override;
};

} // namespace DB
Loading