-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
meta: refactor graph building and scheduling #7420
Comments
3 tasks
mergify bot
pushed a commit
that referenced
this issue
Jan 18, 2023
) Detailedly explained in #7420. This PR mainly focuses on the internal table handling. - Extract the visiting logic into a single function of `visit_internal_table`. - Move the logic of filling IDs before constructing the **Fragment** Graph, so that once the graph is built, it's complete. - Remove a bunch of (mutable) fields in the `Context`. Approved-By: yezizp2012 Approved-By: chenzl25 Approved-By: xx01cyx Co-Authored-By: Bugen Zhao <i@bugenzhao.com> Co-Authored-By: August <pin@singularity-data.com>
3 tasks
mergify bot
pushed a commit
that referenced
this issue
Feb 1, 2023
(as explained in #7420) This PR introduces a brand new streaming scheduler which is done by automatic iteration based on a Datalog syntax. In short, by simply pre-defining some rules, the scheduler can directly derive the distribution (or vnode mapping) for the given fragment graph and fill the exchange with correct dispatchers, without thinking about the topological order or handling edge cases anymore. https://github.com/risingwavelabs/risingwave/blob/b5b70668cf275a83b169b0088e3403686173697d/src/meta/src/stream/stream_graph/schedule.rs#L83-L98 In this PR, we only utilize this new scheduler in parallelism assignment, while the derived distribution itself is ignored. This can be fixed by removing the old scheduler in the next PRs. By making the scheduling phase ahead of time, we can also simplify the logic like actor ID assignments, as we're building actors after we know the total count of them. Approved-By: chenzl25 Approved-By: yezizp2012
4 tasks
Little-Wallace
added a commit
to Little-Wallace/risingwave
that referenced
this issue
Feb 6, 2023
commit a9af7f95b34e29905bcf5e3d0468b2a03da1b7a4 Author: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com> Date: Mon Feb 6 16:06:45 2023 +0800 fix(ci): fix main-cron by increasing total compute node memory in compaction test (#7703) As title. Approved-By: kwannoel commit 77b46077e816cbc5a426fb1c3330cabd3579712e Author: August <pin@singularity-data.com> Date: Mon Feb 6 15:09:44 2023 +0800 feat(frontend): introduce system catalog table pg_enum (#7706) Introduce system catalog table `pg_enum`. Approved-By: neverchanje commit 7e565a713a182e832bb37db1c288caab4de897f5 Author: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com> Date: Mon Feb 6 14:44:30 2023 +0800 feat(optimizer): watermark derivation for various plan nodes (#7655) Add watermark derivation for various plan nodes. Watermark derivation for `TableScan` hasn't been implemented because it may need to modify table catalog and will be done in future PR. Approved-By: st1page commit c3bb0275cfc59770730114eada575599f8cd150e Author: ZENOTME <43447882+ZENOTME@users.noreply.github.com> Date: Mon Feb 6 14:16:39 2023 +0800 feat(frontend): seperate plan_fragmenter into two phase (#7581) To solve #7439, we need to do async operation in plan_fragmentor. To do this, I seperate the plan_fragmentor into two phase **so that we can do async operation in phase 2**: phase 1 : BatchPlanFragmenter.split(batch_node) -> PreStageGraph phase 2 : PreStageGraph.complete() -> StageGraph The difference between PreStageGraph and StageGraph is that StageGraph contains the exchange_info and parallism. These information will be filled in phase 2. Approved-By: liurenjie1024 commit 26780672c936c216bba1200431280456bc960877 Author: Eric Fu <eric@singularity-data.com> Date: Mon Feb 6 13:06:04 2023 +0800 feat: display git version in `version()` and server logs (#7690) Show Git version in `version()` and server logs. Would be helpful when looking into problems, especially for nightly versions. ``` dev=> select version(); version -------------------------------------------------- PostgreSQL 13.9-RisingWave-0.2.0-alpha (b720f19) (1 row) ``` ``` 2023-02-03T10:35:25.446834Z INFO risingwave_compute::server: Starting compute node 2023-02-03T10:35:25.446858Z INFO risingwave_compute::server: > config: RwConfig { server: ServerConfig { heartbeat_interval_ms: 1000, max_heartbeat_interval_secs: 600, connection_pool_size: 16, metrics_level: 0 }, meta: MetaConfig { min_sst_retention_time_sec: 604800, collect_gc_watermark_spin_interval_sec: 5, periodic_compaction_interval_sec: 60, vacuum_interval_sec: 30, max_heartbeat_interval_secs: 600, disable_recovery: true, meta_leader_lease_secs: 10, dangerous_max_idle_secs: Some(1800), enable_compaction_deterministic: false, enable_committed_sst_sanity_check: false, node_num_monitor_interval_sec: 10, backend: Mem }, batch: BatchConfig { worker_threads_num: None, developer: DeveloperConfig { batch_output_channel_size: 64, batch_chunk_size: 1024, stream_enable_executor_row_count: false, stream_connector_message_buffer_size: 16, unsafe_stream_extreme_cache_size: 1024, stream_chunk_size: 1024, stream_exchange_initial_permits: 8192, stream_exchange_batched_permits: 1024 } }, streaming: StreamingConfig { barrier_interval_ms: 1000, in_flight_barrier_nums: 10000, checkpoint_frequency: 10, actor_runtime_worker_threads_num: None, enable_jaeger_tracing: false, async_stack_trace: On, developer: DeveloperConfig { batch_output_channel_size: 64, batch_chunk_size: 1024, stream_enable_executor_row_count: false, stream_connector_message_buffer_size: 16, unsafe_stream_extreme_cache_size: 1024, stream_chunk_size: 1024, stream_exchange_initial_permits: 8192, stream_exchange_batched_permits: 1024 } }, storage: StorageConfig { sstable_size_mb: 256, block_size_kb: 64, bloom_false_positive: 0.001, share_buffers_sync_parallelism: 1, share_buffer_compaction_worker_threads_number: 4, shared_buffer_capacity_mb: 1024, state_store: "hummock+memory", data_directory: "hummock_001", write_conflict_detection_enabled: true, block_cache_capacity_mb: 512, meta_cache_capacity_mb: 128, disable_remote_compactor: false, enable_local_spill: true, local_object_store: "tempdisk", share_buffer_upload_concurrency: 8, compactor_memory_limit_mb: 512, sstable_id_remote_fetch_number: 10, file_cache: FileCacheConfig { dir: "", capacity_mb: 1024, total_buffer_capacity_mb: 128, cache_file_fallocate_unit_mb: 512, cache_meta_fallocate_unit_mb: 16, cache_file_max_write_size_mb: 4 }, min_sst_size_for_streaming_upload: 33554432, max_sub_compaction: 4, object_store_use_batch_delete: true, max_concurrent_compaction_task_number: 16, enable_state_store_v1: false }, backup: BackupConfig { storage_url: "memory", storage_directory: "backup" } } 2023-02-03T10:35:25.447014Z INFO risingwave_compute::server: > debug assertions: on 2023-02-03T10:35:25.447043Z INFO risingwave_compute::server: > version: 0.2.0-alpha (75de7ee) ``` Approved-By: liurenjie1024 commit 2dfa704001daa95bfa807512195d5783638477c5 Author: lmatz <lmatz823@gmail.com> Date: Mon Feb 6 12:23:39 2023 +0800 fix(meta): temporarily does not require advertise_addr when using etcd commit 167afb38e111ae87e220b476ac821d39c3961758 Author: xiangjinwu <17769960+xiangjinwu@users.noreply.github.com> Date: Mon Feb 6 12:11:40 2023 +0800 refactor(DataType): cleanup outdated helpers (#7685) The following 2 helpers are no longer used in any place: `is_type_encodable`, `mem_cmp_eq_value_enc`. Note: right now all DataTypes are encodable in memcomparable format. We will reintroduce the difference and disallow certain types from being used as memcomparable later. Approved-By: liurenjie1024 commit b996ba1ee1b4de45d9ccfdff581f8b2c479dee70 Author: ZENOTME <43447882+ZENOTME@users.noreply.github.com> Date: Mon Feb 6 11:13:19 2023 +0800 chore(test):add e2e test for producing timestamp in kafka source (#7699) Approved-By: liurenjie1024 commit 50fc4495806f3e69b9bf8eb5def8bc7cf0894e0c Author: Eric Fu <eric@singularity-data.com> Date: Mon Feb 6 10:44:18 2023 +0800 fix: alias of argument `advertise_addr` (#7702) The alias of `--advertise-addr` should be `--client-address` to keep compatible with previous versions. Approved-By: lmatz Approved-By: huangjw806 commit 4e9756d04853248c0a7fe95da82b2588820cda5e Author: xxchan <xxchan22f@gmail.com> Date: Sat Feb 4 08:17:40 2023 +0100 ci: add github_token for buf-setup-action (#7697) Approved-By: tabVersion commit 05eb37f33a8275237bb268e4cb2932f8f31eb234 Author: xxchan <xxchan22f@gmail.com> Date: Fri Feb 3 23:55:18 2023 +0100 feat: implement append-only group TopN (#7522) close #7376 Approved-By: BugenZhao commit dffc2f145ddf56f359d6484fcaefeefc8e91bf66 Author: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com> Date: Sat Feb 4 00:37:50 2023 +0800 feat: ensure reserved memory for computing tasks on compute node starting (#7670) The total memory of a CN consists of: 1. computing memory (both stream & batch) 2. storage memory (block cache, meta cache, etc.) 3. memory for system usage That is to say, we have **_CN total memory_ = _computing memory_ + _storage memory_ + _system memory_**, and both _CN total memory_ and _storage memory_ are configured by the user currently. This PR is to ensure that _computing memory_ and _system memory_ are correctly reserved,, i.e. **_computing memory_ + _system memory_ = _CN total memory_ - _storage memory_ > a given amount of memory**. We set this "given amount of memory" as 1G for now (512M for computing and 512M for system). The check is performed on CN starting. Approved-By: fuyufjh Approved-By: hzxa21 commit 20bdb72d5ff015d736fb582918cd551563139273 Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com> Date: Fri Feb 3 19:20:31 2023 +0800 fix: report local execution mode error (#7454) 1. Enable local mode error propagation. Now when local mode task (in CN) happens error, it can report to users. 2. Store sender in TaskExecution, avoid early drop (Otherwise it's possible that the task execution error will become hash shuffle error) This pr revert some previous workaround: TODO in sqlsmith, store the sender in task execution Approved-By: liurenjie1024 Co-Authored-By: BowenXiao1999 <931759898@qq.com> Co-Authored-By: Bowen <36908971+bowenxiao1999@users.noreply.github.com> commit a3306ea51bbb1d390bd49f24edf0ea165388cb0a Author: xxchan <xxchan22f@gmail.com> Date: Fri Feb 3 11:40:52 2023 +0100 feat: don't report error when cancelling `risedev configure` (#7673) Also tweak the prompts Approved-By: BugenZhao Approved-By: TennyZhuang commit 930b185d1a9cadb06b74a5604bb7a17985786a8a Author: Shanicky Chen <peng@singularity-data.com> Date: Fri Feb 3 17:34:29 2023 +0800 feat: refine meta election logic (#7669) This PR refine a series of meta election related codes 1. meta's addr supports multiple addresses 2. meta client adds a meta address mode parameter to distinguish the behavior of election members found in loadbalance (kubernetes environment) and list (normal environment) Approved-By: yezizp2012 commit 496f7a97477bdd6f8882e2fd66c40cb4cec0ee6c Author: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Date: Fri Feb 3 15:23:56 2023 +0800 fix(sqlsmith): generate typed null (#7679) Approved-By: fuyufjh commit 50bd4871cbc6e399b8ddf4c2c2f6e96bf1646c99 Author: Eric Fu <eric@singularity-data.com> Date: Fri Feb 3 14:59:58 2023 +0800 fix: bug of MetaNodeOpts (#7681) It seems to be a mistake introduced in #7658 Approved-By: Gun9niR commit f1c4558ce5c3593f5b77cf9a7c74b1f5df87422b Author: zwang28 <70626450+zwang28@users.noreply.github.com> Date: Fri Feb 3 14:35:40 2023 +0800 chore(log): suppress unnecessary warning (#7676) Change this warning message into a debug message, because it's the expected behavior during backfill. Otherwise it can overwhelm the log stream during a large MV creation. Approved-By: wenym1 commit a97e94f327f6cb8e7c6fabb6b346220cc85e0933 Author: Yi Zhang <72374626+eemario@users.noreply.github.com> Date: Fri Feb 3 14:12:16 2023 +0800 feat(ci): add e2e test for iceberg sink (#7631) This PR adds e2e tests for iceberg sink in PR & main branch workflows, which includes: - setting up the environment (creating a new bucket in hummock-minio with `mcli`, creating an iceberg table with `spark-sql`) - running `e2e_test/sink/iceberg_sink.slt` - checking the test results (reading from the iceberg table with `spark-sql` and matching the output) Approved-By: wenym1 Approved-By: StrikeW Approved-By: tabVersion commit b720f195aee54287e1be2a5f46b94f60db694d4f Author: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Date: Fri Feb 3 12:03:46 2023 +0800 feat(sqlsmith): gen implicit cast (#7629) - [x] Gen for fixed func - [x] Gen for concat (note that this is implicit cast but in explicit context...) Approved-By: lmatz Co-Authored-By: Noel Kwan <noelkwan1998@gmail.com> Co-Authored-By: Noel Kwan <47273164+kwannoel@users.noreply.github.com> commit 1f6b0630e547a7af46f73326e79f9555514b2544 Author: ioperations <65484906+ioperations@users.noreply.github.com> Date: Fri Feb 3 11:35:37 2023 +0800 fix(sqlparser): fix operator precedence between '>=' and 'IN' (#7665) Approved-By: kwannoel Co-Authored-By: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Co-Authored-By: aodong.qin <ioperations.c@gmail.com> Co-Authored-By: ioperations <65484906+ioperations@users.noreply.github.com> commit 704cc4fc8c00a9b37da4edd3a56bc750f12a2f78 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Fri Feb 3 11:12:45 2023 +0800 feat(service-params): Rename `host -> listen_addr`, `client_addr -> advertise_addr` (#7530) Rename `host -> listen_addr`, `client_addr -> advertise_addr` for clearer meaning of cmdline params. Approved-By: CAJan93 Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: jon-chuang <9093549+jon-chuang@users.noreply.github.com> commit bb673d4de584dfb97afb5da8f6c1ddf28b157e7d Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Fri Feb 3 10:15:13 2023 +0800 feat(frontend): define `ConstantEvalRewriter`, impl `ExprRewritable` for batch (#7541) Next steps: - impl for stream nodes Depends on: https://github.com/risingwavelabs/risingwave/pull/7542 Let's remember to enable `now.slt.part` after we enable `ConstantEvalRewriter`. Approved-By: chenzl25 commit 99ffb71a49c5973fb3a5edf00b86973380fdb082 Author: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Date: Thu Feb 2 23:59:02 2023 +0800 fix(sqlsmith): generation of ambiguous `IN` `list` expression (#7672) Generate `InList` expression with parenthesis in the prefix argument to avoid ambiguity. Approved-By: lmatz commit 00b0e361eccd57c06d46ae0863130c2a60ea1650 Author: August <pin@singularity-data.com> Date: Thu Feb 2 22:18:53 2023 +0800 chore(test): minor code refactoring in simulation test (#7666) Approved-By: BugenZhao Approved-By: wangrunji0408 commit 9fd77215f7d4f9b96dc2ac55473c192521e4f118 Author: ZENOTME <43447882+ZENOTME@users.noreply.github.com> Date: Thu Feb 2 20:46:16 2023 +0800 feat(pgwire):support mix format in extended query mode (#7622) as title. Solve the issue #7605 and #7599 Approved-By: BowenXiao1999 Approved-By: xiangjinwu commit 477e30de470a2066d20efb57fe8e3f49a31d51d9 Author: Dylan <chenzl25@mail2.sysu.edu.cn> Date: Thu Feb 2 17:44:31 2023 +0800 feat(streaming): support delta join on primary table (#7662) - Support delta join on primary table, because primary table is also an index as well. Approved-By: st1page commit e9871063fa90fda46fd18dcd560f10b4f629c2fb Author: Zhidong Guo <52783948+Gun9niR@users.noreply.github.com> Date: Thu Feb 2 16:42:58 2023 +0800 feat(cli): fallback to env if CLI arg is absent (#7658) As title. All envs will have the prefix of `RW`. Clap does not support prefixing env at the moment, so we have to specify the name manually :hot_face:. Serfig provides this functionality, but it still has the limitation of being unable to override with default value. As for backward compatibility, although the env names have changed, the CLI args remain the same. Approved-By: fuyufjh commit 83e7f00172ea828086dfd4101248336b7d755e4f Author: August <pin@singularity-data.com> Date: Thu Feb 2 16:06:43 2023 +0800 fix(meta): fix meta_endpoint format when host not provided in playground mode (#7661) Fix `meta_endpoint` format when `host` is not provided in playground mode, the endpoint will be wrongly generated as `127.0.0.1:5690:5690`. Approved-By: lmatz Approved-By: tabVersion Approved-By: BugenZhao commit e7fe72b0dda9379cd42e6e412bbf5a37c69ca9c4 Author: Wallace <bupt2013211450@gmail.com> Date: Thu Feb 2 13:04:58 2023 +0800 feat(storage): use finer granularity to monitor request latency (#7586) We used to thought the iterator latency of state-store could always exceed several hundred microsecond but it is a mistake caused by metrics. In fact, in most case, it is only several micro-seconds Approved-By: Li0k commit df85930b94b385a402905317148746d7a432bf7f Author: CAJan93 <jan.mensch@gmx.net> Date: Thu Feb 2 04:54:27 2023 +0100 feat(docs): Add documentation on how to run RW with debugger (#7652) Simple docs on how to run RW locally using a debugger. Hope you find it useful. For me it sometimes helps to step through the code line by line Approved-By: lmatz commit 9b9e0921f8d5de2c0cd504985200b476c5a101cf Author: waruto <wmc314@outlook.com> Date: Thu Feb 2 09:49:38 2023 +0800 refactor(source): refine some code of split reader (#7644) - move the implementation of `SplitReaderV2` to the readers themselves. Approved-By: xx01cyx Approved-By: tabVersion commit 3d9077b07a8318cc0d64b25b904f0e66e1bf5c86 Author: Zhidong Guo <52783948+Gun9niR@users.noreply.github.com> Date: Wed Feb 1 22:10:12 2023 +0800 feat: override config with cli opts (#7613) Define a proc macro `OverrideConfig` which will override fields in `RwConfig`. It supports overriding any field in the config file. Its usage is as bellow: ```rust pub struct ComputeNodeOpts { // Other items ... #[clap(flatten)] override_config: OverrideConfigOpts, } /// Command-line arguments for compute-node that overrides the config file. #[derive(Parser, Clone, Debug, OverrideConfig)] struct OverrideConfigOpts { #[clap(long)] #[override(path = storage.state_store)] pub state_store: Option<String>, /// Enable reporting tracing information to jaeger. #[clap(parse(from_flag), long)] #[override(path = streaming.enable_jaeger_tracing)] pub enable_jaeger_tracing: Flag, } fn compute_node_serve(opts: ComputeNodeOpts) { // `override_config` should not be usable afterwards let config = load_config(&opts.config_path, Some(opts.override_config)); } ``` I plan to keep only the addresses, hardware-related items and credential-related items in CLI. Thus more options are available in the config file. Please check the release notes. This PR keeps backward compatibility, as the newly added fields in the config file can still be overridden from CLI. I did not use serfig because it [does not support overriding with default values](https://github.com/Xuanwo/serfig/issues/23#issuecomment-1409809442). The user might not do that, but as risedev users, we might accidentally have modified risingwave.toml and forgot to change it back, which conflicts with the values generated by risedev and causing confusing behaviours 🥵 Approved-By: fuyufjh Approved-By: xxchan Co-Authored-By: Gun9niR <gun9nir.guo@gmail.com> Co-Authored-By: Zhidong Guo <52783948+Gun9niR@users.noreply.github.com> commit 17cdbd76b8c06c316f5a0d4d48cd7d875d2fcd8d Author: Bugen Zhao <i@bugenzhao.com> Date: Wed Feb 1 20:08:04 2023 +0800 feat(meta): iterative streaming scheduler (part 1) (#7490) (as explained in #7420) This PR introduces a brand new streaming scheduler which is done by automatic iteration based on a Datalog syntax. In short, by simply pre-defining some rules, the scheduler can directly derive the distribution (or vnode mapping) for the given fragment graph and fill the exchange with correct dispatchers, without thinking about the topological order or handling edge cases anymore. https://github.com/risingwavelabs/risingwave/blob/b5b70668cf275a83b169b0088e3403686173697d/src/meta/src/stream/stream_graph/schedule.rs#L83-L98 In this PR, we only utilize this new scheduler in parallelism assignment, while the derived distribution itself is ignored. This can be fixed by removing the old scheduler in the next PRs. By making the scheduling phase ahead of time, we can also simplify the logic like actor ID assignments, as we're building actors after we know the total count of them. Approved-By: chenzl25 Approved-By: yezizp2012 commit 2eea46208210d95c1975b7ea61f37674b6d2e832 Author: Huangjw <1223644280@qq.com> Date: Wed Feb 1 19:37:56 2023 +0800 fix: fix release workflow scripts && update release version in readme (#7649) As the title. Approved-By: lmatz Approved-By: BugenZhao commit 3b5419f6c6cdaa9b7324211fff71dc1f55907e46 Author: lmatz <lmatz823@gmail.com> Date: Wed Feb 1 19:14:06 2023 +0800 chore: remind contributors to ensure backward compatibility when there are breaking changes (#7607) As per the title. Have no strong opinion on this. Slightly better than nothing....... Approved-By: neverchanje commit 81678aae414e85149f8d6e9bb86315336afb4ba7 Author: CAJan93 <jan.mensch@gmx.net> Date: Wed Feb 1 11:26:30 2023 +0100 fix(chore): Change comments in test to ci-3cn-2fe-3meta (#7651) Approved-By: yezizp2012 Approved-By: liurenjie1024 commit 42d2e11c579fd3ec00f28b95170d8a456efb619d Author: Bohan Zhang <tabvision@bupt.icu> Date: Wed Feb 1 17:37:17 2023 +0800 feat: add ssl/sasl support for kafka sink (#7540) as title Approved-By: waruto210 commit c42860f959ea8e7f3a15e695cd1642e184b11b69 Author: Liang <44948473+soundOfDestiny@users.noreply.github.com> Date: Wed Feb 1 17:13:56 2023 +0800 feat(lru cache): add interface `lookup_with_request_dedup` which … (#7456) Add interface `lookup_with_request_dedup` which leverages `Tokio::spawn` to avoid await in `LruCache` for storage prefetch purpose. Approved-By: Little-Wallace Approved-By: wenym1 commit 8e43885db6fadafa058ab8f52ec1481335c57fe0 Author: William Wen <44139337+wenym1@users.noreply.github.com> Date: Wed Feb 1 16:49:10 2023 +0800 refactor(storage): remove storage core from local hummock (#7642) Currently, in the `LocalHummockStorage`, most of useful fields are inside an inner `HummockStorageCore`, which are wrapped in `Arc` and was intended to be a struct shared in multiple tokio tasks. However, inside the `HummockStorageCore`, there is a `Arc<RwLock<ReadVersion>>`, and this is the real struct that is shared. Therefore, there is no need to wrap the `HummockStorageCore` within `Arc`, and furthermore we can move the fields in `HummockStorageCore` to `LocalHummockStorage` and remove the `HummockStorageCore`. We also remove `Clone` from `LocalHummockStorage` in unit tests, and some unit test code are refactored accordingly. Approved-By: Li0k commit be2300d50a976b49da95d99ce2cb0e22e3eb7379 Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com> Date: Wed Feb 1 16:25:57 2023 +0800 feat: collect stream mem usage (#7180) Different from jemalloc version memory stat, this pr tries to add memory stat that only allocated by streaming jobs. Goal: Collect streaming memory usage and record it as promethues metrics. Code Design: 1. Given that each actor task is wrapped by `allocation_stat`, each task will call `store_mem_usage` and report the actor level memory usage to `ActorContext`. ActorContext need to record the last mem val and current mem val, so it can calculate the diff and apply it to a global variable for collecting all streaming memory usage in `total_mem_val`. 2. `GlobalMemoryManager` responsible for write the `total_mem_val` as Streaming Metrics in promethues. Approved-By: liurenjie1024 commit fcc2ec872635379f8f577a55cd6def4e05bd98e7 Author: congyi wang <58715567+wcy-fdu@users.noreply.github.com> Date: Wed Feb 1 15:44:36 2023 +0800 feat(storage): bloom filter hash value xor with table id (#7502) To avoid different tables in one sst have same bloom filter key, we can xor hash value with table_id, and thus we don't need to change `SstableMeta`. Approved-By: Little-Wallace Co-Authored-By: congyi <15605187270@163.com> Co-Authored-By: congyi wang <58715567+wcy-fdu@users.noreply.github.com> commit 31757b685dde7fd859bfb3cd0cffdc98b823857e Author: Dylan <chenzl25@mail2.sysu.edu.cn> Date: Wed Feb 1 15:21:16 2023 +0800 fix(meta): Handle internal tables properly when creating and dropping sink (#7638) - Handle internal tables properly when creating and dropping sink. Approved-By: yezizp2012 commit cf2306bb8fbb91dbbd3c1e350d93a0c4aa37c4c2 Author: Dylan <chenzl25@mail2.sysu.edu.cn> Date: Wed Feb 1 14:57:34 2023 +0800 feat(stream): support and refactor 2 way delta join (#7417) We will support delta join in a similar way as distributed lookup join. - Schedule stream lookup operator together with the index scan and don't rely on the arrange operator. - Use consistent hash shuffle to guide the upstream row to the lookup operator. - Change `StreamActor`'s `same_work_node_upstream` to `ColocatedActorId`. - Use a storage table instead of a state table to lookup the current and previous epoch data in uncommitted read(NoWait) manner. - Change `ChainExecutor` into a pure dispatcher without consuming a snapshot. Delta join only needs one data flow to consume the old snapshot and other data flows should be pure dispatcher. - Support a session variable `rw_streaming_enable_delta_join` to control whether enable delta join, default value false. Approved-By: fuyufjh Approved-By: BugenZhao Co-Authored-By: Dylan Chen <zilin@singularity-data.com> Co-Authored-By: Dylan <chenzl25@mail2.sysu.edu.cn> commit a1a56d7d98f947caa811e8610a82fb65355a1ced Author: Kiv Chen <34561254+KivenChen@users.noreply.github.com> Date: Tue Jan 31 22:32:45 2023 -0800 feat(ci): poll-and-wait for connector node boot-up (#7452) As the Connector Node becomes heavier, e2e tests may start before the Connector finishes boot-up. From this PR on, the e2e sink and source scripts will wait for Connector to become available until the script comes into timeout. Approved-By: tabVersion Co-Authored-By: Kiv Chen <sdckivenchen@gmail.com> Co-Authored-By: Bohan Zhang <tabvision@bupt.icu> commit f309b246571a9ec68201d696fbcd5cfd4181519f Author: Liang <44948473+soundOfDestiny@users.noreply.github.com> Date: Wed Feb 1 13:29:35 2023 +0800 fix(state store): calculate delete range even if bloom filter negativ… (#7619) calculate delete range of an SST even if bloom filter negative in `iter()` in state store Approved-By: Little-Wallace commit 729272474a851ede454fbdfa281efa3dd843dd40 Author: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Date: Wed Feb 1 13:01:35 2023 +0800 fix(expr): use checked operations for substr (#7634) As per title. Approved-By: xiangjinwu Approved-By: Gun9niR commit 5938e35c02c2091d165b9fb70107d9399f86a0e0 Author: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Date: Tue Jan 31 21:35:18 2023 +0800 chore(sqlsmith): disable `FILTER` with correlated input reference (#7624) Workaround: https://github.com/risingwavelabs/risingwave/issues/4762 Approved-By: lmatz commit 0c15c43f239d6b01813014fa46e8dd554d23dd92 Author: August <pin@singularity-data.com> Date: Tue Jan 31 19:44:12 2023 +0800 chore: decrease retry interval for recovery, try to speed up recovery (#7627) Approved-By: wangrunji0408 commit 1ac6da8c7e0e0da93917f7a1da7dab3cb7ebee33 Author: lmatz <lmatz823@gmail.com> Date: Tue Jan 31 19:08:57 2023 +0800 fix(meta): backward compatible with host when running in mem meta backend (#7628) See #7616, the previous pr only changes the etcd backend. It fails when running mem backend. Now both etcd backend and mem backend run with the same `meta-endpoint` when `meta-endpoint` is not provided, i.e. `format!("{}:{}", meta_addr, listen_addr.port())` Approved-By: arkbriar commit db4f27e56a09d234de074a3acbb43bfea70825af Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com> Date: Tue Jan 31 17:20:50 2023 +0800 fix: do not panic when there is None in dml ret stream (#7606) It's possible that the chunk stream do not return any data. For example, insert fail. We should not panic here. From #7454 Approved-By: BugenZhao Approved-By: liurenjie1024 commit 77a9c3733ae318eef03878bd4973c6865eb190f5 Author: Runji Wang <wangrunji0408@163.com> Date: Tue Jan 31 16:54:34 2023 +0800 feat(test): load balancing using DNS and IPVS in simulation (#7580) This PR utilizes the new DNS and IPVS feature in madsim to support load balancing in deterministic test. As an example, we added a DNS record `frontend -> 192.168.2.0` to the cluster, and a virtual service `192.168.2.0:4566 -> {192.168.2.1:4566, 192.168.2.2:4566, 192.168.2.3:4566}` to each node. We could then connect a frontend using the address `frontend:4566` which will be load balanced to 3 frontend servers. This can also be used to simulate k8s services when we have multiple meta services after a few days. - [x] The patch for madsim in Cargo.toml will be replaced by new versions once all tests pass. Approved-By: BugenZhao commit 082a4681270bb0621114e0f46ae7d323345bc99c Author: Runji Wang <wangrunji0408@163.com> Date: Tue Jan 31 16:30:15 2023 +0800 chore(test): move nexmark queries to external file (#7614) This PR moves inlined nexmark queries to external file. Approved-By: BugenZhao commit b97dd9f5f069c640eeb76369387b63639f6da5f8 Author: lmatz <lmatz823@gmail.com> Date: Tue Jan 31 16:06:17 2023 +0800 fix(meta): add back host, a deprecated cmd line argument (#7616) risingwave-operator made a [temporary fix](https://github.com/risingwavelabs/risingwave-operator/commit/f011c368e0ffc72e615d4717fefbef4650851d2e) with regards to the breaking change that uses `meta-endpoint` instead of `host`, please refer to #7527. But the cloud is unable to adopt this change right now, so we add `host` back. By looking at the temporary fix, it used to use `IP` as `host` and now uses `IP:PORT` as `meta-endpoint`. So this hot fix uses `IP` from `host` and `PORT` from `listening-addr` to compose `meta-endpoint`, please correct me if I am wrong. Approved-By: arkbriar Approved-By: jon-chuang commit 9a4ee148a6083cc3cf85e75d04ec029b1f45067f Author: Bugen Zhao <i@bugenzhao.com> Date: Tue Jan 31 14:43:59 2023 +0800 chore: check breaking change in protobuf (#7609) This PR adds the workflow of detecting breaking changes in protobuf to the CI workflow. Here is an example of error reporting: - https://github.com/risingwavelabs/risingwave/pull/7609/commits/c1dd48e4f4291270ec515886d999dbf8b946b733 We've set the rule category to `WIRE_JSON` which is explained [here](https://docs.buf.build/breaking/rules). Actually `WIRE` is enough for our major usages, while `WIRE_JSON` ensures that the name of the fields is not modified, which brings less confusion to the codebase and also makes the JSON-based dashboard easier to maintain (e.g., avoid #7262 from happening). We may not need to add this workflow to GitHub's branch protection rule until the next release. Approved-By: TennyZhuang commit c52e5505e1d871fbe6646176a27921c7cbdd6435 Author: zwang28 <70626450+zwang28@users.noreply.github.com> Date: Tue Jan 31 12:51:37 2023 +0800 chore(log): modify read current epoch error message (#7603) The `ReadCurrentEpoch error` only occurs when the cluster is under recovery. This PR emphasizes this fact in error message, because the error message will be read by frontend user. When seeing `ReadCurrentEpoch error` constantly, root cause of unsuccessful recovery should be investigated, instead of `ReadCurrentEpoch error` itself. Approved-By: lmatz commit 5b66ffc634c09fbec70450a1725d0f7c4f09fecb Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com> Date: Tue Jan 31 10:35:23 2023 +0800 fix: do not panic when there is ssl error (#7588) Close #7331 Do not panic when there is ssl error. in this case, there will be an error log + retry without SSL. Approved-By: xxchan commit bd86d166d0e61d7b0090d11bbc84146a02bbe320 Author: Renjie Liu <liurenjie2008@gmail.com> Date: Tue Jan 31 09:00:33 2023 +0800 fix(batch): HopWindow executor should work without window_start or window_e d (#7595) As title. Approved-By: lmatz commit 702345b5d0292d799451f9b77e0e433df4fa7c5b Author: Zhidong Guo <52783948+Gun9niR@users.noreply.github.com> Date: Tue Jan 31 00:43:40 2023 +0800 feat(storage): support hummock read plan in java binding (#7253) - Add `MetaClient` on java side: use blocking rpc stub for simplicity, use `ScheduledThreadPoolExecutor` to implement heartbeat (not sure if it's the best way though, may need advice from Java expert 🥺) - Support querying and pinning hummock version, querying table catalog by table name and db name on java side, and pass it to rust side's hummock API - Refine demo - fix maven dependency and classpath issues - start rw cluster with a new profile in risedev: `java_binding_demo` write data to a demo table, and read it from java demo Approved-By: ice1000 Approved-By: wenym1 Co-Authored-By: William Wen <william123.wen@gmail.com> Co-Authored-By: William Wen <44139337+wenym1@users.noreply.github.com> Co-Authored-By: Gun9niR <gun9nir.guo@gmail.com> Co-Authored-By: Zhidong Guo <52783948+Gun9niR@users.noreply.github.com> commit 719f55e4ee2a4d18e75f8f8adfc63603a7ab6748 Author: Dylan <chenzl25@mail2.sysu.edu.cn> Date: Mon Jan 30 21:28:39 2023 +0800 feat(frontend): Describe statement displays the primary key (#7590) - `describe` statement displays the primary key. Approved-By: yezizp2012 commit 198ca46c3b492e04cdbdef32d8d1a7c55dedfb74 Author: Bugen Zhao <i@bugenzhao.com> Date: Mon Jan 30 19:44:59 2023 +0800 fix(streaming): remove actor-level exchange metrics (#7584) As explained in #7362. Given the parallelism `p`, we'll have `p * p` actor-level metrics for each exchange, whose data is unacceptably large. Let's disable it temporarily. Approved-By: StrikeW commit d1c5d6aeb4d66539f16c3351dd284fe6d20b06e6 Author: Shanicky Chen <peng@singularity-data.com> Date: Mon Jan 30 19:16:55 2023 +0800 fix(meta): start leader service directly in meta election scenario (#7594) If the election client resumes from the leader state, it will start directly from the leader service Approved-By: yezizp2012 commit c7adda2bddfe1743685eb6e1a7dbbb15f8716b17 Author: August <pin@singularity-data.com> Date: Mon Jan 30 18:52:26 2023 +0800 fix: fix retry logic of drop statement in deterministic recovery test (#7592) Approved-By: BugenZhao commit f03b3da66eeae82bc0c1eb7641fa56e6f5bac62c Author: lmatz <lmatz823@gmail.com> Date: Mon Jan 30 18:27:06 2023 +0800 fix(batch): sum aggregator should use checked_add instead of raw addition to avoid panic (#7587) fix #7552, overflow should not panic, we used `checked_add` now to replace the raw `+` op. Approved-By: TennyZhuang commit 9f64e93bace7516f5dcf99fe0ae9f6e3541cce12 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Mon Jan 30 17:42:57 2023 +0800 fix(frontend): `display_for_explain` for `ListValue` (#7571) I believe this bug was not caught before: https://github.com/risingwavelabs/risingwave/pull/7541 because Array is treated as a function. But const_eval would force the Array to be evaluated into the List DataType Anw, let's fix it as it's clearly a bug. After https://github.com/risingwavelabs/risingwave/pull/7541, the new planner test will be evaluated as an List literal, rather than as FunctionCall as per now. Approved-By: st1page Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: jon-chuang <9093549+jon-chuang@users.noreply.github.com> commit 1d2bb32982e3b53489422199604051839e991035 Author: Shanicky Chen <peng@singularity-data.com> Date: Mon Jan 30 16:55:34 2023 +0800 feat: meta-client with new election mechanism (#7389) **This section will be used as the commit message. Please do not leave this empty!** This PR is a follow-up to PR #7179 This PR provides a meta client with an election client, and implements a sidecar rpc client update through the api of `members` and `leader` from LeaderService But this also introduces a mutex, all meta client operations need to hold this lock Approved-By: yezizp2012 commit 5e3070d06ee115911d4739bc146a91c72f0ecee0 Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com> Date: Mon Jan 30 16:28:35 2023 +0800 chore: remove unnecessary println (#7585) Brought by previous pr, should remove it. Approved-By: BugenZhao commit da8fda33ce8b3f6ac4744152fe97f9b8ad3a170b Author: waruto <wmc314@outlook.com> Date: Mon Jan 30 16:02:58 2023 +0800 refactor(source): migrate all parsers and readers to the new stream based trait (#7508) - migrate all parsers/readers to the new stream based trait Approved-By: tabVersion commit c72c8880c628706fd0062f934200d93dedb78eac Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com> Date: Mon Jan 30 14:11:07 2023 +0800 feat: support multiple stmts in one run (simple query) (#6849) Test locally. Approved-By: ZENOTME commit 4c31cc71a510f6dd7808d0ee81080a64e3301e17 Author: Bugen Zhao <i@bugenzhao.com> Date: Mon Jan 30 12:26:20 2023 +0800 fix(expr): handle time operation out of range (#7568) Handle the potential "out of range" panics in time operations as careful as possible. Approved-By: wangrunji0408 commit 39aaa75c70c8992ba2553a533c674d4f33c97453 Author: TennyZhuang <zty0826@gmail.com> Date: Mon Jan 30 11:52:48 2023 +0800 fix(expr): overlay panic on overflow (#7575) Use `checked_sub`. Approved-By: kwannoel commit e680ae70bd6740bdd3413d375e47b88bc1c2f80a Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Mon Jan 30 10:53:41 2023 +0800 fix(meta): Temporarily fix breakage of playground due to missing --host arg (#7572) We may rename the args again, but let's stop the breakage due to missing playground arg changes due to https://github.com/risingwavelabs/risingwave/pull/7527... Approved-By: huangjw806 commit 7b0f2d21d278013faca232d399d0366352b240e0 Author: TennyZhuang <zty0826@gmail.com> Date: Sun Jan 29 19:49:20 2023 +0800 chore: update copyright-owner to RisingWave Labs (#7570) update copyright-owner to RisingWave Labs Approved-By: yuhao-su commit 75a5caca3608cee232e9682aabc99c215de3de8b Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Sun Jan 29 17:33:20 2023 +0800 feat(frontend): Impl `ExprRewritable` for `Logical` and `generic` Nodes (#7542) Impl `ExprRewritable` for `Logical` and `generic` Nodes Approved-By: chenzl25 Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: jon-chuang <9093549+jon-chuang@users.noreply.github.com> commit dae43cc8c689c9cfb5f0ada83c51a12e964f72d2 Author: zwang28 <70626450+zwang28@users.noreply.github.com> Date: Sun Jan 29 16:46:50 2023 +0800 refactor(meta): refactor CREATE_COMPACTION_GROUP_FOR_MV (#7567) When CREATE_COMPACTION_GROUP_FOR_MV is true: - Before this PR, all tables of a stream job will be assigned their own new compaction groups. This is unnecessary, especially for small tables. - With this PR, all tables of a stream job will be assigned one same new compaction group. Approved-By: soundOfDestiny Co-Authored-By: zwang28 <84491488@qq.com> Co-Authored-By: zwang28 <70626450+zwang28@users.noreply.github.com> commit 808bc59fe9fdba6fe8c84ba7f5bb2a2cd9883501 Author: waruto <wmc314@outlook.com> Date: Sun Jan 29 16:03:33 2023 +0800 feat(source): support match_pattern for s3 enumerator (#7565) Approved-By: tabVersion commit fb09aa9003f26dd0d7897597bb03b1f2fc651fcf Author: ZENOTME <43447882+ZENOTME@users.noreply.github.com> Date: Sun Jan 29 13:42:46 2023 +0800 fix(frontend): remove alias check (#7562) In the old version, we guarantee every column with a alias name, these behaviour caused the bug #7549. ### And why we have this guarantee: At that time, function didn't have a alias, to guarantee every function column have a column name, we enforce the user must specify column alias.(A rude solution) https://github.com/risingwavelabs/risingwave/pull/6798 But now, we can derive function alias so that I think we can remove the alias check. After remove it, our behaviour will be compatiable with postgres: ``` test_db=> create table t1 as select 1; SELECT 1 test_db=> select * from t1; ?column? ---------- 1 (1 row) test_db=> create table t as select 1,2,3; ERROR: column "?column?" specified more than once test_db=> create table t as select 1 as a , 2 as b; SELECT 1 test_db=> select * from t; a | b ---+--- 1 | 2 (1 row) ``` Approved-By: st1page commit 9896f9a800ba9a5e68345d6c21654c854180727d Author: Li0k <yuli@singularity-data.com> Date: Sat Jan 28 22:32:36 2023 +0800 refactor(storage): refactor compactor context to make code more readable (#7494) as title **This section will be used as the commit message. Please do not leave this empty!** Please explain **IN DETAIL** what the changes are in this PR and why they are needed: - Summarize your change (**mandatory**) - How does this PR work? Need a brief introduction for the changed logic (optional) - Describe clearly one logical change and avoid lazy messages (optional) - Describe any limitations of the current code (optional) Approved-By: Little-Wallace commit 3a9f418e813705a8a64ac884c72d861dc39d8a9b Author: Bugen Zhao <i@bugenzhao.com> Date: Sat Jan 28 17:23:18 2023 +0800 fix(test): handle `RETURNING` correctly in recovery test (#7483) We also need to match `Record::Query` for DMLs. In this PR, we also - skip `FLUSH` as we always enable `IMPLICIT_FLUSH`, - only forward logs to the file and print the test case file name in the console. After #7470 gets merged, we should be able to enable `dml_returning.slt.part`. Approved-By: wangrunji0408 commit 3767daf647bdfb9272dc784c032003bb9abada76 Author: TennyZhuang <zty0826@gmail.com> Date: Sat Jan 28 16:57:32 2023 +0800 feat(binder): hint similar candidates when function is unknown (#7521) When a known function is called, we'll try to find a similar builtin scalar function and hint it in the error message. <img width="1286" alt="image" src="https://user-images.githubusercontent.com/9161438/214260809-26b3a321-393b-45ef-a42c-a14d8a0a080b.png"> Approved-By: xxchan Approved-By: BowenXiao1999 commit 6a1be40ccc3d5cf203080395c66b9364fd9312a7 Author: TennyZhuang <zty0826@gmail.com> Date: Sat Jan 28 16:17:27 2023 +0800 refactor(stream): introduce `Successor` to refine `SortBuffer` (#7550) 1. introduce `Successor` trait 2. refactor `SortBuffer` to use `Successor` Approved-By: st1page Approved-By: BugenZhao commit 5e860b0264c1e157dbfdf0432e9cc8080d53b33d Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Sat Jan 28 15:50:04 2023 +0800 fix(frontend): Output indices for general unnesting (#7507) We did not reorder join output columns based on join type and which side the apply is translated to, so we fix this. We also fix some previous bugs. Approved-By: chenzl25 Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: jon-chuang <9093549+jon-chuang@users.noreply.github.com> commit 6c34da7dc9c6e2c1af2ab534c857c3109de6c829 Author: Bugen Zhao <i@bugenzhao.com> Date: Sat Jan 28 15:19:25 2023 +0800 refactor: encapsulated vnode mapping struct (#7505) This PR introduces an encapsulated structure for `VnodeMapping`, which uses the compressed format which can be efficiently converted into/from the protobuf representation, and can be indexed with logarithmic complexity. See `src/common/src/hash/consistent_hash/mapping.rs` for more details. This PR also distinguishes `pb::ParallelUnitMapping` from `pb::FragmentParallelUnitMapping`. The notification requires a single message which contains the key and value, so we previously put the fragment ID into the `pb::ParallelUnitMapping`, while the `pb::Fragment.vnode_mapping` also contains the fragment ID which makes no sense. Approved-By: xx01cyx Approved-By: yezizp2012 commit 60ace2420e606de0755774e4f1824e9095b77254 Author: Yuhao Su <31772373+yuhao-su@users.noreply.github.com> Date: Sat Jan 28 12:49:40 2023 +0800 feat: dedup input pk in join key (#7557) The input pk may contains the same key from the join key. In this case we can remove those key from input pk. Approved-By: st1page commit 0a809289c7d74a622d64697d75fcbfcfac750fb9 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Sat Jan 28 09:55:37 2023 +0800 chore(deps): bump ua-parser-js from 0.7.31 to 0.7.33 in /dashboard (#7553) Bumps [ua-parser-js](https://github.com/faisalman/ua-parser-js) from 0.7.31 to 0.7.33. - [Release notes](https://github.com/faisalman/ua-parser-js/releases) - [Changelog](https://github.com/faisalman/ua-parser-js/blob/master/changelog.md) - [Commits](https://github.com/faisalman/ua-parser-js/compare/0.7.31...0.7.33) --- updated-dependencies: - dependency-name: ua-parser-js dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 7c2e0891029a8379dab79bdecd43359f367dc640 Author: xxchan <xxchan22f@gmail.com> Date: Sat Jan 28 02:52:26 2023 +0100 chore: use cargo-binstall to install tools (#7554) to make it faster Approved-By: TennyZhuang commit b410eb2d91881cbaa6260a20ee6400bee979dfd1 Author: Shuxian Wang <wsx@berkeley.edu> Date: Fri Jan 27 02:42:51 2023 -0800 feat(frontend): Support for shared views. (#7432) Non materialized views are now labeled as `LogicalShare` nodes in the `Binder`. Support the "Share view" task in https://github.com/risingwavelabs/risingwave/issues/6955#issue-1502542081. Approved-By: chenzl25 Co-Authored-By: Shuxian Wang <wsx@berkeley.edu> Co-Authored-By: Dylan <chenzl25@mail2.sysu.edu.cn> commit 849ffb803a1977db1b54534ada1b256ed9d4b392 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Fri Jan 27 13:09:08 2023 +0800 feat(e2e-testing): More tests for `TemporalFilter` (#7547) More tests Approved-By: soundOfDestiny commit 22159ec50787e8f8c531ef09366538e080f8f0ba Author: Noel Kwan <47273164+kwannoel@users.noreply.github.com> Date: Fri Jan 27 11:01:27 2023 +0800 feat(sqlsmith): generate dynamic boolean expressions (#7529) - Generate `InList` - Generate `InSubquery` - Generate `FILTER`, `ORDER BY` clauses for aggregate functions. - Enable generating negative numbers. - Improve error logs for round trip parsing. - Cleanup some `TODOs`. Approved-By: lmatz Co-Authored-By: Noel Kwan <noelkwan1998@gmail.com> Co-Authored-By: Noel Kwan <47273164+kwannoel@users.noreply.github.com> commit b7ebdd17e18d99f0aafe0bd39d25c4050c6d5752 Author: Kexiang Wang <kx.wang@hotmail.com> Date: Thu Jan 26 11:52:03 2023 -0500 fix(metrics): fix cache miss when insert (#7536) fix the `cache miss when insert`'s legend of grafana panel. Approved-By: yufansong commit 2bdab894e4f3fcecd3287b703926c6f21146a7e0 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Thu Jan 26 23:58:36 2023 +0800 fix(frontend): `now` fragment should be singleton (#7544) `now` fragment was not singleton Approved-By: soundOfDestiny Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: TennyZhuang <zty0826@gmail.com> commit ee9a41c0d70f02af0dd741aecf002fd9e7a0c805 Author: xxchan <xxchan22f@gmail.com> Date: Thu Jan 26 16:32:53 2023 +0100 ci: remove mergify priority (#7545) commit e88604f34b60e39a383ca2fdaed71a5c3cd8a582 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Thu Jan 26 17:49:48 2023 +0800 feat(service-params): add comments for `--state-store` params for `compute-node`, `compactor-node` (#7537) Make it known to the users what are valid params Approved-By: CAJan93 Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: jon-chuang <9093549+jon-chuang@users.noreply.github.com> commit 9a96b27fd8b58a4183a3170507fee293f4df412d Author: Bohan Zhang <tabvision@bupt.icu> Date: Thu Jan 26 15:46:40 2023 +0800 refactor: extract common abstractions for each connector (#7528) Approved-By: waruto210 commit d4193889af95cfbceeb5232f60ef96134fd567da Author: Bohan Zhang <tabvision@bupt.icu> Date: Thu Jan 26 15:15:41 2023 +0800 fix: move install mysql-client to ci image (#7539) fix main aca70301a066b29f133a0ca205bb85e89a237a1c Approved-By: jon-chuang commit aca70301a066b29f133a0ca205bb85e89a237a1c Author: Kexiang Wang <kx.wang@hotmail.com> Date: Wed Jan 25 22:44:12 2023 -0500 fix(metrics): fix bloom filter count (#7535) 1. The current `read_req_bloom_filter_positive_counts` metric is actually recording the counts of `read_req_bloom_filter_true_positive_counts`. 2. `state_store_read_req_positive_but_non_exist_counts` is named as "true positive" This PR fixed them. Approved-By: TennyZhuang commit 71f7ce71e643b05b934d123b70fbcf3501c059e4 Author: CAJan93 <jan.mensch@gmx.net> Date: Wed Jan 25 15:10:55 2023 +0100 fix(chore): Echo correct setup during sim tests (#7532) We are using 2 frontend notes by default and not 1 (see [simulation tests args](https://github.com/risingwavelabs/risingwave/blob/8d99f7aedf86d7a3a43eed075c163afee6d1298e/src/tests/simulation/src/main.rs#L43)). Updating echo to reflect that. Approved-By: wangrunji0408 commit 8d99f7aedf86d7a3a43eed075c163afee6d1298e Author: lmatz <lmatz823@gmail.com> Date: Wed Jan 25 19:14:06 2023 +0800 feat(parser): allow one extra comma at the end of With clause (#7513) As per the title, #7267, not in the standard, but RW's column definition in the `create` statement also allows it. https://github.com/risingwavelabs/risingwave/blob/main/src/sqlparser/src/parser.rs#L1891 It works more smoothly for commenting out the last line sometimes. Approved-By: tabVersion commit 511022c3e5b4ea9da4122036ea102f016ee37b34 Author: Bohan Zhang <tabvision@bupt.icu> Date: Wed Jan 25 18:49:22 2023 +0800 feat: allow non txn kafka sink (#7500) as title, add an option to allow non-txn when writing kafka Approved-By: lmatz commit e38983574792aa9924a17e71f5eca0ff430eab26 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Wed Jan 25 15:00:51 2023 +0800 fix(meta): Meta node should be identified by its `meta_endpoint` parameter for cluster membership (#7527) Meta node should be identified by its `meta_endpoint` parameter. Previously, this defaults to the `listen_addr`. The purpose of `meta_endpoint` is to serve as a unique identifier for cluster membership and for leadership election and connecting to leader. Approved-By: shanicky commit d6182f8970245f09e444395482273910770a54ab Author: Bohan Zhang <tabvision@bupt.icu> Date: Wed Jan 25 14:00:52 2023 +0800 chore: remove explain create source (#7525) as title Approved-By: jon-chuang commit d59148bc4bae9405621238f015887451ce334d41 Author: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com> Date: Wed Jan 25 11:50:30 2023 +0800 fix(parser): fix error message for create table with connector (#7524) As title. Approved-By: tabVersion commit 3f75c4994f24009e1b14bf3eb21559771a53d540 Author: Bugen Zhao <i@bugenzhao.com> Date: Tue Jan 24 18:15:51 2023 +0800 fix(ci): check whether the playground can start in docker build (#7520) Correctly check whether the playground can start with `playground` target. Will sleep for 10 secs then check the status. * Previous failure: https://buildkite.com/risingwavelabs/docker/builds/13298 * Current PR: https://buildkite.com/risingwavelabs/docker/builds/13301 This PR also - remove unused SIMD-related stuff as we do not support non-SIMD after https://github.com/risingwavelabs/risingwave/pull/7259, - change `println` to `tracing` in `resource_util`. Approved-By: xxchan commit 96888514c365a29529fbe35dde37ab19dbdb9703 Author: xxchan <xxchan22f@gmail.com> Date: Tue Jan 24 09:11:39 2023 +0100 fix: remove outdated comment (#7517) Approved-By: xx01cyx commit 2c061ea397bc5a30a68ac8911aefb99f7a44435d Author: Tesla Zhang <ice1000kotlin@foxmail.com> Date: Sun Jan 22 19:26:15 2023 -0500 chore(memory_management): move to local imports (#7509) * fmt * Remove commit f3ec73dcf13d9345e15bcbc37001654e6b591392 Author: TennyZhuang <zty0826@gmail.com> Date: Sun Jan 22 23:46:12 2023 +0800 ci: upgrade actions/checkout to v3 (#7511) upgrade actions/checkout to v3 > ANNOTATIONS > ! Node.js 12 actions are deprecated. Please update the following actions to use Node.js 16: actions/checkout@v2. For more information see: https://github.blog/changelog/2022-09-22-github-actions-all-actions-will-begin-running-on-node16-instead-of-node12/. Approved-By: yuhao-su commit 53a9940410eda92efb399aedb900ac91c3070a60 Author: stonepage <40830455+st1page@users.noreply.github.com> Date: Fri Jan 20 17:45:44 2023 +0800 refactor(frontend): refine handle create table (#7503) - remove `DMLFlag` which is just the workaround for materialzied source - return error when user want create a table with pk constrain and append_only property at the same time Approved-By: BugenZhao commit da83a65a22cc25cdc44f81c318fddf211d3bbb9c Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Fri Jan 20 15:32:32 2023 +0800 feat(frontend): Reorder cmp expressions and require `now()` lower bound for `TemporalFilter` (#7497) Reorder cmp expressions and require now lower bound for `TemporalFilter` Approved-By: soundOfDestiny Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: Liang Zhao <liang@singularity-data.com> commit 2e28fd39462ad56ff4676224db4b51246228fe0b Author: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com> Date: Fri Jan 20 13:52:19 2023 +0800 chore: rename `TableSource` to `TableDmlHandle` (#7496) As title. Approved-By: TennyZhuang Approved-By: st1page commit 915b3932741417de1540d218f5700b0ec8d93f23 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Fri Jan 20 10:37:45 2023 +0800 feat(frontend): Refactor `TemporalFilter` derivation (#7374) Refactor temporal filter derivation into logical-phase transformation of filter with now into a left semi-join, later to be derived into dynamicfilter, introduce `LogicalNow` We also fix an uncaught bug I think - previously, `StreamNow` would not derive broadcast in the plan. Approved-By: st1page Co-Authored-By: jon-chuang <jon-chuang@users.noreply.github.com> Co-Authored-By: Liang Zhao <liang@singularity-data.com> commit e1ec750e34e8b09142607ed84ef8d84bbe064f98 Author: jon-chuang <9093549+jon-chuang@users.noreply.github.com> Date: Fri Jan 20 10:02:22 2023 +0800 feat(frontend): Remove duplicated expressions in `PushCalculationOfJoinRule` (#7400) Previously, unnecessary expressions are duplicated in `PushCalculationOfJoinRule`, in a way that is not possible to prune. (DynamicFilter case was actually not affected as the RHS col would be pruned away regardless, but we manage to fix some other cases). Approved-By: st1page commit 74e5d151c30327ab200b827f40cd038d3f4de7f8 Author: waruto <wmc314@outlook.com> Date: Fri Jan 20 00:14:54 2023 +0800 refactor(source): refactor some code of source (#7493) **This section will be used as the commit message. Please do not leave this empty!** - refactor `FsSourceExecutor`, make it more unified with `SourceExecutor`. - add `SourceMetric` statistics for s3 reader. - rename some V2 structs - refine some code Approved-By: TennyZhuang Approved-By: tabVersion commit 4f8e7cf74b9218f7eef737270b660217248dd685 Author: TennyZhuang <zty0826@gmail.com> Date: Thu Jan 19 23:42:11 2023 +0800 chore(pr-template): use html comments in main section (#7495) We'll extract the content of the main section as the commit message, so if the PR author doesn't remove the hints, the commit message will be very verbose and confusing. [Mergify's doc](https://docs.mergify.com/configuration/#template) said that they would strip the HTML comments in the section, so it's better to use comments here. > By default, the HTML comments are stripped from body. To get the full body, you can use the body_raw attribute. Approved-By: lmatz Approved-By: BugenZhao commit c6ee30d7e2ce5878ee1d65f8bf9b96c30168e46e Author: congyi wang <58715567+wcy-fdu@users.noreply.github.com> Date: Thu Jan 19 17:58:16 2023 +0800 chore(test): remove some outdated tests in relational table (#7488) **This section will be used as the commit message. Please do not leave this empty!** These tests are duplicated and outdated, their test logic is already covered by others. It's hard to move `test_storage_table.rs` outside streaming crate as it need to write by StateTable. Approved-By: st1page Approved-By: hzxa21 commit de6076696f05d0a8ab386b08013848c33188d10b Author: Bohan Zhang <tabvision@bupt.icu> Date: Thu Jan 19 17:23:41 2023 +0800 chore: add netcat dep in docker (#7485) **This section will be used as the commit message. Please do not leave this empty!** as title Approved-By: BugenZhao Approved-By: TennyZhuang Co-Authored-By: tabVersion <tabvision@bupt.icu> Co-Authored-By: TennyZhuang <zty0826@gmail.com> commit f2eb3172e0d9bdb65eb41bcde6874b90244173c6 Author: TennyZhuang <zty0826@gmail.com> Date: Thu Jan 19 16:54:05 2023 +0800 chore(deps): fix hakari generated toml (#7487) Signed-off-by: TennyZhuang <zty0826@gmail.com> Signed-off-by: TennyZhuang <zty0826@gmail.com> commit 93dc5f89d3017f903a59a721b8c251a0705efd24 Author: TennyZhuang <zty0826@gmail.com> Date: Thu Jan 19 16:16:38 2023 +0800 chore(deps): fix lock conflicts (#7486) Signed-off-by: TennyZhuang <zty0826@gmail.com> Signed-off-by: TennyZhuang <zty0826@gmail.com> commit 51a1e7479c220a68167d3271df1e942c6f604d96 Author: August <pin@singularity-data.com> Date: Thu Jan 19 16:10:27 2023 +0800 chore: decrease test num of recovery test in main workflow (#7484) chore: decrease test num of recovery test for main workflow Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit d7d7bbe18845f81ba86a4399bb22345ea5ee2acd Author: TennyZhuang <zty0826@gmail.com> Date: Thu Jan 19 16:09:26 2023 +0800 chore(deps): bump deps to fix several security issues (#7475) * build(deps): bump deps to fix several security issues Signed-off-by: TennyZhuang <zty0826@gmail.com> * haruki Signed-off-by: TennyZhuang <zty0826@gmail.com> * cargo sort Signed-off-by: TennyZhuang <zty0826@gmail.com> * fix hakari Signed-off-by: TennyZhuang <zty0826@gmail.com> Signed-off-by: TennyZhuang <zty0826@gmail.com> commit e0a1cb5443a9afc6e6aff6026a5505162efd75ca Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Thu Jan 19 16:08:41 2023 +0800 chore(deps): bump checkstyle from 8.14 to 8.29 in /src/java_binding/java (#7476) Bumps [checkstyle](https://github.com/checkstyle/checkstyle) from 8.14 to 8.29. - [Release notes](https://github.com/checkstyle/checkstyle/releases) - [Commits](https://github.com/checkstyle/checkstyle/compare/checkstyle-8.14...checkstyle-8.29) --- updated-dependencies: - dependency-name: com.puppycrawl.tools:checkstyle dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 1194333b9ddf657301a9eebb851ce2ef03aab15a Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Thu Jan 19 16:08:23 2023 +0800 chore(deps): bump d3-color and recharts in /dashboard (#7477) Bumps [d3-color](https://github.com/d3/d3-color) to 3.1.0 and updates ancestor dependency [recharts](https://github.com/recharts/recharts). These dependencies need to be updated together. Updates `d3-color` from 2.0.0 to 3.1.0 - [Release notes](https://github.com/d3/d3-color/releases) - [Commits](https://github.com/d3/d3-color/compare/v2.0.0...v3.1.0) Updates `recharts` from 2.1.16 to 2.3.2 - [Release notes](https://github.com/recharts/recharts/releases) - [Changelog](https://github.com/recharts/recharts/blob/master/CHANGELOG.md) - [Commits](https://github.com/recharts/recharts/compare/v2.1.16...v2.3.2) --- updated-dependencies: - dependency-name: d3-color dependency-type: indirect - dependency-name: recharts dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit f96fd80bb7f691ec4f607e828dd908964eeb1505 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Thu Jan 19 16:08:08 2023 +0800 chore(deps-dev): bump junit from 4.11 to 4.13.1 in /src/java_binding/java/java-binding (#7478) chore(deps-dev): bump junit in /src/java_binding/java/java-binding Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1. - [Release notes](https://github.com/junit-team/junit4/releases) - [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.11.md) - [Commits](https://github.com/junit-team/junit4/compare/r4.11...r4.13.1) --- updated-dependencies: - dependency-name: junit:junit dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 49a98331b7dc0ade31af02e67c86752d08e329bf Author: Runji Wang <wangrunji0408@163.com> Date: Thu Jan 19 16:07:53 2023 +0800 chore(deps): switch arrow to crates.io version (#7480) * fix loading `RUST_LOG` in simulation Signed-off-by: Runji Wang <wangrunji0408@163.com> * switch arrow to crates.io version Signed-off-by: Runji Wang <wangrunji0408@163.com> Signed-off-by: Runji Wang <wangrunji0408@163.com> commit 1a1522f6a13de7f2176b1bdc879a1734798702d9 Author: Bugen Zhao <i@bugenzhao.com> Date: Thu Jan 19 14:55:11 2023 +0800 fix(pgwire): run callback before responsing complete (#7470) This PR adds a new field of `callback` in the pgwire, which will be called after the `values_stream` is finished and before sending `ResponseComplete`. This is used for... - do `FLUSH` if implicit flush is set - record metrics for total query latency I've not found a better way to achieve this than manually calling the callback, some other attempts: - Put the execution of the callback in the stream. However, we don't even poll the stream, which is assumed to be empty, for some statement types, so we may fail to execute the callback. - `spawn` and execute it in `Drop` of `PgResponse`. This is asynchronous and it's possible that a new query comes before the callback is executed, so `flush` will be problematic. ~~This PR also enables the `dml_returning.slt` test.~~ Approved-By: TennyZhuang Approved-By: BowenXiao1999 Approved-By: ZENOTME Approved-By: Gun9niR commit 6ab02a3394d9d6ccecd16871ecf50e421bbd88ac Author: TennyZhuang <zty0826@gmail.com> Date: Thu Jan 19 14:29:42 2023 +0800 build(*): bump toolchain to 2023-01-18 (#7482) * build(toolchain): bump toolchain to 2023-01-18 Signed-off-by: TennyZhuang <zty0826@gmail.com> * revert deps Signed-off-by: TennyZhuang <zty0826@gmail.com> * update futures-async-stream Signed-off-by: TennyZhuang <zty0826@gmail.com> Signed-off-by: TennyZhuang <zty0826@gmail.com> commit 8de8c1d22b23b8a0f059154d845f920c761a4014 Author: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com> Date: Thu Jan 19 12:13:05 2023 +0800 chore: rename v2 structs and methods for source (#7464) - Rename `SourceExecutorV2` to `SourceExecutor`. - Remove the old `TableSource::stream_reader` and `TableStreamReader::into_stream` methods because they are no longer used. - **Make the new `TableStreamReader::into_stream` return `StreamChunkWithState` instead of `StreamChunk`.** Approved-By: lmatz Approved-By: st1page Approved-By: tabVersion Approved-By: BugenZhao Approved-By: waruto210 Co-Authored-By: xx01cyx <caoyuanxin0531@outlook.com> Co-Authored-By: st1page <1245835950@qq.com> commit 252f72ee56ec1a215ee62e68834540747fa9c44e Author: TennyZhuang <zty0826@gmail.com> Date: Thu Jan 19 10:37:09 2023 +0800 chore(*): clear rest "singularity" in codebase (#7471) As title Approved-By: lmatz Approved-By: yezizp2012 comm…
Merged
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As we add more and more functionalities, the logic of creating a streaming job becomes tightly coupled. As one of the most obvious examples, we pass a mutable reference
Context
everywhere and make everything done.To be honest, this makes some patches really simple as the developer only needs to add a new field in the
Context
, set it somewhere, and directly use it later without any other glue to pass this argument. However, this makes the overall readability worse since one can't ensure that "some field at some time is correctly set", which in turn makes it more difficult for developers to "add new fields and set them at the right time", creating a vicious circle.Roughly, the procedure for creating a streaming job can be separated into:
Exchange
node intoDispatch
andMerge
in the node body of actors, and finally get a persisted Table Fragments.However, there're some abuses of this procedure, according to reviews on current implementation:
Some of these problems are caused by historical reasons. It's time to review them, to better support future work which requires stricter planning and scheduling on the meta service like:
The text was updated successfully, but these errors were encountered: