Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError occurred when executing python run.py --run Balsa_JOBRandSplit --local #1

Open
tyc20 opened this issue May 3, 2022 · 2 comments

Comments

@tyc20
Copy link

tyc20 commented May 3, 2022

I tried to run this project and found some problems. Following the instructions in README.md, I installed requirements and used this line of command to run.

python run.py --run Balsa_JOBRandSplit --local

The only difference is I'm using PostgreSQL 13.5, and I modified default connection settings in pg_executor/pg_executor/pg_executor.py to fit it into my environment.

LOCAL_DSN = "dbname=imdb user=postgres"

An error occurred when I tried to run it the first time, with following traceback:

Traceback (most recent call last):
  File "run.py", line 2155, in <module>
    app.run(Main)
  File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run.py", line 2151, in Main
    agent.Run()
  File "run.py", line 2100, in Run
    has_timeouts = self.RunOneIter()
  File "run.py", line 1831, in RunOneIter
    model, dataset = self.Train()
  File "run.py", line 1221, in Train
    log=not train_from_scratch)
  File "run.py", line 883, in _MakeDatasetAndLoader
    skip_training_on_timeouts=p.skip_training_on_timeouts)
  File "/xxx/balsa/balsa/experience.py", line 560, in featurize
    skip_training_on_timeouts=skip_training_on_timeouts)
  File "/xxx/balsa/balsa/experience.py", line 393, in _featurize_dedup
    self.featurizer, all_subtrees)
  File "/xxx/balsa/balsa/experience.py", line 38, in TreeConvFeaturize
    plan_featurizer)
  File "/xxx/balsa/balsa/models/treeconv.py", line 268, in make_and_featurize_trees
    indexes = torch.from_numpy(_batch([_make_indexes(x) for x in trees])).long()
  File "/xxx/balsa/balsa/models/treeconv.py", line 268, in <listcomp>
    indexes = torch.from_numpy(_batch([_make_indexes(x) for x in trees])).long()
  File "/xxx/balsa/balsa/models/treeconv.py", line 218, in _make_indexes
    preorder_ids, _ = _make_preorder_ids_tree(root)
  File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
    root_index=root_index + 1)
  File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
    root_index=root_index + 1)
  File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
    root_index=root_index + 1)
  [Previous line repeated 1 more time]
  File "/xxx/balsa/balsa/models/treeconv.py", line 199, in _make_preorder_ids_tree
    root_index=lhs_max_id + 1)
  File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
    root_index=root_index + 1)
  File "/xxx/balsa/balsa/models/treeconv.py", line 199, in _make_preorder_ids_tree
    root_index=lhs_max_id + 1)
  File "/xxx/balsa/balsa/models/treeconv.py", line 198, in _make_preorder_ids_tree
    rhs, rhs_max_id = _make_preorder_ids_tree(curr.children[1],
IndexError: list index out of range

I used print() to show curr, and found the experience loaded a Bitmap Heap Scan node.

print(curr, curr.children, root_index)
// Bitmap Heap Scan [movie_keyword AS mk] cost=1131.96
//  Bitmap Index Scan [movie_keyword AS mk] cost=6.74
// [Bitmap Index Scan [movie_keyword AS mk] cost=6.74
//] 10

I skipped the issue by directly consider the Bitmap Heap Scan node as a leaf node, but I found another error when I restarted run.py.

Traceback (most recent call last):
  File "run.py", line 2155, in <module>
    app.run(Main)
  File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run.py", line 2151, in Main
    agent.Run()
  File "run.py", line 2100, in Run
    has_timeouts = self.RunOneIter()
  File "run.py", line 1831, in RunOneIter
    model, dataset = self.Train()
  File "run.py", line 1221, in Train
    log=not train_from_scratch)
  File "run.py", line 883, in _MakeDatasetAndLoader
    skip_training_on_timeouts=p.skip_training_on_timeouts)
  File "/xxx/balsa/balsa/experience.py", line 560, in featurize
    skip_training_on_timeouts=skip_training_on_timeouts)
  File "/xxx/balsa/balsa/experience.py", line 393, in _featurize_dedup
    self.featurizer, all_subtrees)
  File "/xxx/balsa/balsa/experience.py", line 38, in TreeConvFeaturize
    plan_featurizer)
  File "/xxx/balsa/balsa/models/treeconv.py", line 269, in make_and_featurize_trees
    _batch([_featurize_tree(x, node_featurizer) for x in trees
  File "/xxx/balsa/balsa/models/treeconv.py", line 269, in <listcomp>
    _batch([_featurize_tree(x, node_featurizer) for x in trees
  File "/xxx/balsa/balsa/models/treeconv.py", line 255, in _featurize_tree
    _bottom_up(curr_node)
  File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
    left_vec = _bottom_up(curr.children[0])
  File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
    left_vec = _bottom_up(curr.children[0])
  File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
    left_vec = _bottom_up(curr.children[0])
  [Previous line repeated 1 more time]
  File "/xxx/balsa/balsa/models/treeconv.py", line 250, in _bottom_up
    right_vec = _bottom_up(curr.children[1])
  File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
    left_vec = _bottom_up(curr.children[0])
  File "/xxx/balsa/balsa/models/treeconv.py", line 250, in _bottom_up
    right_vec = _bottom_up(curr.children[1])
  File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
    left_vec = _bottom_up(curr.children[0])
  File "/xxx/balsa/balsa/models/treeconv.py", line 246, in _bottom_up
    vec = node_featurizer.FeaturizeLeaf(curr)
  File "/xxx/balsa/balsa/util/plans_lib.py", line 726, in FeaturizeLeaf
    scan_operator_idx = np.where(self.scan_ops == node.node_type)[0][0]
IndexError: index 0 is out of bounds for axis 0 with size 0

I found the error is caused by scan methods not included in the parameter search_space_scan_ops. In method BalsaAgent._MakeWorkload() in run.py, JOB queries are loaded and PostgreSQL plans are obtained through explain (costs, format json). Loaded train_nodes are then used to initialize experience set (run.py, line 826).

Besides, I'm also confused about the procedure. As is introduced in the paper, Balsa bootstraps from a simulator and never uses an expert optimizer. So is it OK to just replace train_nodes with an empty list? And if I'm going to use a different split of train/test dataset, how can I train the model with the simulator to get a checkpoint?

@concretevitamin
Copy link
Contributor

Overall, we recommend using the tested Postgres version to get Balsa running first.

The only difference is I'm using PostgreSQL 13.5

Please first try Balsa out on our recommended and tested version, v12.5. One reason is the PG version may also have some interaction with the pg_hint_plan extension.

On the Bitmap Heap/Index error, it's because

  • Your server is not using our balsa-postgresql.conf, which disables bitmap scans.
  • There are a few places in code where we configure the default scan and join ops, e.g., here and here.

Can you use the above conf file and see if the problem goes away?

Besides, I'm also confused about the procedure. As is introduced in the paper, Balsa bootstraps from a simulator and never uses an expert optimizer. So is it OK to just replace train_nodes with an empty list?

As the paper stated, Balsa does not learn from (train on) those experience nodes. This is done in code here and with p.skip_training_on_expert defaulting to True.

Those nodes are included in the buffer only for implementation convenience, such as logging the expert's performance (here), gathering query filters and selectivities (here), extracting all possible join/scan types and relation names (here -> here), etc. It should be possible to change the code to make it an empty list -- albeit involved -- after finding alternative ways to specify the above.

And if I'm going to use a different split of train/test dataset, how can I train the model with the simulator to get a checkpoint?

You can set p.sim_checkpoint to None. You can register a new Params subclass to do so or see here for quickly trying it out.

@yqin43
Copy link

yqin43 commented Oct 10, 2023

Hi, I am also interested in your code and got the same error runningpython run.py --run Balsa_JOBRandSplit --local.

I have checked my PostgreSQL version is 12.5,

psql --version
psql (PostgreSQL) 12.5

and that my config file used is balsa-postgresql.conf, which got from cp ~/balsa/conf/balsa-postgresql.conf ~/imdb/postgresql.conf

and that balsa/experiments.py and balsa/balsa/envs/envs.py are imported. (through import experiments and from balsa.envs import envs in run.py)

I wonder if this problem has been resolved or how it was resolved in the end. I would be very grateful for any assistance or replies. Thanks and regards!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants