Removing quotes from udf_metadata_key #1026

hershd23 · 2023-09-01T16:25:39Z

Replacing udf_metadata_key from string_literal to ID_LITERAL
modified: evadb/parser/evadb.lark
Removed .value from key_value_pair[0] post the change in type
modified: evadb/parser/lark_visitor/_functions.py
Replaced string key to ID_LITERAL in test query
modified: test/unit_tests/parser/test_parser.py

Solves #1010

modified: evadb/parser/evadb.lark Removed .value from key_value_pair[0] post the change in type modified: evadb/parser/lark_visitor/_functions.py Replaced string key to ID_LITERAL in test query modified: test/unit_tests/parser/test_parser.py

github-actions

👋 Hello @hershd23, thanks for submitting a EVA DB PR 🙏 To allow your work to be integrated as seamlessly as possible, we advise you to:

✅ Verify that your PR is up-to-date with georgia-tech-db/eva master branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge master locally.
✅ Verify that all EVA DB Continuous Integration (CI) checks are passing.
✅ Reduce changes to the absolute minimum required for your bug fix or feature addition.

modified: evadb/parser/evadb.lark

modified: test/unit_tests/parser/test_parser.py

hershd23 · 2023-09-01T16:38:37Z

Once tests and all work, will go ahead and change the docs as well

modified: README.md modified: docs/source/reference/evaql/create.rst modified: docs/source/reference/udfs/model-train.rst Replacing udf_metadata_key in Ludwig test modified: test/integration_tests/long/test_model_train.py

hershd23 · 2023-09-01T16:58:06Z

Ran the integration test for training the model. It parsed the query correctly and invoked training but gave the following error. Creating an issue for this

ERROR    evadb.utils.logging_manager:plan_executor.py:182 Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument weight in method wrapper_CUDA__native_batch_norm)
Traceback (most recent call last):
  File "/home/hershd23/Desktop/evadb/evadb/executor/plan_executor.py", line 178, in execute_plan
    yield from output
  File "/home/hershd23/Desktop/evadb/evadb/executor/project_executor.py", line 34, in exec
    batch = apply_project(batch, self.target_list, self.catalog())
  File "/home/hershd23/Desktop/evadb/evadb/executor/executor_utils.py", line 42, in apply_project
    batches = [expr.evaluate(batch) for expr in project_list]
  File "/home/hershd23/Desktop/evadb/evadb/executor/executor_utils.py", line 42, in <listcomp>
    batches = [expr.evaluate(batch) for expr in project_list]
  File "/home/hershd23/Desktop/evadb/evadb/expression/function_expression.py", line 129, in evaluate
    outcomes = self._apply_function_expression(func, batch, **kwargs)
  File "/home/hershd23/Desktop/evadb/evadb/expression/function_expression.py", line 188, in _apply_function_expression
    return func_args.apply_function_expression(func)
  File "/home/hershd23/Desktop/evadb/evadb/models/storage/batch.py", line 173, in apply_function_expression
    return Batch(expr(self._frames))
  File "/home/hershd23/Desktop/evadb/evadb/udfs/abstract/abstract_udf.py", line 36, in __call__
    return self.forward(args[0])
  File "/home/hershd23/Desktop/evadb/evadb/udfs/ludwig.py", line 33, in forward
    predictions, _ = self.model.predict(frames, return_type=pd.DataFrame)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/api.py", line 895, in predict
    predictions = predictor.batch_predict(
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/predictor.py", line 142, in batch_predict
    preds = self._predict(batch)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/predictor.py", line 188, in _predict
    outputs = self._predict_on_inputs(inputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/predictor.py", line 324, in _predict_on_inputs
    return self.dist_model(inputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/ecd.py", line 136, in forward
    combiner_outputs = self.combine(encoder_outputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/ecd.py", line 81, in combine
    return self.combiner(encoder_outputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/combiners/combiners.py", line 451, in forward
    hidden, aggregated_mask, masks = self.tabnet(hidden)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/modules/tabnet_modules.py", line 113, in forward
    features = self.batch_norm(features)  # [b_s, i_s]
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 171, in forward
    return F.batch_norm(
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/functional.py", line 2450, in batch_norm
    return torch.batch_norm(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument weight in method wrapper_CUDA__native_batch_norm)

xzdandy · 2023-09-01T17:15:24Z

Ran the integration test for training the model. It parsed the query correctly and invoked training but gave the following error. Creating an issue for this

ERROR    evadb.utils.logging_manager:plan_executor.py:182 Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument weight in method wrapper_CUDA__native_batch_norm)
Traceback (most recent call last):
  File "/home/hershd23/Desktop/evadb/evadb/executor/plan_executor.py", line 178, in execute_plan
    yield from output
  File "/home/hershd23/Desktop/evadb/evadb/executor/project_executor.py", line 34, in exec
    batch = apply_project(batch, self.target_list, self.catalog())
  File "/home/hershd23/Desktop/evadb/evadb/executor/executor_utils.py", line 42, in apply_project
    batches = [expr.evaluate(batch) for expr in project_list]
  File "/home/hershd23/Desktop/evadb/evadb/executor/executor_utils.py", line 42, in <listcomp>
    batches = [expr.evaluate(batch) for expr in project_list]
  File "/home/hershd23/Desktop/evadb/evadb/expression/function_expression.py", line 129, in evaluate
    outcomes = self._apply_function_expression(func, batch, **kwargs)
  File "/home/hershd23/Desktop/evadb/evadb/expression/function_expression.py", line 188, in _apply_function_expression
    return func_args.apply_function_expression(func)
  File "/home/hershd23/Desktop/evadb/evadb/models/storage/batch.py", line 173, in apply_function_expression
    return Batch(expr(self._frames))
  File "/home/hershd23/Desktop/evadb/evadb/udfs/abstract/abstract_udf.py", line 36, in __call__
    return self.forward(args[0])
  File "/home/hershd23/Desktop/evadb/evadb/udfs/ludwig.py", line 33, in forward
    predictions, _ = self.model.predict(frames, return_type=pd.DataFrame)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/api.py", line 895, in predict
    predictions = predictor.batch_predict(
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/predictor.py", line 142, in batch_predict
    preds = self._predict(batch)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/predictor.py", line 188, in _predict
    outputs = self._predict_on_inputs(inputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/predictor.py", line 324, in _predict_on_inputs
    return self.dist_model(inputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/ecd.py", line 136, in forward
    combiner_outputs = self.combine(encoder_outputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/models/ecd.py", line 81, in combine
    return self.combiner(encoder_outputs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/combiners/combiners.py", line 451, in forward
    hidden, aggregated_mask, masks = self.tabnet(hidden)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/ludwig/modules/tabnet_modules.py", line 113, in forward
    features = self.batch_norm(features)  # [b_s, i_s]
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 171, in forward
    return F.batch_norm(
  File "/home/hershd23/Desktop/evadb/env/lib/python3.10/site-packages/torch/nn/functional.py", line 2450, in batch_norm
    return torch.batch_norm(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument weight in method wrapper_CUDA__native_batch_norm)

Thanks for creating an issue! Do you run the experiments on ada-00 or ada-01?

xzdandy

PR itself looks good. Thanks!

hershd23 · 2023-09-02T00:13:30Z

Actually I ran it on my personal machine which has a GPU card

xzdandy · 2023-09-02T00:26:00Z

We can merge this PR. We can investigate the GPU issue in #1028. Thanks!

xzdandy · 2023-09-02T05:33:08Z

Come into my mind that you may also need to update the hugging face functions.

hershd23 · 2023-09-02T19:02:54Z

Come into my mind that you may also need to update the hugging face functions.

Oh okay @xzdandy. Will check for that today

modified: benchmark/text_summarization/text_summarization_with_evadb.py modified: docs/source/benchmarks/text_summarization.rst modified: docs/source/overview/concepts.rst modified: docs/source/reference/ai/hf.rst modified: docs/source/reference/ai/openai.rst modified: docs/source/reference/ai/yolo.rst modified: docs/source/usecases/object-detection.rst modified: docs/source/usecases/question-answering.rst modified: docs/source/usecases/text-summarization.rst modified: evadb/parser/utils.py modified: evadb/udfs/udf_bootstrap_queries.py modified: test/benchmark_tests/test_benchmark_pytorch.py modified: test/integration_tests/long/interfaces/relational/test_relational_api.py modified: test/integration_tests/long/test_error_handling_with_ray.py modified: test/integration_tests/long/test_huggingface_udfs.py modified: test/integration_tests/long/test_reuse.py

modified: docs/source/benchmarks/text_summarization.rst

hershd23 · 2023-09-04T20:39:01Z

@xzdandy @gaurav274 can you take a look at the changes, have incorporated the changes for HuggingFace UDFs as well

hershd23 · 2023-09-04T20:40:37Z

I also see some change changes in the docs causing the docs CI to fail. Should I try to fix them (best effort) in the PR?

xzdandy

Thanks Hersh for the contribution! Doc failures will be fixed in #1035.

Verified on Python 3.10

All notebook testcases passed
test/integration_tests/long/test_model_train.py passed when running as an independent test
4 testcases failed when bash script/test/test.sh -m "LONG INTEGRATION"

FAILED test/integration_tests/long/test_model_train.py::ModelTrainTests::test_ludwig_automl - evadb.executor.executor_utils.ExecutorError: No best trial found. Please check if you specified the correct default metric (metric_...
FAILED test/integration_tests/long/test_udf_executor.py::UDFExecutorTest::test_should_create_udf_with_metadata - lark.exceptions.UnexpectedToken: Unexpected token Token('STRING_LITERAL', "'CACHE'") at line 6, column 19.
FAILED test/integration_tests/long/test_udf_executor.py::UDFExecutorTest::test_should_return_empty_metadata_list_if_udf_is_removed - lark.exceptions.UnexpectedToken: Unexpected token Token('STRING_LITERAL', "'CACHE'") at line 6, column 19.
FAILED test/integration_tests/long/interfaces/relational/test_relational_api.py::RelationalAPI::test_create_udf_with_relational_api - AssertionError: "CREA[50 chars]Face 'task' 'automatic-speech-recognition' 'mo[22 chars]ase'" != "CREA[50 chars]Face TASK 'automatic...

A quick check at the error message indicates that test/integration_tests/long/test_model_train.py::ModelTrainTests::test_ludwig_automl failed because of GPU memory from other testcases are not freed. If the error is not relevant to this PR's change, welcome to create an issue, and we will investigate separately!

Minors:

remove .lock_preprocessing from the pr.

Update:

test/integration_tests/long/test_model_train.py failed when running with other long integration test cases #1036 created for test/integration_tests/long/test_model_train.py

deleted: .lock_preprocessing Converted CACHE and BATCH from string literals to UIDs modified: test/integration_tests/long/test_udf_executor.py

hershd23 · 2023-09-05T14:39:02Z

Hi @xzdandy have made the following changes

removed .lock_preprocessing which was created unknowingly
Coverted CACHE and BATCH to UID structure in tests

For the other two issues

I think the Ludwig issue has been separately pointed out by you in test/integration_tests/long/test_model_train.py failed when running with other long integration test cases #1036, I think the issue is not unique to my PR rather an issue in itself that you yourself have pointed out
test_create_udf_with_relational_api is actually a little tricky. Let me try and explain the issue to you.

We have made all udf_metadata_keys into UIDs. Once parsed these are processed and ultimately stored as lowercase strings in the metadata_map. What this test is trying to do is trying to convert the query and args from the API to the QUERY itself. To solve this I need to find the query builder for this which converts the API constructs to an SQL type query. I guess @gaurav274 mentioned that there are some usecases where python API data is converted to query and then run to leverage the optimizations built on the query.

Can you point me to the query builder code. I'll try and correct this there.

UPDATE : Figured it out

direct string matching test cases modified: evadb/parser/create_udf_statement.py

hershd23 · 2023-09-05T15:11:50Z

@xzdandy made the changes can you take a look

hershd23 · 2023-09-05T15:38:06Z

Are we good to go with this? Can we merge or waiting for @gaurav274 's review.

xzdandy · 2023-09-05T15:42:22Z

Are we good to go with this? Can we merge or waiting for @gaurav274 's review.

I am fixing all merge conflicts now.

hershd23 · 2023-09-05T15:43:49Z

Done done, I have fixed merge conflicts

xzdandy · 2023-09-05T15:46:32Z

Done done, I have fixed merge conflicts

One minute to update the forecasting feature to use the new UDF syntax

hershd23 · 2023-09-05T15:57:21Z

Thanks @xzdandy for all your help!

Replacing udf_metadata_key from string_literal to ID_LITERAL ` modified: evadb/parser/evadb.lark` Removed .value from key_value_pair[0] post the change in type ` modified: evadb/parser/lark_visitor/_functions.py` Replaced string key to ID_LITERAL in test query ` modified: test/unit_tests/parser/test_parser.py` Solves #1010 --------- Co-authored-by: xzdandy <xzdandy@gmail.com>

github-actions bot reviewed Sep 1, 2023

View reviewed changes

hershd23 requested a review from xzdandy September 1, 2023 16:26

hershd23 added 2 commits September 1, 2023 12:27

Fixing typos

0bdeb71

modified: evadb/parser/evadb.lark

Replaced metadata key back to KEY in test

a4c3f01

modified: test/unit_tests/parser/test_parser.py

hershd23 requested a review from gaurav274 September 1, 2023 16:36

hershd23 added 2 commits September 1, 2023 12:41

Replacing udf_metadata_key in docs

fcb8e89

modified: README.md modified: docs/source/reference/evaql/create.rst modified: docs/source/reference/udfs/model-train.rst Replacing udf_metadata_key in Ludwig test modified: test/integration_tests/long/test_model_train.py

Fixing failed unit test

0011642

xzdandy approved these changes Sep 1, 2023

View reviewed changes

xzdandy assigned hershd23 Sep 2, 2023

xzdandy added the Feature Request ✨ New feature or request label Sep 2, 2023

xzdandy added this to the v0.3.3 milestone Sep 2, 2023

hershd23 force-pushed the remove_quotes_udf_metadata_key branch from 18bd15c to 0011642 Compare September 4, 2023 20:17

hershd23 added 3 commits September 4, 2023 16:18

Merge branch 'staging' into remove_quotes_udf_metadata_key

82c66d8

Increased underline length in at line 75 in text_summarization.rst

801ddb6

modified: docs/source/benchmarks/text_summarization.rst

xzdandy requested changes Sep 5, 2023

View reviewed changes

hershd23 added 2 commits September 5, 2023 10:16

Merging staging into branch

af6c447

Deleted .lock_preprocessing file

ec050b5

deleted: .lock_preprocessing Converted CACHE and BATCH from string literals to UIDs modified: test/integration_tests/long/test_udf_executor.py

Removed quotes from udf_metadata_key and converted to upper case for

9968033

direct string matching test cases modified: evadb/parser/create_udf_statement.py

xzdandy approved these changes Sep 5, 2023

View reviewed changes

Merge branch 'staging' into remove_quotes_udf_metadata_key

1889c38

xzdandy added 2 commits September 5, 2023 11:49

Merge branch 'staging' into remove_quotes_udf_metadata_key

8d3668a

Sync with staging

49a2df8

xzdandy merged commit e535896 into georgia-tech-db:staging Sep 5, 2023

hershd23 mentioned this pull request Sep 5, 2023

Drop quotes for UDF metadata #1010

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removing quotes from udf_metadata_key #1026

Removing quotes from udf_metadata_key #1026

hershd23 commented Sep 1, 2023 •

edited

Loading

github-actions bot left a comment

hershd23 commented Sep 1, 2023

hershd23 commented Sep 1, 2023 •

edited

Loading

xzdandy commented Sep 1, 2023

xzdandy left a comment

hershd23 commented Sep 2, 2023

xzdandy commented Sep 2, 2023

xzdandy commented Sep 2, 2023

hershd23 commented Sep 2, 2023

hershd23 commented Sep 4, 2023

hershd23 commented Sep 4, 2023

xzdandy left a comment •

edited

Loading

hershd23 commented Sep 5, 2023 •

edited

Loading

hershd23 commented Sep 5, 2023

hershd23 commented Sep 5, 2023

xzdandy commented Sep 5, 2023

hershd23 commented Sep 5, 2023 •

edited

Loading

xzdandy commented Sep 5, 2023

hershd23 commented Sep 5, 2023

Removing quotes from udf_metadata_key #1026

Removing quotes from udf_metadata_key #1026

Conversation

hershd23 commented Sep 1, 2023 • edited Loading

github-actions bot left a comment

Choose a reason for hiding this comment

hershd23 commented Sep 1, 2023

hershd23 commented Sep 1, 2023 • edited Loading

xzdandy commented Sep 1, 2023

xzdandy left a comment

Choose a reason for hiding this comment

hershd23 commented Sep 2, 2023

xzdandy commented Sep 2, 2023

xzdandy commented Sep 2, 2023

hershd23 commented Sep 2, 2023

hershd23 commented Sep 4, 2023

hershd23 commented Sep 4, 2023

xzdandy left a comment • edited Loading

Choose a reason for hiding this comment

hershd23 commented Sep 5, 2023 • edited Loading

hershd23 commented Sep 5, 2023

hershd23 commented Sep 5, 2023

xzdandy commented Sep 5, 2023

hershd23 commented Sep 5, 2023 • edited Loading

xzdandy commented Sep 5, 2023

hershd23 commented Sep 5, 2023

hershd23 commented Sep 1, 2023 •

edited

Loading

hershd23 commented Sep 1, 2023 •

edited

Loading

xzdandy left a comment •

edited

Loading

hershd23 commented Sep 5, 2023 •

edited

Loading

hershd23 commented Sep 5, 2023 •

edited

Loading