Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate plugins #2378

Merged
merged 8 commits into from
Aug 13, 2024
Merged

Conversation

jieguangzhou
Copy link
Collaborator

Description

Related Issues

Checklist

  • Is this code covered by new or existing unit tests or integration tests?
  • Did you run make unit_testing and make integration-testing successfully?
  • Do new classes, functions, methods and parameters all have docstrings?
  • Were existing docstrings updated, if necessary?
  • Was external documentation updated, if necessary?

Additional Notes or Comments

@jieguangzhou jieguangzhou force-pushed the feat/migrate-plugins branch from d5a1234 to 29c4ae3 Compare August 8, 2024 13:38
@jieguangzhou jieguangzhou mentioned this pull request Aug 8, 2024
@jieguangzhou jieguangzhou changed the title Feat/migrate plugins Migrate plugins Aug 8, 2024
@jieguangzhou jieguangzhou force-pushed the feat/migrate-plugins branch from e4999f3 to c5e9119 Compare August 8, 2024 14:04
Comment on lines +95 to +105
- name: Optionally run the base testing
run: |
SUPERDUPER_CONFIG="plugins/${{ matrix.plugin }}/plugin_test/config.yaml"
if [ -f "$SUPERDUPER_CONFIG" ]; then
echo "Running the base testing..."
make unit_testing SUPERDUPER_CONFIG=$SUPERDUPER_CONFIG
make usecase_testing SUPERDUPER_CONFIG=$SUPERDUPER_CONFIG
else
echo "No config file found, skipping..."
fi
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin located under the plugin_test folder, which includes a config.yaml file, not only runs its own tests but also uses this configuration file to run test cases for superduper, including both unittest and usecase tests.

@jieguangzhou jieguangzhou marked this pull request as ready for review August 12, 2024 09:33
@jieguangzhou jieguangzhou self-assigned this Aug 12, 2024

class FieldType(Leaf):
Copy link
Collaborator Author

@jieguangzhou jieguangzhou Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A unified data type identifier used to record native datatypes, replacing the original dtype in Ibis

Comment on lines 81 to 88
- name: Optionally run custom CI script
run: |
if [ -f "plugins/${{ matrix.plugin }}/.ci_extend.sh" ]; then
echo "Running custom CI script..."
bash ./plugins/${{ matrix.plugin }}/.ci_extend.sh
else
echo "No custom CI script found, skipping..."
fi
Copy link
Collaborator Author

@jieguangzhou jieguangzhou Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the officially released package to prepare the dependency plugins.
Do this in the next release pr

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just extract the dependent plugins from the pyproject.toml scripts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[project.optional-dependencies]
test = [
    "vcrpy>=5.1.0",
    # Annotation plugin dependencies will be installed in CI
    # :CI: plugins/mongodb
]

Added this for installing local dependencies instead of those on PyPI, which is useful when modifying multiple interdependent plugins at the same time. Using the names directly from PyPI will lead to dependencies on outdated versions for running CI.


- name: Install DevKit (docs, testing, etc)
run: |
make install_devkit
Copy link
Collaborator

@blythed blythed Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should combine these 2 things into .[test]?

Copy link
Collaborator Author

@jieguangzhou jieguangzhou Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@blythed blythed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, I especially like how quickly the CI is running now.

What shall we do with "core" plugins/ extensions, for example ext/numpy? Shouldn't we move this somehow? Also, what is the plan for moving away from the superduper.ext.* sub-packages? Shall we raise a warning which can become an error later?

@jieguangzhou jieguangzhou force-pushed the feat/migrate-plugins branch 3 times, most recently from e850ae8 to 5d1ab10 Compare August 13, 2024 08:21
@jieguangzhou jieguangzhou force-pushed the feat/migrate-plugins branch 5 times, most recently from 14e6095 to de31d8f Compare August 13, 2024 08:34
@jieguangzhou
Copy link
Collaborator Author

What shall we do with "core" plugins/ extensions, for example ext/numpy? Shouldn't we move this somehow? Also, what is the plan for moving away from the superduper.ext.* sub-packages? Shall we raise a warning which can become an error later?

We need to retain some core plugins, which are pre-set, and can continue to use this ext until a better name is found.

I added the warning, and I think we can directly remove those plugins in the next one or two versions, at which point an import error will be thrown.

'pymongo': MongoClient,
'ibis': BaseBackend,
"lance": LanceVectorSearcher,
"in_memory": InMemoryVectorSearcher,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we thought about making vector index implementations as plugin?

Copy link
Collaborator Author

@jieguangzhou jieguangzhou Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should do this later. The VectorSearcher backend should be the plugin, and keep the InMemoryVectorSearcher as pre-set

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@@ -169,57 +137,46 @@ def test_refer_to_applied_item(db):


def test_column_encoding(db):
import PIL

img = PIL.Image.open('test/material/data/test.png')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this intended?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we have removed the Pillow dependency and replaced it with another non-native datatype.


data = {
"img": PIL.Image.open("test/material/data/test.png"),
"df": pd.DataFrame(np.random.randn(10, 10)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not keep both?

Copy link
Collaborator Author

@jieguangzhou jieguangzhou Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have removed the Pillow dependency and replaced it with another non-native datatype.

@@ -185,14 +160,14 @@ def test_insert_with_auto_schema(db):
datas_from_db = list(table_or_collection.select().execute())

for d, d_db in zip(datas, datas_from_db):
assert d["img"].size == d_db["img"].size
assert d["df"].sum().sum() == d_db["df"].sum().sum()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.sum(d['df'])
might reduce it

just a small pick

Copy link
Collaborator Author

@jieguangzhou jieguangzhou Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not work with using np.sum(d['df']), I changed to d["df"].values.sum()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release the specified plugin with the following commit message
[PLUGINS] Bump Version [plugin_name_1 | plugin_name_2]

Release all plugins with the following commit message

[PLUGINS] Bump Version [all]

@jieguangzhou jieguangzhou force-pushed the feat/migrate-plugins branch 2 times, most recently from bb8c05a to 54ccb3e Compare August 13, 2024 13:15
@blythed blythed force-pushed the feat/migrate-plugins branch from 54ccb3e to 26eccc8 Compare August 13, 2024 14:21
Copy link
Collaborator

@blythed blythed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!

@blythed blythed merged commit cc5a04d into superduper-io:main Aug 13, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants