Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimum openvino #467

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

optimum openvino #467

wants to merge 6 commits into from

Conversation

michaelfeil
Copy link
Owner

@michaelfeil michaelfeil commented Nov 16, 2024

@tjtanaa FYI, continued by merging your branch into this and main.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR adds OpenVINO backend support to optimize model inference on Intel hardware, with changes spanning Docker configuration, embedder implementation, and utility functions.

  • Added OpenVINO execution provider in /libs/infinity_emb/infinity_emb/transformer/utils_optimum.py with bf16 precision support
  • Implemented OpenVINO model file handling with new get_openvino_files() function
  • Set default INFINITY_ENGINE="optimum" in CPU Docker configuration with OpenVINO extras
  • Added CHECK_OPTIMUM_INTEL optional import for Intel optimization capabilities
  • Duplicate optimization config code in optimize_model() needs to be cleaned up

5 file(s) reviewed, 9 comment(s)
Edit PR Review Bot Settings | Greptile

@@ -20,6 +20,8 @@ cpu:
extra_env_variables: |
# Sets default to onnx
ENV INFINITY_ENGINE="optimum"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Installing optimum[openvino] in extra_env_variables section is unconventional. Should be moved to main_install or extra_installs_main.

Comment on lines 45 to 46
RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Redundant installation of optimum[openvino] - this package is already included via EXTRAS='all openvino' in the environment variables

Comment on lines 52 to 53
RUN ./requirements_install_from_poetry.sh --without lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Installing optimum[openvino] multiple times in different build stages may cause version conflicts or increase build time unnecessarily

Comment on lines +65 to +66
except Exception as e: # show error then let the optimum intel compress on the fly
print(str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Silently printing errors and continuing is dangerous. Consider logging the error and/or raising a more specific exception if OpenVINO file loading fails.

)
if provider == "OpenVINOExecutionProvider":
CHECK_OPTIMUM_INTEL.mark_required()
filename = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Empty filename could cause issues if exception occurs. Initialize with None instead to make the failure case more explicit.

else: # Optimum onnx cpu path
optimizer = ORTOptimizer.from_pretrained(unoptimized_model)

is_gpu = "cpu" not in execution_provider.lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Duplicate optimization config block. Remove lines 231-239 as they are identical to 222-230.

openvino_files = [p for p in repo_files if p.match(pattern)]

if len(openvino_files) > 1:
logger.info(f"Found {len(openvino_files)} onnx files: {openvino_files}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Log message incorrectly refers to 'onnx files' when listing OpenVINO files

Suggested change
logger.info(f"Found {len(openvino_files)} onnx files: {openvino_files}")
logger.info(f"Found {len(openvino_files)} OpenVINO files: {openvino_files}")

if files_optimized:
file_optimized = files_optimized[-1]
if file_name:
file_optimized = file_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Overwriting file_optimized with file_name could bypass optimization caching if file_name is set

Comment on lines 214 to 216
ov_config={
"INFERENCE_PRECISION_HINT": "bf16"
}, # fp16 for now as it has better precision than bf16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Using bf16 precision by default may reduce accuracy compared to fp16/fp32 on some hardware

@codecov-commenter
Copy link

codecov-commenter commented Nov 16, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 44.00000% with 42 lines in your changes missing coverage. Please review.

Project coverage is 78.74%. Comparing base (8ac0b3c) to head (5961be9).

Files with missing lines Patch % Lines
...nity_emb/infinity_emb/transformer/utils_optimum.py 44.23% 29 Missing ⚠️
...y_emb/infinity_emb/transformer/embedder/optimum.py 38.09% 13 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #467      +/-   ##
==========================================
- Coverage   79.51%   78.74%   -0.77%     
==========================================
  Files          41       41              
  Lines        3417     3468      +51     
==========================================
+ Hits         2717     2731      +14     
- Misses        700      737      +37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants