Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page ordering/sorting fix, fix to use unique temp dir if userinput not provided also with better cleanup, added async shutil (aioshutil) #34

Merged
merged 12 commits into from
Sep 12, 2024

Conversation

pradhyumna85
Copy link
Contributor

@pradhyumna85 pradhyumna85 commented Sep 12, 2024

Changes

Note: PR #26 is no longer required as this PR incorporates its ideas plus other minor fixes also all together.

Pradyumna Singh Rathore added 8 commits September 12, 2024 13:25
…ely function to do a expected alphanumeric sort.
… Replaces zerox ouput for python sdk with updated output containing token counts.

- Fixing page order issue by adding alphanumeric human like sorting utility (doesn't require padding). Fixes Issue getomni-ai#25
- Better way of handling tmp_dir - user input or unique named temperory directory if None
@pradhyumna85 pradhyumna85 changed the title Page ordering/sorting fix, fix to use unique temp dir if userinput not provided also with better cleanup, added async shutil ( aioshutil) Page ordering/sorting fix, fix to use unique temp dir if userinput not provided also with better cleanup, added async shutil (aioshutil) Sep 12, 2024
@pradhyumna85
Copy link
Contributor Author

pradhyumna85 commented Sep 12, 2024

@xdotli, @annapo23 Could you review this PR?

Pradyumna Singh Rathore added 4 commits September 13, 2024 02:25
Copy link
Contributor Author

@pradhyumna85 pradhyumna85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylermaran, All review threads incorporated. pip install verified, tested imports locally also. Ready to merge!

Copy link
Contributor

@tylermaran tylermaran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! 🧹

@tylermaran tylermaran merged commit 0e7025c into getomni-ai:main Sep 12, 2024
@pradhyumna85 pradhyumna85 deleted the general-sorting-tmpdir-fixes branch September 12, 2024 21:38
@pradhyumna85
Copy link
Contributor Author

pradhyumna85 commented Sep 13, 2024

@tylermaran, I think you'll need to create a release to trigger the pypi publishing action to publish newer python package to pypi?

namtho7078 pushed a commit to namtho7078/zerox that referenced this pull request Oct 27, 2024
…t provided also with better cleanup, added async shutil (aioshutil) (getomni-ai#34)

* Fixed page order issue, fixed tempdir cleanup issue, added sorted nicely function to do a expected alphanumeric sort.

* fix token usage counting which is not aggrefated correctly and always returned 0

* minor fix in token count aggregation

* make more concise

* - update missing minor documentation changes for Issue getomni-ai#31. Replaces zerox ouput for python sdk with updated output containing token counts.
- Fixing page order issue by adding alphanumeric human like sorting utility (doesn't require padding). Fixes Issue getomni-ai#25
- Better way of handling tmp_dir - user input or unique named temperory directory if None

* added asyncronous shutil (aioshutil) and replaced older syncronous shutil.rmtree calls with async ones

* bump pysdk version

* fix import in python sdk documentation

* update importable name to pyzerox. Update documentation (manually wrapped python example output)

* removed padding logic (not required anymore after sorted_nicely function implementation.

* minor typo fix for token count

---------

Co-authored-by: Pradyumna Singh Rathore <pradyumna.singhrathore@halliburton.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python SDK documentation update required Python: Extracted PDF not in the right order
2 participants