Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade pyxform to v2.1.1 #5126

Merged
merged 10 commits into from
Nov 6, 2024
Merged

Upgrade pyxform to v2.1.1 #5126

merged 10 commits into from
Nov 6, 2024

Conversation

RuthShryock
Copy link
Member

@RuthShryock RuthShryock commented Sep 24, 2024

Description

The previous version was 1.9.0. Most workflows are not affected, but if you interact directly with XForm XML, you should refer to the changes listed at https://github.com/XLSForm/pyxform/blob/master/CHANGES.txt, particularly the major version v2.0.0. These do affect the generated XForm XML.

Checklist

  1. If you've added code that should be tested, add tests
  2. If you've changed APIs, update (or create!) the documentation
  3. Ensure the tests pass
  4. Make sure that your code lints and that you've followed our coding style
  5. Write a title and, if necessary, a description of your work suitable for publishing in our release notes
  6. Mention any related issues in this repository (as #ISSUE) and in other repositories (as kobotoolbox/other#ISSUE)
  7. Open an issue in the docs if there are UI/UX changes

Notes

Update pyxform to version 2.1.1. This also modifies the format of the xls file that we send to the pyxform method create_survey_from_xls. Before the xls was a Django FieldFile object but we now covert it into a binary stream using io.BytesIO to create a file-like object in order to be compatible with the pyxform updates.

This update is deployed to kf.du.kbtdev.org for testing.

Copy link

@RuthShryock RuthShryock marked this pull request as ready for review September 25, 2024 13:11
@Akuukis Akuukis deleted the branch main October 1, 2024 17:16
@Akuukis Akuukis closed this Oct 1, 2024
@Akuukis Akuukis reopened this Oct 1, 2024
@Akuukis Akuukis changed the base branch from beta to main October 1, 2024 17:29
@RuthShryock RuthShryock changed the title Upgrade pyxform version to 2.0.3 Upgrade pyxform version to 2.1.1 Oct 22, 2024
@jnm
Copy link
Member

jnm commented Oct 24, 2024

Of note: take a look at Enketo image choices. They don't seem to be appearing on kf.du, although the form logo does load.

Edit: this is definitely a problem, but seems unrelated to this PR. See internal discussion.

@kobotoolbox kobotoolbox deleted a comment from notion-workspace bot Nov 5, 2024
Comment on lines 159 to 165
survey = create_survey_from_xls(self.xls)
file_obj = self.xls.read()
# Wrap the file data in a BytesIO object to simulate a file
xls_bytes_io = io.BytesIO(file_obj)
xls_bytes_io.name = self.xls.name

survey = create_survey_from_xls(xls_bytes_io)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised by this, and I've tried to do some testing locally but couldn't notice a difference between the old and new versions of pyform. Do you know a way to reproduce the issue that prompted this workaround?

Here's the testing I tried:

kpi$ pip freeze | grep pyxform
pyxform==1.9.0
kpi$ ./manage.py shell_plus
<snip>
IPython 8.12.3 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from pyxform.builder import create_survey_from_xls

In [2]: dd = DataDictionary.objects.last()

In [3]: s = create_survey_from_xls(dd.xls)

In [4]:
kpi$ pip install pyxform==2.0.3
<snip>
Successfully installed openpyxl-3.1.2 pyxform-2.0.3
kpi$ ./manage.py shell_plus
<snip>
IPython 8.12.3 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from pyxform.builder import create_survey_from_xls

In [2]: dd = DataDictionary.objects.last()

In [3]: s = create_survey_from_xls(dd.xls)

In [4]:

I also noticed this, which says it will allow us "to call pyxform without needing to use files (accepts bytes/bufferedreader (from open())/strings)", but it's not included until pyxform v2.1.0: XLSForm/pyxform#712

Copy link
Member

@jnm jnm Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly, it's XLSForm/pyxform#712 that actually stops us from passing a file-like object to create_survey_from_xls(). Here, with pyxform v2.1.1:

>>> from pyxform.builder import create_survey_from_xls
>>> dd = DataDictionary.objects.last()
>>> s = create_survey_from_xls(dd.xls)
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/pyxform/xls2json_backends.py", line 284, in xlsx_to_dict
    reader = ExcelReader(wb_file.data, read_only=True, data_only=True)
  File "/opt/venv/lib/python3.10/site-packages/openpyxl/reader/excel.py", line 123, in __init__
    self.archive = _validate_archive(fn)
  File "/opt/venv/lib/python3.10/site-packages/openpyxl/reader/excel.py", line 77, in _validate_archive
    file_format = os.path.splitext(filename)[-1].lower()
  File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext
    p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/opt/venv/lib/python3.10/site-packages/pyxform/builder.py", line 381, in create_survey_from_xls
    excel_reader = SurveyReader(path_or_file, default_name=default_name)
  File "/opt/venv/lib/python3.10/site-packages/pyxform/xls2json.py", line 1678, in __init__
    self._dict = parse_file_to_json(
  File "/opt/venv/lib/python3.10/site-packages/pyxform/xls2json.py", line 1609, in parse_file_to_json
    workbook_dict = parse_file_to_workbook_dict(path, file_object)
  File "/opt/venv/lib/python3.10/site-packages/pyxform/xls2json.py", line 1583, in parse_file_to_workbook_dict
    return xlsx_to_dict(file_object if file_object is not None else path)
  File "/opt/venv/lib/python3.10/site-packages/pyxform/xls2json_backends.py", line 292, in xlsx_to_dict
    raise PyXFormReadError(f"Error reading .xlsx file: {read_err}") from read_err
pyxform.errors.PyXFormReadError: Error reading .xlsx file: expected str, bytes or os.PathLike object, not NoneType

The problem is here, https://github.com/XLSForm/pyxform/pull/712/files#diff-13b76560ce57b07de404f4026336e9fc8fd9dd9fbedef0b761ab1b327a873ea8R781-R782, which now requires what's passed to be an instance of bytes | BytesIO | IOBase.

I'll compare this to how it was working before XLSForm/pyxform#712.

tl;dr: It's not really a big deal to wrap self.xls inside a BytesIO (although self.xls.file.file already is one), but I'm concerned about XLSForm/pyxform#712 "set[ting] the scene for making parts of pyxform non-public" (ref), especially how it presents pyxform.survey.Survey as non-public: https://github.com/XLSForm/pyxform/pull/712/files#diff-4eb61487be8c901fb71364cbe00f3b124caf5224f7bd06822c6c53d39241f9baR43-R58.

Copy link
Member

@jnm jnm Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll compare this to how it was working before XLSForm/pyxform#712.

The old way was to consider it a file unless it was an instance of str, bytes, os.PathLike, in which case it was considered a path:
https://github.com/XLSForm/pyxform/blob/e25612efe5b47241bddf0bb95304c61a76079bd5/pyxform/xls2json_backends.py#L187-L192

The new way:

  • considers it a path if it's an instance of str | PathLike,
  • normalizes any instance of bytes | BytesIO | IOBase to a BytesIO
  • silently falls back to None if neither of the above criteria is met

It'd be worth asking the pyxform folks behind the change if it's doing what they intended, but for now, let's proceed with your workaround.

@jnm jnm requested a review from noliveleger as a code owner November 6, 2024 16:17
@jnm jnm changed the title Upgrade pyxform version to 2.1.1 Upgrade pyxform to v2.1.1 Nov 6, 2024
@jnm jnm removed the request for review from noliveleger November 6, 2024 16:19
@jnm jnm merged commit 1574ad2 into main Nov 6, 2024
6 of 7 checks passed
@jnm jnm deleted the update-pyxform-version branch November 6, 2024 17:30
RuthShryock added a commit that referenced this pull request Jan 13, 2025
The previous version was 1.9.0. Most workflows are not affected, but if
you interact directly with XForm XML, you should refer to the changes
listed at https://github.com/XLSForm/pyxform/blob/master/CHANGES.txt,
particularly the major version v2.0.0. These do affect the generated
XForm XML.

1. [ ] If you've added code that should be tested, add tests
2. [ ] If you've changed APIs, update (or create!) the documentation
3. [x] Ensure the tests pass
4. [x] Make sure that your code lints and that you've followed [our
coding
style](https://github.com/kobotoolbox/kpi/blob/master/CONTRIBUTING.md)
5. [x] Write a title and, if necessary, a description of your work
suitable for publishing in our [release
notes](https://community.kobotoolbox.org/tag/release-notes)
6. [ ] Mention any related issues in this repository (as #ISSUE) and in
other repositories (as kobotoolbox/other#ISSUE)
7. [ ] Open an issue in the
[docs](https://github.com/kobotoolbox/docs/issues/new) if there are
UI/UX changes

Update pyxform to version
[2.1.1](https://github.com/XLSForm/pyxform/releases/tag/v2.1.1). This
also modifies the format of the xls file that we send to the pyxform
method `create_survey_from_xls`. Before the xls was a Django FieldFile
object but we now covert it into a binary stream using `io.BytesIO` to
create a file-like object in order to be compatible with the pyxform
updates.

This update is deployed to [kf.du.kbtdev.org](https://kf.du.kbtdev.org/)
for testing.

---------

Co-authored-by: John N. Milner <john@tmoj.net>
jnm added a commit that referenced this pull request Jan 14, 2025
The previous version was 1.9.0. Most workflows are not affected, but if
you interact directly with XForm XML, you should refer to the changes
listed at https://github.com/XLSForm/pyxform/blob/master/CHANGES.txt,
particularly the major version v2.0.0. These do affect the generated
XForm XML.

1. [ ] If you've added code that should be tested, add tests
2. [ ] If you've changed APIs, update (or create!) the documentation
3. [x] Ensure the tests pass
4. [x] Make sure that your code lints and that you've followed [our
coding
style](https://github.com/kobotoolbox/kpi/blob/master/CONTRIBUTING.md)
5. [x] Write a title and, if necessary, a description of your work
suitable for publishing in our [release
notes](https://community.kobotoolbox.org/tag/release-notes)
6. [ ] Mention any related issues in this repository (as #ISSUE) and in
other repositories (as kobotoolbox/other#ISSUE)
7. [ ] Open an issue in the
[docs](https://github.com/kobotoolbox/docs/issues/new) if there are
UI/UX changes

Update pyxform to version
[2.1.1](https://github.com/XLSForm/pyxform/releases/tag/v2.1.1). This
also modifies the format of the xls file that we send to the pyxform
method `create_survey_from_xls`. Before the xls was a Django FieldFile
object but we now covert it into a binary stream using `io.BytesIO` to
create a file-like object in order to be compatible with the pyxform
updates.

This update is deployed to [kf.du.kbtdev.org](https://kf.du.kbtdev.org/)
for testing.

---------

Co-authored-by: John N. Milner <john@tmoj.net>
jnm added a commit that referenced this pull request Jan 14, 2025
The previous version was 1.9.0. Most workflows are not affected, but if
you interact directly with XForm XML, you should refer to the changes
listed at https://github.com/XLSForm/pyxform/blob/master/CHANGES.txt,
particularly the major version v2.0.0. These do affect the generated
XForm XML.

1. [ ] If you've added code that should be tested, add tests
2. [ ] If you've changed APIs, update (or create!) the documentation
3. [x] Ensure the tests pass
4. [x] Make sure that your code lints and that you've followed [our
coding
style](https://github.com/kobotoolbox/kpi/blob/master/CONTRIBUTING.md)
5. [x] Write a title and, if necessary, a description of your work
suitable for publishing in our [release
notes](https://community.kobotoolbox.org/tag/release-notes)
6. [ ] Mention any related issues in this repository (as #ISSUE) and in
other repositories (as kobotoolbox/other#ISSUE)
7. [ ] Open an issue in the
[docs](https://github.com/kobotoolbox/docs/issues/new) if there are
UI/UX changes

Update pyxform to version
[2.1.1](https://github.com/XLSForm/pyxform/releases/tag/v2.1.1). This
also modifies the format of the xls file that we send to the pyxform
method `create_survey_from_xls`. Before the xls was a Django FieldFile
object but we now covert it into a binary stream using `io.BytesIO` to
create a file-like object in order to be compatible with the pyxform
updates.

This update is deployed to [kf.du.kbtdev.org](https://kf.du.kbtdev.org/)
for testing.

---------

Co-authored-by: John N. Milner <john@tmoj.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants