-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-47389 Porting mail-alarm to python3 #5400
CP-47389 Porting mail-alarm to python3 #5400
Conversation
Regrading python2 unittest error, it caused by this commit 51ca753 In python2, I am struggling with the decision of whether to write an 'if' statement as a workaround for the python2 unittest, however, the code will never be executed when using the python3 interpreter. Now, I have two options:
@psafont may I have your suggestion for this? |
While I'd like to say to not bother running the unit-test for python2 because the code is now only run under python3, that may not be quite true. In case there's an unforeseen issue, we may want to revert the script to be run under python 2. In that case it makes sense to have the unit-test to be run under python2, even if it means adding more code to support it. |
9c54308
to
e228c7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, lets go to feature branch for nightly test.
scripts/mail-alarm
Outdated
@@ -880,7 +886,7 @@ class XapiMessage: | |||
self.mail_language, | |||
self.session, | |||
) | |||
elif re.match("sr_io_throughput_total_[0-9a-f]{8}$", name): | |||
elif re.match("sr_io_throughput_total_[0-9a-f]{8}$", str(name)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why converting this name
to str? In python2 passing a byte to str
is a identity, in python3 one would get a string representation of a byte. But I guess that is not relevant here. Are we expecting name
to be a byte object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why converting this
name
to str? In python2 passing a byte tostr
is a identity, in python3 one would get a string representation of a byte. But I guess that is not relevant here. Are we expectingname
to be a byte object?
What you said isn't entirely correct, especially regarding python 3. this knowledge of bytes and str can indeed be a bit confusing, but you need to keep in mind that str.encode() is bytes type in python3.
As for why I made this change, talk is cheap show you the code.
$python3
Python 3.6.8 (default, Jan 5 2024, 10:43:44)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-21)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a=None
>>> b=str(a)
>>> c=b.encode()
>>> type(b)
<class 'str'>
>>> type(c)
<class 'bytes'>
>>> import re
>>> re.match('\w+', a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.6/re.py", line 172, in match
return _compile(pattern, flags).match(string)
TypeError: expected string or bytes-like object
>>> re.match('\w+', b)
<_sre.SRE_Match object; span=(0, 4), match='None'>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acefei: You did not answer the question, and your proposed diff does not explain it either.
The question was justified, and you dismissed it with schooling @Vincent-lau on things that are not at all related to this diff and that he knows for sure. You also said that what he said was not correct, without saying what you refer to. In other words, you completely spoke sideways to his question. He also did not say anything that was wrong. He was just pointing out the obvious.
As for explaining why you suggest this diff, you do not explain but give a hint that you say that name
could be None
. It can, and None would result in a raise TypeError
, but that would have been an error on Python2 as well, and as such you would have had to raise a CA ticket instead.
Additionally, you spread misinformation as to what str()
does in Python3:
In your commit message in one of the commits of the PR, you wrote:
commit d91e2dc2ad00e1734b1dfb0859ae4269f167c78f
Author: Fei Su <fei.su@cloud.com>
Date: Thu Jan 25 17:01:24 2024 +0800
Fix str.encode issue in python3
In python2, string is represented as bytes type, and encode() function means
encoding conversion, its type remains unchanged. However, in Python 3,
string is unicode type (<class 'str'>), and encoding them converts them into bytes type.
Additionally, the str() function in Python 3 is used to decode bytes into unicode.
While the start of your explanation is correct, the last line is wrong:
Additionally, the str() function in Python 3 is used to decode bytes into unicode
No, the str()
function does not decode() bytes to unicode
in Python3:
It gives you a string representation of the bytes, similar to what a __repr__
would do:
python3 -c 'print(str(b"\xe9\x94\x99"))'
b'\xe9\x94\x99'
python2 -c 'print(str(b"\xe9\x94\x99"))'
错
Python2 str()
just outputs the bytes as they are, and when sent to a terminal in UTF-8 mode, the terminal can display the glyph that is represented by the UTF-8 bytes which is example shows.
So as this shows, on Python2, as @Vincent-lau correctly said, bytes
and str
are an identity.
Not so on Python3 as this shows too, opposite of what you wrote in your commit message.
In addition, str()
encloses the bytes to show that they are bytes: b'<bytes>'
(like shown above)"
Also, str()
by its very nature of not having a charset-encoding to work with, cannot do decode.
Besides, there is no such thing as "decode bytes into unicode" that you mention in Python3.
unicode
, as a data type only exists in Python2. You might have wanted to say str
, but as shown, getting a repr-like string is not at all something like decoding to a Unicode data type.
At the end, you get a string, but it's not like the Unicode of 错 but you get: b'\xe9\x94\x99'
So, the change you make is only needed with None
and has nothing to do with encoding, like you imply in your dismissal of @Vincent-lau'd review question that did not answer what he asked.
@Vincent-lau: BTW to answer your question, name is set here
name = get_alarm_config("name", str)
Where get_alarm_config is defined as:
def get_alarm_config(tag, cast):
try:
return cast(
config_xmldoc.getElementsByTagName(tag)[0].getAttribute("value")
)
except:
return None
Thus, it already does str(config_xmldoc.getElementsByTagName(tag)[0].getAttribute("value")
with a catch around it that sets to None
on any Exception
.
Therefore, the obvious fix, not related to this Python3 migration, would be to check name
to be set (not None) which would have been:
- elif re.match("sr_io_throughput_total_[0-9a-f]{8}$", str(name)):
vs
+ elif name and re.match("sr_io_throughput_total_[0-9a-f]{8}$", name):
Code should be obvious to read, and improvements should make the code more readable.
They should not obfuscate what you really need to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, str() by its very nature of not having a charset-encoding to work with, cannot do decode.
@bernhardkaindl Thank you for explaining it, and I will remove the commit message that would cause confusion for native speakers. I realized the expression "the str() function in Python 3 is used to decode bytes into unicode." is incorrect, especially the word decode
.
Meanwhile, I 'd like to express my understanding as accurately as possible here.
From the official document https://docs.python.org/release/3.0.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit, we can see the following.
All text is Unicode; however encoded Unicode is represented as binary data. The type used to hold text is [str](https://docs.python.org/release/3.0.1/library/functions.html#str),
According to All text is Unicode
, the 'unicode' I mentioned is actually what you referred to as 'string' in your statement.
It gives you a string representation of the bytes, similar to what a repr would do:
For example, a
is bytes, str(a)
convert (I shouldn't say decode
before) a
to the text which can be encoded further.
>>> a=b'\xe9\x94\x99'
>>> type(a)
<class 'bytes'>
>>> b=str(a)
>>> type(b)
<class 'str'>
>>> b
"b'\\xe9\\x94\\x99'"
>>> b.encode()
b"b'\\xe9\\x94\\x99'"
Also, str() by its very nature of not having a charset-encoding to work with, cannot do decode.
Besides, there is no such thing as "decode bytes into unicode" that you mention in Python3.
In fact, the str()
has two additional parameters.
If passing a bytes object to str() without the encoding or errors arguments falls under the first case of returning the informal string representation.
If at least one of encoding or errors is given, object should be a bytes-like object (e.g. bytes or bytearray). In this case, if object is a bytes object, then str(bytes, encoding, errors)
is equivalent to bytes.decode(encoding, errors).
Python 3.6.8 (default, Jan 5 2024, 10:43:44)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-21)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a=b'\xe9\x94\x99'
>>> str(a, encoding='utf-8')
'错'
Code should be obvious to read, and improvements should make the code more readable.
They should not obfuscate what you really need to do.
you're right, I will update this commit as your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I would personally put this check at the top, something like
if name is None:
etg=None
if name == ....
but I am happy for you to leave it as it is now.
Thanks @bernhardkaindl for the thorough explanation!
@psafont base on above changes and testing, may I have your further review and approval? |
Tests on the newest commits in the master branch now run static analysis on python modules. Please rebase the branch to check whether something was missed in the porting |
2eb4e6a
to
60fc7e8
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## feature/py3 #5400 +/- ##
============================================
Coverage 49.07% 49.07%
============================================
Files 18 18
Lines 2319 2319
============================================
Hits 1138 1138
Misses 1181 1181
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Noted, since we create a feature branch, I need to wait @liulinC sync it with master first. |
549c2c3
to
c7a00ed
Compare
48b046d
to
3c4ceb8
Compare
@acefei The RP LGTM, do not forgot to squash your commits before merge. |
Rebase the feature branch and rebuild the package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This review question by @Vincent-lau has not been resolved: #5400 (comment)
This legitimate question by @Vincent-lau need to be resolved before the merge because we should not commit changes that are not understood correctly. In this case, there appears to be some misunderstanding that should be resolved first.
The rules on reviews from the Xapi Toolstack Community of Practice and the System Quality Community of Practice say:
Never be tempted by time pressure to merge a PR before has had enough review. Quality first!
8fc455a
to
ed25899
Compare
The code has been updated and tested again as commit changed.
@bernhardkaindl Could you please take a look the pytype issue?
In my PR, there have been no changes on scripts/hfx_filename, scripts/perfmon. |
I think the last commit: ed25899 is not needed, when a script is changed, it will automatically be removed from the |
Signed-off-by: Fei Su <fei.su@cloud.com>
Signed-off-by: Fei Su <fei.su@cloud.com>
Signed-off-by: Fei Su <fei.su@cloud.com>
Signed-off-by: Fei Su <fei.su@cloud.com>
Signed-off-by: Fei Su <fei.su@cloud.com>
e22ef26
to
f42f23f
Compare
@acefei , sorry, I miss understood the logic in CI, the changed script: "scripts/mail-alarm" is expected to be removed from "expected_to_fail" in file "pyproject.toml". |
…l in pyproject.toml and make sure it passes pytype checks Signed-off-by: Fei Su <fei.su@cloud.com>
Hi @bernhardkaindl, just a friendly reminder that I had already made the changes as you requested two weeks ago, which should be enough time to conduct a thorough review. Could you please share any further feedback? |
Hi @bernhardkaindl, |
The changes are missing tests, this means that they have no coverage. We can merge this as is, but for merging to master they will be required. |
I fully agreed to enhance the Python 3 code coverage in this project. Regarding the present PR, I am ensuring that the following meet the merging requirements: |
The coverage report seems broken, it doesn't detect any changes in |
Looking at the testrun in the CI there are no coverage reports for that file even locally, so it is not an upload/codecov.io bug. Looks like the test does some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on clarifications on planning the work with @psafont and Diego,
I'm approving this PR.
Python3-only can be moved to the new xen-api/python3/ directory structure
- This will cleanup the directory structure
- The new python3 directory structure allows to separate legacy code and new reviewed code.
Auto Test Passed with ring3hostops.seq 3918348, ring3 bvt+bst 194002
Manual Test with three languages.