Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial refactoring #119

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

Conversation

ispielma
Copy link
Collaborator

@ispielma ispielma commented Feb 24, 2024

This is a real pull request that adds no functionality. Instead it is a first step for any large-scale refactoring of the lyse code base. All that was really done was pulling out the functional code from __main__.py into main.py. Even this is inherently useful because other files / modules can import from 'main.py', but owing to shamefully poor language design __main__.py cannot be imported from. For may larger aggregate pull request, this is why I refactored.

Doing this was slightly non trivial because __main__.py made liberal use of global variables that had to be transformed to method / function arguments and class variables.

Note that this is not intended to be a complete refactor, but any next steps should be pretty easy with the global scope variables banished.

This was tested both on a Mac and a live laboratory deployment of labscript.

@philipstarkey and @dihm : Is this about the scope you are looking for in a more-easy-to-audit pull request?

@dihm
Copy link
Contributor

dihm commented Feb 27, 2024

Thanks for getting this broken down! This is definitely the right size and scope Before doing a full review, I have a few higher level comments.

While I'd normally say to each their own, I don't think the amount of shade thrown on python is (fully) warranted. Now I'm probably equally biased the other way here, but I think some of your judgements fly in the face of standard python conventions and will make this code harder to work on longer term.

  1. Since python has made the choice that __main__.py is the only entry point for a stand-alone module, it actually is excellent design to prevent imports from it because it prevents all but inevitable circular imports. After all, why would something import from __main__.py and not ultimately need to be reimported back for actual use? Only separate API code would satisfy that assumption, but it should be in a separate file anyway.
    • I think wholesale moving everything from __main__.py to main.py is not as much of an improvement as it could be. Really, I'd like to see the GUI code broken up into descriptive modules (ie Analysis Worker stuff in one, dataframe stuff in another, the webserver on its own, etc).
    • Surely some of these items don't actually need to be moved from __main__.py, right? Do you really intend to import Lyse or LyseMainWindow somewhere else? I'd say anything that is only going to be used in __main__.py (long term) should stay there.
  2. This is more of a personal opinion, but I think "avoid globals at all costs" is scaremongering propaganda from the functional programming gang, especially in python. Because everything is an object, a "global" is just a module level attribute and in many cases is functionally equivalent to a class attribute. I personally don't feel a ton of motivation to manually track and pass around a handle to a singleton instance that is only used in a single module that is going to be modified in place via side effects anyway.
    • Ultimately moot since splitting up functionality between files breaks an assumption making this change necessary. Mostly just wanted to caution against casually going against python conventions since they are often conventions for a valid reason. And at the end of the day, PRs are easier to get reviewed and merged safely if we can agree on conventions and instead focus attention on the hard stuff.

@ispielma
Copy link
Collaborator Author

@dihm In terms of the actionable part of this review I can put the MainWindow back in __main__.py since it is unlikely to be imported. I am more than happy to break the new main.py into smaller files, I didn't do that before because I was trying to keep the commit modest in size.

@ispielma
Copy link
Collaborator Author

Regarding the other topics I don't want to start a philosophical war.

  1. The inability to import from __main__.py : perhaps this is a Python documentation or error reporting issue. The core problem is that the line from lyse.__main__ import ... from within a .py file in the lyse module doesn't behave as expected, and no error or warning is raised.
  2. My position is that could should be as easy as possible to understand by a 3rd party.
    a. So globals such as I_LOVE_GLOBALS = False are great. The common standard of all caps denotes them as some sort of global and they by convention are defined the top of a file, making them easy to find.
    b. The problem is hidden globals. For example in __main__.py the global qapplication was defined at the bottom of the script and inside the if __name__ == "__main__": block. This both makes it hard to understand the operation of the individual classes that reference qapplication in isolation without a wholistic understanding of the program and it is a bug because qapplication is not defined in the global scope if __name__ != "__main__"
  3. Thinking of amusing bugs. Run python and type import lyse.__main__ at the prompt. You will get the lyse spash window, but nothing else (because __name__ != "__main__").

@ispielma
Copy link
Collaborator Author

ispielma commented Feb 28, 2024

Update: I put Lyse and LyseMainWindow back into _main__.py and also fixed the amusing bug (3.) above. In doing so I realized that this combination actually defeats the splash window (because modules are loaded before the splash window opens as I had to move that to the if __name__ == "__main__" code block), so I removed them again. By having __main__.py nearly empty there are no required imports outside of if __name__ == "__main__" so we can be assured that splash does what we expect it to.

@ispielma
Copy link
Collaborator Author

And split "mostly gui" code into more files. This should pretty much complete the refactoring of what was __main__.py.

Copy link
Contributor

@dihm dihm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying some things for me. I hadn't looked closely enough to realize how qapplication was defined. I agree that isn't great. It's a good thing to fix here while we are moving things around anyway.

As for import lyse.__main__, while I have little doubt documentation and behavior could be improved here at a language level, I still stipulate that it should never be done. So I wouldn't really classify unusual things happening when one does it as a bug and I don't think we should structure our code to protect against it. It shouldn't be done in the first place. In fact, I see the whole purpose of this PR is to ensure that action never has to happen.

This is relevant to the only real concern I have with the PR, namely that all these imports in __main__.py are decoupled from their usage. It means tooling can't ensure imports are used/missing and therefore ensuring that import list is accurate is a purely manual process (lyse will still work fine even if we add extra imports or take needed ones off). I really don't like it.

I'd much rather move current main.py stuff back into __main__.py and continue to rely on the import side effect to handle the splash. I don't see us losing any functionality, and it conforms to standard conventions so we are less likely to surprise future developers. Unless I'm missing something else, the only thing the current implementation is solving is allowing import lyse.__main__, but I don't think we should support that anyway.

Now, perhaps I am missing something. I am assuming that Lyse and LyseMainWindow do not need to be imported elsewhere for any reason in the future. My imagination may be limited, but I just don't see a valid reason for needing to do that that doesn't also entail a major structural change to how lyse works. And if that is the case, we should discuss it (ideally as another PR/Issue/etc, so this can move along).

Pretty sure my only other meaningful change request is some minor adjustments to the file structure. I think analysis.py is kind-of vague given analsis_subprocess.py exists. It also is only used in widgets.py for use with in RoutineBox (via an undeclared import, which is incidentally leading to a circular dependency since analysis.py has to import RoutineBox). I'd advocate for having a routines.py file that has RoutineBox and AnalysisRoutine in it. I'd also advocate moving more of the QT widgets from filebox.py into widgets.py (basically any class that only has QT dependencies and no specific lyse logic baked in should move to widgets.py).

Finally, I've noticed there are a few lingering unused imports in the files. Given that I'm asking for some changes here I won't list them all out, but an example is lyse.analysis and qtutils.icons in filebox.py. Annoyingly, my tooling is not catching the lyse.analysis unused import, but maybe that means something subtle is going on. Tread with caution I suppose.

spielman added 2 commits March 3, 2024 06:52
…ere are some other places with interprocess / thread communication is defined and getting that together seems wise. This commit gets a place for it ready.
@ispielma
Copy link
Collaborator Author

ispielma commented Mar 4, 2024

I am going to decouple some of these comments. First regarding the imports and SplashWindow. The SplashWindow is pure eyecandy that gives the user something to see while Lyse starts. Prior to this pull request pretty much everything that lyse was going to import was being also imported in __main__.py as a result the SplashWindow could give some indication about the progress of importing all of these imports. By pulling most/all of the code out of main.py there were very few imports and the SplashWindow therefore had few updates. My solution was to recover the old behavior by manually importing everything that was imported in __main.__.py prior to the refactoring. Personally I don't care about this behavior, but perhaps somebody likes it.

@ispielma
Copy link
Collaborator Author

ispielma commented Mar 4, 2024

Unfortunately import qtutils.icons needs to be present as it performs some sort of magic that allows the .ui files to reference that fugue icon set without defining the exact path in QtDesigner. I would like to add an explicit option to the ui loader such as

self.ui = loader.load(os.path.join(LYSE_DIR, 'user_interface/main.ui'), LyseMainWindow(self), fugue_icons=True)

to make this behavior explicit and avoid the magical import.

@ispielma
Copy link
Collaborator Author

ispielma commented Mar 5, 2024

The last set of changes should fully resolve @dihm's points.

Copy link
Contributor

@dihm dihm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ispielma This is really good. Have two very small comments then I suspect it is ready to go.

I would like to stress test it a little more than the dummy shots on my home rig, but I won't have time until later this week. Hopefully that will give @philipstarkey sufficient time to look things over if he wants to weigh in. Otherwise, I think we'll be good to merge by Friday.

lyse/communication.py Outdated Show resolved Hide resolved
lyse/routines.py Outdated Show resolved Hide resolved
…ame_utilities as rangeindex_to_multiindex.
@dihm
Copy link
Contributor

dihm commented Mar 11, 2024

I also forgot, there is probably a little work that needs to be done on the documentation build to track the new changes. If you wanted to sort those out, I would appreciate it. If not, I'll get them sorted this week and make a PR to your branch with the changes.

@ispielma
Copy link
Collaborator Author

ispielma commented Mar 11, 2024

I am not that familiar with the documentation system (in terms of what is automatic and what is manual), but I will have a look.... OK so I have it generating the API in a way that is no worse than before, meaning that many functions / classes are not documented, but I think that this should be its own pull request.

Copy link
Contributor

@dihm dihm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another small issue testing the refactored code in the lab.

lyse/dataframe_utilities.py Outdated Show resolved Hide resolved
@dihm
Copy link
Contributor

dihm commented Mar 12, 2024

I am not that familiar with the documentation system (in terms of what is automatic and what is manual), but I will have a look.... OK so I have it generating the API in a way that is no worse than before, meaning that many functions / classes are not documented, but I don't think that will be its own pull request.

Thanks for sorting that out. Docs are definitely spotty, but at some level most people don't need to see the GUI docstrings anyway. In fact, we should probably consider splitting the lyse autosummary from all the others (ie divide up API and GUI docstrings) just to make it clearer what is actually important.

spielman and others added 3 commits March 13, 2024 09:47
…here and in dataframe_utilities.py to avoid circular dependencies, I moved these into utis.py and renamed them LYSE_PORT and LABCONFIG respectivly to denote their role as system wide constants. I also moved LYSE_PATH there as well for consistency, but re-exported it in __init__.py so it will still be accessible when lyse is imported.
@ispielma
Copy link
Collaborator Author

In working with __init__.py I realized that its exports are somewhat uncontroled. Meaning that everything in the namespace will be imported with import lyse or from lyse import *. To be clear, an example of this is

from lyse.dataframe_utilities import get_series_from_shot as _get_singleshot
from labscript_utils.dict_diff import dict_diff

in this case when one does import lyse the function dict_diff will be entered into the name space and lyse.dict_diff will be defined. Clearly the authors knew about this which is why we see ... as _get_singleshot.

I strongly suggest that I amend this pull request to also create a file lyse_api.py and move almost the whole content of __init__.py there and have __init__.py instead be more like

from lyse.lyse_api import ...
__all__ = [...]

this will make explicit what is being exported and provide better control over the namespace, and the use of __all__=[...] would define what we want from a * import.

Copy link
Contributor

@dihm dihm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for moving the API to another file, I'm not totally sold on the benefit. We should definitely define a __all__, but we can just do that in place. The incidental imports still show up under import lyse (like sys, os, etc), but I'm less concerned about that in that situation.

In any case, given the docs have recommended from lyse import * for forever, this would be a breaking change, so we should save it for another PR.

lyse/analysis_subprocess.py Show resolved Hide resolved
lyse/analysis_subprocess.py Show resolved Hide resolved
dihm
dihm previously approved these changes Mar 18, 2024
Copy link
Contributor

@dihm dihm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with where this. I'll give @philipstarkey a little more time to look it over before merging, now that the bugs have been sorted out.

@dihm dihm requested a review from philipstarkey March 27, 2024 06:58
Copy link
Member

@philipstarkey philipstarkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just reviewed the changes for now. I'm also hoping to find time to check out the new code and give it a look over to see if there is anything that stands out about the structure (which can be hard to see with all of the noise from the diff)

@@ -10,7 +10,8 @@
# the project for the full license. #
# #
#####################################################################
"""Lyse analysis API
"""
Lyse analysis API
"""

from lyse.dataframe_utilities import get_series_from_shot as _get_singleshot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also import lyse.dataframe_utilities explicitly below. Can we simplify some of these lyse import statements so we aren't importing from the same modules in different places?

Comment on lines +13 to +14
"""
Lyse analysis API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this newline make it inconsistent with the other docstring formatting? If any changes were to be made here, I'd suggest a single line """Lyse analysis API""" docstring given how short it is.


# lyse imports
import lyse.dataframe_utilities
import lyse.utils
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit hesitant about this lyse.utils import. lyse.utils is importing Qt packages. But the lyse package (e.g. this file) can be imported in either the GUI, or the worker process. We are probably tied to a Qt matplotlib backed in the worker process anyway? But it would be nice if we were not, or at least minimally so.

One solution would be to make a utils dir, with three files (__init__.py, gui.py, worker.py - or whatever names you like) so that the utils and imports can be split between only for gui, only for worker, or for both. That allows more specific imports to be made (note that the contents of the init file will be imported if either of the others are so that one should try to stay clean of Qt stuff as well).

lyse.figure_manager.install()

from matplotlib.backends.backend_qt5agg import NavigationToolbar2QT as NavigationToolbar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit nervous about relocating this import to before the figure manager is installed. Any idea if it has consequences?

There is a brief line in figure_manager.py about needing to patch matplotlib before importing pylab. I think maybe this import order dependency needs investigating a little bit more, and then documenting (or we revert the change of import location and log an issue to investigate it later)

"""
import os
import labscript_utils.excepthook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relocation of this import means that exceptions raised from some imports and the splash screen won't be raised graphically. I think excepthook installation should be as high as it can be.

Comment on lines -30 to -32
splash.update_text('importing h5_lock and h5py')
import labscript_utils.h5_lock
import h5py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the h5_lock import happening now? Until it's imported, anything that uses h5py could access h5 files without locking them, which could lead to corrupt files. Worse, if something imports h5py.File explicitly (e.g. from h5py import File) before h5_lock is installed, then it won't ever get the patched version of h5py. To be honest I'm wondering if it should happen even before numpy

Comment on lines -2340 to -2343
# Start the web server:
splash.update_text('starting analysis server')
server = WebServer(app.port)
splash.update_text('done')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relocation of the WebServer into the Lyse class, and instantiating it before the GUI is set up, means that any messages received immediately on start could crash lyse when the WebServer.handler method tries to access something about the UI that isn't instantiated yet.

The previous implementation worked such that messages could be received, and placed in appropriate event queues before the Qt loop was even started.

I suspect, given the auto-retry behaviour of BLACS, that there would be scenarios where the crash will occur (it's a bit of a race condition though so may not be easily replicable)

@@ -0,0 +1,125 @@
#####################################################################
# #
# /main.py #
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect filename (there could be other instances of this - please check all the new files)

lyse/utils.py Outdated
Comment on lines 27 to 32
try:
LABCONFIG = LabConfig(required_params={"ports": ["lyse"]})
LYSE_PORT = int(LABCONFIG.get('ports', 'lyse'))
except Exception:
LABCONFIG = None
LYSE_PORT = 42519
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super comfortable with this. Instantiating a LabConfig just because you import lyse.utils has a bad code smell to it. It also isn't used by anything in this file. Seems like it was just move to fix some sort of circular dependency issue?

I'm sure there are a few possible solutions. One that may be the simplest is just to move the import of the labconfig inside the function that uses it, so that it can import it from lyse. That's perfectly acceptable to do, and a good way to break circular dependencies if one part only needs something at run time, not load time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants