Skip to content

refactor to separate getting info and creating config files #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 48 additions & 58 deletions doc/source/examples/tools_example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ Automatically Capture User Info
One task we would like to do is to capture and propagate useful metadata that describes the diffraction data.
Some is essential such as wavelength and radiation type. Other metadata is useful such as information about the
sample, co-workers and so on. However, one of the most important bits of information is the name of the data owner.
For example, in ``DiffractionObjects`` this is stored in the ``metadata`` dictionary as ``username``, ``user_email``,
and ``user_orcid``.
For example, in ``DiffractionObjects`` this is stored in the ``metadata`` dictionary as ``owner_name``, ``owner_email``,
and ``owner_orcid``.

To reduce experimenter overhead when collecting this information, we have developed an infrastructure that helps
to capture this information automatically when you are using `DiffractionObjects` and other diffpy tools.
to capture this information automatically when you are using ``DiffractionObjects`` and other diffpy tools.
You may also reuse this infrastructure for your own projects using tools in this tutorial.

This example will demonstrate how ``diffpy.utils`` allows us to conveniently load and manage user and package information.
Expand All @@ -28,8 +28,9 @@ Load user info into your program

To use this functionality in your own code make use of the ``get_user_info`` function in
``diffpy.utils.tools`` which will search for information about the user, parse it, and return
it in a dictionary object e.g. if the user is "Jane Doe" with email "janedoe@gmail.com" and the
function can find the information, if you type this
it in a dictionary object e.g. if the user is "Jane Doe" with email "janedoe@gmail.com" and ORCID
"0000-0000-0000-0000", and if the
function can find the information (more on this below), if you type this

.. code-block:: python

Expand All @@ -40,16 +41,17 @@ The function will return

.. code-block:: python

{"email": "janedoe@email.com", "username": "Jane Doe"}
{"owner_email": "janedoe@email.com", "owner_name": "Jane Doe", "owner_orcid": "0000-0000-0000-0000"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the term "owner" - and orcid since that's a gateway for more info



Where does ``get_user_info()`` get the user information from?
-------------------------------------------------------------

The function will first attempt to load the information from configuration files with the name ``diffpyconfig.json``
on your hard-drive.
It looks first for the file in the current working directory. If it cannot find it there it will look
user's home, i.e., login, directory. To find this directory, open a terminal and a unix or mac system type ::
It looks for files in the current working directory and in the computer-user's home (i.e., login) directory.
For example, it might be in C:/Users/yourname`` or something like that, but to find this directory, open
Copy link
Contributor

@bobleesj bobleesj Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much clearer since there is an example provided since i wasn't sure what "login" meant here.

a terminal and a unix or mac system type ::

cd ~
pwd
Expand All @@ -58,67 +60,55 @@ Or type ``Echo $HOME``. On a Windows computer ::

echo %USERPROFILE%"

It is also possible to override the values in the config files at run-time by passing values directly into the
function according to ``get_user_info``, for example,
``get_user_info(owner_name="Janet Doe", owner_email="janetdoe@email.com", owner_orcid="1111-1111-1111-1111")``.
The information to pass into ``get_user_info`` could be entered by a user through a command-line interface
or into a gui.

What if no config files exist yet?
-----------------------------------

If no configuration files can be found, the function attempts to create one in the user's home
directory. The function will pause execution and ask for a user-response to enter the information.
It will then write the config file in the user's home directory.

In this way, the next, and subsequent times the program is run, it will no longer have to prompt the user
as it will successfully find the new config file.

Getting user data with no config files and with no interruption of execution
----------------------------------------------------------------------------
If no configuration files can be found, they can be created using a text editor, or by using a diffpy tool
called ``check_and_build_global_config()`` which, if no global config file can be found, prompts the user for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear - when no config is found, this function is called.

information then writes the config file in the user's home directory.

If you would like get run ``get_user_data()`` but without execution interruption even if it cannot find
an input file, type
When building an application where you want to capture data-owner information, we recommend you execute
``check_and_build_global_config()`` first followed by ``get_user_info`` in your app workflow. E.g.,

.. code-block:: python

user_data = get_user_data(skip_config_creation=True)

Passing user information directly to ``get_user_data()``
--------------------------------------------------------

It can be passed user information which fully or partially overrides looking in config files
For example, in this way it would be possible to pass in information
that is entered through a gui or command line interface. E.g.,

.. code-block:: python

new_user_info = get_user_info({"username": "new_username", "email": "new@example.com"})

This returns ``{"username": "new_username", "email": "new@example.com"}`` (and so, effectively, does nothing)
However, You can update only the username or email individually, for example

.. code-block:: python

new_user_info = get_user_info({"username": new_username})

will return ``{"username": "new_username", "email": "janedoe@gmail.com"}``
if it found ``janedoe@gmail.com`` as the email in the config file.
Similarly, you can update only the email in the returned dictionary,

.. code-block:: python

new_user_info = get_user_info({"email": new@email.com})

which will return ``{"username": "Jane Doe", "email": "new@email.com"}``
if it found ``Jane Doe`` as the user in the config file.

I entered the wrong information in my config file so it always loads incorrect information
------------------------------------------------------------------------------------------

You can use of the above methods to temporarily override the incorrect information in your
global config file. However, it is easy to fix this simply by editing that file using a text
from diffpy.utils.tools import check_and_build_global_config, get_user_info
from datetime import datetime
import json

def my_cool_data_enhancer_app_main(data, filepath):
check_and_build_global_config()
metadata_enhanced_data = get_user_info()
metadata_enhanced_data.update({"creation_time": datetime.now(),
"data": data})
with open(filepath, "w") as f:
json.dump(metadata_enhanced_data, f)

``check_and_build_global_config()`` only
interrupts execution if it can't find a valid config file, and so if the user enters valid information
it will only run once. However, if you want to bypass this behavior,
``check_and_build_global_config()`` takes an optional boolean ``skip_config_creation`` parameter that
could be set to ``True`` at runtime to override the config creation.

I entered the wrong information in my config file so it always loads incorrect information, how do I fix that?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear on how to reset config

--------------------------------------------------------------------------------------------------------------

It is easy to fix this simply by deleting the global and/or local config files, which will allow
you to re-enter the information during the ``check_and_build_global_config()`` initialization
workflow. You can also simply editi the ``diffpyconfig.json`` file directly using a text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - edit

editor.

Locate the file ``diffpyconfig.json``, in your home directory and open it in an editor ::

{
"username": "John Doe",
"email": "john.doe@example.com"
"owner_name": "John Doe",
"owner_email": "john.doe@example.com"
"owner_orcid": "0000-0000-4321-1234"
}

Then you can edit the username and email as needed, make sure to save your edits.
Expand Down
24 changes: 24 additions & 0 deletions news/userinfo.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
**Added:**

* <news item>

**Changed:**

* Refactor get_user_info to separate the tasks of getting the info from config files
and creating config files when they are missing.

**Deprecated:**

* <news item>

**Removed:**

* <news item>

**Fixed:**

* <news item>

**Security:**

* <news item>
80 changes: 47 additions & 33 deletions src/diffpy/utils/tools.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import importlib.metadata
import json
import os
from copy import copy
from pathlib import Path

Expand Down Expand Up @@ -56,7 +55,7 @@ def load_config(file_path):
Returns
-------
dict:
The configuration dictionary or None if file does not exist.
The configuration dictionary or {} if the config file does not exist.

"""
config_file = Path(file_path).resolve()
Expand All @@ -65,7 +64,7 @@ def load_config(file_path):
config = json.load(f)
return config
else:
return None
return {}


def _sorted_merge(*dicts):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder do we want to make some of the functions private (like load_config)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this should probably be private, I agree.

Can we also check? I don't think we need _sorted_dict() any more so that could be deleted on a new PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can fix these on a new PR, once that updater workflow PR is merged

Expand All @@ -91,47 +90,62 @@ def _create_global_config(args):
return return_bool


def get_user_info(args=None):
def get_user_info(owner_name=None, owner_email=None, owner_orcid=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much better - instead of args, this line explicitly communicates what info can be optionally provided

"""
Get username and email configuration.

First attempts to load config file from global and local paths.
If neither exists, creates a global config file.
It prioritizes values from args, then local, then global.
Removes invalid global config file if creation is needed, replacing it with empty username and email.
Get name, email and orcid of the owner/user from various sources and return it as a metadata dictionary

The function looks for the information in json format configuration files with the name 'diffpyconfig.json'.
These can be in the user's home directory and in the current working directory. The information in the
config files are combined, with the local config overriding the home-directory one. Values for
owner_name, owner_email, and owner_orcid may be passed in to the function and these override the values
in the config files.

A template for the config file is below. Create a text file called 'diffpyconfig.json' in your home directory
and copy-paste the template into it, editing it with your real information.
{
"owner_name": "<your name as you would like it stored with your data>>",
"owner_email": "<your_associated_email>>@email.com",
"owner_orcid": "<your_associated_orcid if you would like this stored with your data>>"
}
You may also store any other gloabl-level information that you would like associated with your
diffraction data in this file

Parameters
----------
args argparse.Namespace
The arguments from the parser, default is None.
owner_name: string, optional, default is the value stored in the global or local config file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. this gives me joy

The name of the user who will show as owner in the metadata that is stored with the data
owner_email: string, optional, default is the value stored in the global or local config file.
The email of the user/owner
owner_name: string, optional, default is the value stored in the global or local config file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - owner_orcid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, good catch.

The ORCID id of the user/owner

Returns
-------
dict or None:
The dictionary containing username and email with corresponding values.
dict:
The dictionary containing username, email and orcid of the user/owner, and any other information
stored in the global or local config files.

"""
config_bool = True
runtime_info = {"owner_name": owner_name, "owner_email": owner_email, "owner_orcid": owner_orcid}
for key, value in copy(runtime_info).items():
if value is None or value == "":
del runtime_info[key]
global_config = load_config(Path().home() / "diffpyconfig.json")
local_config = load_config(Path().cwd() / "diffpyconfig.json")
if global_config is None and local_config is None:
print(
"No global configuration file was found containing "
"information about the user to associate with the data.\n"
"By following the prompts below you can add your name and email to this file on the current "
"computer and your name will be automatically associated with subsequent diffpy data by default.\n"
"This is not recommended on a shared or public computer. "
"You will only have to do that once.\n"
"For more information, please refer to www.diffpy.org/diffpy.utils/examples/toolsexample.html"
)
config_bool = _create_global_config(args)
global_config = load_config(Path().home() / "diffpyconfig.json")
config = _sorted_merge(clean_dict(global_config), clean_dict(local_config), clean_dict(args))
if config_bool is False:
os.remove(Path().home() / "diffpyconfig.json")
config = {"username": "", "email": ""}

return config
# if global_config is None and local_config is None:
# print(
# "No global configuration file was found containing "
# "information about the user to associate with the data.\n"
# "By following the prompts below you can add your name and email to this file on the current "
# "computer and your name will be automatically associated with subsequent diffpy data by default.\n"
# "This is not recommended on a shared or public computer. "
# "You will only have to do that once.\n"
# "For more information, please refer to www.diffpy.org/diffpy.utils/examples/toolsexample.html"
# )
user_info = global_config
user_info.update(local_config)
user_info.update(runtime_info)
return user_info


def get_package_info(package_names, metadata=None):
Expand Down
12 changes: 7 additions & 5 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,16 @@ def user_filesystem(tmp_path):
base_dir = Path(tmp_path)
home_dir = base_dir / "home_dir"
home_dir.mkdir(parents=True, exist_ok=True)
cwd_dir = base_dir / "cwd_dir"
cwd_dir = home_dir / "cwd_dir"
cwd_dir.mkdir(parents=True, exist_ok=True)

home_config_data = {"username": "home_username", "email": "home@email.com"}
home_config_data = {
"owner_name": "home_ownername",
"owner_email": "home@email.com",
"owner_orcid": "home_orcid",
}
with open(home_dir / "diffpyconfig.json", "w") as f:
json.dump(home_config_data, f)

yield tmp_path
yield home_dir, cwd_dir


@pytest.fixture
Expand Down
Loading
Loading