Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Price-taker framework for DISPATCHES #1201

Closed
wants to merge 19 commits into from

Conversation

adam-a-a
Copy link
Contributor

@adam-a-a adam-a-a commented Jun 8, 2023

Fixes

Summary/Motivation:

Changes proposed in this PR:

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

@adam-a-a adam-a-a added the WIP label Jun 8, 2023
@MarcusHolly
Copy link
Contributor

@lbianchi-lbl I have two questions:

1.) Are you aware of why the Check Spelling test is failing in some PRs (like this one) but not others?

2.) We'll probably need to add some dependencies for this PR (namely scikit-learn). While this PR is still a WIP and probably won't be merged for a while, we were curious if you had any thoughts on the matter or if there are any potential complications with doing so.

@lbianchi-lbl
Copy link
Contributor

lbianchi-lbl commented Jun 9, 2023

@lbianchi-lbl I have two questions:

1.) Are you aware of why the Check Spelling test is failing in some PRs (like this one) but not others?

Thanks for bringing this to my attention. This seems to be due to a change in version of the spellchecking tool and I've created #1203 to track it. It's very likely that CI checks will continue to fail until that is resolved. The spellchecking issues have been resolved by #1204.

2.) We'll probably need to add some dependencies for this PR (namely scikit-learn). While this PR is still a WIP and probably won't be merged for a while, we were curious if you had any thoughts on the matter or if there are any potential complications with doing so.

Starting with #1133, optional dependencies are being managed within extras_require targets in setup.py. For the codebase currently in idaes.apps.grid_integration, the target is called grid. We can discuss more as the PR gets closer to being merged, but you can use gridx-prescient as a reference to show how to add e.g. scikit-learn.

@adam-a-a
Copy link
Contributor Author

@lbianchi-lbl it seems the spell checker is identifying FOM as a typo, but it is a relevant acronym. How do we sidestep such issues, other than getting rid of the acronym?

@MarcusHolly
Copy link
Contributor

@lbianchi-lbl it seems the spell checker is identifying FOM as a typo, but it is a relevant acronym. How do we sidestep such issues, other than getting rid of the acronym?

Is it fine to just add a few more entries to typos.toml as was done in #1204?

@lbianchi-lbl
Copy link
Contributor

@adam-a-a @MarcusHolly that's correct. You should be able to add an entry for FOM in typos.toml, using any of the existing domain-specific acronyms and terms as a reference.

@ksbeattie
Copy link
Member

@adam-a-a this doesn't look like it will make the final DISPATCHES release. Is that ok?

@lbianchi-lbl lbianchi-lbl added the Priority:Normal Normal Priority Issue or PR label Jun 15, 2023
@lbianchi-lbl
Copy link
Contributor

@radhakrishnatg does this have already a planned completion date? Otherwise, I'll leave it off the August release board while it's still in draft.

while j <= len(raw_data):
daily_data[day] = raw_data[i:j].reset_index(drop=True)
i = j
j = j + 24
Copy link
Contributor

@radhakrishnatg radhakrishnatg Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use horizon_length instead of hardcoding 24. If the data vector has excess elements either truncate the data or throw an exception asking the user to check the length of the data vector.

return daily_data

@staticmethod
def get_optimal_n_clusters(daily_data, kmin=None, kmax=None, sample_weight=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine get_optimal_n_clusters and get_elbow_plot methods.

Copy link
Contributor

@radhakrishnatg radhakrishnatg Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an argument for seed and save it as a hidden argument. Ensure that all methods which use Kmeans clustering code use the same seed value. Define seed and horizon_length as properties. You can access them getters and setters.

Copy link
Contributor

@radhakrishnatg radhakrishnatg Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update kmin and kmax to 4 and 30. Add a warning if the optimal number of clusters is close to kmax. Re-run the code with a higher kmax value in that case.

@adam-a-a
Copy link
Contributor Author

@adam-a-a this doesn't look like it will make the final DISPATCHES release. Is that ok?

@ksbeattie - Actually, @radhakrishnatg coordinated with us and asked us (including @MarcusHolly and Tyler Jaffe, who should've passed the quiz and needs to be added to the repo) to deploy this. I would think we would want this in the final release, but I don't know what that release date is and have only recently gotten involved with DISPATCHES. I suppose this would be a better question for @radhakrishnatg.

@adowling2
Copy link
Contributor

This will get merged into the Spring 2024 IDAES release.

@radhakrishnatg
Copy link
Contributor

@ksbeattie @adam-a-a It's okay if this PR does not make it to the final DISPATCHES release. @lbianchi-lbl We do not have a planned completion date yet. For now, let's not include this PR in the August board.

@codecov
Copy link

codecov bot commented Jun 21, 2023

Codecov Report

Patch coverage: 36.20% and project coverage change: -0.16 ⚠️

Comparison is base (623c168) 76.83% compared to head (9b5edd4) 76.67%.

❗ Current head 9b5edd4 differs from pull request most recent head 948f40c. Consider uploading reports for the commit 948f40c to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1201      +/-   ##
==========================================
- Coverage   76.83%   76.67%   -0.16%     
==========================================
  Files         390      392       +2     
  Lines       61852    62089     +237     
  Branches    11386    11429      +43     
==========================================
+ Hits        47523    47609      +86     
- Misses      11867    12013     +146     
- Partials     2462     2467       +5     
Impacted Files Coverage Δ
.../grid_integration/multiperiod/price_taker_model.py 33.33% <33.33%> (ø)
...gration/multiperiod/design_and_operation_models.py 46.34% <46.34%> (ø)
idaes/apps/grid_integration/__init__.py 100.00% <100.00%> (ø)

... and 6 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

raw_data=raw_data[i], day_list=day_list
)

return daily_data, scenarios
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there's a better way to call scenarios in cluster_lmp_data without having to return it here, but if this is fine as is, we can keep it. It just makes the code a bit clunkier imo.


return daily_data, scenarios

def get_optimal_n_clusters(self, daily_data, kmin=None, kmax=None, plot=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adam mentioned that we may want to reconsider having a separate function for plotting the elbow plot rather than just an argument in get_optimal_n_clusters. I don't have a strong opinion either way.


Returns:
lmp_data = {1: {1: 2, 2: 3, 3: 5}, 2: {1: 2, 2: 3, 3: 5}}
weights = {1: 45, 2: 56}
Copy link
Contributor

@MarcusHolly MarcusHolly Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the output for weights is in the format {0: {0: 45 , 1: 56, ...). I assume we want to get rid of the 0 in front so that it matches the format Radha initially put in the doc description, but I'm not sure how to go about doing that. I think it's just the header for the column similar to how the output for lmp_data is in the format: {0: {1: 2, 2: 3, 3: 5}, 1: {1: 2, 2: 3, 3: 5}}, where 0 is the header of the first column representing the first cluster.

On that last note, is it okay if lmp_data and weights start at 0 rather than 1?


assert f"kmax was not set - using a default value of 30." in caplog.text

# TODO: The below test is not working because our data doesn't ever seem to arrive at an n_clusters close to kmax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get this test working for the reason stated in the TO-DO note. I've played around with it a bit and couldn't get it working. It might be the case that we just need to generate a new dataset that is capable of arriving at an n_clusters close to kmax such that we can test this logger warning. Or maybe we should just leave this warning untested for now...

Comment on lines 1 to +2
[pytest]
addopts = --pyargs idaes
--durations=100
addopts = --durations=100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to push this change? I think the test file will fail without it since the Excel could not be imported.

@ksbeattie
Copy link
Member

@adam-a-a is this still something that will be done (given that DISPATCHES is done)?

@ksbeattie
Copy link
Member

@djlaky is #1358 replacing this PR? As in, should be close this one?

@djlaky
Copy link
Contributor

djlaky commented Jun 13, 2024

@djlaky is #1358 replacing this PR? As in, should be close this one?

Yes, that is correct. This can be closed as #1358 replaces this PR.

@ksbeattie ksbeattie closed this Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DISPATCHES Priority:Normal Normal Priority Issue or PR WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants