Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specification of the current data structure and functionalities provided by WMStats UI #11411

Closed
amaltaro opened this issue Dec 19, 2022 · 13 comments

Comments

@amaltaro
Copy link
Contributor

Impact of the new feature
WMStats (CouchApp)

Is your feature request related to a problem? Please describe.
Before a new generation of the WMStats CouchApp service is discussed, there are many information that need to be collected and considered for a future service, such as:

  • current data stored in the database (including required vs likely non-required data)
  • how data gets published to the wmstats database
  • functionalities provided by WMStats (e.g., ACDC creation)
  • APIs used to load data into WMStats
  • data structure of the data loaded into WMStats
  • weak and/or missing functionalities

Describe the solution you'd like
The outcome should be a document (wiki, gdoc, etc) where all the information above is provided.

With this new document, we can then resume discussing the next generation for WMStats, potentially coming back to the initial prototype provided by Valentin in Golang.

Describe alternatives you've considered
None

Additional context
None

@vkuznet
Copy link
Contributor

vkuznet commented Jan 5, 2023

Here is initial draft of notes I taken to understand WMStats behavior. I'll expand it further and organize it according to this ticket requirements.

@vkuznet vkuznet moved this from Todo to In Progress in WMCore quarterly developments Jan 5, 2023
@amaltaro
Copy link
Contributor Author

amaltaro commented Jan 5, 2023

I had a quick look at it and your documentation is looking good, Valentin. Note though that this issue is supposed to focus on the CouchApp layer (Web UI), such that we can decouple it from CouchDB itself.

@vkuznet
Copy link
Contributor

vkuznet commented Jan 5, 2023

I understand that. The WMStats server provides API to CouchDB and data which is required for WMStats UI. This data can be fetched directly from CouchDB though. So far I'm collecting information for both as I think it will help us to decide how to move forward, i.e. either fetch data from WMStats server or get it directly from CouchDB (and in this case we do not need per-se WMStats server to implement WMStats UI one).

@amaltaro
Copy link
Contributor Author

amaltaro commented Jan 5, 2023

I see. However, a service like wmstatsserver is of paramount important in this design, given that it:

  • provides caching
  • reduces (drastically) the number of calls to CouchDB itself
  • provides proper logging
  • provides proper access and can be used for authz

@vkuznet
Copy link
Contributor

vkuznet commented Jan 5, 2023

@amaltaro , even though you have fair points I can't say much about them since I never saw any benchmarks. In other words

  • does CouchDB get any degradation if we place 1 or 1K calls?
  • does fetch from CouchDB to cache gets any benefits for data which should be fetched, why we can't use nginx for that and rely on ETag?
  • CouchDB has logging too what's wrong with it?
  • Does WMStats and CouchDB are behind our Frontends which provides auth and groups?

Please do not take me wrong, I'm not against WMStats server, but simple claims should be supported by some studies, benchmarks, etc. and both approaches should be considered from operational and maintenance point of view. That's is why it is desired to obtain as much information as we need, and may be do benchmarks too in order to make proper design choice(s).

@amaltaro
Copy link
Contributor Author

amaltaro commented Jan 5, 2023

does CouchDB get any degradation if we place 1 or 1K calls?

Yes, we have done some benchmark when upgrading CouchDB 1.6 to 3.1. You can see some of those results in:
https://docs.google.com/spreadsheets/d/1lEoMWfjmwLmT59g10EgwrH_8VW010k9ojGELTUWty_k/edit?usp=sharing

but results are really dependent on many other factors, functionality used, amount of data, average size of documents, etc.

does fetch from CouchDB to cache gets any benefits for data which should be fetched, why we can't use nginx for that and rely on ETag?

I understand that the server backend still has to verify whether the resource requested has had any changes or not, based on the ETag provided by the client. So it's not clear to me that NGinx would be enough to decrease load on the backend. I might be missing something here though.

CouchDB has logging too what's wrong with it?

I don't think it is as rich as the ones we have in our web-services at the moment (including DN/user, data in, data out, etc).

Does WMStats and CouchDB are behind our Frontends which provides auth and groups?

Yes, but CouchDB has its own auth/authz model. Reason why we used to keep the CMS specific code on top of the old CouchDB.

@vkuznet
Copy link
Contributor

vkuznet commented Jan 6, 2023

@amaltaro , I put up more information about WMStats functionality and I think my doc is almost complete. It would be useful if you'll re-read if and provide your feedback. Then, we can move it into WMCore wiki (since I used markdown format it will be cut-and-paste step).

@amaltaro
Copy link
Contributor Author

amaltaro commented Jan 6, 2023

It's looking good in general, but before we converse and persist it in a wiki, I think it would be useful to go through this together via Zoom and make some required corrections on the fly (offline review of that content would be complicated).

@vkuznet
Copy link
Contributor

vkuznet commented Jan 9, 2023

Sure, I'm fine with zoom chat, just let me know when you'll have time.

@vkuznet
Copy link
Contributor

vkuznet commented Jan 9, 2023

@amaltaro , I re-factor the documentation and put it to WMCore wiki. When you'll get time feel free to explore it here

@amaltaro
Copy link
Contributor Author

@vkuznet I had a look at the current documentation and it's looking pretty good. I made a few corrections here and there, both on typos and in the content itself. Please have another look at the "details" wiki.

Given that you are on it, I'd suggest to merge those wmstats wikis all in the same one. For the RESTful APIs, we might have a section that points to a separated wiki (as already is).

@vkuznet
Copy link
Contributor

vkuznet commented Jan 18, 2023

I fixed wiki, please review again and hopefully we can converge on this issue and close it.

@amaltaro
Copy link
Contributor Author

I like how this documentation has been organized, thanks Valentin.
I have added further content, especially to the details page. Please have another look whenever it's convenient to you.

I also collected fresh statistics with the current running conditions, written to the beginning of this section:
https://github.com/dmwm/WMCore/wiki/WMStats-details#weak-andor-missing-functionalities

Other than a deep look into the document types and possible structure, I think we have a good coverage of the service now and I'd be in favor of closing this out. Please reopen it in case you want to further expand on things. Thanks!

@github-project-automation github-project-automation bot moved this from In Progress to Done in WMCore quarterly developments Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

2 participants