Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R: Create Python wrapper for windrose method used by R-Instat #4

Open
5 tasks
isedwards opened this issue Jul 26, 2020 · 4 comments
Open
5 tasks

R: Create Python wrapper for windrose method used by R-Instat #4

isedwards opened this issue Jul 26, 2020 · 4 comments
Assignees

Comments

@isedwards
Copy link
Collaborator

isedwards commented Jul 26, 2020

R-Instat uses the Clifro package - see #R-Instat/1908

The windrose method has the following parameterisation:


windrose(speed, direction, facet, n_directions = 12, n_speeds = 5,
speed_cuts = NA, col_pal = "GnBu", ggtheme = c("grey", "gray", "bw",
"linedraw", "light", "minimal", "classic"), legend_title = "Wind Speed",
calm_wind = 0, variable_wind = 990, n_col = 1, ...)

Arguments
Data parameters

  • speed numeric vector of wind speeds.
  • direction numeric vector of wind directions.
  • facet character or factor vector of the facets used to plot the various windroses.

Other parameters

  • n_directions the number of direction bins to plot (petals on the rose). The number of directions
    defaults to 12.
  • n_speeds the number of equally spaced wind speed bins to plot. This is used if speed_cuts
    is NA (default 5).
  • speed_cuts numeric vector containing the cut points for the wind speed intervals, or NA
    (default).
  • col_pal character string indicating the name of the RColorBrewer colour palette to be
    used for plotting, see ’Theme Selection’ below.
  • ggtheme character string (partially) matching the ggtheme to be used for plotting, see
    ’Theme Selection’ below.
  • legend_title character string to be used for the legend title.
  • calm_wind the upper limit for wind speed that is considered calm (default 0).
  • variable_wind numeric code for variable winds (if applicable).
  • n_col The number of columns of plots (default 1).
    ... further arguments passed to theme.

Tasks:

  • Create the initial Python process library including an empty function/method for the windrose from r-instat
  • Add an initial Python docstring
  • Create automated test suite with a (failing) image test for r-instat windrose
  • If possible, check characteristics of the plot numerically with an initially failing test
  • Update data parameters and other parameters inline with grammar of graphics: data, processing and presentation

r-instat-windrose

from opencdms.process.r_instat import windrose

def windrose(data, other_para):
"""
What the windrose does

Arguments
data:
other_para:

"""
raise ProcessNotImplemented
@volloholic
Copy link

volloholic commented Aug 4, 2020

@isedwards We have now discussed this internally in more depth.
Our main observation is that we found it easier to think through aspects of the grammar when we had a broader set of examples to reflect on.
When we started to consider the sort of PICSA graphs that are needed regularly by African Met offices we recognised more structures needed for the API Grammar. In particular the need for a 'definition' style object.
If instead of thinking about Data, process and presentation we think about Data, Transformation, Definition and presentation there is a consistency which comes out, although our thinking is still raw:

  1. Data is a structure which is provided by the Database
  2. Transformation is a definition of how that data is summarised or processed to be in the right shape.
  3. Definition is specification of the graph
  4. Presentation is like a theme for the visualisation of the graph.

The nice thing about this is that in the PICSA context is each of the 4 points might be the responsibility of someone different.
The data would be specific to a given place, the presentation would depend on delivery (Through an app, printed on A4, as a poster), the transformation might be per region for something like start of the rains and the definition might be the same for all PICSA products.
If we get the API right and in place I could easily imagine a process which would enable the pieces of this to be 'assembled' by different actors in such a way as to enable a seamless user experience.

The key aspect to this reflection is that while I think this Windrose example is worthwhile it is only when we started looking beyond it to a practical use case that the value of creating these grammatical API structures really made sense. In this new way if thinking I believe the data needed in this example is the same as the data needed in #5 however the transformation, definition and presentation are different.

@volloholic volloholic reopened this Aug 4, 2020
@lloyddewit
Copy link
Contributor

@volloholic Thank you for the excellent summary above of the discussion we had with @dannyparsons.

@isedwards I have some extra questions:

  • There are currently several issues open for Windroses (e.g. R: Create Python wrapper for windrose method in Climatol #5). These issues currently specify different APIs for different back-ends. I understand that the longer-term intention is to have a single API that can be implemented by different back-ends. For the Windrose API, will there be a single set of mandatory parameters? Or will some parameters be optional (i.e. backend may ignore them)? Should the backend return a warning informing the calling function about parameters it ignores?
  • For the test cases, we can't compare the returned image against an expected image because different backends will produce different types of image. Do you have any ideas how we should test whether the returned image is correct?
  • For the testing, you mentioned numerical testing of the returned result. Again, it will be difficult to test objects, returned from different backends, against a specific expected result. Is the intention to have different expected results for different backends? Or does this undermine the concept of having a set of tests to ensure that a backend satisifes the API?

If the above was already discussed before then please accept my apologies.

@isedwards
Copy link
Collaborator Author

Thank you @lloyddewit - thoughts on each of your question are below...

  1. We decided in the last meeting to move API work into a separate sub-team (there will be a proposal for this shortly). The current technical group will focus on the identified deliverables for the process library. "Crucially, this will not require any changes to R-Instat, Climatol or other external libraries". The focus is simply on exposing existing processes.

  2. Each wind rose process returns an image that is specific to that process. There would then be an automated test for each process. However, we decided a couple of meetings ago not to pursue image tests at this stage. The general aim is to get the team used to implementing automated tests, the fact that the first example chosen returned an image was unfortunate in terms of the increased complexity (although I expect we'll have examples implemented soon).

  3. The automated tests in the python process library were always intended to test the processes in the process library

We're burning through our time quite quickly, so the intention is to focus on the basics to ensure deliverables are met. This approach will apply to the web API design as well (building on existing API solutions rather than doing something completely new). However, I'm very happy for breakout groups to push further on new and related topics.

So, what's next...?

@lloyddewit
Copy link
Contributor

@isedwards Thank you for the response, things are much clearer for me now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants