You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed inconsistencies between the pvlib.clearsky.detect_clearsky implementation and the reference it's based on (M. J. Reno and C. W. Hansen, “Identification of periods of clear sky irradiance in time series of GHI measurements,” Renew. Energy, vol. 90, pp. 520–531, 2016.)
The first issue is in the way slopes are calculated. When calculating the slopes, the function implicitly assumes a data frequency of 1 minute (e.g., meas_slope = np.diff(measured[H], n=1, axis=0)). Including the time differences would provide expected behavior and would reflect the equations from the original reference. Furthermore, it may improve the performance of using the default parameters on lower frequency data.
I also believe that the calculation of the final features ("maximum difference between changes in GHI and clear sky time series" from the paper) is incorrectly calculated. The paper shows this feature as the maximum of the absolute element-wise difference between GHI and GHI(cs) slopes for each window (eqn 14). The implementation appears to calculate the difference between the absolute maximum GHI slope and absolute maximum GHI(cs) slope for a given window.
This isn't really a "bug", but tagging it as such seemed like the best option. I'd be happy to resubmit the issue report under a different tag if preferred.
The text was updated successfully, but these errors were encountered:
Concur with the first issue. The equivalent function in PVLib Matlab handles different time intervals, and unequally spaced data, but it's not clever, just a brute force loop. Handling different time intervals and data spacing would be a good enhancement to the pvlib-python function.
On the second point, you are correct that the code doesn't implement the equations in the paper. However, niether does our Matlab function, which calculates the difference in maxima, not the maximum of differences. I'll talk with Matt Reno (my co-author) and sort this out.
Thanks for bringing these issues to our attention.
Looks like we have an error in our PVLib MATLAB function which made it into pvlib-python. The intent was as described in the paper and we should correct the code.
I'm going to close this issue and open two issues to address the points you raise separately; that helps keep the pull requests simple. Your contribution to either is welcomed.
I noticed inconsistencies between the pvlib.clearsky.detect_clearsky implementation and the reference it's based on (M. J. Reno and C. W. Hansen, “Identification of periods of clear sky irradiance in time series of GHI measurements,” Renew. Energy, vol. 90, pp. 520–531, 2016.)
The first issue is in the way slopes are calculated. When calculating the slopes, the function implicitly assumes a data frequency of 1 minute (e.g.,
meas_slope = np.diff(measured[H], n=1, axis=0)
). Including the time differences would provide expected behavior and would reflect the equations from the original reference. Furthermore, it may improve the performance of using the default parameters on lower frequency data.I also believe that the calculation of the final features ("maximum difference between changes in GHI and clear sky time series" from the paper) is incorrectly calculated. The paper shows this feature as the maximum of the absolute element-wise difference between GHI and GHI(cs) slopes for each window (eqn 14). The implementation appears to calculate the difference between the absolute maximum GHI slope and absolute maximum GHI(cs) slope for a given window.
This isn't really a "bug", but tagging it as such seemed like the best option. I'd be happy to resubmit the issue report under a different tag if preferred.
The text was updated successfully, but these errors were encountered: