-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace LAMA with Dask: grouping methods to migrate #295
Comments
Hi Sadie - this is great, just what we need. Thanks. (Is there a line 8 missing from the Python code?) |
Thanks @davidhassell, after our conversation I thought it was needed to allow us to organise the work. I have started to add in some basic groupings that may be helpful when thinking about methods with similar requirements and ones to tackle as a group, etc. I will add the properties in as a group too, though I don't think there are that many sadly (since as you mentioned they count for three in the table). Please edit the table as and when you wish to, there is plenty of info and groupings that could be added but I was going to chip away at it as useful rather than doing it all at once. I will ensure to keep it up to date in terms of what methods are in a PR that is open or merged.
Ah yes, good spot, I must have missed it when copying lines from the interactive session. I will add that back in now. |
I should add, I am aware you aren't a fan of emojis but the green and red colouring here makes it much easier to process at a glance whether a method is done or not than Booleans or Y/N... |
colon smiley colon |
Hi @davidhassell, here's the code I have quickly written to grab the total count of completed methods: # TODO: copy table as string here, heeding warning below.
# Make sure table is copied such that the first line of table starts as the
# first character, with no newline, and the string ends at end of final line.
# Basically as the line below indicates. (Code relies on that format.)
table = """<insert the current table here!>"""
def get_count_of_completed_methods(table_string):
"""Get count (and total) of (un)daskified methods from table of PR 295."""
lines = table_string.split("\n")
total_number_methods = len(lines) - 2 # subtract two heading lines
count = 0
validate = 0
for line in lines[2:]: # skip 2 header lines
if ":heavy_check_mark:" in line:
count += 1
elif ":heavy_multiplication_x:" in line:
validate += 1
else:
print(f"WARNING: POSSIBLY DODGY LINE? CHECK:'{line}'")
# Check that nothing dodgy has happened, that all lines/methods got covered
missing_methods = total_number_methods - count - validate
print("NUMBER OF METHODS UNACCOUNTED FOR IS", missing_methods)
return count, total_number_methods # include total for extra info
results = get_count_of_completed_methods(table)
print(
"COUNT OF DASKIFIED METHODS IS {} FROM TOTAL {}.".format(*results)
)
print(f"THAT'S A COMPLETION PERCENTAGE OF {results[0]/results[1]*100:.1f}") Right now (see comment timestamp later if needed as a reference) is: $ python daskification-table.py
NUMBER OF METHODS UNACCOUNTED FOR IS 0
COUNT OF DASKIFIED METHODS IS 228 FROM TOTAL 315.
THAT'S A COMPLETION PERCENTAGE OF 72.4 As you predicted, once I put my |
@davidhassell I've just updated the table to account for #409 and the scores on the doors are now: $ python daskification-table.py
NUMBER OF METHODS UNACCOUNTED FOR IS 0
COUNT OF DASKIFIED METHODS IS 306 FROM TOTAL 315.
THAT'S A COMPLETION PERCENTAGE OF 97.1 So we're very close now, as the green to red ratio indicates just from scrolling down the table! Just getting up PRs for |
Data test suite now passes! |
Table for #182
(See comment update datetime for author and datestamp of last table update)
Note:
306 methods in totalnow 315 methods in total in the table.Data
method nameHDF_chunks
Units
_HDF_chunks
_Units
_YMDhms
__abs__
__add__
__and__
__array__
__bool__
__contains__
__data__
__deepcopy__
__div__
__doc_template__
__docstring_package_depth__
__docstring_substitutions__
__eq__
__float__
__floordiv__
__ge__
__getitem__
__gt__
__hash__
__iadd__
__iand__
__idiv__
__ifloordiv__
__ilshift__
__imod__
__imul__
__init__
__int__
__invert__
__ior__
__ipow__
__irshift__
__isub__
__iter__
__itruediv__
__ixor__
__le__
__len__
__lshift__
__lt__
__mod__
__module__
__mul__
__ne__
__neg__
__new__
__or__
__pos__
__pow__
__query_set__
__query_wi__
__query_wo__
__radd__
__rand__
__rdiv__
__reduce__
__reduce_ex__
__rfloordiv__
__rlshift__
__rmod__
__rmul__
__ror__
__round__
__rpow__
__rrshift__
__rshift__
__rsub__
__rtruediv__
__rxor__
__setitem__
__sub__
__truediv__
__xor__
_all_axes
_all_axis_names
_asdatetime
_asreftime
_atol
_auxiliary_mask
_auxiliary_mask_add_component
_auxiliary_mask_from_1d_indices
_auxiliary_mask_return
_auxiliary_mask_subspace
_auxiliary_mask_tidy
_axes
_binary_operation
_change_axis_names
_chunk_add_partitions
_collapse
dask >= 2022.03.0
_collapse_create_weights
_collapse_finalise
_collapse_mask
_collapse_optimize_weights
_collapse_subspace
_combined_units
_create_auxiliary_mask_component
_custom
_cyclic
_default
_del_Array
_del_component
_dtype
_equals
_equals_preprocess
_flag_partitions_for_processing
_flip
_get_Array
_get_component
_has_component
_initialise_netcdf
_is_abstract_Array_subclass
_isdatetime
_item
_move_flip_to_partitions
_ndim
_new_axis_identifier
_package
_parse_axes
_parse_indices
_pmaxes
_pmndim
_pmshape
_pmsize
_rtol
_set_Array
_set_CompressedArray
_set_component
_set_partition_matrix
_set_subspace
_shape
_share_lock_files
_share_partitions
_size
_unary_operation
add_partitions
all
dask >= 2022.6.0
.allclose
any
dask >= 2022.6.0
.apply_masking
arccos
arccosh
arcsin
arcsinh
arctan
arctanh
argmax
array
cftime >= 1.6.0
asdata
binary_mask
ceil
change_calendar
chunks
clip
close
compressed
compressed_array
compute
concatenate
concatenate_data
convolution_filter
copy
cos
cosh
count
dask >= 2022.03.0
.count_masked
dask >= 2022.03.0
.creation_commands
cumsum
cyclic
data
datetime_array
cftime 1.6.0
datetime_as_string
datum
day
del_calendar
del_fill_value
del_units
diff
digitize
dtarray
dtype
dump
dumpd
dumps
empty
equals
exp
files
fill_value
filled
first_element
fits_in_memory
fits_in_one_chunk_in_memory
flat
flatten
flip
floor
full
func
get_calendar
get_compressed_axes
get_compressed_dimension
get_compression_type
get_count
get_data
get_filenames
get_fill_value
get_index
get_list
get_units
halo
convolution_filter
.hardmask
has_calendar
has_fill_value
has_units
hour
in_memory
insert_dimension
inspect
integral
isclose
is_masked
ispartitioned
isscalar
last_element
loadd
loads
log
mask
mask_fpe
mask_invalid
masked_all
masked_invalid
max
maximum
maximum_absolute_value
mean
mean_absolute_value
mean_of_upper_decile
median
mid_range
min
minimum
minimum_absolute_value
minute
month
nbytes
nc_clear_hdf5_chunksizes
nc_hdf5_chunksizes
nc_set_hdf5_chunksizes
ndim
ndindex
ones
outerproduct
override_calendar
override_units
partition_boundaries
partition_configuration
partitions
percentile
persist
range
rechunk
reconstruct_sectioned_data
reshape
rint
roll
root_mean_square
round
sample_size
save_to_disk
sd
second
second_element
section
set_calendar
set_fill_value
set_units
seterr
shape
sin
sinh
size
source
square
sqrt
squeeze
standard_deviation
std
stats
sum
sum_of_squares
sum_of_weights
sum_of_weights2
swapaxes
tan
tanh
to_dask_array
to_disk
to_memory
tolist
transpose
trunc
uncompress
unique
dask >= 2022.6.0
.var
variance
varray
where
year
zeros
Code to re-generate table
(In case a different form proves useful. Uses the library python-tabulate for ease.)
List of groupings of methods referenced in above table
See 'Group with (if at all)' column in table, which is designed to denote methods which are lunked in some useful way (say, being similar so can be migrated in a set of related PRs). Any groups referenced there should be added here.
Example: name group A for all of the trig. methods, and list 'A' in the 'Group with' column for all such methods, as well as noting A with all methods in it here.
arctan2
is special (as takes two args) and not yet implemented, but now can be.__add__
,__sub__
,__mul__
,__div__
,__floordiv__
,__mod__
, etc.__abs__
,__pos__
,__neg__
,__invert__
, etc.__eq__
,__ne__
,__gt__
,__lt__
,__ge__
,__le__
, etc.__and__
,__or__
,__xor__
, etc._rtol
,_atol
__lshift__
,__ilshift__
,__rlshift__
,__rshift__
,__irshift__
,__rrshift__
cfdm
max
,mean
,_collapse
to_memory
The text was updated successfully, but these errors were encountered: