forked from BurntSushi/nfldb
-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathCHANGELOG
293 lines (228 loc) · 8.78 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
0.2.17
======
- Fixed nfldb.team.standard_team to recognize LA Rams.
0.2.16
======
- Updates for the 2016 season.
- Includes database migration to add new LA team.
0.2.15
======
- Revert PR #142.
0.2.14
======
- Fix #134 by bumping nflgame dependency.
0.2.11
======
- Add custom override for start time for 2015100400
0.2.10
======
- Fix #80.
0.2.9
=====
- Update nflgame dependency to get playoff games.
0.2.8
=====
- Fix time handling for games starting at 12pm/12:30pm.
- Depend on nflgame >= 1.2.9.
0.2.7
=====
- Depend on nflgame >= 1.2.8 (see nflgame#95).
0.2.6
=====
- Fixed a bug that was preventing `nfldb.aggregate` from working.
0.2.5
=====
- Added a new "simulation" mode to `nfldb-update`. Try it with
`nfldb-update --simulate 2014081501`.
- Added a new `game_date` field to plays that corresponds to the
first 8 characters of the play's game gsis_id. This ends up
translating to the date that the game played, which just so happens
to be how play identifiers are handed out on NFL.com. This makes
sorting plays with respect to *when* they happened simple.
- Refactored some of nfldb-update into its own module so it can
be invoked easier from other programs.
- Better error messages when sorting with invalid field.
0.2.4
=====
Minor refactoring and performance features.
The refactoring is mostly confined to the SQL 'ORDER BY' generator. I
had intended to make it more powerful, but that turned out to be more
work than I expected. Instead, I'll settle for clearer code.
The performance features aren't likely to be useful unless you're doing
heavy analysis or are using nfldb to power a web site. Briefly, several
`fill` functions have been added to nfldb's entity types that will, for
example, fill in a list of plays with drive and game data using as few
queries as possible. Currently, this will fall over with >5000 items
because of PostgreSQL limits, but future work will fix this using some
sort of batching mechanism.
0.2.3
=====
- Fixes #41 (Python 2.6 compatibility).
0.2.2
=====
- Fixes #39 and #42.
- Fixes another bug where LIMIT clauses weren't being inserted by
`nfldb.Query` if there were no sorting criteria.
0.2.1
=====
- Fixes #34.
0.2.0
=====
Materialized view for `play`. `Query` demolition.
ATTN: This introduces a breaking change. The `team` field can no longer
be used in the `play` method. Instead, you should use the new
`play_player` method to select individual player statistics belonging to
a specific team.
Otherwise, there are very few public facing changes, but the entire
guts of `nfldb.Query` have been ripped out and replaced with more
robust SQL generation code. Moreover, several idiosyncracies have been
fixed and some unit tests have finally been added.
1. Previously, the `Query` class was doing some very clever things to do
parts of a JOIN in Python code. The general flow was that filtering
was applied to find primary keys---never using any JOINs---and once
all criteria had been applied, those ids were used in a simple SELECT
to fetch the actual rows.
Now all of that cruft has been removed and replaced with intelligent
SQL generation that constructs one query with all the proper JOINs.
For whatever reason, I thought this was slower when experimenting
with it when I first started nfldb. Perhaps my indexes weren't
configured properly then. In any case, I can't really see much
performance difference.
2. The SQL generation code is very smart. Although it is not part of
nfldb's public API, I imagine it would be very useful if you had some
special needs. See the unexported but documented `nfldb.sql` module.
3. Many idiosyncracies resulting from doing a join in Python are now
completely gone. For example, if you tried to apply a `sort` with a
`limit` with complex search criteria, you were bound to get wrong
answers. For example, if you tried sorting by both a column on the
`week` table (like `down`) and a column on `play_player` (like
`passing_tds`) and applied a limit to it, the results would be
completely wonky because the pure Python join can't cope with it
performantly. A regular SQL join? Piece of cake.
4. I have added a materialized view `agg_play`. This is a fancy word for
"a table that automatically updates itself." In essence, whenever a
new row is added to `play_player`, aggregate statistics for that play
are re-computed. This makes adding data slower (which doesn't happen
very frequently), but it makes querying data much faster and easier.
For example, plays can be queried for `passing_yds` without ever
joining with `play_player`. (Which is wonky because of the
one-to-many relationship.)
To reflect this clearer separation of concerns, the `Query.play`
method will no longer add criteria that hits the `play_player` table.
Instead, if you really want the `play_player` table, then you can use
the new `play_player` method. The only field that was accepted in the
`play` that is no longer allowed is the `team` and `player_id`
fields. This is because there is no sensible way to aggregate these
values into a single play.
To the best of my knowledge, that is the only possible breaking
change here.
0.1.6
=====
- Add better error message when config file can't be found.
0.1.5
=====
- Support hall-of-fame games by allowing week to be `0`.
0.1.4
=====
- Some light code refactoring and doco updates.
0.1.3
=====
- Fix a bug where the start time of a game has the wrong
year if it started the year after the season started.
(e.g., postseason games.)
0.1.2
=====
- nflgame now correctly stores the season of postseason
games, so nfldb no longer needs to fix it.
0.1.1
=====
- Added a fix so that nfldb can work with psycopg2 2.4.x.
0.1.0
=====
- Supports nflgame's new automatic schedule updating.
(nfldb now automatically updates schedules.)
0.0.16
======
- Added functions for guessing the position of a player based on
his statistical categories.
0.0.15
======
- Merged PRs #17 and #18. (Methods for determining score of game
at any time.)
- Clocks can now be manipulated by adding or subtracting seconds.
- Fix bug where the `winner` and `loser` properties of a Game
object weren't being set.
- Added the Soundex algorithm to the player_search function.
0.0.14
======
- The "weight" and "height" columns are now numeric.
This is consistent with the change to nflgame to sanitize height data
so that it is all reported in inches.
- In light of the aforementioned change, nfldb now requires nflgame 1.1.23
or newer.
0.0.13
======
- Dummy release because I messed up. No substantial changes.
0.0.12
======
- Don't depend on a specific version of pytz.
- Change `nfldb-update` batch size so that it has a higher rate of
success in lower memory environments.
0.0.11
======
- Fixed city and team name for New York teams.
- Set the cursor factory at cursor init instead of connection init.
(This lets us use an older version of psycopg2.)
- Added an `is_playing` property to games.
- Expose `orelse` in the query interface, even though its semantics
are confusing and unclear.
- Added a `points` convenience method to compute the number of points
scored in a single player statistic.
- Provided a way to update only game schedules with the `nfldb-update`
script. This is useful when game schedules change midway through
the season. Currently, it must be run manually.
- Don't update the DB with drives/plays if none are reported in the
JSON. (In particular, this indicates corrupt JSON while the game
is ongoing, so ignore it.)
- `standard_team` cannot reasonably resolve "new york". It will now
fail with a helpful assertion error.
0.0.10
======
- Don't import errant drives (e.g., a drive without plays).
0.0.9
=====
- Windows can't handle symlinks in setup.py.
0.0.8
=====
- Invoke nflgame-update-players from nfldb-update in a cross platform
way. e.g., `python -m nflgame.update_players` instead of
`nflgame-update-players`.
0.0.7
=====
- Fixed a bug where nfldb crashes if neither `XDG_CONFIG_HOME` or
`HOME` are set.
- Changed `nfldb-dump` to omit owner information in the SQL dump.
- Fixed embarrassing team misspellings.
- Made pytz version dependency consistent with nflgame.
0.0.6
=====
- Fix a bug in nfldb-update where the database wasn't being closed
when being run with the `--interval` flag.
- Added `player_search` which does fuzzy name matching.
0.0.5
=====
- Fixed a broken link in the module/PyPI description.
0.0.4
=====
- Numerous fixes for bugs I found while writing the nfldb wiki.
0.0.3
=====
- Tons of documentation.
- Added pytz as an explicit dependency.
- Added a `show_where` method to the Query class for debugging search logic.
0.0.2
=====
- Be a bit smarter with configuration files.
0.0.1
=====
Initial release.