Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRED-like API interface #8

Open
epogrebnyak opened this issue Sep 7, 2017 · 19 comments
Open

FRED-like API interface #8

epogrebnyak opened this issue Sep 7, 2017 · 19 comments

Comments

@epogrebnyak
Copy link
Collaborator

epogrebnyak commented Sep 7, 2017

After reading https://research.stlouisfed.org/docs/api/fred/ here are some thoughts on how this app urls may look like and what they'd do.

This is fantasy url system for this app:

# dataset varname domain
# can also be 'oil', 'china', 'us'  
# the idea is to make varibale name like INDPRO short but need reference to country somewhere
/ru   # Russian macro stats     

/all - global namespace to get many variable from different sources

/domains # list all domains

# similar to FRED
# list available categories in this domain
# note in FRED categories are a graph (each id has parent id)
/ru/categories

# list all variables
/ru/names

#indicator homepage, returns an html with graph and description 
/ru/series/INDPRO


# note fred has two functions 
# one for the series info
# https://research.stlouisfed.org/docs/api/fred/series.html
# other for data
# https://research.stlouisfed.org/docs/api/fred/series_observations.html

# for the data must return something feedable to pd.read_json()

#actual data retrieved by <frequency>/<mod>, where <mod> is a mix of unit or transformation
/ru/series/INDPRO/a/yoy 
/ru/series/INDPRO/q/rog 
/ru/series/INDPRO/m/rog

# some other data 
/oil/series/BRENT/m/eop
/oil/series/BRENT/a/avg

#can we just keep it without unit sometimes? adding usd is not symmetric to eop/avg
/oil/series/BRENT/d/

# to experiment with notation - domain with underscore
/cbr_fx/series/USDRUR/d/
/cbr_fx/series/USDRUR/m/eop
/cbr_fx/series/USDRUR/y/avg

# as in https://fedstat.ru/indicator/37393
/ru_export/series/COPPER/m/vol
/ru_export/series/COPPER/m/price


#available frequencies and modifications
/cbr_fx/series/USDRUR/freq #will list a subset of h,d,w,m,q,y
/cbr_fx/series/USDRUR/mod  #will list eop, avg

#truncating data 
# reason: eg in USDRUR daily from 1991 is a lot of datapoints, may want just few latest one
# how about truncating by a year
/cbr_fx/series/USDRUR/m/eop/2017

#finalising data format - probably not a good idea - 
csv/cbr_fx/series/USDRUR/m/eop
xlsx/cbr_fx/series/USDRUR/m/eop
xls/cbr_fx/series/USDRUR/m/eop
@epogrebnyak
Copy link
Collaborator Author

In flask will need some kind of catch-all url for this

@epogrebnyak
Copy link
Collaborator Author

Possibly can formalise above as incoming json:

# per-query
format='csv' 
# good for several datapoints
domain='cbr_fx' 
# let's just accept string date stamp for start
start = '2017'

# one datapoint 
varname='USDRUR'
frequency='m' #when omitted go to highest frequency possible, usually day or month
modifier='eop'

Here modifier rog is effectively a unit of measurement and there is not default for no-modifier:

varname='INDPRO'
frequency='m' #when omitted go to highest frequency possible, usually day or month
modifier='rog'

Domain definition looks vulnerable, one domain is better for user, may use categories instead.

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 7, 2017

Предполагаемая единица хранения:
<НАЗВАНИЕ_ПЕРЕМЕННОЙ>-<единица_изм>-<дата>-<значение>-<частота>-<домен>

domain = 'ru'
source = 'mini-kep'
varname ='INDPRO'
unit ='rog'
frequency ='m'
date = '2017-03-31'
value = 107.1

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 13, 2017

@relique - does flask have a capability to catch nested URLs like /ru/series/INDPRO/q/rog

without specifiing what is after /ru/series/ or how many slashes will there be?

@relique
Copy link
Contributor

relique commented Sep 13, 2017

Hi @epogrebnyak yes, Flask does have this capability. Sample route: @app.route('/ru/series/<path:inner_path>')

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 14, 2017

Reference paths, to be converted to dict/json:

/fin/series/LIBOR_3M_USD/d/2017 
/ru/series/INDPRO/a/yoy/2010
/ru/series/INDPRO/a/rog/2010 #points to the same data, 
                             #yoy and rog are the same on annual level
/ru:bank/series/PROVISION/m/eop/
/ru:bank/series/PROVISION/m/ #points to the same data, eop is default modifier
/oil/series/BRENT/a/avg/2010/2015/csv 
/ru/series/USDRUR/m/avg/2017 

domains are good to get same indicator for different countries

#CIS
ru/CPI/rog/m/2017
ua/CPI/rog/m/2017
by/CPI/rog/m/2017

#BRICS
kz/CPI/rog/m/2017
br/CPI/rog/m/2017
cn/CPI/rog/m/2017
sa/CPI/rog/m/2017

other than that they are to organise the thingking about the sources, and queries about what is availiable:

\domains #list all domains
\<domain>\series #list all variables

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 14, 2017

Mandatory:

  • (string) domain, optionally with :subdomain
  • series string
  • (string) VARNAME
  • (string) frequency, any letter from d, w, m, q or a.

Optional:

  • (string) a postfix (unit of measurement yoy, rog, base or modifier eop, avg ), Defaults to main unit or measurement.
  • (int) start year, defaults to observations start year
  • (int) end year, defaults to current year
  • (string) formatter - json assumed by default, may be csv, possibly xlsx

@epogrebnyak
Copy link
Collaborator Author

From catching the URL following result is needed:

domain = 'ru:bank'
varname ='INDPRO'
frequency ='m'
postfix ='rog'
start_year = None
end_year = None
formatter = None

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 14, 2017

I think we want an URL catcher like:

@app.route('/<string:domain>/series/<string:varname>/<path:inner_path>')

  • when URL ends at domain/series/VARNAME the output is variable landing page in html
  • when inner path with more slashes is specified, the output is csv/json

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 14, 2017

A reading on API design, to note:

  • error handling
  • subdomain
  • doubful on versioning

@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 14, 2017

# nominal/real + rate + info finaliser
/ru/series/GDP/a
/ru/series/GDP/a/rub/bln
/ru/series/GDP/q/rog
/ru/series/GDP/q/yoy
/ru/series/GDP/q/base

# info finaliser
/ru/series/GDP/a/info
/ru/series/GDP/q/base/info 

# seasonal adjustment
/ru/series/GDP/q/real/base/sa:<method>
/ru/series/GDP/q/real/base/nsa/info

epogrebnyak added a commit that referenced this issue Sep 14, 2017
@epogrebnyak
Copy link
Collaborator Author

epogrebnyak commented Sep 14, 2017

А. VARIABLE NAMESPACE RULES

# Rule 0. Base URL for getting data is <domain>/series/<VARNAME>/<freq>
/ru/series/GDP/a

# Rule 1. When no postfix - fall back to default nominal time series
/ru/series/GDP/a    # will retrun bln_rub

# Rule 2. Rate of change modifiers always indicate real growth rates 
#         rog - to previous period
#         yoy - 12 month ago change 
#         and base index (some date in the past=1)
/ru/series/GDP/q/rog     
/ru/series/GDP/q/yoy
/ru/series/GDP/q/base 

# Rule 3. Aggregation postfix applies to daily observations 
#         transfomed to lower frequencies like month or year.
/ru/series/USDRUR/m/eop
/ru/series/USDRUR/m/avg

# Note. We are not adding every other transformation to API, just bare minimum. 
#       User is expected to know playng with pandas/R dataframe. 


B. TIME PERIOD

# We can optionally specifiy start and end year to truncate timeseries.
# We do not allow truncation other than by year to keep this feature simple.
/ru/series/USDRUR/m/eop/1998/2000  # 1998 to 2000, total 3 years of data 
/ru/series/USDRUR/m/eop/2017       # from start of 2017 to present


C. FINALISERS  

<expression>/csv     # get csv instead of json  
<expression>/info    # get no data, get this interpretation

D. TO IMPLEMENT: seasonal adjustment

/ru/series/GDP/q/rog/sa:<method>
/ru/series/GDP/q/rog/nsa # same as below, all time series default to nsa
/ru/series/GDP/q/rog 

@epogrebnyak
Copy link
Collaborator Author

RFC:

  1. collected time series are nominal stocks/flows + prices + price indices
  2. any variable except prices we can present as nominal variable = real variable * price index
  3. prices are quoted at points in time (daily in our database)

Changes:

  • eop, avg - applies to prices only when aggregating form daily to monthly or up.
  • cannot apply eop, avg to anything other than prices

@epogrebnyak
Copy link
Collaborator Author

We now have experimental catcher working:
http://mini-kep.herokuapp.com/ru/series/BRENT/m/eop/

@epogrebnyak
Copy link
Collaborator Author

@epogrebnyak
Copy link
Collaborator Author

Also discussed here mini-kep/intro#12

@azie-ginanjar
Copy link

with apiary we can do the following tasks:
- create documentation
- create mock server
with following url {domain}/series/{varname}/{freq}/{?rate}/{?agg}/{?start}/{?end}/{?finaliser}
it would make all parameters after freq optional.

@epogrebnyak
Copy link
Collaborator Author

{?rate}/{?agg} are mutually exclusive, we can either have {?rate} or {?agg}, so better call them {?suffix},
the varname will be {domain}/series/{varname}/{freq}/{?suffix}/{?start}/{?end}/{?finaliser}

later in the program:

  • if {?suffix} is in (eop, avg) then {agg} is defined
  • if {?suffix} is in (yoy, rog) then {rate} is defined

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants