Skip to content

Commit

Permalink
Merge pull request #82 from UDST/dev
Browse files Browse the repository at this point in the history
Finalizing v0.2.2 release
  • Loading branch information
smmaurer authored Nov 14, 2020
2 parents 42a8f35 + 605fe9f commit 85681d5
Show file tree
Hide file tree
Showing 22 changed files with 1,595 additions and 839 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
v0.2.2
======

2020/11/09

* allows passing matplotlib axes to urbanaccess.plot.plot_net()
* adds flexibility to calendar/date handling (calendar_dates.txt now supported)
* improves GTFS downloading (solves issue where requests were rejected due to missing user agent header)
* improves text encoding support

v0.2.1
======

Expand Down
7 changes: 3 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Citation and academic literature

To cite this tool and for a complete description of the UrbanAccess methodology see the paper below:

`Samuel D. Blanchard and Paul Waddell. 2017. "UrbanAccess: Generalized Methodology for Measuring Regional Accessibility with an Integrated Pedestrian and Transit Network." Transportation Research Record: Journal of the Transportation Research Board. No. 2653. pp. 35–44. <http://trrjournalonline.trb.org/doi/pdf/10.3141/2653-05>`__
`Samuel D. Blanchard and Paul Waddell. 2017. "UrbanAccess: Generalized Methodology for Measuring Regional Accessibility with an Integrated Pedestrian and Transit Network." Transportation Research Record: Journal of the Transportation Research Board. No. 2653. pp. 35–44. <https://journals.sagepub.com/doi/pdf/10.3141/2653-05>`__

For other related literature see `here <https://udst.github.io/urbanaccess/introduction.html#citation-and-academic-literature>`__.

Expand Down Expand Up @@ -113,9 +113,8 @@ Minimum GTFS data requirements

The minimum `GTFS data
types <https://developers.google.com/transit/gtfs/>`__ required to use
UrbanAccess are: ``stop_times``, ``stops``, ``routes``, ``calendar``,
and ``trips`` however if there is no ``calendar``, ``calendar_dates``
can be used as a replacement.
UrbanAccess are: ``stop_times``, ``stops``, ``routes`` and ``trips`` and
one of either ``calendar`` or ``calendar_dates``.

Related UDST libraries
----------------------
Expand Down
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@
project = u'UrbanAccess'
author = u'UrbanSim Inc.'
copyright = u'{}, {}'.format(datetime.now().year, author)
version = u'0.2.1'
release = u'0.2.1'
version = u'0.2.2'
release = u'0.2.2'
language = None

# List of patterns to ignore when looking for source files.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ UrbanAccess

A tool for computing GTFS transit and OSM pedestrian networks for accessibility analysis.

v0.2.1, released August 28, 2020.
v0.2.2, released November 9, 2020.

Contents
--------
Expand Down
6 changes: 3 additions & 3 deletions docs/source/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ A `demo <https://github.com/UDST/urbanaccess/tree/master/demo>`__ is available a
Minimum GTFS data requirements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The minimum `GTFS data types <https://developers.google.com/transit/gtfs/>`__ required to use UrbanAccess are: ``stop_times``, ``stops``, ``routes``, ``calendar``, and ``trips``. If you are using a feed that does not have or utilize a calendar you may use the ``calendar_dates`` file instead of ``calendar`` with the ``calendar_dates_lookup`` parameter :ref:`here <transit-network>`.
The minimum `GTFS data types <https://developers.google.com/transit/gtfs/>`__ required to use UrbanAccess are: ``stop_times``, ``stops``, ``routes``, and ``trips`` and either ``calendar`` or ``calendar_dates``. If you are using a feed that does not have or utilize a calendar you may use the ``calendar_dates`` file instead of ``calendar`` with the ``calendar_dates_lookup`` parameter :ref:`here <transit-network>`.

License
~~~~~~~~
Expand All @@ -51,11 +51,11 @@ Citation and academic literature

To cite this tool and for a complete description of the UrbanAccess methodology see the paper below:

`Samuel D. Blanchard and Paul Waddell. 2017. "UrbanAccess: Generalized Methodology for Measuring Regional Accessibility with an Integrated Pedestrian and Transit Network." Transportation Research Record: Journal of the Transportation Research Board. No. 2653. pp. 35–44. <http://trrjournalonline.trb.org/doi/pdf/10.3141/2653-05>`__
`Samuel D. Blanchard and Paul Waddell. 2017. "UrbanAccess: Generalized Methodology for Measuring Regional Accessibility with an Integrated Pedestrian and Transit Network." Transportation Research Record: Journal of the Transportation Research Board. No. 2653. pp. 35–44. <https://journals.sagepub.com/doi/pdf/10.3141/2653-05>`__

For a detailed use case of the tool see the following paper:

`Samuel D. Blanchard and Paul Waddell. 2017. "Assessment of Regional Transit Accessibility in the San Francisco Bay Area of California with UrbanAccess." Transportation Research Record: Journal of the Transportation Research Board. No. 2654. pp. 45–54. <http://trrjournalonline.trb.org/doi/abs/10.3141/2654-06>`__
`Samuel D. Blanchard and Paul Waddell. 2017. "Assessment of Regional Transit Accessibility in the San Francisco Bay Area of California with UrbanAccess." Transportation Research Record: Journal of the Transportation Research Board. No. 2654. pp. 45–54. <https://journals.sagepub.com/doi/pdf/10.3141/2654-06>`__

Reporting bugs
~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
1 change: 1 addition & 0 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ pycodestyle
# testing demo notebook
jupyter
cartopy # requires conda
pyepsg

# building documentation
numpydoc
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

setup(
name='urbanaccess',
version='0.2.1',
version='0.2.2',
license='AGPL',
description=description,
long_description=long_description,
Expand Down
2 changes: 1 addition & 1 deletion urbanaccess/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,6 @@
from .gtfsfeeds import *
from .plot import *

__version__ = "0.2.1"
__version__ = "0.2.2"

version = __version__
17 changes: 13 additions & 4 deletions urbanaccess/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,12 @@ def _format_check(settings):
"""

valid_keys = ['data_folder', 'logs_folder', 'log_file',
'log_console', 'log_name', 'log_filename', 'gtfs_api']
'log_console', 'log_name', 'log_filename',
'txt_encoding', 'gtfs_api']

for key in settings.keys():
if key not in valid_keys:
raise ValueError('{} not found in list of valid configuation '
raise ValueError('{} not found in list of valid configuration '
'keys'.format(key))
if not isinstance(key, str):
raise ValueError('{} must be a string'.format(key))
Expand All @@ -42,13 +43,17 @@ class urbanaccess_config(object):
logs_folder : str
location to write log files
log_file : bool
if true, save log output to a log file in logs_folder
if True, save log output to a log file in logs_folder
log_console : bool
if true, print log output to the console
if True, print log output to the console
log_name : str
name of the logger
log_filename : str
name of the log file
txt_encoding : str
default text encoding used by the GTFS files, to be passed to
Python's open() function. Must be a valid encoding recognized by
Python codecs.
gtfs_api : dict
dictionary of the name of the GTFS API service as the key and
the GTFS API server root URL as the value to pass to the GTFS loader
Expand All @@ -61,6 +66,7 @@ def __init__(self,
log_console=False,
log_name='urbanaccess',
log_filename='urbanaccess',
txt_encoding='utf-8',
gtfs_api={'gtfsdataexch': (
'http://www.gtfs-data-exchange.com/'
'api/agencies?format=csv')}):
Expand All @@ -71,6 +77,7 @@ def __init__(self,
self.log_console = log_console
self.log_name = log_name
self.log_filename = log_filename
self.txt_encoding = txt_encoding
self.gtfs_api = gtfs_api

@classmethod
Expand Down Expand Up @@ -110,6 +117,7 @@ def from_yaml(cls, configdir='configs',
log_name=yaml_config.get('log_name', 'urbanaccess'),
log_filename=yaml_config.get('log_filename',
'urbanaccess'),
txt_encoding=yaml_config.get('txt_encoding', 'utf-8'),
gtfs_api=yaml_config.get('gtfs_api', {
'gtfsdataexch':
('http://www.gtfs-data-exchange.com/'
Expand All @@ -128,6 +136,7 @@ def to_dict(self):
'log_console': self.log_console,
'log_name': self.log_name,
'log_filename': self.log_filename,
'txt_encoding': self.txt_encoding,
'gtfs_api': self.gtfs_api,
}

Expand Down
118 changes: 85 additions & 33 deletions urbanaccess/gtfs/load.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import time
import pandas as pd
import six
import logging as lg

from urbanaccess import config
from urbanaccess.utils import log
Expand All @@ -20,7 +21,7 @@ def _standardize_txt(csv_rootpath=os.path.join(config.settings.data_folder,
Parameters
----------
csv_rootpath : str, optional
root path where all gtfs feeds that make up a contiguous metropolitan
root path where all GTFS feeds that make up a contiguous metropolitan
area are stored
Returns
Expand Down Expand Up @@ -59,6 +60,7 @@ def _txt_encoder_check(gtfsfiles_to_use,
"""
# UnicodeDecodeError
start_time = time.time()
log('Checking GTFS text file for encoding issues...')

folderlist = [foldername for foldername in os.listdir(csv_rootpath) if
os.path.isdir(os.path.join(csv_rootpath, foldername))]
Expand All @@ -74,14 +76,16 @@ def _txt_encoder_check(gtfsfiles_to_use,
for textfile in textfilelist:
if textfile in gtfsfiles_to_use:
# Read from file
file_open = open(os.path.join(csv_rootpath, folder, textfile))
file_path = os.path.join(csv_rootpath, folder, textfile)
file_open = open(file_path)
raw = file_open.read()
file_open.close()
if raw.startswith(codecs.BOM_UTF8):
msg = 'Correcting encoding issue in: {}...'
log(msg.format(file_path))
raw = raw.replace(codecs.BOM_UTF8, '', 1)
# Write to file
file_open = open(
os.path.join(csv_rootpath, folder, textfile), 'w')
file_open = open(file_path, 'w')
file_open.write(raw)
file_open.close()

Expand All @@ -100,9 +104,9 @@ def _txt_header_whitespace_check(gtfsfiles_to_use,
Parameters
----------
gtfsfiles_to_use : list
list of gtfs feed txt files to utilize
list of GTFS feed txt files to utilize
csv_rootpath : str, optional
root path where all gtfs feeds that make up a contiguous metropolitan
root path where all GTFS feeds that make up a contiguous metropolitan
area are stored
Returns
Expand All @@ -111,6 +115,11 @@ def _txt_header_whitespace_check(gtfsfiles_to_use,
"""
start_time = time.time()

txt_encoding = config.settings.txt_encoding
msg = ('Checking GTFS text file header whitespace... '
'Reading files using encoding: {} set in configuration.')
log(msg.format(txt_encoding))

folderlist = [foldername for foldername in os.listdir(csv_rootpath) if
os.path.isdir(os.path.join(csv_rootpath, foldername))]

Expand All @@ -124,25 +133,41 @@ def _txt_header_whitespace_check(gtfsfiles_to_use,

for textfile in textfilelist:
if textfile in gtfsfiles_to_use:
file_path = os.path.join(csv_rootpath, folder, textfile)
# Read from file
with open(os.path.join(csv_rootpath, folder, textfile)) as f:
lines = f.readlines()
lines[0] = re.sub(r'\s+', '', lines[0]) + '\n'
# Write to file
try:
with open(os.path.join(csv_rootpath, folder, textfile),
'w') as f:
f.writelines(lines)
except Exception:
log('Unable to read {}. Check that file is not currently'
'being read or is not already in memory as this is '
'likely the cause of the error.'
''.format(os.path.join(csv_rootpath,
folder, textfile)))
log(
'GTFS text file header whitespace check completed. Took {:,'
'.2f} seconds'.format(
time.time() - start_time))
if six.PY2:
with open(file_path) as f:
lines = f.readlines()
else:
# read with default 'utf-8' encoding
with open(
file_path,
encoding=txt_encoding) as f:
lines = f.readlines()
line_wo_whitespace = re.sub(r'\s+', '', lines[0]) + '\n'
# only write the file if there are changes to be made
if lines[0] != line_wo_whitespace:
msg = 'Removing whitespace from header(s) in: {}...'
log(msg.format(file_path))
lines[0] = line_wo_whitespace
# Write to file
if six.PY2:
with open(
file_path, 'w') as f:
f.writelines(lines)
else:
# write with default 'utf-8' encoding
with open(
file_path, 'w',
encoding=txt_encoding) as f:
f.writelines(lines)
except Exception as e:
msg = 'Unable to process: {}. Exception: {}'
raise Exception(log(msg.format(file_path, e),
level=lg.ERROR))
log('GTFS text file header whitespace check completed. '
'Took {:,.2f} seconds'.format(time.time() - start_time))


def gtfsfeed_to_df(gtfsfeed_path=None, validation=False, verbose=True,
Expand All @@ -156,7 +181,7 @@ def gtfsfeed_to_df(gtfsfeed_path=None, validation=False, verbose=True,
Parameters
----------
gtfsfeed_path : str, optional
root path where all gtfs feeds that make up a contiguous metropolitan
root path where all GTFS feeds that make up a contiguous metropolitan
area are stored
validation : bool
if true, the validation check on stops checking for stops outside
Expand Down Expand Up @@ -236,8 +261,20 @@ def gtfsfeed_to_df(gtfsfeed_path=None, validation=False, verbose=True,
os.listdir(os.path.join(gtfsfeed_path, folder)) if
textfilename.endswith(".txt")]
required_gtfsfiles = ['stops.txt', 'routes.txt', 'trips.txt',
'stop_times.txt', 'calendar.txt']
optional_gtfsfiles = ['agency.txt', 'calendar_dates.txt']
'stop_times.txt']
optional_gtfsfiles = ['agency.txt']
# either calendar or calendar_dates is required
calendar_gtfsfiles = ['calendar.txt', 'calendar_dates.txt']

calendar_files = [i for i in calendar_gtfsfiles if i in textfilelist]
if len(calendar_files) == 0:
error_msg = (
'at least one of `calendar.txt` or `calendar_dates.txt` is '
'required to complete a GTFS dataset but neither was found in '
'folder {}')
raise ValueError(error_msg.format(os.path.join(
gtfsfeed_path, folder)))

for required_file in required_gtfsfiles:
if required_file not in textfilelist:
raise ValueError(
Expand All @@ -263,10 +300,32 @@ def gtfsfeed_to_df(gtfsfeed_path=None, validation=False, verbose=True,
stop_times_df = utils_format._read_gtfs_stop_times(
textfile_path=os.path.join(gtfsfeed_path, folder),
textfile=textfile)

for textfile in calendar_files:
# use both calendar and calendar_dates if they exist, otherwise
# if only one of them exists use the one that exists and set the
# other one that does not exist to a blank df
if textfile == 'calendar.txt':
calendar_df = utils_format._read_gtfs_calendar(
textfile_path=os.path.join(gtfsfeed_path, folder),
textfile=textfile)
# if only calendar, set calendar_dates as blank
# with default required columns
if len(calendar_files) == 1:
calendar_dates_df = pd.DataFrame(
columns=['service_id', 'dates', 'exception_type'])
else:
calendar_dates_df = utils_format._read_gtfs_calendar_dates(
textfile_path=os.path.join(gtfsfeed_path, folder),
textfile=textfile)
# if only calendar_dates, set calendar as blank
# with default required columns
if len(calendar_files) == 1:
calendar_df = pd.DataFrame(
columns=['service_id', 'monday',
'tuesday', 'wednesday', 'thursday',
'friday', 'saturday', 'sunday',
'start_date', 'end_date'])

for textfile in optional_gtfsfiles:
if textfile == 'agency.txt':
Expand All @@ -276,13 +335,6 @@ def gtfsfeed_to_df(gtfsfeed_path=None, validation=False, verbose=True,
textfile=textfile)
else:
agency_df = pd.DataFrame()
if textfile == 'calendar_dates.txt':
if textfile in textfilelist:
calendar_dates_df = utils_format._read_gtfs_calendar_dates(
textfile_path=os.path.join(gtfsfeed_path, folder),
textfile=textfile)
else:
calendar_dates_df = pd.DataFrame()

stops_df, routes_df, trips_df, stop_times_df, calendar_df, \
calendar_dates_df = (utils_format
Expand Down
Loading

0 comments on commit 85681d5

Please sign in to comment.