-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoP: Data Science: Analyze correlations between metro locations and 311-data requests #107
Comments
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in X days. |
@priyakalyan please document the following update to this issue in the comments here Progress: "What is the current status of your project? What have you completed and what is left to do?" |
Progress: I added this file Progress summary 311 Data Project and data dictionary for 311 data Value Column 311 data and Metro rail and bus line Value column Metro- Bus and Rail line on 3-31-2022. So far have downloaded the 311 data (not cleaned it yet) and looked at the request type count/relative frequency over the years: 2015-2022 (till March 27th). Also looked at the looked at the request type count for different APCs. Since then have been out of town up until today, so literally have no further update for the last week. Plan for the upcoming week:
Availability: 6 hours this week. ETA- Totally new to geospatial data analysis, so may be 1 to 2 weeks. |
Progress: Was successful in installing the docker but could not set up a local 311 data server (tried many times- the last step in Step 3: Build and seed your local database failed. Any suggestions/pointers? For now, I have stopped working on it. Downloaded data from this website. Loaded the metro rail line shapefile, the metro bus line shapefile and the neighborhood council shapefile. Currently working on spatially joining the 311 data and the NC data (looking at one region at a time- 12 in all). Then overlay the metro rail and bus line and plot different request type and do like a qualitative study exploring the request type count geographically. Availability: 6 hours this week. ETA- 1 to 2 weeks. |
Progress: Finally figured out to how to use paginated API's with python to fetch all rows of data from the 311 server for the year 2021. I have saved it as a CSV file-clean_311_data_2021. I will fetch the clean data rest of the years (2015-2020, 2022). Have spatially joined the 311 data+ NC data + metro bus + metro rail line displaying the specific request types over 12 regions of NC. Adding sample pics here- this is for the region 4- South East Valley- NC's: 'SHERMAN OAKS NC', 'NORTH HOLLYWOOD NORTH EAST NC', 'VAN NUYS NC', 'GREATER VALLEY GLEN', 'NOHO NC', 'NOHO WEST NC', 'STUDIO CITY NC', 'NC VALLEY VILLAGE', 'GREATER TOLUCA LAKE NC'. Availability: 6 hours this week. |
Progress:
Plan for the upcoming week:
Availability: 6 hours this week. ETA- 1 week |
The team discussed this last Thursday, so I'll leave some notes for the record: I think it would be useful to have a histogram where the x axis is "distance from nearest bus stop/metro rail marker/etc." and the y axis is "number of requests". This will allow us to very clearly see whether there is some correlation between nearness to bus stops and 311 requests. |
Used the haversine formula- (great-circle distance) to calculate the distance between each request type-lat, long and metro rail stop. For each request type, found out the distance from the nearest metro rail marker. All this was done for reg6 - year 2021 and request type- Single Streetlight Issue. As discussed in the last 311 team meeting, here is the histogram plot: |
Thanks Anupriya! Sorry for the delay. What do you make of this graph? To me, it seems to suggest that there is not a strong association between distance to nearest metro stop and request frequency--I'd expect to see a (basically) monotonically decreasing histogram, implying that there are a lot of requests close to metro stops but just a few far from metro stops. But maybe a request type like graffiti would be more illuminating. Another bit that might help us understand this better: what is the density of metro stops? If the density of metro stops is very low, e.g., they are 10km apart from each other, then the median distance from the nearest metro stop of ~500m would be quite close. But if metro stops are 1km apart from each other, then ~500m is pretty far. With this foundation, I think we can start controlling for factors like population density, bus ridership density, and metro stop density. Does that sound feasible? |
Have been trying to figure out how to get the population of each neighborhood council so that we can figure out the population density and so on. As @piotrsan mentioned in another issue
I also found this: Demographics of Neighborhood Councils. In both these files there are only 97 records- 97 NCs. The NC boundary has been updated in 2018 with 2 new NC's added- here is the link. I found out the missing council names- NORTH WESTWOOD NC and ARTS DISTRICT LITTLE TOKYO NC. Next step is to figure out how to go from census block/tract data and adjust it at NC level. This link gives the mapping process to start from block data and reconcile at NC boundary level. After today's meeting- it looks like starting at census tract will be the easiest way to go. Take the NC shape file and merge it with the census tract and get the geocodes and move on to demographics from there. |
Have calculated the population of each neighborhood council using the census tract 2020 (TIGER/line shapefile 2020), updated NC shape file (99 councils) and the ACS 2020 demographics data at the tract level. No approximation was made in the geometry this time. Found the percentage of area/population for tracts intersecting multiple NCs and then calculated the actual population. |
Worked on this notebook- to find the updated population of the LA city neighborhood councils using geospatial analysis. Next- add a notebook- comparing the updated NC population obtained by geospatial analysis and arcGIS analysis. |
Have updated the notebook. The total population of LA city NCs is very close to the 2021 Census Bureau value. Have also been working on this PR- API pagination using python- to fetch all rows of data from 311 data pipeline for a given year. |
Hi @priyakalyan, are there any recent updates to this issue? |
|
A summary of this should be added to the wiki |
Overview
Investigate whether there are meaningful trends associated with metro stops and metro lines with regards to requests tracked by 311-data in LA County.
Action Items
Resources
Information about 311 Data here
Access 311 data here
http://geohub.lacity.org/datasets/metro-rail-lines-stops
https://developer.metro.net/docs/gis-data/overview/
District types issue: #118
use 2019 data for 311
streetlights
crime
metrostops
tools
google colab, sklearn, pandas
Work in progress
The text was updated successfully, but these errors were encountered: