This repository contains all the code required to reproduce the results presented in the following paper:
- M. De Nadai, B. Lepri. The economic value of neighborhoods: Predicting real estate prices from the urban environment, 2018.
Input, intermediary and source data can be downloaded from figshare.
Dependencies are listed in the requirements.txt
file at the root of the repository. Using Python 3.6 with pip all the required dependencies can be installed automatically.
pip3 install -r requirements.txt
- PostgreSQL 10.0
- PostGIS 2.4.1 extension
Due to storage constraints, input data are not integrated to this repository. However, input and intermediary files required to run the analysis can be downloaded from a figshare. To run the following code, input and/or the intermediary files must be downloaded and placed in the folder. Then, do:
createdb dsaa
gunzip < intermediate_db_backup.sql.gz | psql dsaa
tar -xf data.tar -C data/
Then place the content of dsaa_census_areas.zip into data/generated_files/
.
To produce the intermediary files, go to the section "DIY Instructions".
The code of the analysis in divided in two parts: the Python scripts and modules used to support the analysis, and the notebooks where the outputs of the analysis have been produced.
data_processing_houses.ipynb
: script used for the pre-processing of Immobiliare.it data.compute_walkability.py
: script used to generate the walkability scores for each census area.data_processing_neighborhood.py
: script used to create all the dataset.predict.py
: script used to predict the housing value from the intermediary files.plots.ipynb
: script used to produce the images of the manuscript.
This code is licensed under the MIT license.
Here we generate the entire database from ground. To do so, we have to create the minimal setup from this command:
psql dsaa < data/SQL/minimal.sql
psql dsaa < data/SQL/minimal_materialize.sql
Census data have to complay to the format of the census_areas_onfocus
table. Only when you did import data to this table you can proceed with all the steps. When you imported the data, you can generate the spatial matrix here:
psql dsaa < data/SQL/first-DIY-step.sql
A OpenStreetMap file has to be downloaded (preferably from here), and placed in data/OSM
. Then they are imported in PostGIS with:
osm2pgsql -c -d dsaa --create --style "config/osm2pgsql.style" --multi-geometry --number-processes 5 --latlong -C 30000 [FILENAME].osm.pbf
The same file OSM file can then be used to produce the OSRM database:
osrm-extract -p config/profiles/foot.lua [FILENAME].osm.pbf
osrm-contract [FILENAME].osrm
To run the server, use the command
osrm-routed [FILENAME].osrm
After this everything is set up to create the intermediate data in the database. Import all the materialized view, then run the script. Before running it, personalize line 13 and 35 of compute_walkability.py
.
psql dsaa < data/SQL/walkability.sql
python3 compute_walkability.py
To create the security perception scores, we use the code and weights of the following paper:
- De Nadai, M., Vieriu, R. L., Zen, G., Dragicevic, S., Naik, N., Caraviello, M., ... & Lepri, B. Are safer looking neighborhoods more lively?: A multimodal investigation into urban life. In ACM MM 2016.
Everything is available here. All the prediction should be placed inside the placepulse
table in PostgreSQL. Then, you can impor/refresh the materialized view present here:
psql dsaa < data/SQL/security.sql
You can insert a dataset with the census areas (geoid
) and a proxy of companies earnings (fatturato
) in data/companies.csv
. Pay attention that this is included only in the non-open model version.
You can insert a dataset with the census areas (geoid
) and a proxy of land value (assessed_land_value
) in data/land_value.csv
. Pay attention that this is included only in the non-open model version.
Census data has to be inserted with the same format as the files placed in data/census
and data/census/industry
. To change this, change the corrisponding code at data_processing_housing.py
.
Download satellite shapefiles from https://land.copernicus.eu/local/urban-atlas/urban-atlas-2012/view. Import them in the urban_atlas
PostgreSQL table. Then run the code:
psql dsaa < data/SQL/urban_atlas.sql
- XGBoost 0.72 for some reason is not available anymore. I changed it to 0.71 because many users have contacted me because of this issue.