Skip to content

Latest commit

 

History

History
222 lines (146 loc) · 22.3 KB

README.md

File metadata and controls

222 lines (146 loc) · 22.3 KB

Assessment of the CO2 emission and economic impact of the electrification of private means of transport for the construction of a new transport mode. prediction of popularity

Acosta, O., Bellardini, J., Castro, R., Ljungberg, M. and Prestupa, R., 2023

Summary

The following project will address the issue of private transport services and their environmental impact taking into consideration the profitability of the service. It will be demonstrated that the adoption of electric vehicles in private transport services will considerably reduce CO2 emissions without profoundly impacting profitability.

Keywords: data science, CO2, private cars, private vehicles

Hypothesis

The adoption of electric vehicles for a private transport fleet will significantly reduce CO2 emissions, with no significant negative impact on the company's profitability.

Objectives

General Objective

To assess air and noise pollution from private transport vehicles by comparing emissions of internal combustion vehicles vs electric vehicles during the period 2015-2022 in New York City, expecting their reduction in 12 years (cost equalization time between both types of vehicles, the exact number is 12.65 years), and verifying the acceptance of the same by users through a predictive model of popularity prediction model.

Specific Objectives

Environmental: Obtain the KPI CO2 emission rate, defined as the sum of CO2 emissions from combustion vehicles over electric vehicles, proving a 30% decrease over 12 years.

Economic: Determine the KPI Return on investment per trip. The return on investment will be calculated from the fuel expenditure. The KPI will then be the net gain (revenue - fuel cost) on electric vehicles over combustion vehicles, expecting values of no less than 200%, assessed semi-annually.

Economic: Evaluates the KPI Recovery Period, defined as the time required in months to recover the initial investment. The initial investment is the average price of 100 EV units plus the installation of 20 EV charging points, without taking into account the city's own rates, labor cost according to city standards, cost of land, lawyers , etc. in USD. The estimated income is calculated as income from the trips (fare paid by the passengers) plus the profits from the sale of electricity at the charging points less the cost of electricity for the trips made and the salary of the team of 200 drivers with a monthly salary of $2,500. is units of UDS/month. It is calculated as initial investment / estimated monthly income. A number of years less than 5 will be considered as a success.

Sociocultural: Generation of a map of Mexico City with the predicted popular locations for charging stations and the KPI Population rate per station, defined as the total amount of population of the Metropolitan Area of Mexico City divided by the number of charging stations in the city. After the implementation of the project in a period of one year, a reduction of at least 5% in the rate is expected.

Justification

The current project seeks to generate conclusions on the economic performance and environmental impact that the use of internal combustion vehicles generates compared to electric motor vehicles in order to inform the investment decisions of a private transport company in Mexico City.

Introduction

In recent decades, one of the most emphasized focuses has been on the environmental impact of our technologies and how to make a positive impact on the ecosystem.(Luiz Adriana Pita-Morales, 2016) Conventional means of transport, such as land, air and sea vehicles that release greenhouse gasses through their combustion, are one of the main factors that have a negative impact on the environment. (Loreto Inés Roas Valera, 2011) The topic that brings us to the subject is to generate an analysis of the impact caused by land vehicle transport services on the environment and how to counteract this impact. Specifically, it looks at the trend patterns with respect to the environment in New York City over the last decade. An electric vehicle (EV) is defined as a vehicle that uses one or more electric motors for propulsion and uses energy stored in its rechargeable battery which can be charged by home connection or at public charging points. (Weldon, et al., 2018). There are 2 main categories, hybrid cars (HEVs) and all-electric cars (AEVs). The latter are equipped with engines that are charged by electric sources, subclassified into battery-powered vehicles (BEVs) that need external charging and fuel cell vehicles (FCEVs) that do not. There is a specific type of HEVs called plug-in hybrids (PHEVs) which are hybrid cars that have the option of recharging their battery (Das, et al., 2020). One of the main barriers to EVs representing a considerable portion of the market is the perception of increased costs compared to internal combustion vehicles (ICEVs). (Weldon, et al., 2018). A study of the cost of living of EVs compared to hybrid and traditional vehicles conducted by Roth in 2015 found that EVs are often the least expensive to own, being variable according to government subsidies, length of ownership, and fueling costs in each city (Weldon, et al., 2018). In a study of the perceptions of over 2000 individuals conducted in the 21 largest US cities, 95% of respondents mentioned that they are not aware of incentives for EV purchase, such as direct subsidies, free parking, carbon dioxide emission tax and increased charging infrastructure, which are budget-limited public strategies (United States Congress, 2009, Weldon, et al., 2018). Total cost of ownership (TCO) is a widely used method for comparing the economic position of vehicles to enable consumer, manufacturing and policy-making decisions (Liu et al., 2021). Wu et al. presented a probabilistic method incorporating the natural stochasticity of conventional TCO parameters applied to vehicles and further segmented into consumer-oriented TCO and society-oriented TCO, which incorporated the effect of greenhouse gasses ( Liu et al., 2021). Conventional TCO behavior between EVs and ICEVs indicates that EVs are generally more expensive in the absence of federal policies promoting their use (Liu et al., 2021). A study conducted by Liu et al. in 2021 determines the time in years needed to recoup the investment of buying an EV versus an ICEV, ranking them by taking into account curb weight, miles per gallon, power, price and component costs of gearing, exhaust, and other correlates of price and available data. (Liu et al., 2021). From the analysis and development of formulas, it was determined that the time needed to equalize costs between ICEVs and BEVs is 6.8 and 7.7 years for those with an empty weight of 1 and 2.5 t, and 11.2 and 14.1 years if the installation of the home charging system is taken into account for the same weights (Liu et al., 2021). (Liu et al., 2021) Recent studies have described that more than 3 million EVs have been reached in 2017, while by 2007 there were hundreds of them where countries such as Norway, Iceland, Sweden and Denmark lead the list, leaving the USA in 8th place by 2018. (Das, et al., 2020) It is estimated that by 2035, 100 million EVs will be circulating worldwide according to Energy Outlook while the International Energy Agency (IEA) stated that the target is 548 million by 2040. (Das, et al., 2020).

Outreach

Temporary scope:

The study will start by examining the taxi industry in New York, using available databases and data from 2016. The period considered for data collection and analysis spans from the entry into force of the Paris agreement in November 2016 to the current date (Rogelio Fernández-Reyes, 2016). This will capture changes and developments in the adoption of electric cars in the transport industry. It will examine the evolution of the technology and the economic and environmental impacts over time. This broad time range will support the feasibility analysis of investment in electric cars in the transport industry. For the evaluation of the investment recovery time, a proportional ratio is considered between the implementation of the strategic business plan between the US and Mexico.

Geographical scope and applicability in other cities:

The study will begin by examining the transport industry in New York and will consider similar extrapolation of the results obtained to Mexico City and other cities. The inclusion of Mexico City in the project is based on its relevance as one of the largest and most populated cities in Latin America. The implementation of electric vehicles in the transport sector in a city with a significant population density such as Mexico City could have a substantial impact on reducing pollution and greenhouse gas emissions. By considering the extrapolation of the results obtained in New York to Mexico City, we seek to evaluate the potential environmental and economic benefit of the adoption of electric cars in a similar metropolis, which could serve as a reference for other cities with similar demographic and geographic characteristics in the region.

Economic evaluation:

The following economic aspects will be analyzed:

  • Electric vehicle procurement costs in New York.
  • Electric vehicle maintenance costs in New York.
  • Fuel savings by using electric cars in New York.

The criteria used to assess the viability of investment in electric cars in the taxi industry will be:

  • Return on investment in New York-
  • Return on investment period in New York.
  • Environmental benefits in New York.

Product maintenance and upgrades:

Once the project is completed, no ongoing maintenance is provided for the developed product.

However, the design of the analysis will be such as to allow for the incorporation of new data sources in the future, which will facilitate its updating.

A small test will be conducted by adding a limited set of new data to the product, in order to demonstrate its adaptability and automation.

Limits of the project:

An analysis of the legal consequences that could be caused by the implementation of the measures detailed in this project in Mexico City will not be included. This is because we do not have the knowledge or the time to generate the research that such an analysis would require.

The demographic and cultural implications that would be caused by the implementation of the new transport paradigm in Mexico City detailed in this project will not be considered as these areas are not aligned with the objectives of this project.

Methodology

Definition of the team and responsibilities:

Octavio Acosta Monti, Data Science (ML) & Data Analytics: Research papers related to Machine Learning systems. Review of the production of Machine Learning systems. Assistance in the production of graphics and interactive dashboard following the guidelines and idiosyncrasies of the consultancy. Research and creation of a speech based on storytelling to present to the stakeholder.

Jeremias Belardini, Data Analytics: Production of the analysis thanks to the identification of patterns and trends in the data and interpretation of the results. Collaboration in the creation of the interactive dashboard, to facilitate the visualization and understanding of the data by the project stakeholders.

Ricardo Castro Peraza, Data Science (ML): Research of previous literature on the development of ML models focused on the comparison of electric vehicles versus internal combustion vehicles. Development of ML model for the prediction of sociological factors associated with EV charging infrastructure, its evaluation methods and parameter adjustment.

Mauricio Ljungberg, Data Engineer: Determination of the database structure to be used, design and administration of the database, process of exploratory analysis of the data obtained, design of interactions with APIs to obtain quality data, writing the documentation of the entire data pipeline process.

Romina Prestupa, Data Engineer: Data extraction from two types of sources, databases and APIs, primary analysis of data quality, transformation (outliers exploration, null treatment, discretisation). Column selection. Creation of the data warehouse and deployment in the cloud.

Juan Gerardo, Scrum Master

Working strategy

In the project, it has been determined that two meetings will be held daily as part of the team's working methodology. One of these meetings will take place in the morning, and the other in the afternoon. During the morning meeting, the team will be planning and discussing the current day's procedure. In addition, new tasks will be generated and contributed for inclusion in the plan, and the slide to be presented at the afternoon meeting will be defined. During this last meeting, a space will be established to raise doubts and share the progress achieved. The Scrum Master has been appointed to lead this meeting and, in his role, will ask brief questions to the team members. They will inquire about the previous day's activities, the current day's plans, and whether any particular issues or challenges are arising. In addition, the slides created in the morning meeting will be presented at this meeting, and feedback will be received from the Scrum Master in order to provide guidance more in line with stakeholder requirements. In order to monitor the progress of the team in relation to the tasks performed, milestones will be set on Fridays of each of the four weeks of the project. For the monitoring and marking of the objectives, a Gantt Chart has been made available to the team.

Data collection

The database of the New York City Taxi & Limousine Commission (NYC TLC), which is responsible for the collection and provision of data on yellow, green taxis and private vehicles for hire, both service and volume, will be used. This will be the main basis for extracting information on the main preferences regarding the movement of ground vehicle transport services between different areas of New York.

Historical Weather API will be used, which is responsible for the collection and provision of historical data on the particular weather in each area of the world. This database will provide weather trends in relation to transport services in New York City.

Exploratory data analysis

In the first instance, an exploration of the quality of the available data detailed above will be carried out, aiming to assess completeness and reliability, prior to any analysis or modeling.

In the columns of each dataset, we will assess which variables are of interest for our analysis. The completeness of the data will be assessed, in order to make decisions on the treatment of null or empty data, and the quality and consistency of the available data will also be evaluated.

Data transformation and loading (TL)

The transformation and loading process will consist of the following steps: Initial analysis and exploration: a preliminary analysis of the data will be performed to understand its structure, variable types and quality issues.

Data cleaning: identification and handling of missing values, duplicates, outliers, formatting errors, etc. Imputation or elimination techniques will be used, when necessary.

Normalization and standardization: data will be verified to have a consistent format, using normalization and standardization methods on selected variables.

Creation of new variables: when necessary, new fields will be designated as needed for the analysis, such as calculated columns.

Subsequently, the final structure of the data will be determined and loaded into a database and made available in the cloud.

Data analysis

Identification of patterns and trends in data distribution: An exploratory analysis of the data will be carried out to identify patterns and trends in data distribution. Visualization techniques and statistical analysis will be used to understand the structure and characteristics of the data.

Relationship between variables in the same tables: An analysis of the relationship between variables within the same tables will be carried out. Correlations and dependencies between variables will be examined to identify possible significant relationships.

Relationship between patterns in different tables: The relationship between patterns in different tables will be explored. Search for meaningful connections and common patterns between different data sets, to get a more complete picture of the analyzed data.

Interpretation of results and generation of preliminary conclusions: The results obtained in the previous tasks will be interpreted and preliminary conclusions generated. Patterns identified and relationships found will be examined to extract relevant information according to the objectives of the analysis.

Interactive dashboard generation

Power BI and Azure will be integrated to establish the connection between the two tools, enabling efficient and up-to-date extraction of data from the database in Azure.

The dashboard will be designed in a way that is consistent with the identity of the report, using colors, fonts and styles that maintain a professional and consistent appearance with the rest of the document.

The pages and sections of the dashboard will be structured logically, facilitating the structured presentation of information. Each metric or area of analysis will have its own page for easy navigation.

Interactive visualizations will be created that will allow users to adjust and customize the visualization of data through the use of filters, segmentations and interactive controls.

The most relevant KPIs will be highlighted on the dashboard, using specific visualizations such as cards or indicators to highlight crucial information and facilitate immediate understanding.

Commentary and explanatory annotations will be added to the dashboard to provide context and guidance to users in interpreting the data. These comments are based on trends, important fundamentals and recommendations based on the results of the analysis.

Prediction Model

The Urban Gis and Charging Station Data database available at IEEE Data Port will be used. This database will be chosen because it has been previously used in the development of prediction models for the location of electric vehicle charging stations in large cities in the Netherlands. The strategy to be used is the generation of a gradient boosted regression tree for the prediction of the location of EV charging infrastructure, due to its usefulness when external conditions are taken into account in the calculation. For its evaluation, it is established that a model with an F-Score greater than 0.6 and an accuracy greater than 0.7 will be considered successful, such characteristics are similar to those obtained by Straka in 2020 when evaluating similar variables with the same regression model.

Finally, an overview of the steps to be taken for the success of the project is incorporated (Fig. 1).

Metodology.jpg Fig 1. General steps of the current project

Results

Gantt Chart

For your evaluation the file is available in a spreadsheet at this location Grantt Chart

Technology Stack

  • Cloud service: Azure
  • Data Warehouse: MySQL database
  • Python exploratory data analysis libraries: Pandas, Matplotlib, Numpy, Seaborn *Automatization: Docker and Apache Airflow
  • ETL: Python
  • Documentation collection service: Github
  • Dashboard: PowerBI and Streamlit
  • Machine Learning modeling tool: Scikit-Learn and XGBoost

Entity Relationship Diagram and Dicctionary of Columns

ER-Model-Dic-of-columns.png

Brand identity

For the development of the current project, an identity of the consultancy was generated by developing its logo (Fig 2), mission, vision and values.

veloxia-logo.png

Fig 2. Veloxia logo.

  • Name: Veloxia
  • Slogan: Data-driven solutions
  • Vision: "To be the one everyone comes to for the best data-driven decisions".
  • Mission: "Our mission is to grow your business by taking the best data-driven, environmentally friendly approach to business success".
  • Values: sincerity, humility, smart work, quality, togetherness, commitment, inclusiveness.

Bibliography

  • Carpenter, T., Curtis A., Keshav, S. (2013)The return on investment for taxi companies transitioning to electric vehicles. A case study in San Francisco.
  • Das, H., Rahman, M., Li, S. and Tan, C. (2019) Electric vehicles standards, charging infrastructure, and impact on the grid integration: A technological review. Renewable and Sustainable Energy Reviews
  • Focas, C. (2016) Travel behaviour and CO2 emissions in urban and exurban London and New York. Transport Policy
  • Isik, M., Dodder, R. and Ozge, P. (2021) Transportation emissions scenarios for New York City under different carbon intensities of electricity and electric vehicle adoption rates . Nature Energy
  • Liu, H. (2011) Model to Forecast the Popularity of Electric Vehicles. School of Information and Communication Engineering of Beijing University of Posts and Telecommunications.
  • Liu, Z., Song, J., Kubal, J., Susarla, N., Knehr, K., Islam, E., Nelson, P and Ahmed, S. (2021) Comparing total cost of ownership of battery electric vehicles and internal combustion engine vehicles. Energy Policy
  • Pita, L.. (2016) Línea de tiempo: educación ambiental en Colombia. Revista Praxis
  • Roas, L. (2011) Los vehículos eléctricos. Universidad Antonio de Nebrija
  • Straka, M., De Falco, P., Ferruzzi, G., Proto, D., Van der Poel, G., Khormali, S. and Buzna, L. (2020) Predicting Popularity of Electric Vehicle Charging Infrastructure in Urban Context. IEEE
  • Weldon, P, Morrissey, P and O'Mahony, M. (2017) Long-term cost of ownership comparative analysis between electric vehicles and internal combustion engine vehicles. Sustainable Cities and Society

Annexes

Abbreviations

AEV: All-electric vehicles BEV: battery electric vehicles CO2: Carbon Dioxide EDA: Exploratory Data Analysis EV: electric vehicles HEV: Hybrid electric vehicles ICEV: internal combustion vehicles IEA: International Energy Agency KPIs: Key Performance Indicators ML: Machine Learning NYC TLC: New York City Taxi & Limousine Commission SQL: Structured Query Language, a programming language USA: United States USD: American dollars. TCO: Total Cost of Ownership