Home

ML Guided Network Science-Based Vulnerability Model

Introduction

From the inception of the Novel SARS-CoV2 virus in Wuhan, China, The SARS-CoV2 has affected 219 countries and territories with around 55 million confirmed cases with 1.3 lakh deaths till 17 November 2020^[1]. The United States of America (USA) is the one who suffered the most from SARS-CoV2 spread as the USA only has around 20.7% of total cases of SARS-CoV2. Social interaction plays a crucial role in spreading disease and transportation is the infrastructure that enables these interactions. Here we proposed a framework to assess the low-risk interstate transport corridor for roadway and airway transport for the USA using Machine learning-based network science-driven vulnerability analysis.

Problem Description

The easement and partial lockdown on travel within the USA saw more interaction and travel; these led to spread and increase in cases of SARS-CoV2 in less-affected regions. While the control measures such as symptoms-based screening protocol are implemented at various airports and checkpoints, this will make less impact on the control as most of the cases arrive during an incubation period. This type of control has less influence in road transport mode as the multiple entry and exit points are available. In this case, there is a need to identify the safe (low-risk) route to travel, whether via air or road.

Broad Approach

The broad approach of the framework is explained in the given figure and adopted from ^[1]. This framework integrates the disparate domains of science namely machine learning, network science, geospatial engineering and optimization.

Technical Details of Approach

To identify the safe passage between two disparate regions where the potential low-risk travel corridors can be established, we calculate the social and health vulnerability using the social and health indicator shown in the above figure. One of the analysis parameters' patient count is predicted using the Long Short-Term Memory (LSTM) deep learning time series models.

Case Count Predictions

Since SARS-CoV2 case counts (and death counts) are a sequence of 'discrete-time' data, they fit the time series description. We used LSTMs and Bi-LSTMs models to show the best result among the disparate time series models ^[2]. We used a two-layer Stacked LSTM architecture with 100 neurons on each layer. Following this added a dense layer (with different neurons for different states as each state's demographic data is significantly variable) to obtain a fully connected network. At last, we added the dropout layer to prevent overfitting of data. The amount of dropout was tuned on a state-by-state basis. We trained the model for 50 epochs and gave a low Mean-Squared Error MSE - between 0.0030 to 0.0015.

Given below is the code for one of the models that were trained.

def model_compile(self):
    ''' Model fit '''


    self.model = Sequential()
    self.model.add(LSTM(128,activation='relu', input_shape=(self.n_steps, self.n_features)))
    # self.model.add(LSTM(100, activation='relu'))
    self.model.add(Dense(2048, kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-6), 
                                bias_regularizer=regularizers.l2(1e-4), 
                                activity_regularizer=regularizers.l2(1e-5)))
    self.model.add(Dropout(0.2))
    self.model.add(Dense(1))

    self.model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001), loss=tf.keras.losses.Huber(), metrics = 'mse')
    return self.model

Vulnerability Calculation

After the calculation of confirmed patient count from the machine learning model. We selected the other variable based on the deductive approach ^[3]. This dataset is normalized using the min-max rescaling standardization.

The Data Envelopment Analysis (DEA), a nonparametric mathematical linear programming based optimization technique, is used to compute the health and social vulnerability using the (Equation 1).

The total vulnerability is computed by taking the geometric mean of social and health vulnerabilities. To calculate the en-route risk for interstate travel, the first shortest path between two states is computed using google maps. The resultant road network contains the state centroids (nodes) and shortest routes(link). The en-route travel risk for the road network is calculated based on the (Equation 2).

For Air transport, the duration of the flight is multiplied by health vulnerability. The resultant Air-network represents the risk associated with the travel from origin to destination.

Impact

The current study provides an understanding of the integration of machine-learning models with networks. This framework helps establish the low-risk travel corridors, which can help understand and plan interstate travel. The authorities or the user can choose the best suitable route from this model's multiple alternate routing options.

Contributors

Udit Bhatia, IIT Gandhinagar
Raviraj Dave, IIT Gandhinagar
Dhruv Menon, IIT Gandhinagar
Rwik Rana, IIT Gandhinagar
Pranshu Kumar Gond, IIT Gandhinagar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly