Skip to content

Latest commit

 

History

History
581 lines (365 loc) · 45.3 KB

2021-02-05.md

File metadata and controls

581 lines (365 loc) · 45.3 KB

< 2021-02-05 >

2,780,215 events, 1,397,451 push events, 2,201,297 commit messages, 168,092,848 characters

Friday 2021-02-05 00:35:53 by shadow01148

arsenal esp and gui thing that i havent tested yet

fuck you


Friday 2021-02-05 01:22:29 by Anthony

Fixed main

fuck you Jingy and your commits on main breaking everything >:(


Friday 2021-02-05 02:09:09 by DrakePork

Damn I need to start pushing more often so I actually remember what Ive changed.

  • Finished /bow and /sword commands
  • Fixed some NPE errors in Referral & ReferralList and also started work on giving people discord roles if they get X referrals
  • Updated RewardGUI (Should probably rename it Secrets but meh) to the new and improved version that I made on RoyalAsylumCore
  • Added SecretFound to add a secret to your stats
  • Forgot to initiate placeholders onEnable so now they do that
  • Added placeholders regarding bus and trains in SkyCity so that the price changes according to the placeholder so I dont have to make a GUI for every bus/train stop
  • Removed unused stuff from ConfigCreator
  • Deleted a bunch of commands like the Opme ones
  • Added an OP bypass on guard/build duty move blocker
  • Did so you cant break birch saplings in prison world
  • Put the correct main: in plugin.yml
  • Holy fuck I havnt pushed this in a long time
  • Generally fixed up in plugin.yml
  • Probably more stuff I have forgotten about

Friday 2021-02-05 02:19:37 by Rory Dale

2021-02-04

Thursday, February 4th, 2020 - the super, super, funky show! All credit for the playlist for today's show (apart from the last three songs) goes to Phil "Dr. Philphy" Edwards and his longtime friend and bandmate Steve Chandra Savale - currently of Asian Dub Foundation. Phil and Steve's history has been told a little bit on earlier shows, but Steve's love for funk, and deep cuts from the funk canon (nice job on the homonym English language!) is ready to take you on a musical voyage. There are some SERIOUSLY funky tracks on here so strap in, screw your wig on tight, and turn it up!


Friday 2021-02-05 02:56:03 by Jean-Paul R. Soucy

New data: 2021-02-04: SEE ONTARIO DATA NOTES.

Recent changes:

2021-01-27: Due to the limit on file sizes in GitHub, we implemented some changes to the datasets today, mostly impacting individual-level data (cases and mortality). Changes below:

  1. Individual-level data (cases.csv and mortality.csv) have been moved to a new directory in the root directory entitled “individual_level”. These files have been split by calendar year and named as follows: cases_2020.csv, cases_2021.csv, mortality_2020.csv, mortality_2021.csv. The directories “other/cases_extra” and “other/mortality_extra” have been moved into the “individual_level” directory.
  2. Redundant datasets have been removed from the root directory. These files include: recovered_cumulative.csv, testing_cumulative.csv, vaccine_administration_cumulative.csv, vaccine_distribution_cumulative.csv, vaccine_completion_cumulative.csv. All of these datasets are currently available as time series in the directory “timeseries_prov”.
  3. The file codebook.csv has been moved to the directory “other”.

We appreciate your patience and hope these changes cause minimal disruption. We do not anticipate making any other breaking changes to the datasets in the near future. If you have any further questions, please open an issue on GitHub or reach out to us by email at ccodwg [at] gmail [dot] com. Thank you for using the COVID-19 Canada Open Data Working Group datasets.

  • 2021-01-24: The columns "additional_info" and "additional_source" in cases.csv and mortality.csv have been abbreviated similar to "case_source" and "death_source". See note in README.md from 2021-11-27 and 2021-01-08.

Vaccine datasets:

  • 2021-01-19: Fully vaccinated data have been added (vaccine_completion_cumulative.csv, timeseries_prov/vaccine_completion_timeseries_prov.csv, timeseries_canada/vaccine_completion_timeseries_canada.csv). Note that this value is not currently reported by all provinces (some provinces have all 0s).
  • 2021-01-11: Our Ontario vaccine dataset has changed. Previously, we used two datasets: the MoH Daily Situation Report (https://www.oha.com/news/updates-on-the-novel-coronavirus), which is released weekdays in the evenings, and the “COVID-19 Vaccine Data in Ontario” dataset (https://data.ontario.ca/dataset/covid-19-vaccine-data-in-ontario), which is released every day in the mornings. Because the Daily Situation Report is released later in the day, it has more up-to-date numbers. However, since it is not available on weekends, this leads to an artificial “dip” in numbers on Saturday and “jump” on Monday due to the transition between data sources. We will now exclusively use the daily “COVID-19 Vaccine Data in Ontario” dataset. Although our numbers will be slightly less timely, the daily values will be consistent. We have replaced our historical dataset with “COVID-19 Vaccine Data in Ontario” as far back as they are available.
  • 2020-12-17: Vaccination data have been added as time series in timeseries_prov and timeseries_hr.
  • 2020-12-15: We have added two vaccine datasets to the repository, vaccine_administration_cumulative.csv and vaccine_distribution_cumulative.csv. These data should be considered preliminary and are subject to change and revision. The format of these new datasets may also change at any time as the data situation evolves.

Revise historical data: cases (AB, BC, MB, ON, PE, QC, SK); mortality (ON).

2021-02-02: ONTARIO DATA REPORTING MAY HAVE UNUSUAL NUMBERS FOR THE NEXT SEVERAL DAYS. SEE BELOW. https://www.ontario.ca/page/how-ontario-is-responding-covid-19

“Toronto Public Health has now migrated all of their data to the provincial data system, CCM. This migration has impacted today’s daily counts. Most notably, TPH’s case count is negative following the identification of duplicate cases as well as data corrections to some fields (e.g., long-term care home residents and health care workers), resulting in an underestimation of today's cases. In addition, case counts for other PHUs may have been affected by system outages related to the migration. As a result, we anticipate fluctuations in case numbers over the next few days.”

Note regarding deaths added in QC today: “The data also report 42 new deaths, for a total of 9,941. Among these 42 deaths, 15 have occurred in the last 24 hours, 16 have occurred between January 28 and February 2, 10 have occurred before January 28 and 1 has occurred at an unknown date.” We report deaths such that our cumulative regional totals match today’s values. This sometimes results in extra deaths with today’s date when older deaths are removed.

https://www.quebec.ca/en/health/health-issues/a-z/2019-coronavirus/situation-coronavirus-in-quebec/#c47900

Note about SK data: As of 2020-12-14, we are providing a daily version of the official SK dataset that is compatible with the rest of our dataset in the folder official_datasets/sk. See below for information about our regular updates.

SK transitioned to reporting according to a new, expanded set of health regions on 2020-09-14. Unfortunately, the new health regions do not correspond exactly to the old health regions. Additionally, the provided case time series using the new boundaries do not exist for dates earlier than August 4, making providing a time series using the new boundaries impossible.

For now, we are adding new cases according to the list of new cases given in the “highlights” section of the SK government website (https://dashboard.saskatchewan.ca/health-wellness/covid-19/cases). These new cases are roughly grouped according to the old boundaries. However, health region totals were redistributed when the new boundaries were instituted on 2020-09-14, so while our daily case numbers match the numbers given in this section, our cumulative totals do not. We have reached out to the SK government to determine how this issue can be resolved. We will rectify our SK health region time series as soon it becomes possible to do so.


Friday 2021-02-05 07:46:01 by Sam Xu

Added some silly respones

oh no -> on yeah hell yeah - oh hell no !ravioli


Friday 2021-02-05 07:47:47 by NewsTools

Created Text For URL [nation.africa/kenya/blogs-opinion/opinion/don-t-die-before-you-fall-in-love-3280326]


Friday 2021-02-05 08:48:02 by bindhu520

Add files via upload

SAFE DRIVING CHALLENGE ML Project Report BACHELOR OF TECHNOLOGY IN COMPUTER SCIENCE & ENGINEERING SUBMITTED BY NAME OF THE STUDENT Ms. T BINDHU BHARGAVI Department of Computer Science and Engineering BVRIT HYDERABAD College of Engineering for Women (Approved by AICTE, New Delhi and Affiliated to JNTUH, Hyderabad) Bachupally, Hyderabad – 500090

Department of Computer Science and Engineering BVRIT HYDERABAD College of Engineering for Women (Approved by AICTE, New Delhi and Affiliated to JNTUH, Hyderabad) Bachupally, Hyderabad – 500090

Acknowledgement

Firstly, I would like to express my immense gratitude towards BVRIT HYDERABAD College of Engineering for Women, which created a great platform to attain profound technical skills in the field of Computer Science though this industry enabled learning WISE.

I would like to extend my sincere thanks and gratitude to Dr. K V N Sunitha, Principal, BVRIT HYDERABAD College of Engineering for Women and WISE team of college for their meticulous planning and conduction of this learning program.  

I would also like to extend my sincere thanks to WISE & Team of Talent sprint for enabling us with this unique learning platform.

T Bindhu Bhargavi

INDEX

S.NO Contents Page No. 1 Abstract 4 2 Introduction 5 3 Problem statement 6 4 Approach and Statistics of code 7 5 Data Sets 8-9 6 First Model 10 7 Feature Engineering 11 8 PCA 12-13 9 Neural Network 14-15 10 Second Model 16 11 Random forest and Naïve Bayes 17 12 Comparisons of Models 18-20 13 Result 21 14 Reference Link and Project Link 22

LIST OF FIGURES

S.NO Name of the figure Page No. 1 Screen plot of 30 features 12 2 Histogram of mean alertness per trail 15 3 ROC curve of two models 20

ABSTRACT

In this project we introduce a classifier which takes in multidimensional data consisting of real-world measurements of physical, environmental and vehicular continuous features obtained from number of driving sessions. We will show that using Naive Bayes classifier which assumes the data distribution to be Gaussian distribution we can make a prediction weather the driver is alerted or not while driving and achieve reasonable low misclassification rate for the given data. We will inspect how insight into relevant features were obtain by using Principal Component Analysis (PCA) and simple correlation matrix. We were able to obtain a misclassification rate as low as 12.03 % and 27.07 % for the test and training data respectively.   INTRODUCTION

With a training and test set consisting of 33 features from real time measurements test we want to use that information to predict if a certain driver is alerted or not alerted while driving. Here our goal is to construct a binary classifier which will predict a binary target value using the whole or a subset of the 33 features and give a prediction as Predictions = ( 1 if the driver is alert 0 if the driver is not alert (1) A. Datasets The datasets are gained from the website www.kaggle.com and consist of one training set and one test set. The datasets include measurements from total of 510 real time driving session where each driving session takes 2 minutes. This gives a new measurement of the each of the 33 features every 100ms. The headers in the datasets are listed in table I below. The size of the training set is a measurement set of 510 driving sessions done by 100 people. This results in a 604330×33 as the size of the training set.

Problem Statement Driving while distracted, fatigued or drowsy may lead to accidents. Activities that divert the driver's attention from the road ahead, such as engaging in a conversation with other passengers in the car, making or receiving phone calls, sending or receiving text messages, eating while driving or events outside the car may cause driver distraction. Fatigue and drowsiness can result from driving long hours or from lack of sleep. The data for this Kaggle challenge shows the results of a number of "trials", each one representing about 2 minutes of sequential data that are recorded every 100ms during a driving session on the road or in a driving simulator. The trials are samples from some 100 drivers of both genders, and of different ages and ethnic backgrounds. The files are structured as follows: The first column is the Trial ID - each period of around 2 minutes of sequential data has a unique trial ID. For instance, the first 1210 observations represent sequential observations every 100ms, and therefore all have the same trial ID The second column is the observation number - this is a sequentially increasing number within one trial ID The third column has a value X for each row where

       X = 1     if the driver is alert 

       X = 0     if the driver is not alert 

The next 8 columns with headers P1, P2 , …….., P8 represent physiological data; The next 11 columns with headers E1, E2, …….., E11 represent environmental data; The next 11 columns with headers V1, V2, …….., V11 represent vehicular data;

APPROACH

• Initially, we have analysed train and test datasets • Imported the required libraries • By using Data pre-processing, logistic regression, feature engineering, PCA, Support vector regression, Neural network we have predicted the output.

STATISTICS OF THE CODE • We have used google Collaboratory to predict the output.

SAFE DRIVING CHALLENGE

1 INTRODUCTION The objective is to design a classifier that will detect whether the driver is alert or not alert, employing data that are acquired while driving. This report is meant to illustrate the process of building a predictive machine learning model of the Machine Learning.

2 DATASETS There are 604,329 instances of data in the training dataset and 120,840 instances of data in the test dataset. The data for this challenge shows the results of a number of” trials”, each one representing about 2 minutes of sequential data that are recorded every 100ms during a driving session on the road or in a driving simulator. The trials are samples from some 100 drivers of both genders, and of different ages and ethnic backgrounds. The files are structured as follows: The training data was broken into 500 trials, each trial consisted of a sequence of approximately 1200 measurements spaced by 0.1 seconds. Each measurement consisted of 30 features; these features were presented in three sets: physiological (P1...P8), environmental (E1...E11) and vehicular (V1...V11). Each feature was presented as a real number. For each measurement we were also told whether the driver was alert or not at that time (a Boolean label called Is Alert). No more information on the features was available. 3 EXISTED MODEL In order to summarize existed work and formulate a plan in order to build an outperformed machine learning predictive model. Similar machine learning techniques are applied to this dataset. The techniques most participants used limited to Nave Bayes, Logistic Regression, Support Vector Machine, Neural Network, and Random Forest. But the performances of their models are totally different, as they pre-processed the original data in different ways, especially in their feature engineering. Thus, I will mainly focus on the feature engineering methods applied by the participants, instead of how they choose parameters of algorithms in the summary part. The highest score (AUC = 0.861151) was reached by a logistic regression model. As the dataset consists of sequential data recorded every 100ms for 2 minutes in each trial, the partitions of the data by trials (Trial ID) rather than randomly partition. The Means and Standard Deviations of each trial were computed as new features (include the target feature Is Alert). After- wards, feature selection based on diagnostics of the logistic regression was conducted and three strong features were chosen for modelling (sdE5, V11, and E9). How- ever, this model applies future observation (The mean and standard deviation can only be calculated when a trial is finished), thus inapplicable for real-life situations. A running Mean and Standard deviation were applied to training instead and the AUC has dropped slightly, from 0.861151 to 0.849245). We focus on the instances at the initial moment the driver lost alertness, the dataset is reduced significantly in this way and he highlighted the factors change significantly between status change for feature selection. E4, E5, E6, E7, E8, E9, E10, P6, V4, V6, V10, and V11 are selected for building a Neural Network. This model reaches an AUC of 0.84953 & also attempts to aggregate data from each trial and calculate means and standard deviations as additional features. After tossing up correlated feature and other feature engineering, a logistic regression model trained from feature selected data reaches an AUC of 0.80779. Fourier generates around 600 new features to the dataset (The inverse, the square, and the cube of each features, all the combinations of 2 columns, time interval variables). It reaches the highest AUC by applying forward search to select predictive features. A Nave Bayes model trained by these selected features reach an AUC of 0.844. We trained an epsilonSVR, RBF kernel model with parameters c = 2, g = 1/30, and p = 0.1, which reaches an AUC of 0.839 and applies a random forest with 199 trees and min node size of 25, the correlated features are tossed out beforehand. This predictive model reaches an AUC of 0.81410. 3.1 SUMMARY OF EXISTED MODEL An important feature for this dataset is that it contains sequential data. For each trial, the dataset records data every 100ms. Thus, all the participants shuffle the dataset by trials for the purpose of preserving this sequential feature. Aggregating data within a trial to generate means and standard deviations as new features for modelling is proofed as a useful method of data pre-processing. Another useful method of data pre-processing is to choose the instances close to the moment the driver lost alertness, which reduce time to train the models significantly. Multiple methods of feature selection are applied, the mean/standard deviation of existed features, inverse, the square, the cube, and a combination of 2 columns are viewed as potentially useful new features. Correlated, remain constant features are always tossed out. As for the choice of predictive machine learning algorithms, there is no valid proof that one algorithm out- performs all the others in this specific situation. Generally, Nave Bayes, Logistic Regression, Random Forest, Support Vector Machine, and Neural Network all reach a good performance in this case. 4 MODEL BUILDING PLANS Even though many existed models have already had a decent performance, it’s still possible to improve the model. A plan for building a new predictive model is outlined in this section. 4.1 GAP IDENTFICATION The predictive model with the highest AUC value is trained from 20% of the training dataset. What’s more, the means and standard deviations of each trial are future observation features. Those make this predictive model inapplicable to a real-life situation. An AUC value of 0.861151 also means there are still rooms for improvement. Another noticeable point within most of the existed work is that most of the models are evaluated by either AUC score or classification accuracy. For this specific situation, it’s obviously more important to identify those not alert instances as driving while not alert can be deadly. Failing to identify ’not alert’ can lead to worse consequences compare to failing to identify ’alert’. Thus, true negative rate (TN / (TN + FP)) can also be a valuable measure of evaluation as it shows the percentage of ’not alert’ instances successfully identified. Furthermore, as all the models’ classification accuracies are above 50%, which makes building an ensemble model to reach a better performance possible as if the recalls and the specificities of all the models can reach above 50% at the same time for all the models. 4.2 MODEL BUILDING PLAN Firstly, those existed models with good performance will be reproduced, includes the way they pre-process the data and the parameters they choose to build predictive models. Secondly, a local evaluation will be conducted on these models. The recalls and specificity will be used for evaluation, apart from classification accuracy and AUC score. Then to group those models with recall and specificity both higher than 50% to build an ensemble model, aims to reach a better performance than all the existed models. 5 SOLUTION DEVELOPMENT Python is used as the developing environment for this project. Scikit-learn is the machine learning tool applied. Missing data is identified as 0 throughout all the dataset. 5.1 FIRST MODEL The first predictive model is built by the data pre-processing method. We were concerned that using the entire data set would create too much noise and lead to inaccuracies in the model. The final goal of the system is to detect the change in the driver from alert to not alert so that the car can self-correct or alert the driver. So, we decided to just focus on the data at the initial moment when the driver lost alertness. According to this, I subset the dataset to the moment when the driver lost alertness. The rows with the feature ’Is Alert’ == 0 and the last rows with the feature ’Is Alert’== 0 are chosen, along with 5 rows before and after each (100ms of time between each observation, 5 rows before means focus on the data recorded 0.5s before and after the driver lost alertness). After sub setting, 37421 instances without duplication are chosen to build the predictive model.

5.1.1 FEATURE ENGINEERING

There are 30 features included in the dataset, thus filter those features with higher impact could not only save computational resources but also potentially improve the performance of the predictive model. Principle Component Analysis (PCA) is applied as the feature engineering technique in this case. For PCA, the dataset is standardized firstly, then the fraction of variances of each feature is calculated to identify those features have higher impact on the result.

図 1: Scree Plot of 30 features We can see that the first 14 attributes contribute 80.95% of the total variance, the number of features

selected for modelling is decreased from 30 to 14 in this way.

5.1.2 MODELLING As the size of subset is relatively smaller, stratified 10-fold cross-validation is applied as data evaluation method to make full use of the dataset. Naive Bayes, Logistic Regression, Random Forest, Support Vector Machine, and Neural Network models are built from this dataset. Gaussian Nave Bayes model performs a validation ac- curacy of 61.74%. Logistic Regression with optimization algorithm of ’liblinear’ reaches a validation accuracy of 64.6%. Multiple models in Support Vector Machine family include Linear-SVC, Nu-SVC, C-SVC are applied as well. Their validation accuracies varied from 65% to 78% A Neural Network with 5 neurons in the first hidden layer and 2 neurons with the second hidden layers reaches a validation accuracy of 65.01%. I tried to use neural networks with different architectures, another neural network with 5 hidden layers and 14, 14, 12, 10, 5 neurons in each layer. The activation function is also changed from RELU to logistic regression. Unfortunately, the performance of the new neural network does not change mach.

Model Accuracy recall Specificity AUC Sc Logistic Regression 64.60% 94.32% 25.25% 0.5978 Nave Bayes 61.74% 89.47% 25.02% 0.5724 Random Forest 93.54% 97% 90.08% 0.9354 Linear-SVC 64.55% 94.85% 24.44% 0.5958 Nu-SVC 78.63% 92.36% 60.45% 0.7640 C-SVC 67.55% 98.13% 27.06% 0.626 Neural Network1 65.01% 94.31% 26.21% 0.6026 Neural Network2 65.96% 97.02% 24.83% 0.6092 Performances of algorithm on PCA dataset

It can be found from the performance diagram that all the models perform pretty well on predicting those drivers ’in alert’. However, most models cannot reach a decent result when it comes to identifying drivers not in alert, which is more important in this specific situation. On the other hand, the Random Forest model outer form all other models, especially when it comes to specificity, which makes it a part of our final ensemble model. The Nu-SVC model reaches a specificity of more than 50% as well, which means it can also be part of an ensemble model.

5.2 SECOND MODEL Unfortunately, we did not get a good predictive ma- chine learning model by the first data pre-processing method (apart from the Random Forest with 50 trees model). I decided to conduct an exploratory analysis on the dataset in order to provide a guidance of data pre-processing

5.2.1 EXPLORATORY ANALYSIS AND FEATURE ENGINEERING ON THE DATASET We calculate the average Is Alert value per trial and plot the result on a histogram. 図 2: Histogram of mean alertness per trial

It is found that for most drivers, they either stay alert or not alert throughout the 1200ms trial. Thus, the characteristic of each driver, recorded in the mean and standard deviation of each attribute, can be helpful for predictive analysis. On the other hand, it is impossible to get the mean and standard deviation of a trial at the beginning of each trial, which makes using stable means and standard deviations of each feature unpractical in real-life situation. More- over, using stable means and standard deviations cannot record the change of the driver’s behaviour within a trial, which may be constantly changing overtime. For these reasons, we decided to use rolling means and standard deviations of each features as new features in- stead of simply using stable means and standard deviations in order to make full use of the sequential feature. The rolling window is set to 5, as for every 5 instances (500ms), it calculates the mean and standard deviation for them, then the algorithm drops the first instance and add a new instance, etc.

5.2.2 MODELLING Similarly, we applied algorithms mentioned above to this pre-processed dataset. As the size of the dataset is big enough, we use 80%-20% to train-test split the dataset in- stead of cross validation. Firstly, we tried Random Forest algorithm, the one per- forms the best in the last feature selected dataset, to see if there’s any improvement compare to the other feature selecting method. The Random Forest has 50 trees, the parameters are the same as the one applied before. It reaches a decent performance on the validation dataset, with a validation accuracy of 98.91%. Algorithms of the Support Vector Machine family all fail to converge within a specific period of time. A neural network with four hidden layers, each layer has 90, 70, 50, 30 neurons respectively also applied, reaches a validation accuracy of 80.76%. Furthermore, Nave Bayes and Logistic Regression have not improved much compare to the preview models. Generally, Neural Network and Random Forest performs better than other models in this situation, and Random Forest performs far better than Neural Networks.

Model Accuracy recall Specificity AUC Sc Logistic Regression 61.21% 75.21% 41.96% 0.5858 Nave Bayes 62.86% 45.28% 87.02% 0.6615 Random Forest 98.91% 98.58% 97.55% 0.9873 Neural Network 80.76% 96.40% 59.26% 0.7783 Performances of algorithm on rolling mean std dataset

5.2.1COMPARISON OF MODELS

Comparing the performances of models trained from data pre-processed by different methods, it is found that algorithms logistic regression, Support Vector Machine, and nave Bayes are not suitable for this problem. While Neural Network can reach a good performance in the dataset pre-processed by generating time sequential feature, it is not the model fits the dataset the best. The Random Forest Algorithm generates the best result on predictive analysis, either trained from data pre-processed by PCA or from data pre-processed by other feature engineering techniques. Another interesting finding is that most models perform better when it comes to predicting ’alert’ drivers than to predicting ’not alert’ drivers, apart from the Nave Bayes model. Considering two values are basically equally distributed (alert: 349785, not alert: 254544), it’s hard to say one label is over represented than the other, which makes the unbalanced predict result hard to explain. As a result, I choose three models for local evaluation, which are two Random Forest models and a Neural Net- work Model. 6 Local Evaluation We use the data’solution.csv’ to evaluate the final models. 6.1 MODEL 1 The first model is the Random Forest trained by the data with features selected from PCA.

Predict = 0 	Predict=1 

Actual = 0 22571 7343 Actual = 1 63616 27310 AUC = 0.52744 Though this model only reaches an accuracy of 41.28% on the test dataset, it identifies many not alerted drivers correctly. Overall, this model is not good enough, no matter evaluated by which method. 6.2 MODEL 2 The second model is the Random Forest trained from the data with added features of rolling mean and standard deviation.

Predict = 0 	Predict = 1 

Actual = 0 16671 13243 Actual = 1 8679 82247 AUC = 0.7309 This model reaches a good performance, with classification accuracy of 81.86% on the test data. It has a good performance in predicting alert drivers, with recall = 90.45%, precision = 86.13% and F1-score = 88.24%. However, for this specific situation. The model is expected to predict ’not alert’ drivers precisely, specificity (The percentage of Actual = 0 is predicted correctly) should be the evaluation method we focus on for this rea- son. The specificity of this model only reaches 55.73%, which still has lots of room to improve. Overall, the AUC value of this model is 0.7309, not as good as the work I referenced, but still an improvement. 6.3 MODEL 3 The third model is the Neural Network trained from the data with added features of rolling mean and standard deviation. It has four hidden layers with 90, 70, 50, 30 neurons in each layer. We were meant to use the first layer to grab all the original features and the coming layers to process and predict the output, thus the number of neurons for the first layer is as many as the number of the features.

Predict = 0 	Predict = 1 

Actual = 0 12886 17028 Actual = 1 361 90565 AUC = 0.71340 This model reaches a good performance as well, com- pares to the first model. It successively predicts most of alert drivers (recall = 99.6%, precision = 84.17%, F1- score = 91.24%). However, the model fails to predict many not alert drivers correctly (specificity = 43.08%), which is the more important evaluation method for this predictive model. 6.4 COMPARISON OF MODEL 2 AND MODEL 3 Both two models perform better on predicting alert drivers than identifying not alert drivers as the true positive rate are both higher than their true negative rate in their confusion matrix, though the main goal of this predictive model is to predict not alert drivers. The curve reaches 100% true positive rate firstly is the neural network, the other curve is the random forest. It also can be found that the random forest model performs better than the neural network model. However, the neural network predicts most alert drivers correctly and when it predicts a driver as not alert, it’s correct at the most of times.

The random forest model reaches a significantly higher result on identifying not alert drivers from all the drivers than the neural network model, though when it predicts a driver as not alert, it gets 34.25% chance of being wrong. The Random Forest model would be a better choice in this situation, but the architecture of the neural network model can be optimized to reach a higher performance.

7 RESULT REFLECTION AND COMPARISON

7.1 RESULT CONCLUSION This project was meant to build a supervised learning model to predict not alert drivers, the model with the best performance is achieved by Random Forest with 50 trees in it. It predicts 16671 of 29914 not alert drivers correctly in the test data. It reaches a classification accuracy of 81.86% and AUC value of 0.7309.

7.2 RESULT COMPARISON When it compares to the results of those in the leader- board, there are lots of participants’ models reach a higher performance. The best model reaches an AUC value of 0.86115, though it applies means and standard deviations of each trial as new features. Almost 20 participants’ models reach AUC scores over 0.8, which is significantly higher than mine. Refers that there is still large room to improve my model.

7.3 DISCUSSION AND FUTURE WORK The existed predictive model for this problem is far from perfection. There are a few perspectives that can improve the performance of the model.

7.3.1 DATA PREPROCESSING METHOD

The rolling means and the standard deviation is proved to be a good method to preserve the sequential attribute of the data. However, rolling means for every 0.5s could be too short to grab the driving pattern of a driver. Expand the rolling window to produce rolling means and standard deviations in a longer period could be considered as a useful method to introduce the long-term driving patterns of drivers. Better performance is believed can be achieved by model learns not only from drivers’ behaviours in a short time (0.5s) but in a long time as well.

7.3.2 MODEL OPTIMIZATION AND SELECTION

Though the neural network model fails to produce a better performance than the random forest model, it is still not convincing that random forest is always the best option for this problem. Neural network still shows great potential to produce good result. Further work could to optimize the architecture of neural networks.

REFERENCE LINK

https://www.kaggle.com/c/stayalert/data

PROJECT LINK

https://github.com/bindhu520/Safe-driving-Challenge-ML-PROECT-


Friday 2021-02-05 10:31:17 by Vishal Barvaliya

Create NumberGuessingGameOf27Numbers.py

Hello, first of all this is not very simple number guessing game its real magical game. I Promise you will get amazed and i have created this amazing game from my real life experiences.


Friday 2021-02-05 10:59:17 by ILoveJesus-CareyHsie

Update stickers.yml

5f8a49138d91a73c30de0dbeedbb9867: key: 8c247c35a053b6c488707065c4e49adcfd3c0fcb3dea00b7cd1c1c588f2c6254 source: Carey Hsie – www.careyhk.com tags: - JESUS - God - Gospel - Christian - I Love Jesus - Carey Hsie - Bless - Chinese New Year original: true


Friday 2021-02-05 12:52:36 by SaemonMoki

made seek and stay sheltered checks a little better, but some work still needs doing added spawn weight configs per animal added vault gene for chickens added health genes for llamas added a new short nose gene animals are now spawned in at around a year old, babies happen 5% of the time and occationally there are wild pregnancies added initialize health, animals can now have dynamic base health fixed gold leather bridles missing their leather overlay re-ordered some methods so more initialization stuff is together fixed my error with leather drops added stay and seek shelter to cows corrected coat length to always be 0 for newborns. Coatlength should be limited correctly by age now when omnigender is true animals now display a pregnancy icon instead of male/female symbol. added a few variations of not gendered icon for omnigender when the pregnancy icon is moused over it will randomly display either the male or female version of text raw dark chicken meat now shows the correct image kunekune now spawn smaller with shorter noses some dwarf pigs now have shorter noses removed delay for rabbit coat length corrected lower horn placement for poly horn sheep added a renderer for llama spit, still doesnt work though trader llamas are now guaranteed to have maximum strength shovels can turn sparsegrass to grass path now you can now set autosomal genes one after another


Friday 2021-02-05 12:53:00 by Terra

god motherfuckin damnn

Signed-off-by: Terra terra@mcterra.id.au


Friday 2021-02-05 12:58:26 by Eduard Tolosa

Revert "Fuck you here too"

This reverts commit 67c66dc5c8101bc49c999f4fe825f1f02c4a0b20.


Friday 2021-02-05 14:19:48 by Fabian Greffrath

add support for colored blood and gibs (#182)

  • add support for colored blood and gibs

Due to popular demand. 😉

  • use P_SetTarget() instead of editing the target value directly

  • introduce a new mobjflag instead of misusing the mobj->target pointer

  • set flags more consistently

If MF_COLOREDBLOOD is set and none of (MF_TRANSLATION1|MF_TRANSLATION2) this means green blood (Baron of Hell / Hell Knight) and if MF_TRANSLATION1 is additionally set this means blue blood (Cacodaemon).

This allows for up to 4 different blood colors once MF_TRANSLATION2 is also taken into account. Let's see what we come up with...

  • add comments

Friday 2021-02-05 17:04:25 by Luca Wehrstedt

Update on "Silence harmless error logs of TensorPipe agent during shutdown"

The TensorPipe pipes do not really support a "graceful" shutdown: if one side is expecting data (i.e., it has scheduled a readDescriptor call) and the other side closes, the former will receive an error. Such an error will not even be predictable, as it depends on the backend: some may detect this and report it "well" (through an EOFError), others may not be able to tell this apart from a failure and report it as such.

This meant that during shutdown some of these errors would fire and thus the agent would log them as warning. We did add a note that these were expected under some conditions, so that users wouldn't be alarmed, but it was still a far-from-ideal experience.

In principle we could build a "protocol" on top of these pipes to "agree" on a graceful shutdown, and this was the plan to solve this. However, it was rather complicated to implement.

Here I am proposing a quicker, but perhaps hackier, solution, which re-uses the already existing graceful shutdown "protocol" of the agent (i.e., the join method) to put the agent in a special state in which it will silence all errors due to a remote shutting down.

Such a check cannot happen in the shutdown method, because that's also used in case of ungraceful shutdown (in which case I believe we'd still want to display errors). Since it needs to make sure that all participants have transitioned to this new state before any of them can continue (as otherwise one of them may close its pipes before another one has realized that this is now expected), we need to perform a barrier. Hence the ideal place for it is the join method, where we're already doing a lot of gang-wide synchronization. Since the join method isn't only called during shutdown, we need to make sure we only switch the agent to this state when it's the last call to join, and we do so by adding a new optional argument to it (which will be ignored by all agents except the TensorPipe one).

I realize this isn't the prettiest solution, and since it changes the agent's API it's worth discussing it carefully. Let me know what you think!

Differential Revision: D26276137

NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on Phabricator!

[ghstack-poisoned]


Friday 2021-02-05 17:08:57 by DillerOFire

lavender: Build StitchImage

Very gore, pls no I hate anus Fuck ur ass wen

Signed-off-by: DillerOFire niktofe1@gmail.com


Friday 2021-02-05 17:36:13 by rebot333

Action rewrite

500 lines to <150. 2 years progress makes a difference. This rewrite feels super basic, and that's cause it is. It's super easy (now) to do this compared to 2 years ago. I'm still writing 500 line things, but they're not boring-to-edit super repetitive things.

2 years goes from >500 lines for a stupid basic sentence generator to >500 lines for a feature-rich Game of Life web app with saving, loading, downloading, uploading, and more

Still maybe 2 years ago I shouldn't have wrote that bad? It was literally just a bunch of if/else statements and a bit of string manipulation, as in the tiniest bit. I remember thinking there was probably a way to do what this commit did 2 years ago, but I didn't know what that way would be. Ahh, the time before I knew what objects or arrays were. Not a very good time actually. Now is much better.


Friday 2021-02-05 18:00:32 by Adam W. Willis

oneplus: Silence gratuitous OEM logging

For the love of god, I do not require my every input logged to dmesg.

Signed-off-by: Adam W. Willis return.of.octobot@gmail.com

Conflicts:

drivers/oneplus/power/supply/qcom/bq27541_fuelgauger.c

drivers/oneplus/power/supply/wlchg/bq2597x_charger.c


Friday 2021-02-05 20:45:31 by Jayden Lefebvre

Typescriptified

it fucking works holy shit i'm amazed


Friday 2021-02-05 21:42:20 by Mick Vermeulen

fix issue where laser rampage items wouldn't be removed (boy don't you just love boolean logic)


< 2021-02-05 >