You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: Learn.md/PROPOSAL.md
+29-30
Original file line number
Diff line number
Diff line change
@@ -2,27 +2,27 @@
2
2
3
3
## Finding Insights from Stack Overflow Developer Survey
4
4
5
-
Stack Overflow is a professional community for developers, conducting an annual survey. The collected data from 2011 onwards has been available for opensource on the web, with the latest dataset released in 2020. Analyzing this dataset professionally using modern tools would enable us to answer real-world questions effectively. The dataset includes responses to 275 questions.
5
+
Stack Overflow is a professional community for developers that conducts an annual survey. The data collected from 2011 onwards is available as open-source and the latest dataset was released in 2020. Analyzing this dataset professionally using modern tools enables us to answer real-world questions effectively. The dataset includes responses to 275 questions.
6
6
7
-
### Project Goal:
7
+
### Project Goal
8
8
9
-
1.**Perform Analysis on 3 years of Stack Overflow Dataset:** Extract insights from the data.
10
-
2.**Data Analysis Goals:**Answer the following questions:
9
+
1.**Perform Analysis on 3 Years of Stack Overflow Dataset:** Extract valuable insights from the data.
10
+
2.**Data Analysis Goals:**Address the following questions:
11
11
- What is the impact of higher education on the salary of surveyed developers?
12
12
- How do education, experience, and responsibilities affect gender inequalities?
13
13
- How does ethnicity impact participation rates?
14
14
- Is there a difference in income between men and women?
15
15
- How does the previous year's interest in a language affect its popularity in the current year?
16
16
3.**Data Visualization Goals:**
17
-
- Identify the most commonly used language.
18
-
- Analyze the distribution of surveyors based on their developer roles.
17
+
- Identify the most commonly used programming languages.
18
+
- Analyze the distribution of survey respondents based on their developer roles.
19
19
- Explore factors affecting job satisfaction.
20
-
- Predict the growth of languages for upcoming years based on survey answers.
20
+
- Predict the growth of programming languages for upcoming years based on survey answers.
21
21
- Provide insights for IT environment, hiring employees, job seekers, and building a solid résumé.
22
22
23
23
### Data Source and Background
24
24
25
-
The dataset is sourced from the annual Stack Overflow developer survey, covering responses from developers in 180 countries. The data range from 2011 to 2020, with the focus being on the last 3 years. Respondents primarily come from the US, India, and EMEA regions, with a background in developer/coding experience. The dataset includes survey data gathered from 180 countries, with responses ranging from "Not at all important" to "Very important" and "Not at all satisfied" to "Very satisfied."
25
+
The dataset is sourced from the annual Stack Overflow developer survey, covering responses from developers in 180 countries. The data spans from 2011 to 2020, with the focus being on the last 3 years. Respondents primarily come from the US, India, and EMEA regions, with a background in developer/coding experience. The dataset includes survey data gathered from 180 countries, with responses ranging from "Not at all important" to "Very important" and "Not at all satisfied" to "Very satisfied."
26
26
27
27
### Data Format
28
28
@@ -37,40 +37,39 @@ The data is in CSV format, consisting of 252,199 observations and 62 variables.
37
37
38
38
#### Techniques Expected to Use in the Project
39
39
40
-
- ML Algorithms: Utilize algorithms like Random Forest, KNN, AUC for classification problems, logistic regression, and linear regression.
41
-
- Data Visualization: Employ data visualization techniques for better understanding and presentation of insights.
42
-
- Parameter Analysis: Analyze parameters to fine-tune models and improve accuracy.
40
+
-**ML Algorithms:** Utilize algorithms like Random Forest, KNN, AUC for classification problems, logistic regression, and linear regression.
41
+
-**Data Visualization:** Employ data visualization techniques for better understanding and presentation of insights.
42
+
-**Parameter Analysis:** Analyze parameters to fine-tune models and improve accuracy.
43
43
44
44
#### Project Plan
45
45
46
-
**Week 8:** Project Base Setup
46
+
**Week 8: Project Base Setup**
47
47
- Source control setup on [GitHub](https://github.com/Recode-Hive/Stackoverflow-Analysis)
48
-
- Project Management using tools like MS Project
49
-
- Complete Data Wrangling & Basic Analysis
48
+
- Project management using tools like MS Project
49
+
- Complete data wrangling and basic analysis
50
50
51
-
**Week 10:** Baseline Model Building
51
+
**Week 10: Baseline Model Building**
52
52
- Implement algorithms and build baseline models
53
53
54
-
**Week 11:** Model Evaluation
54
+
**Week 11: Model Evaluation**
55
55
- Run tests and evaluate the performance of models
56
56
57
-
**Week 12:** Finalization
58
-
- Prepare video presentation summarizing the analysis and insights
57
+
**Week 12: Finalization**
58
+
- Prepare a video presentation summarizing the analysis and insights
0 commit comments