Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyse Emailing Sending and Receiving Behavior and Sentiment towards… #594

Merged
merged 4 commits into from
Jun 1, 2023

Conversation

priyankaiitg
Copy link
Collaborator

… Male and Female Genders.

@codecov-commenter
Copy link

codecov-commenter commented May 5, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (7dc30d2) 73.20% compared to head (e87b79a) 73.20%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #594   +/-   ##
=======================================
  Coverage   73.20%   73.20%           
=======================================
  Files          31       31           
  Lines        3702     3702           
=======================================
  Hits         2710     2710           
  Misses        992      992           
Flag Coverage Δ
unittests 73.20% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sbenthall
Copy link
Collaborator

This is great progress!

A few comments:

A couple of comments:

  • Some qualitative text framing the research problem being addressed, and the interpretation of the results, would be helpful. Maybe break up the big blocks of print statements into sections, and explain them? As is, it's not clear how the counts connect to the research questions that we've discussed.

  • Similarly, printed numbers are note as nice as plots, and Jupyter notebooks make plotting very easy! See other notebooks in the examples/ directory how how we've done this elsewhere.

  • Once the code for the analysis of a single mailing list has been worked out, it would be good to encapsulate it in a function. That way it can be applied to many mailing lists to compare them.

  • I believe for political correctness reasons, it is best to change the terms "male/female" to "men/women". I think in one of the other notebooks we did this change programmatically. It would also be good to include some disclaimer text like the following:

"BigBang uses a library that guesses the gender of a person based on their first name and census records. We understand that this method is prone to error. Only names with very high correlation with a particular gender are so identified. Because of these and other errors, we consider gender in statistical aggregates only. Please do not take these results as attributing gender to any particular individual on the mailing list."

priyankaiitg and others added 2 commits May 29, 2023 20:36
… Bloxplots and Grouped Barcharts to analyze the Sentiment across Genders estimated from First Name on Mailing Lists
@sbenthall
Copy link
Collaborator

Huge progress! In general, I love the plots. This will be by far one of our best notebooks.

One technical issue:

In cell [9], I'm getting this error: https://gist.github.com/sbenthall/15af4d7edb5774303d71f56b42dfcd04
Looks connected to this: https://stackoverflow.com/questions/76158147/pandas-groupby-valueerror-cannot-subset-columns-with-a-tuple-with-more-than-o
Which suggests that you may have been using an older version of Pandas.
Can you update Pandas and figure out how to correct this?

Two nitpicks on presentation -- not necessary to fix...

In cell [10] (the first bar plot), I am a little confused by the plot since only one column has the darker blue bar.
Is that because the number of unique senders is negligible in those categories?
I assume this is a stacked bar plot, but it is hard to tell.

Could you add text explaining what you mean by "Response or interaction ratio"?

@sbenthall
Copy link
Collaborator

Great work. Thanks @priyankaiitg !

@sbenthall sbenthall merged commit 62be195 into datactive:main Jun 1, 2023
@sbenthall sbenthall added this to the 0.5 milestone Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants