Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualizing data with R and ggplot2 #354

Closed
ZoeLeBlanc opened this issue Mar 21, 2021 · 26 comments
Closed

Visualizing data with R and ggplot2 #354

ZoeLeBlanc opened this issue Mar 21, 2021 · 26 comments

Comments

@ZoeLeBlanc
Copy link
Member

ZoeLeBlanc commented Mar 21, 2021

The Programming Historian has received the following tutorial on Visualizing data with R and ggplot2 by @nabsiddiqui & @rogorido. This lesson is now under review and can be read at:

http://programminghistorian.github.io/ph-submissions/en/drafts/originals/visualizing-data-with-r-and-ggplot2

Please feel free to use the line numbers provided on the preview if that helps with anchoring your comments, although you can structure your review as you see fit.

I will act as editor for the review process. My role is to solicit two reviews from the community and to manage the discussions, which should be held here on this forum. I have already read through the lesson and provided feedback, to which the author has responded.

Members of the wider community are also invited to offer constructive feedback which should post to this message thread, but they are asked to first read our Reviewer Guidelines (http://programminghistorian.org/reviewer-guidelines) and to adhere to our anti-harassment policy (below). We ask that all reviews stop after the second formal review has been submitted so that the author can focus on any revisions. I will make an announcement on this thread when that has occurred.

I will endeavor to keep the conversation open here on Github. If anyone feels the need to discuss anything privately, you are welcome to email me.

Our dedicated Ombudsperson is (Ian Milligan - http://programminghistorian.org/en/project-team). Please feel free to contact him at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudsperson will have no impact on the outcome of any peer review.

Anti-Harassment Policy

This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.

The Programming Historian is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or to requests for clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. Thank you for helping us to create a safe space.


[Permission to Publish]

The editor must also ensure that the author or translator post the following statement to the Submission ticket.

I the author|translator hereby grant a non-exclusive license to ProgHist Ltd to allow The Programming Historian English|en français|en español to publish the tutorial in this ticket (including abstract, tables, figures, data, and supplemental material) under a CC-BY license.

@ZoeLeBlanc
Copy link
Member Author

Very belated but excited to get this ticket setup and the lesson formatted correctly for Visualizing Data with R and ggplot2 🎉

@nabsiddiqui would you mind giving me Igor Sosa Mayor's Github handle, I couldn't find it myself. Thanks!

Also if you would both take a quick look over the lesson and let me know if there's anything you want to update before I do my initial edit pass since this was submitted some time ago. I did notice already that there's a typo at the end of paragraph 18 where a word is cut in half so things like that might have been missed on the initial submission.

Once you let me know it's good to go, I'll do my first pass and send you my edits. Then once those are done, I'll solicit reviewers, and hopefully we can get this lesson peer reviewed and published by the summer! Let me know if you have any questions and thank you for your patience and submission!!

@rivaquiroga
Copy link
Member

@ZoeLeBlanc, @nabsiddiqui, I was taking a look to this lesson because it's about a topic I'm interested in, and I realized that on ¶90 (Additional Resource) it points to an old link for the ggplot extensions website. I would suggest to change it as soon as possible for https://exts.ggplot2.tidyverse.org/gallery/, as the old one is NSFW.

@nabsiddiqui
Copy link
Collaborator

nabsiddiqui commented Mar 21, 2021

@rivaquiroga Well that was unexpected...I have just removed the additional resources. They aren't really necessary for the tutorial and the other links that are posted should be fine.

@ZoeLeBlanc I'll reread it tomorrow and let you know. Igor's Github is @rogorido. Also link up top is not working.

@nabsiddiqui
Copy link
Collaborator

@ZoeLeBlanc Went ahead and just read it and fixed up some small errors. Should be good to go. I also removed all links to the offending page.

@ZoeLeBlanc
Copy link
Member Author

Thanks @nabsiddiqui ! Also thanks for catching the link above and helping me find @rogorido's handle! I'll try and have my comments posted here no longer than a week from today (so by Monday March 29). Let me know if anything comes up before then or if you have any questions, and looking forward to digging in to this lesson 😊

@ZoeLeBlanc
Copy link
Member Author

Alrighty! Mea culpa for the big delay but here's my initial editorial notes below @nabsiddiqui & @rogorido . I realize it's always tough to get feedback so please feel free to take as much time you need to decide what you want to incorporate and what you don't from my feedback. All of it is in essence optional, but I would appreciate that if you aren't going to address an issue that you offer some rationale for your choice.

I'm also happy to chat about the feedback and help your brainstorm how to incorporate it into the lesson. The big thing I'll need to know is when you plan to get through this feedback (again take as much as you need), so that I can start soliciting reviewers.

P1

  • "Gathering and analyzing data are important tasks that historians now face," Feel like historians have always done this to varying degrees and not sure this really does much for setting up the value of this lesson. Might want to reframe as helping historians leverage new ways of analyzing and visualizing their data. Also I would replace "see inside our data" with "beautiful plots for exploratory data analysis, and eventually even create publication-ready figures."

P6

  • "Creating graphics is a complicated issue." change to "Creating graphics is complicated."

P7

  • in 1 remove half parentheses from "which is available for free online)"
  • in 2 change "the package" to "the ggplot2 package" for specificity

P8

  • I feel like the use of "material" here is a bit confusing "The material we will analyze for our visualization." Maybe instead use dataset or information.
  • "The mapping of visual properties to geoms" Not sure most readers will know what 'geoms' are by this point in the tutorial.

General comment. I'm not sure defining all these terms here is the best way to introduce these concepts. In particular, I feel like you jumped a bit ahead without discussing why someone would even want to create a grammar of graphics. I wonder if a simpler explanation about the concept of creating a 'statistical graph' that mentions these terms might help readers understand what the terms mean in context.

P10

  • Change "As historians, we can ask new questions about economic, religious, and cultural relationships by examining this data. For instance, are German cities related to French or Polish cities, maybe as an attempt to overcome deep historical tensions?" to "As historians, we might ask are German cities related to French or Polish cities, maybe as an attempt to overcome deep historical tensions?"

P12

  • Delete this sentence since it's redundant and a worded a bit confusingly "readr can read most of the common formats you will encounter, but we will be reading in a csv file."

P15

  • Reverse the examples of as_tibble and as_dataframe, since we currently have a tibble and would change it to a dataframe and then back

P16

  • Replace "As you can see, we have a tibble called “eudata” with six countries. There are 13081 rows with 12 variables." with "You can inspect all six of our countries data in our eudata tibble, which contains 13081 rows with 12 variables."

P17

  • Put variables and column names in between ticks

P18

  • Replace "An interesting question we can ask our data is whether european cities have more profound relationships with cities in the same country, in other EU countries, or other countries in the world." with "We would assume that cities in the same country would have a closer relationship than those in other countries (whether in Europe or globally). We can test our assumption through analyzing and comparing whether European cities have more profound relationships with cities in the same country, in other EU countries, or other countries in the world."

P19

  • Need more descriptive alt-text for graph

P20

  • Replace "The first parameter of the ggplot() function is the tibble or data frame containing the information we are exploring." with "The first parameter of the ggplot() function is our tibble or dataframe variable."
  • Confusing wording in this sentence "Aesthetics, as you may recall from earlier, defines the variables in our data and how to map them to visual properties." What do you mean here by 'defines the variables in our data'? Needs to be clarified and I would recommend emphasizing the mapping of variables to visual properties.
  • Generally I feel like there's a bit redundancy/overlap between P20 and P22 with your discussion of aesthetics so might consider condensing them or moving the order around.

P25

  • Need more descriptive alt-text for graph

P26

  • Feel like this is a bit confusing "In other words, it aggregates the data for you." You might instead expand this sentence "This means that it will count how many times a value appears." to explain how counting diverges from sum or other ways of aggregating data.

Generally very confused about what typecountry actually contains and how it's useful for exploring relationships between cities. I think spending some time earlier one outlining the hypothesis more and what you would expect to find would help clarify the tweaks to the graph later on. Also you might explain why a bar chart is the best choice for exploring this question.

P28

  • Need more descriptive alt-text for graph

P31

  • I think it would be helpful to include some sort of diagram to describe the layers in ggplot (something like this maybeso that readers understand that geom means geometries
    https://lh3.googleusercontent.com/proxy/GUvXk1y9QwMnvsxlQ_rxA0bB2hSO65XmvHSWYXqOaHmgqoIXQcWEJMWpvzD5oHVgmxcl1OBgOg7ZaPiUCtmmvOXCbkSONoqGAxo)

P33

  • I appreciate the attention to skewedness but confused by mention of log10(dist) since it's not in the the following code snippet.

P34

  • Need more descriptive alt-text for graph

P37

  • Need more descriptive alt-text for graph

P40

  • Need more descriptive alt-text for graph

P41

  • I'm very confused by this sentence "It is up to you as a historian to explore explanations for this fact." You as the author are making a case for why these methods and data help us understand something about this question or problem, so why wouldn't you include something along the lines of this supports what historians have hypothesized around sister cities? Overall the goal is to help readers learn how to interpret results with these graphs, so if we don't help them do that in the lesson with the case study you've established then we leave them confused on the value of the lesson or these methods.

P43

  • Need more descriptive alt-text for graph

P44

  • Might be helpful to define overplotting here

P49

  • Why do we want to add axes? Instead of telling people that we do, explain the rationale for these design choices. (Same could be true for size and color).

P50

  • Need more descriptive alt-text for graph

P54

  • Need more descriptive alt-text for graph

P62

  • Need more descriptive alt-text for graph

P63

  • Why is it better to use pre-defined color scales? Also change "colors scalas" to "color scales"

P64

  • Need more descriptive alt-text for graph

P65

  • use of qualitative as opposed to continuous here doesn't make a lot of sense. I would say categorical instead of qualitative

P66

  • Need more descriptive alt-text for graph

P67

  • change "scala" to scale
  • change "gradiented" to "gradient"

P68

  • Need more descriptive alt-text for graph

P71

  • It would be helpful to earlier on state this as a hypothesis and then build up the graphs to answering this question

P73

  • Need more descriptive alt-text for graph

P75

  • Replace "Previously, we created a plot which compared cities and their relationships with cities in EU countries, non-EU countries using different colors for each country. ggplot2 also provides an effective way to create plots that include information splitted by categories (space, time, and so). We can represent the same data, but in graphs we separate per country." with "Previously, we created a plot which compared cities and their relationships with cities in EU countries, non-EU countries using different colors for each country. However, what if we want to explore these relationships country by country? While ggplot2 also provides an effective way to create plots that separated by categories (space, time, and in our case country).

P77

  • Need more descriptive alt-text for graph

P81

  • Need more descriptive alt-text for graph

P85

  • Why would one want to make a ridgeline plot? What does a ridgeline plot visualize that a different type of graph doesn't?

P86

  • Need more descriptive alt-text for graph

It would be helpful to conclude the lesson with some assessment of how this approach helped readers understand something about sister cities. Otherwise the lesson ends very abruptly.

I realize this is a lot of points, but most of them are small tweaks. Again feel free to reach out if you have additional questions or if anything is unclear, and thank you again for your patience. Excited to moving towards getting this out to reviewers and then published 🎉

@ZoeLeBlanc
Copy link
Member Author

Greetings @nabsiddiqui & @rogorido! Just wanted to ping you to see if you had any questions for me about my feedback or sense of your timeline for these revisions? NO pressure! Seriously took me forever to get to this, so don't want to rush you at all. Just wanted to make sure that you knew I was still committed to helping get your lesson published, and that I'm happy to help clarify any of the above comments 😊

@nabsiddiqui
Copy link
Collaborator

Hey @ZoeLeBlanc. I am still working on some of the revisions. It shouldn't be too much longer, but it is a busy summer and Fall. I will let you know if I need any additional help.

@anisa-hawes
Copy link
Contributor

Hello all,

Please note that this lesson's .md file has been moved to a new location within our Submissions Repository. It is now found here: https://github.com/programminghistorian/ph-submissions/tree/gh-pages/en/drafts/originals

A consequence is that this lesson's preview link has changed. It is now: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/visualizing-data-with-r-and-ggplot2

Please let me know if you encounter any difficulties or have any questions.

Very best,
Anisa

@nabsiddiqui
Copy link
Collaborator

@ZoeLeBlanc Took a long time to get back to this, but we have completed the revisions based on initial feedback. Let me know next steps and timeline, etc.

@anisa-hawes I upload new images for the tutorial. Right now, they are linked with a full source URL to the new images. I didn't know how the backend works enough to know how to do it in Markdown. Can you look at the links to the images and let me know how to fix them?

@anisa-hawes
Copy link
Contributor

Hello @nabsiddiqui. Yes, I can help. The format required is:

{% include figure.html filename="file-name.png" alt="Visual description of figure image" caption="Figure number: Caption text to display" %}

I can see that you've used the title tag title= instead of caption=, but I can change this for you (for consistency across our lessons).

I also note that, at the moment, your alt text and titles/captions match. Ideally the caption is a very concise line of text to number and identify the image, so that it can be understood within the context of the lesson content. Meanwhile, the alt text would explain visual information in the image for screen reader users. Do you have time to adjust these?

@nabsiddiqui
Copy link
Collaborator

@anisa-hawes Yes, I will adjust these soon.

@nabsiddiqui
Copy link
Collaborator

@anisa-hawes Should be good to go now.

@anisa-hawes
Copy link
Contributor

Thank you, @nabsiddiqui. Alt text looks good, and the images are displaying correctly in the preview.

@nabsiddiqui
Copy link
Collaborator

Thank you. @ZoeLeBlanc I think it should be good for sending to reviewers now.

@anisa-hawes
Copy link
Contributor

Hello @nabsiddiqui. Zoe is feeling unwell, so please be patient as you wait to hear back about the revisions you've submitted.

@nabsiddiqui
Copy link
Collaborator

nabsiddiqui commented Jul 22, 2022

No worries. We aren’t in any rush. Just wanted to make sure we got everything requested on our end.

@hawc2
Copy link
Collaborator

hawc2 commented Oct 13, 2022

@nabsiddiqui just to update you, @ZoeLeBlanc should be getting to this in the coming month

@nabsiddiqui
Copy link
Collaborator

Sounds good. Thanks.

@hawc2
Copy link
Collaborator

hawc2 commented Mar 20, 2023

@ZoeLeBlanc what's the status on editing this lesson?

@ZoeLeBlanc
Copy link
Member Author

Sorry for the slow reply @hawc2 ! Unfortunately, realistically I won't be able to get to this until May, so feel free to reassign it to a different editor if you would prefer the lesson move faster. I'm just a bit overcommitted at the moment so apologies for needing more time!

@hawc2
Copy link
Collaborator

hawc2 commented Apr 5, 2023

Hey @ZoeLeBlanc, if you can pick up this lesson in May, then let's keep you on as editor. Hopefully we can move this lesson forward and publish before end of the year. If you don't think it's feasible for you to handle other PH responsibilities and continue to work as an Editor on a lesson like this, though, please let me know and I'll look into alternatives.

@hawc2
Copy link
Collaborator

hawc2 commented Jul 24, 2023

@ZoeLeBlanc will you be able to edit this lesson? I am thinking we should ask @nabsiddiqui to revise and resubmit this lesson in the fall for our new submission period. Perhaps you can give him guidelines now for that revision before he submits in the fall?

@ZoeLeBlanc
Copy link
Member Author

@hawc2 sorry for being slow and that sounds perfect! I'll take a look and try to get any notes back in the next two weeks @nabsiddiqui and then come Fall we can work on finalizing reviewers. Thanks for being so understanding and patient!

@acrymble
Copy link

I think you should reject this on time elapsed @hawc2

@hawc2
Copy link
Collaborator

hawc2 commented Oct 10, 2023

Thanks @acrymble. I've discussed this with @anisa-hawes and @nabsiddiqui and we are planning to invite the authors of this submission @nabsiddiqui and @rogorido to revise and resubmit this lesson. We shouldn't 'reject' this lesson outright, as the delay in its review is our fault, for which I apologize. I agree that we should close this ticket though, and I'm doing that with this comment.

The English edition of Programming Historian is currently starting a new submission process, where we will post a CFP this fall, and accept a number of lessons to be edited and reviewed next year. I would highly encourage the authors to make some revisions to this lesson this fall, resubmit, and hopefully this will be brought into the publishing pipeline more efficiently next year.

@nabsiddiqui and @rogorido if you have any concerns or questions about this plan, please email me, the Managing Editor for the English journal, at english@programminghistorian.org. I would also be happy to review a draft of your lesson and give feedback via email this fall if that is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants