Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graphs #6

Open
acangros opened this issue Jun 14, 2019 · 18 comments
Open

graphs #6

acangros opened this issue Jun 14, 2019 · 18 comments

Comments

@acangros
Copy link

great article! short and very concise
i think that one of the strong points of R is the great capacity to generate amazing graphs. Is it the same for Python?

@matloff
Copy link
Owner

matloff commented Jun 14, 2019

Do you mean graphs as in graphics, or graphs as in vertices and edges?

@jasonjiang8866
Copy link

jasonjiang8866 commented Jun 14, 2019

For Python you have matplotlib, seaborn, bokeh, plotly, etc..

@acangros
Copy link
Author

Do you mean graphs as in graphics, or graphs as in vertices and edges?
graphs as plots. "ggplot"its an amazing library for data science, newspaper, researchers... all you can imagine you can draw it. Python has nice tools for ploting, but not so cool at this moment.

Other two things:

  • integration with third parties, for example Neo4J for DBGraphs. I have no idea how good are R or Python in general?

  • community: R has an amazing and proactive comunity, and really participative on Twitter with #rstats or #tidyTuesday. I have no idea about Python.

For me, the strong pointsof R are dplyr, ggplot and comunnity.

@matloff
Copy link
Owner

matloff commented Jun 14, 2019

Yes, both R and Python have great add-on packages for graphics. My point about graphics being built-in to R merely meant that, e.g. one can draw a histogram immediately in R, whereas for Python one has to learn an add-on.

The sense of community in R is wonderful, but is threatened by the language unity issue I brought up. There is tension among some leaders of both groups, quite alarming to me.

@smartgamer
Copy link

smartgamer commented Jun 14, 2019 via email

@matloff
Copy link
Owner

matloff commented Jun 14, 2019

I must say I hate regex, and try not to use it much beyond the basics. I'm told that Python's regex is much better than R's, and was considering including this in my comparison, but due to lack of much knowledge of "advanced regex," chose not to do so.

@stevekm
Copy link

stevekm commented Jun 14, 2019

In general you should always seek to use ggplot2 as your plotting library in R. It's nice that there are base graphics included in R but I've found that in almost every case they become a hindrance rather than a benefit. Base R plotting, along with grid based plotting, is antiquated and extremely difficult to work with as soon as you need to interact with plots that someone else's code or library produces. The fact that you can assign your ggplot to an object and interact with it is a huge benefit; by contrast most other plotting packages involve writing directly to the graphics device, leaving you with massive headaches trying to do downstream manipulation of it. Further, ggplot2 has the great benefit of a single syntax for all types of plots, and it can act as a path to interactive JavaScript based plots via Plotly. If you are writing R code that produces plots then you need to be using ggplot2.

Note that this negates some of the benefit of having built in graphs vs. Python.

@smartgamer
Copy link

smartgamer commented Jun 14, 2019 via email

@matloff
Copy link
Owner

matloff commented Jun 14, 2019

I'm a big user of ggplot2, have been since it first came out. But please note: (a) I didn't say one should not use R add-ons, quite the contrary. (b) For the beginner, ggplot2 is very abstract, difficult to pick up and poorly documented, with mystifying, frustrating error messages; the lattice package is just as powerful, and is more intuitive. (c) One can use Plotly without ggplot2 (see my cdparcoord pkg). (d) Once again: My comments are mainly regarding beginners; a new R user can type 'hist(Nile') right away, without add-ons.

@chasbecker
Copy link

Advantage to R for vanilla plotting in the base language, which base Python lacks. However Pandas provides plotting routines similar to base R, so a single library brings Python much closer to base R, and from there everything is extra libraries on both sides.

@matloff
Copy link
Owner

matloff commented Jun 15, 2019

Re Pandas: Don't you need NumPy as well? And NumPy is pretty complicated. And a large number of functions for both? Saying "just one single extra library" seems unfair.

@chasbecker
Copy link

Pandas needs Numpy but to do base R type plotting the programmer doesn't need to know much about Numpy. The base R type plot functionality consists of methods of Pandas dataframes. Eg; pd.someDfThing.plot.line(). Not as obvious as R but not too bad, either.

@Zylatis
Copy link

Zylatis commented Jun 15, 2019

Yeah you can definitely get quite far with just pandas and ignoring other niggles, e.g. df.hist(). Might not infer as much cool stuff as the R equiv, but gets you out of the starting blocks all the same (and no direct numpy needed).

That being said, one of my great python gripes is matplotlib, especially after ggplot2. The sooner I get stuck into plotnine the easier my life will be I think. Actually, df.hist() illustrates one of my problems as it's matplotlib underneath: can only specify # bins, not bin width as a single number, which is nuts.

@smartgamer
Copy link

Thank you for your effort. I’m convinced by you and will start to learn base R again.

@jaapwalhout
Copy link

@smartgamer:

Even ggplot graphs are not great for academic publications. That’s why people use those commercial softwares. It’s a shame actually.

That's just not true. I've seen lots of academic publications with ggplot / base R graphics (including my own).

R is good for learning statistics and doing some quick analysis on small datasets. That’s my impression.

Also not true. R is being used in large corporations and (academic) research institutes on small and large datasets. For example, I'm using R on datasets with millions of rows without a problem.

@matloff
Copy link
Owner

matloff commented Oct 18, 2019

Yes, R is definitely used in large corporations. Ever heard of Google? :-) Actually, you might want to look at the large corps. in the R Consortium.

R packages such as ggplot2 and lattice are used all the time in academic publications, including mine.

@smartgamer
Copy link

smartgamer commented Oct 18, 2019 via email

@matloff
Copy link
Owner

matloff commented Oct 18, 2019

Well, Charles, most of my books don't have much graphics, but the won that does won a major award in 2017. Presumably that means the graphics were publication quality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants