Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add ohlc / candlestick plotting option for DataFrame #783

Closed
wesm opened this issue Feb 15, 2012 · 24 comments
Closed

ENH: Add ohlc / candlestick plotting option for DataFrame #783

wesm opened this issue Feb 15, 2012 · 24 comments

Comments

@wesm
Copy link
Member

wesm commented Feb 15, 2012

Not sure if possible to do in a generic way. But maybe will make sense once we have an OHLC aggregator implemented

@esthers
Copy link

esthers commented Feb 17, 2013

Yes please implement candlestick plotting for DataFrame. It is so nice to make instant plots of financial data from yahoo with pandas but candelstick charts for many, like me, really are a must to analyze the data.

@wesm
Copy link
Member Author

wesm commented Feb 17, 2013

I'd be happy to accept a pull request for this

@dalejung
Copy link
Contributor

I have this here:
http://nbviewer.ipython.org/4982660/

In order to support financial data I send mpl an ordinal index, essential np.arange(len(df.index)) and draw the ticks with a Locator/Formatter. Doing this with DatetimeIndex actually speeds up plotting #1579 and gets rid of weekend gaps. However, a side effect of this is that you have to keep the previous data around for future plots on the same axes. For that I have a Figure class that handles translating any future plots into an ordinal index. I chose to keep all plotted data around for plot specific fillnas, but that's probably unnecessary.

I'd like to merge this stuff in but I'm not sure how compatible it is. trtools.charting is fairly heavy handed in assuming you're not touching matplotlib directly. Mixing ordinal and datetime index plots on the same axes wouldn't work.

The module itself is self-contained outside of a small column_grep utility, so it can be made to be cleanly import-able as an add-on.

@esthers
Copy link

esthers commented Feb 20, 2013

@dalejung, if you could find a way to merge your code it would be greatly appreciated.

@woodb
Copy link

woodb commented Jul 9, 2013

@dalejung, have you had any luck implementing this?

I agree that importing a big library for something like a candlestick plot seems heavy handed, maybe there's a lighter weight way we can get this going.

@woodb
Copy link

woodb commented Jul 9, 2013

@wesm, how do you think this should work syntactically?

Rolling this functionality into plot might be neat.

For example, if the DataFrame has a datetime index and columns "open", "high", "low", "close", and a DataFrame.plot() is called, would it make sense for the result to be a candlestick plot by default?

Would this be better as a kwarg for DataFrame.plot() or as a separate function (e.g. ohlc_plot)

@dalejung
Copy link
Contributor

dalejung commented Jul 9, 2013

@woodb Sort of. I split out the charting work into its own project https://github.com/dalejung/ts-charting. The heaviness was due to me keeping the plotted data around in a DataFrame. I was doing this to use the DataFrame.__setitem__'s auto-reindexing to keep the plots aligned. I'm a bit smarter now and keep only index around.

I think ts-charting should be straight-forward to merge into pandas, at least a portion of it. When I get the time I'll open a PR.

@woodb
Copy link

woodb commented Jul 9, 2013

@dalejung Prima, let me know if you need any help, I've got some time here and there.

@dalejung
Copy link
Contributor

dalejung commented Jul 9, 2013

@woodb Actually, a smaller first pass could be done with http://nbviewer.ipython.org/5864433. That uses the matplotlib datetime handling.

ts-charting converts the DatetimeIndex to ints and translates the labels back with a custom Formatter. I'm pretty sure this could be made pandas compatible by always checking the current ax for our TimestampFormatter and reindexing to the current x-axis. I suppose the Figure and Grapher classes would need their methods refactored into flat functions or hidden away onto the Formatter.

Converting to an int-index is better but it means that every plot to that ax will need the translation. It's convenient to call fig.hl_span("2012-01-02", "2012-01-05"), but the base matplotlib method would not work ax.axvspan("2012-01-02", "2012-01-05") unless I monkey patched it.

Actually, thinking about it more. I dunno if that's acceptable for inclusion into pandas. Part of the reason for the Figure abstraction is to create a namespace where all methods are aware of the Datetime -> int translation. hm.

Go Gators??!

@phobson
Copy link

phobson commented Dec 13, 2013

Thought y'all might be interesting in this:
matplotlib/matplotlib#2643

If merged, you'll be able to feed a matplotlib axes object a list of dictionaries "stats" describing the boxplots via a bxp method

@ghost ghost mentioned this issue Dec 14, 2013
@ghost
Copy link

ghost commented Dec 16, 2013

Thanks @phobson. Worth noting that mpl already has candlestick support pretty much baked in,
But it's planned for removal in a future release?

@phobson
Copy link

phobson commented Dec 16, 2013

@y-p -- yeah. What I've heard is that matplotlib.finance will be deprecated in mpl 1.4 and removed by 1.5.

@ghost
Copy link

ghost commented Dec 16, 2013

@phobson , is it getting a new home or will the code just be thrown away?
The fragmented state of python viz has been coming up in issues here lately.
The situation is eerily similar to wes's old post on the fragmented state
of data libraries in python, hoping for a similar turn for the better.

@jreback
Copy link
Contributor

jreback commented Dec 16, 2013

@phobson new champion for a combined viz library? (built on pandas of course) :)

@jtratner
Copy link
Contributor

I was discussing this out of band with @olgabot. @mwaskom 's seaborn could be a good candidate (and it already appears to use pandas).

@phobson
Copy link

phobson commented Dec 16, 2013

@y-p I can't say for certain what the final fate of it is. My impression, however, is that it'll be thrown out.

Personally, I think pandas/pydata people/we should be equally supporting/endorsing/contributing to seaborn and python-ggplot.

I got in early on matplotlib straight from using matlab during my stint in academia -- which is to say I'm comfortable using it directly and I probably won't ever get my head around ggplot-esque APIs. However, I think having both styles of API is crucial to growing the python-science/data/viz community.

@ghost
Copy link

ghost commented Dec 16, 2013

poignant comment made a few days ago re two valid approaches: mpl recipes vs. ggplot.
I don't know which library will come out on top (hopefully both), but we should encourage viz PRs
to join forces with existing efforts rather then pandas sapping work away IMO.

Note: @phobson just beat me to it :)

@jtratner
Copy link
Contributor

@phobson - yeah, definitely think both are good efforts. I find ggplot
really confusing personally (but that's probably because I started with
mpl).

Not totally clear where that leaves us in terms of the visualization PRs we
currently have. If we were to change the plot functions to lookup kind
with a dictionary, it'd be relatively easy for others to hook into the
pandas plot method (not sure if that's actually desirable though)

@olgabot
Copy link

olgabot commented Dec 16, 2013

So should I move heatmap to seaborn (or at least suggest it with a PR)
to keep with this theme?


Olga Botvinnik
PhD Program in Bioinformatics and Systems Biology
Gene Yeo Laboratory | Sanford Consortium for Regenerative Medicine
University of California, San Diego
olgabotvinnik.com
blog.olgabotvinnik.com
github.com/olgabot

On Mon, Dec 16, 2013 at 9:56 AM, y-p notifications@github.com wrote:

poignant comment made a few days ago re two approaches: mpl recipes vs.
ggplot.
I don't know which library will come out on top, but we should encourage
viz PRs
to join forces with existing forces rather then pandas sapping effort away
IMO.


Reply to this email directly or view it on GitHubhttps://github.com//issues/783#issuecomment-30683347
.

@ghost
Copy link

ghost commented Dec 16, 2013

You might, I've been thinking of doing the same thing except targeting ggplot.

@wesm
Copy link
Member Author

wesm commented Sep 29, 2016

Won't fix

@wesm wesm closed this as completed Sep 29, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: Next Major Release, No action Sep 30, 2016
@femtotrader
Copy link

Some basic code to display daily OHLCV data

import pandas as pd
from pandas.compat import StringIO

import matplotlib.pyplot as plt
from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates

def plot_candlestick(df, ax=None, fmt="%Y-%m-%d"):
    if ax is None:
        fig, ax = plt.subplots()
    idx_name = df.index.name
    dat = df.reset_index()[[idx_name, "Open", "High", "Low", "Close"]]
    dat[df.index.name] = dat[df.index.name].map(mdates.date2num)
    ax.xaxis_date()
    ax.xaxis.set_major_formatter(mdates.DateFormatter(fmt))
    plt.xticks(rotation=45)
    _ = candlestick_ohlc(ax, dat.values, width=.6, colorup='g', alpha =1)
    ax.set_xlabel(idx_name)
    ax.set_ylabel("OHLC")
    return ax


data="""Date,Stock,Open,High,Low,Close,Volume
2016-09-29,KESM,7.92,7.98,7.92,7.97,149400
2016-09-30,KESM,7.96,7.97,7.84,7.9,29900
2016-10-04,KESM,7.8,7.94,7.8,7.93,99900
2016-10-05,KESM,7.93,7.95,7.89,7.93,77500
2016-10-06,KESM,7.93,7.93,7.89,7.92,130600
2016-10-07,KESM,7.91,7.94,7.91,7.92,103000"""
df = pd.read_csv(StringIO(data), index_col='Date', parse_dates=True)

ax = plot_candlestick(df)

print(ax)

plt.tight_layout()
#plt.savefig("candle.png")
plt.show()

candle

Related SO questions:

@jreback
Copy link
Contributor

jreback commented Oct 12, 2016

@femtotrader you are just plotting a box-plot.

http://pandas.pydata.org/pandas-docs/stable/visualization.html#box-plots

@femtotrader
Copy link

An example of box-plot with such a DataFrame will be nice.

I tried

df[["Open", "High", "Low", "Close"]].transpose().plot.box()

and get

boxplot

It doesn't look very well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants