Skip to content

Latest commit

 

History

History
52 lines (31 loc) · 3.56 KB

matplotlib.md

File metadata and controls

52 lines (31 loc) · 3.56 KB

appropriate # Python and Pandas internationalisation

Data visualisation issues for languages that need bidirectional support or complex font rendering.

Matplotlib is a commonly used tool for basic data visualisation in Python, and is the default plotting tool with pandas.Dataframe.plot. It is also used by seaborn and wordcount, along with other libraries and tools.

The default backends for Matplotlib have a number of limitations:

  1. No support for the Unicode bidirectional algorithm,
  2. No support for complex font rendering

This places severe limits on what natural languages can be used in titles, lables, legends, and other text elements in plots.

The package mplcairo provides an alternative backend for matplotlib that uses Raqm and GNU FriBidi for bidirectional text layout and complex rendering of OpenType features. This allows the use of most languages to be supported in plots.

The key limitations for mplcairo are bugs in iPython and the lack of support for Jupyter notebooks.

Using the mplcairo backend for matplotlib we can display plot titles, axes labels and categorical tick labels in any language we need to support.

There are two missing pieces at this point:

  1. Display of numeric tick labels in a numeral system appropriate for the UI language.
  2. Choice on bidirectional layout req uirements of the appropriate data visualisation.

Numeral systems

Regarding the first issue, it is possible to use matplotlib.ticker.FuncFormatter() to apply a function to convert to the target numeral system, and apply necessary grouping and decimal separators.

RTL layout and data visualisation

It isn't always necessary to change the layout of the plot. If the plot is using a cartesian coordinate system, it is best to use the default layout. The layout used, combined with user expectations, will impact the interpretation of trends in data visualisations. User interpretation of the visualisations, combined with user experience are critical inputs into a data visualisation design.

If a RTL layout is required:

  1. Use yaxis.tick_right() and yaxis.set_label_position("right") to reposition y-axis to the right side of the plot
  2. Use plt.gca().invert_xaxis() to invert the x-axis. This step may not be necessary. UX is an important consideration.

Examples

The following python scripts uses Sorani Kurdish data:

Kurdish plot using Seaborn

Fig.1 - Kurdish bar charts in both LTR and RTL layouts.

Kurdish wordcloud

Fig.2 - Kurdish wordcloud.