This guide introduces an open-source toolkit for academic research and writing. The main features of this toolkit centered around Emacs and Org-mode are:
- embedded R code in the document that allows for statistical results to be revised and reproduced,
- bibliographic citations from a personal bibliographic database,
- formatting using well defined styles with minimal markup,
- support for production of final output as pdf, odt, docx, html and many other formats.
This guide will be useful for you if you are writing a research paper, a dissertation or an academic book. It would be useful if your writing involves one or more of the following:
- Citing existing literature in your area
- Presenting results of statistical analyses (in tabular form and/or graphically)
- Using mathematical equations
Following this guide would need some investment of time but benefits far outweigh the investment you make.
What are the most interesting features of the writing platform that you will set up using this guide?
- With easy style specifications that you provide, the document will
be almost-entirely automatically formatted by the software.
- Complicated LaTeX-style markup is a pain and Openoffice/MS-Word documents require too much manual formatting. Basic Org-mode mark-up is extremely simple, and can be mastered in very little time.
- Org-mode can produce well formatted output in LaTeX, pdf, odt, docx, html and many other formats.
- Instead of including statistical results (tables, graphs, etc), we
would embed appropriate R programs in the document, so that when the
formatted output is produced, all programs are run to generate the
results. Advantages of doing this are:
- Any changes in the data being used can be accommodated just by publishing the document again.
- Any modifications in statistical analysis are easily made by modifying the programs that are embedded in the file itself.
- Anyone who has the org file, can reproduce your results. You can also extract all R programs from the org file and distribute those for reproduction of your results.
- The document will be integrated with a citation manager, so that
bibliographic information will be pulled automatically from a
central database to create a fully formatted bibliography.
- You will maintain a bibliographic database in BibTeX format, that you can build over time, adding bibliographic information for works that you cite.
- Many websites (including Google Scholar) provide bibliographic information directly in BibTeX format, and we will have integrated tools that will allow us to pull this information directly into our local database.
In my adaptation of org, I have benefited immensely from the great community of org-mode developers and users. The Org-mode manual, Worg, and archives of the Org-mode mailing list have been the most important resources. In addition, I have greatly benefited from solutions provided by various people to my specific queries on the Org-mode mailing list. What I present in this document is essentially a synthesis of solutions provided by various people. The community has been extremely generous in providing these.
I would particularly like to thank
- Carsten Dominik, the author of Org-mode.
- Bastien Guerry, who has been a great maintainer of Org-mode, after Carsten passed on the baton to him.
- Nicolas Goaziou, who wrote the brilliant new exporter framework. The amount of code Nicolas has contributed to Org over the last two years or so is incredible. Nicolas very kindly responded to several of my queries.
- Eric Schulte, the main author of Babel, which gave Org mode the ability to execute code. I used to use Org-mode as a task manager and for taking notes. I discovered org-babel in the summer of 2010, when I was doing fieldwork in villages in eastern India. This discovery completely changed my work flow, and Org-mode became central to all my academic work.
- John Kitchin, the author of scimax, org-ref and org-ref-cite, with whose use of org-mode I have the greatest overlap. His documentation and videos are a great resource.
- In addition to the above, Suvayu Ali, for responses to several of my queries on the mailing list.
This set up will work with any operating system. I have tested it on GNU/Linux and Mac OS-X, but it should work on Windows as well. For this setup, you need to install Emacs (Version 24 along with a few additional Emacs packages), Texlive, R (along with whatever additional R packages you want to use) and Pandoc.
- GNU/Linux
Emacs can be installed using package managers of all GNU/Linux distributions. Latest versions of most common distributions provide version 24. I strongly recommend using the latest version of Emacs.
- Mac OS-X
The built-in Emacs on OS-X is an older version, and it would be a good idea to install the latest version instead.
The best option is to install it via homebrew. I like the version available from railwaycat/emacsmacport tap (https://github.com/railwaycat/emacs-mac-port).
After installing homebrew, or if you already have it installed, just do the following from the terminal
$ brew tap railwaycat/emacsmacport
$ brew install emacs-mac
- Microsoft Windows
Download the latest version of Emacs from http://ftp.gnu.org/gnu/emacs/windows/, and install.
- GNU/Linux
Texlive can also be installed from package managers in most GNU/Linux distribution.
- Mac OS-X
For OS-X, install MacTeX from http://www.tug.org/mactex/
- Microsoft Windows
For Windows, download Texlive and follow instructions from https://www.tug.org/texlive/doc.html
In this guide, I assume that you are familiar with R (http://www.r-project.org). I will not cover R programming in this guide.
For GNU/Linux, R can be installed from native package managers (look for r-base in debian and debian-based distributions). For Mac OS-X and Windows, download and see installation instructions at http://www.r-project.org
Pandoc (http://johnmacfarlane.net/pandoc/) is an extremely powerful converter, which can translate one markup into another. It supports conversion between many file formats, and supports “syntax for footnotes, tables, flexible ordered lists, definition lists, fenced code blocks, superscript, subscript, strikeout, title blocks, automatic tables of contents, embedded LaTeX math, citations, and markdown inside HTML block elements.” That is pretty much everything I use.
We shall use pandoc to convert our file from LaTeX to odt/docx/html formats.
GNU Emacs is an extensible platform. Although its primary function is as an editor, it can be extended to do almost anything that you would want your computer to do. Now, that really is not an overstatement. It is a worthwhile aim to slowly shift an increasing number of tasks you do on your computer to emacs-based solutions. For each major task you do on your computer, ask if it can be done using emacs. For almost everything, the answer is yes, and in most cases, emacs does it better than other software you are used to. Many emacs users have learnt emacs by shifting, one-by-one, all major tasks that they do on the computer to emacs.
I am not going to give a detailed guide to use of emacs. A few tasks for which I use Emacs include
- File management (copying files, moving files, creating directories)
- Reading and writing e-mails
- Reading RSS feeds
- Calender, scheduler, planner
- Calculator
- Statistical work (by hooking Emacs to R)
- And, of course, as an editor (including for writing research papers)
In this guide, I will just provide a minimal set of basic commands in emacs to get you started. This is a minimal but sufficient set to be able to work. I expect that you would learn more commands as you start using emacs.
In emacs, a buffer is equivalent to a tab in a web browser. It is normal to have several buffers open at the same time. Each file opens in emacs as a buffer. Buffers could also have processes like R running in them. Emacs displays any messages for you in a separate buffer.
Most commands in emacs are given using the Control (ctrl) or the Meta
(often mapped to alt) keys.[fn:3] Control key is usually referred to as C-
and the Meta key as M-
. So a command C-c
means pressing Control
and c together. Command M-x
means pressing Meta and x
together. Everything is case-sensitive. So M-X
would mean, pressing
Meta, Shift and x together. C-c M-x l
would mean pressing C-c,
release, then M-x, release, and then l.
Table essential-emacs-commands gives the commands that are the most important. This is a minimal set, commands that you should aim to learn as soon as possible. There are many more, which you will learn as you start using emacs.
All commands have a verbose version that can be used by pressing M-x
and writing the command. For example, M-x find-file
to open a file.
All major commands are also mapped to a shortcut. For example, instead
of typing M-x find-file
to open a file, you can say C-x C-f
. I
remember shortcuts for commands that I use most frequently. For
others, I use the verbose versions. Over time, one learns more
shortcuts and starts using them instead of the verbose versions.
Description | Verbose command | Shortcut |
M-x followed by | ||
---|---|---|
*Opening files, saving and closing* | ||
Open a file | find-file | C-x C-f |
Save the buffer/file | save-buffer | C-x C-s |
Save as: prompts for a new filename and saves the buffer into it | write-named-file | C-x C-w |
Save all buffers and quit emacs | save-buffers-kill-emacs | C-x C-c |
*Copy, Cut and Delete Commands* | ||
Delete the rest of the current line | kill-line | C-k |
To select text, press this at the beginning of the region and then take the cursor to the end | set-mark-command | C-spacebar |
Cut the selected region | kill-region | C-w |
Copy the selected region | copy-region-as-kill | M-w |
Paste or insert at current cursor location | yank | C-y |
*Search Commands* | ||
prompts for text string and then searches from the current cursor position forwards in the buffer | isearch-forward | C-s |
Find-and-replace: replaces one string with another, one by one, asking for each occurrence of search string | query-replace | M-% |
Find-and-replace: replaces all occurrences of one string with another | replace-string | |
*Other commands* | ||
Divide a long sentence into multiple lines, each smaller than the maximum width specified | fill-paragraph | M-q |
*Window and Buffer Commands* | ||
Switch to another buffer | switch-to-buffer | C-x b |
List all buffers | list-buffers | C-x C-b |
Split current window into two windows; each window can show same or different buffers | double-window | C-x 2 |
Remove the split | zero-window | C-x 0 |
When you have two or more windows, move the cursor to the next window | other-window | C-x o |
*Canceling and undoing* | ||
Abort the command in progress | keyboard-quit | C-g |
Undo | undo | C-_ |
Emacs is highly customisable. We will need a well customised emacs installation with a number of additional emacs packages. There are many configuration frameworks available for emacs (including spacemacs, doom, prelude, and scimax). You may try these and choose whichever you like. In most cases, you would need to do some additional customisation to suit your needs.
My current emacs configuration does not use any of these frameworks. Instead, I have a rather modular setup primarily based on use-package for loading the additional packages I need. Packages are installed mainly using package.el, the most popular package manager for emacs. A few packages, including org-mode, are directly installed from their git repositories to take advantage of the latest features available.
My configuration directory can be cloned from github using the following commands:
$ mv .emacs.d .emacs.d.old
$ git clone --recurse-submodules -j8 https://github.com/ep624/emacs.d.2021.git ~/.emacs.d
$ cd ~/.emacs.d/org-mode && make
The first command will move any existing emacs configuration to .emacs.d.old, and the second command will install my configuration instead. The third command compiles org-mode.
Starting version 24, Emacs includes a package-manager. You can install/update add-on packages using the package manager. To use the package manager, press M-x in emacs, and then type package-list-packages and press return. This would bring up a list of packages.
To mark a package for installation, take the cursor to the item and press i. Once you have marked the packages you want to install, press x to execute the installation.
The following emacs commands will install all the packages that I currently have in my emacs installation. You many not need some of them. You may also want some others depending upon your use. You can always delete the ones you do not need, and add any other packages that you need.
(setq package-selected-packages '(ag anzu async auto-complete avy
bibretrieve auctex cdlatex
citeproc coffee-mode consult
counsel-projectile
counsel-tramp counsel
dired-subtree dired-hacks-utils
dirtree ebib edit-server
elfeed-goodies ace-jump-mode
elfeed-score elfeed emacsql
embark erc-hl-nicks erc-image
eshell-git-prompt
ess-R-data-view ctable
ess-r-insert-obj
ess-smart-equals
ess-smart-underscore ess-view
ess-view-data csv-mode ess
flx-ido flx flycheck
git-gutter+ git-gutter
highlight-indentation htmlize
ido-completing-read+
ido-vertical-mode iedit
ivy-bibtex bibtex-completion
biblio biblio-core
ivy-prescient key-chord keycast
kurecolor magit git-commit
magit-section memoize move-text
multi-web-mode multiple-cursors
nameless noflet notmuch-labeler
notmuch org-gcal alert log4e
gntp org-sidebar org-ql f
dash-functional
org-sticky-header
org-super-agenda ht ov parsebib
pdf-tools peg persist
persistent-scratch pinentry
popup popwin powerline
prescient pretty-hydra hydra lv
projectile pkg-info epl
quelpa-use-package quelpa queue
rainbow-delimiters rainbow-mode
remember-last-theme
request-deferred request
deferred simple-httpd
smartparens string-inflection
super-save swiper ivy tblui
tablist magit-popup transient
tree-mode ts s dash use-package
bind-key visual-fill-column
which-key windata with-editor
yasnippet-snippets yasnippet
zenburn-theme))
(package-install-selected-packages)
You can copy the above lines, paste them in an empty buffer (e.g., \*scratch\*), select them, and do M-x eval-region.
You can considerably speed up your work in emacs by using
yasnippets. Yasnippets are chunks of text – forms, templates, lines
of code – that can be inserted in a buffer using a keyword. It allows
you to insert text that needs to be used repeatedly by using a short
keyword. If you have cloned my emacs configuration, a whole bunch of
snippets would be in
An Org file has a few special lines at the top that set up the environment. Following lines are an example of the minimal set of lines that we shall use.
#+title: Using Emacs, Org-mode and R for Research Writing in Social Sciences
#+subtitle: A Toolkit for Writing Reproducible Research Papers and Monographs
#+author: Vikas Rawal
#+date: May 4, 2014
#+options: toc:2 H:3 num:2
As you can see, each line starts with a keyword, and the values for this keyword are specified after the colon.
Table special-lines gives details of a few major special lines that we shall use. The table also gives snippet keywords that can used to create the keyword if you have got the yasnippets from my emacs configuration.
Keyword | Snippet | Purpose |
---|---|---|
#+title | <ti | To declare title of the paper |
#+author | <au | To declare author/s of the paper |
#+date | <da | Sets the date. If blank, no date is used. If this keyword is omitted, current date is used. |
#+options | <op | There are many options you can give. These are what I find the most important Multiple options can be separated by a space and specified on the same line. |
toc:nil (Do not include a Table of contents), toc:n (Include n levels of sections and sub-sections in Table of contents) | ||
H:2 (Treat top two levels of headlines as section levels, and anything below that as item list. Modify the number as appropriate) | ||
num:2 (Number top two levels of headlines. Modify the number as appropriate.) |
In addition to these, we can use LaTeX specific options for formatting the pdf output, odt specific options for formatting the odt/docx output, and R specific options for setting up the R environment. These would also be specified using special lines at the top of the file. I shall provide details of some of these in the sections where these topics are discussed.
In research monographs prepared using org, you may need quite a few lines in the preamble to set everything. These would tend to make the top of the org file look daunting. It may be helpful to store these lines in a set of separate org files, and include them in your main document using line such as this:
#+INCLUDE: path-to-file/papersetup.org
After the special lines at the top comes the main body of the Org file.
The content in any Org file is organised in a hierarchy of headlines. Think of these headlines as sections of your paper.
A headline in Org starts with one or more stars (*) followed by a space. A single star denotes the main sections, double star denote the subsections, three stars denote sub-subsections, and so on. We shall use this to create the section structure of our document. You can create as many levels of sections as you need.
See the following example. Note that headlines are not numbered. We leave section numbering for org-mode to handle automatically.
Org handles these headlines beautifully. With your cursor on the headline, pressing tab folds-in the contents of a headline. If you press tab on a folded headline, it opens to display the contents. If there are multiple levels of headlines, these open in stages as you repeat pressing the tab key.
When you are on a headline, pressing M-return creates a new headline at the same level (that is, with the same number of stars). Once you are on the new headline, a tab moves it to a lower level (that is, a star is added), and shift-tab moves it to a higher level (that is, a star is removed).
When I start writing a paper, I start with a tentative headline/section structure, and then start filling in the content under each headline, and modify the section structure, if needed, as the paper develops.
(For further reading, see Headlines in the Org-mode manual)
Following syntax produces unordered (bulleted) lists:
+ bullet
+ bullet
- bullet2 1
- bullet2 2
+ bullet
+ bullet
This is how this list shows up in the final document
- bullet
- bullet
- bullet2 1
- bullet2 2
- bullet
- bullet
Note that, in unordered lists, +
and -
signs are interchangeable.
Following syntax produces ordered/numbered lists:
1. Item 1
2. Item 2
1) Item 2.1
2) Item 2.2
1) Item 2.2.1
3) Item 3
This is how the ordered list shows up in the final document.
- Item 1
- Item 2
- Item 2.1
- Item 2.2
- Item 2.2.1
- Item 3
Note that, in ordered lists 1. and 1) are interchangeable.
Description lists are used for providing definitions/explanations for words/phrases like in a dictionary. Following syntax produces a description list:
+ item1 :: description of item1
+ item2 :: description of item2
- bullet1 under item2
- bullet2 under item2
This is how this list shows up in the final document
- item1
- description of item1
- item2
- description of item2
- bullet1 under item2
- bullet2 under item2
Note that:
- In lists, levels of bullets and numbering are determined by indentation.
- Different types of lists can be mixed using numbers and bullets for different levels.
- If the cursor is on a line that is part of an itemised list, M-return inserts a new line with a bullet/number below the present line with the same level of indentation.
- To insert a footnote at any point, use
C-c C-x f
- To reorder and renumber footnotes after inserting a footnote in a
text that already has some footnotes after the point where a new
footnote is being inserted, use
C-u C-c C-x f S
The following sample code produces a reasonably formatted table, with a numbered title above the table and a name for cross-referencing the table from the text anywhere in the document.
See Table table-yield for an illustration of how this table shows up in the final document.
#+name: table-yield
#+caption: Simple table created using LaTeX tabular environment
#+attr_latex: :environment tabular :width \textwidth :align lrr
| State | Average yield | Average income |
|----------------+---------------+----------------|
| Madhya Pradesh | 672 | 13000 |
| Haryana | 300 | 25000 |
| Punjab | 260 | 35000 |
State | Average yield | Average income |
---|---|---|
Madhya Pradesh | 672 | 13000 |
Haryana | 300 | 25000 |
Punjab | 260 | 35000 |
Org-mode has an in-built table editor, which is very simple to use.
- Tables in Org have columns separated using |.
- Once you create the first row by separating columns using |, pressing tabs takes you from the first column to the next. Org automatically aligns the columns.
- At the end of the row, pressing tab again, creates a new blank row. You can also create a new blank row by pressing return anywhere in the last row.
- For creating a horizontal line anywhere, type |- at the starting of the line, and press tab.
- Contents of each cell are aligned automatically by Org.
- To delete a row, use
C-k
.
Org provides various commands for manipulating the design of tables. Table org-table-commands provides the most important ones. Note that Table org-table-commands is created using Org mode. It also gives you an idea of how the table would look eventually.
Command | Description |
---|---|
M-<left> | Move the column left |
M-<right> | Move the column right |
M-S-<left> | Delete the current column |
M-S-<right> | Insert a new column to the left of the cursor position |
M-<up> | Move row up |
M-<down> | Move row down |
M-S-<up> | Delete the current row or horizontal line |
M-S-<down> | Insert a new row above the current row |
For more commands for manipulating tables, see this section of the Org manual. In particular, you may want to look at spreadsheet-like functions of the table editor.
If you have snippets included with my emacs configuration, you can
type nct1
, nct2
, nct3
or nct4
and press the tab key to create
blank tables with 1-4 columns.
Note that we will directly create only those tables in org that are not produced as a result of some statistical analysis. For tables that are a result of some statistical analysis, we will embed R programs rather than the tables themselves. This is discussed in Section #orgmodeandr of this guide.
You can insert images in documents as follows
[[file:path-to-file/a.jpg]]
You should do this for images that you already have, and you just want
to insert them in the document. If you have snippets included in my
emacs configuration, you can use yasnippet ncf
to help in inserting
a named figure. For graphs produced by R, we will embed the code
instead, so that the graph is generated and inserted automatically.
We would like to give a title to our tables and images. And we would like to be able to refer to them from the text. These are achieved by adding two lines above every table and image.
- A line starting with
#+caption:
placed just above a table or a figure adds a title to it. All Tables and Figures titles are automatically numbered. - For referring to these Tables and Figures in the text, we shall name
each table and figure in a line starting with
#+name:
as below.
To illustrate, for inserting an image, with a caption and a name, this is what we shall do.
#+name: literacy-rate
#+caption: Percentage of literate men and women, by country (per cent)
[[file:a.jpg]]
Similarly, a table will be inserted as follows.
#+name: literacy-rate-table
#+caption: Percentage of literate men and women, by country (per cent)
| Country | Men | Women |
|------------+-----+-------|
| India | 75 | 43 |
| Bangladesh | 83 | 63 |
| Rwanda | 77 | 60 |
To refer to the Table above in the text, write Table
[[literacy-rate-table]]
. As an illustration, see the following sentence.
Tables [[literacy-rate-table]] and [[health-table]], and Figure
[[literacy-figure]], show the level of underdevelopment.
By default, all objects with captions are numbered, and names are used to anchor cross-references. When the formatted output is produced, all the references would be automatically converted to appropriate numbers. If new objects are inserted in the paper, numbering will be adjusted automatically when you create the formatted output.
Following code in .emacs.d/_configs/use-org.el enables Org to run different types of code. If you are using my emacs configuration, these are already enabled.
I have included here the languages that I commonly use. See Org manual, if you would like to add any more.
(org-babel-do-load-languages
'org-babel-load-languages
'((R . t)
(org . t)
(ditaa . t)
(latex . t)
(dot . t)
(emacs-lisp . t)
(gnuplot . t)
(screen . nil)
(shell . t)
(sql . nil)
(sqlite . t)))
Org uses ESS (emacs-speaks-statistics) to provide a fully functional, syntax-aware, development environment to write R code. R code is embedded into Org as a source block. The basic syntax is
#+name: name_of_code_block
#+BEGIN_SRC R <switches> <header-arguments>
<Your R code goes here.>
#+END_SRC
This is how source blocks are created.
- First write the lines starting with
#+NAME
,#+BEGIN_SRC
and#+END_SRC
. We will use different snippets to quickly insert these lines for different types of code blocks. - Then with your cursor in between the
BEGIN_SRC
and theEND_SRC
lines, give the command C-c ’ (that is, press Ctrl-C, release, and press ‘).- This would open a new buffer using ESS mode. If you type your code in this buffer, you will see that ESS is syntax-aware and nicely highlights R code.
- ESS also allows you to run (evaluate) the code that you write, to
test what your code is doing.
C-return
orC-c C-n
can be used to evaluate a line/region of code and move to the next line. Or you can useC-c C-j
for evaluating a single line of code,C-c C-b
for evaluating the entire ess buffer, orC-c C-r
for a marked region within the ess buffer.
- Once you have finished writing a code block and tested it, press
C-c '
again to come back to your Org buffer. - In your Org buffer, with your cursor in a source-block, press
C-c C-c
to evaluate the whole code block and have the results included in your document. - You can always edit your source code by opening a temporary ESS buffer using C-c’
Code blocks that read data and load functions for later use in the document without any immediate output
I normally have one or two code blocks that read the data I am going to use, call the libraries that I use, and define a few functions of my own that I plan to use. I want this code block to be evaluated, so that these data, libraries and functions become available in my R environment. But no output from such code blocks is expected to be included into the document.[fn:4]
Code block readdata-code is an example of such a code block. Note
:results value silent
switch used in the #+begin_src
line.
#+name: readdata-code
#+BEGIN_SRC R :results silent :exports none
read.data("datafile1.csv",sep=",",
header=T)->mydata1
#+END_SRC
The output of a lot of R code I write is presented in some kind of table.
There are two approaches for producing formatted, publication-ready tables.
- We get R to produce bare tables, add Org-mode markup to those, and get Org-mode mode to export these into nice looking LaTeX tables.
- Alternatively, we can get R to produce nicely formatted LaTeX tables, and let Org just export the LaTeX code as it is.
I will discuss the first approach in this section.
The code block may use data and functions made available by previous code blocks, read some new data and may load some new functions. The code block does some statistical processing. The last command of the code block produces an object (for example, a data frame) that is included in the document as a Table.
For example, the code block r-code-table below uses mydata1 read in the previous code block, reads a new dataset, and processes them to create a table that shows average BMI by country.
#+name: bmi-table-code
#+BEGIN_SRC R :results value :exports results :colnames yes :hline
aggregate(height~Country,data=mydata1,mean)->a1
read.data("datafile2.csv",sep=",",header=T)->mydata2
aggregate(weight~Country,data=mydata2,mean)->a2
merge(a1,a2,by="Country")->a1
a1$weight/a1$height->a1$BMI
subset(a1,select=c("Country","BMI"))
#+END_SRC
If you are using my emacs configuration, you can use the snippet
srct
to insert a blank code block of this kind. You can then use
C-c '
to go into a temporary ess buffer and write the code.
You can evaluate the code block using C-c C-c
. When you do that, it
produces the output, and places it immediately below the code block.
The results display the output of the code under a line that looks
like below
#+RESULTS: bmi-table-code
Note that the results are tied to the code block using the name of the
code block. Every time you go to the source code block and press C-c
C-c
, the code is evaluated again and the results are updated.
On top of the line starting with #+RESULTS:
, we shall add two more
lines, to give the table a caption and a name. Note that the code
block and the result of the code block have separate names.
#+name: bmi-table-output
#+caption: Average BMI, by country
#+RESULTS: bmi-table-code
Like any Org table, you can cross-refer to this table using
[[bmi-table-output]]
.
Section #tableformatting discusses how to format tables for LaTeX export. All that can be done on tables created by source code blocks.
An alternative approach would be to use various R packages designed for producing formatted tables. R has excellent libraries for producing tables formatted for LaTeX, html, RTF and docx exports. Particularly noteworthy are xtable, gt/gtExtras, kableExtra, and tabularray packages. It is beyond the scope of the present document to describe each of these. Of these, xtable is most versatile but complex. xtable does not work out of the box with threeparttable for producing table notes. gt/gtExtras are excellent but are mainly aim at producing html tables. kableExtra produces tables that work with tabular, longtable and tabu LaTeX environments to produce tables, while tabularray produces output for tabularray LaTeX environment. kableExtra works well with threeparttable.
The code block r-code-kable-formatted shows code that uses kableExtra to produce a formatted LaTeX table. Similarly, code block r-code-tabularray-formatted shows code that produces a table formatted for tabularray LaTeX package.
A disadvantage of this approach would be that the results of the code block will not contain an Org-mode table, which is a visually appealing and functionally useful representation while working in Org-mode.
One could, of course, use a combination of two approaches by creating
two separate code blocks: the first producing an Org-mode table that
is not meant for export (:exports none
), and another that reads the
Org-mode table, and produces LaTeX code meant for export.
#+NAME: code1
#+BEGIN_SRC R :results value verbatim latex :exports results
library(kableExtra)
dt <- mtcars[1:5, 1:4]
kbl(dt, format="latex", booktabs = T, caption = "Demo Table") |>
kable_styling(latex_options = c("striped", "hold_position"),
full_width = F) |>
add_header_above(c(" ", "Group 1" = 2, "Group 2[note]" = 2)) |>
add_header_above(c(" ", "Group 3" = 4)) |>
footnote(c("1. table footnote","2. another footnote"))->t
#+end_src
#+NAME: code2
#+BEGIN_SRC R :results value verbatim latex :exports results
library(tabularray)
library(dplyr)
df <- starwars |>
filter(homeworld == "Tatooine") |>
select(name, height, mass, sex, birth_year) |>
arrange(desc(birth_year))
df |>
mutate(sex = stringr::str_to_title(sex)) |>
group_by(sex) |>
tblr(type = "float", caption = "Starwars Creatures from Tatooine") |>
set_source_notes(
Note = "Entry C3PO altered to test characters that have a special meaning in LaTeX.",
Source = "R package \\texttt{dplyr}"
) |>
set_alignment(height:birth_year ~ "X[r]") |>
set_column_labels(
name = "",
height = "Height",
mass = "Mass",
birth_year = "Birth Year"
) |>
set_theme(row_group_style = "panel") |>
set_interface(width = "0.7\\linewidth") |>
set_column_spanner(
c(height, mass) ~ "Group 1",
birth_year ~ "Group 2"
) |>
set_column_spanner(!name ~ "All my vars")->t
t
#+end_src
These code blocks can have a series of commands. The last command produces a graph that we would like to be included in the document.
The following code shows an example of a code block that produces a graph.
You can use snippet <srcg
and press tab to insert an empty code
block, and then go into a temporary ESS buffer by using C-c ‘.
- Once in this temporary ESS buffer, you can write the R commands for making your graph.
- As you write, you can evaluate the commands using
C-c C-n
,C-c C-r
andC-c C-b
and see what your output looks like. - The output is displayed on your screen using the default graphic device used by R (X11, quartz or windows graphic device depending upon your operating system).
- Once you have finalised your graph, you press C-c ’ and come back to the Org buffer.
Note that creation of the image file is left to appropriate switches
in the #+BEGIN_SRC
line. Org automatically chooses appropriate
graphic device to produce the file. When you evaluate this code using
C-c C-c
, the results are displayed below the code block as follows.
#+RESULTS: mygraph-code
[[bmi2.png]]
Note that, taking the file name from our #+BEGIN_SRC
line, a file
called bmi2.png
was automatically created and linked, so that the
graph would be inserted in the document when you produce the formatted
output.[fn:5] Every time you evaluate the code using C-c C-c
, the
underlying image file containing the graph is overwritten by a new
file.
As with the tables, we shall add a caption and a name to it. We can
also use :attr_latex
to adjust width (andcorrespondingly the
height, maintaining the aspect ratio).
#+name: my-bmi-graph
#+caption: Average BMI, by Country
#+attr_latex: :width 5in
#+RESULTS: mygraph-code
[[gini.png]]
You can now refer to this graph in the text using [[my-bmi-graph]]
.
You can run R code blocks in two ways.
One possibility is to send the block as a batch of commands to R and receive the results in the org-mode file. In this case, R runs each block independent of the others. The code is sent to R, evaluated independently, and results inserted in the org file. This is the default behaviour.
Another possibility is to start a named R session within emacs, which is persistent, and execute multiple code blocks in the same R session. In this case, output of one code block (say, some data objects that have been created or some libraries that have been called) is available to the next code block that is executed in the same session.
In fact, org allows you to run multiple R sessions simultaneously. If you are working on two documents side by side, and would like to keep statistical work for the two separately, you can run them in two separate sessions.
If you want to have one R session for a particular org file, you can specify a named session (in this case, with the name `my-r-session`) for the whole file in the preamble as follows:
#+property: :session my-r-session
You could also use `:session my-r-session` as a header argument for a particular code block to evaluate that code block in `my-r-session`. This means that you could also evaluate different code blocks from the the same document in separate but persistent R sessions. Say, if your paper draws upon two different sets of data and analyses, you could process them in two separate R sessions (with different working directories, etc).
Starting Version 9.5, Org-mode has introduced a new system for embedding citations and creating bibliographies in org-mode documents. You must upgrade your org-mode to Version >9.5 to be able to use this new citation syntax.
Users are advised to read the discussion on citation handling in the org-mode manual and a very useful summary article by Timothy.
If you are using my emacs configuration, you should have the latest
version of org-mode. My configuration of functions related to
citations and bibliography resides in _configs/use-oc.el
. It is
loaded if sym-linked to _activated/use-uc.el
.
The tasks related to citations and bibliography can be divided into two parts: (i) maintaining a bibliographic database and (ii) inserting citations in org-mode documents and processing them to create formatted citations and bibliography at the time of export.
The new syntax of citations divides the process of inserting and processing citations in org-mode into four components: insert, follow, activate and export. The core functionality is designed to allow development and use of different tools for each of these. John Kitchin and Bruce D’Arcus have built useful extensions on top of the core functionality.
The citation processors available for org-mode allow the use of bibliographic databases maintained in bibtex, biblatex or citeproc/json formats.
I use a biblatex database because biblatex provide tools that are highly customisable, to format bibliographies in any style. Citeproc/json has become increasingly popular in recent years because this is the native format used by pandoc and other applications in the markdown ecosystem.
We shall use a master bibliographic database to contain bibliographic records for the literature that we cite. The database, in Biblatex or BibTex format, is stored in a text file with .bib extension.
In a BibTeX/BibLaTeX database, each bibliographic record is given a unique key, which is used to cite it. Each record is classified as one among various categories of publications (journal article, book, chapter, etc.), and for the given publication type, the record specifies values for various fields (author, title, volume, publisher, etc). BibLaTeX extends the BibTeX specification to cover a wider variety of publication types and fields. Given that, it is more versatile than BibTeX.
Bibliographic information in BibTeX/BibLaTeX format is available from many online sources, including journal/publisher websites, and Google Scholar. Many applications/tools allow you to search download bibliographic records from these sources directly. Please note that you would often need to clean/correct the downloaded entries. And, when the bibliographic information in BibTeX/BibLaTeX format is not available from any existing database, you may have to enter the information yourself.
To start with, you may wish to use a GUI application like JabRef (cross-platform, http://jabref.sourceforge.net/) or BibDesk (OS-X only, http://bibdesk.sourceforge.net/) to build and maintain your database. In my experience, most of these tend to add additional junk to stamp the entries that a particular application has been used to create the database.
It is much cleaner, and efficient, to use excellent tools available for maintaining the database in emacs itself. Ebib gives a nice interface to manually enter the records. Bibretrieve and org-ref give you commands that can download BibTeX records directly from online databases.
Eventually, you should use bibretrieve from within Emacs to add entries to your database. org-ref.el provided by John Kitchin (https://github.com/jkitchin/jmax) also has some useful functions.
As a sample, my own bibliographic database is available from https://github.com/indianstatistics/bibliobase/blob/master/bibliobase.bib.
Using biblatex with Org requires some customisation of variables. These
are done in _config/use-oc.el
and _config/use-org-contrib.el
files. Both the files are sym-linked to files with the same names in _activated
directory).
Please note the configuration of ivy-bibtex in _config/use-oc.el
. You may want to modify the values of variables bibtex-completion-bibliography (default bibliographic database), bibtex-completion-notes-path (directory where you may optionally keep your notes for each source), bibtex-completion-library-path (the directory where you optionally keep pdf files for each source) and bibtex-completion-pdf-open-function (the application that should be used to open the pdfs).
(use-package ivy-bibtex :ensure t :init (setq bibtex-completion-bibliography '("~/bibliobase/bibliobase.bib") bibtex-completion-notes-path "~/bibliobase/notes/" bibtex-completion-notes-template-multiple-files "#+TITLE: Notes on: ${author-or-editor} (${year}): ${title}\n\nSee [cite/t:@${=key=}]\n" bibtex-completion-library-path '("~/pdfbibliobase/") bibtex-completion-additional-search-fields '(keywords) bibtex-completion-display-formats '((article . "${=has-pdf=:1}${=has-note=:1} ${author:36} ${year:4} ${title:*} ${journal:40}") (inbook . "${=has-pdf=:1}${=has-note=:1} ${author:36} ${year:4} ${title:*} Chapter ${chapter:32}") (book . "${=has-pdf=:1}${=has-note=:1} ${author:36} ${year:4} ${title:*}") (incollection . "${=has-pdf=:1}${=has-note=:1} ${author:36} ${year:4} ${title:*} ${booktitle:40}") (inproceedings . "${=has-pdf=:1}${=has-note=:1} ${author:36} ${year:4} ${title:*} ${booktitle:40}") (t . "${=has-pdf=:1}${=has-note=:1} ${author:36} ${year:4} ${title:*}"))) (setq bibtex-completion-pdf-open-function (lambda (fpath) (call-process "evince" nil 0 nil fpath))) )
:
The operative part in the _config/use-org-contrib.el
file is the following:
(setq org-latex-pdf-process '("xelatex -interaction nonstopmode -output-directory %o %f" "biber %b" "xelatex -interaction nonstopmode -output-directory %o %f" "xelatex -interaction nonstopmode -output-directory %o %f"))
Every time you export the document to pdf via latex, it runs xelatex, then runs Biber and then runs xelatex twice again. This is necessary to get the citations in the pdf file.
Three sets of things have to be specified for each document.
First, the following line in the header of the org file calls biblatex. You may want to modify the options used here.
#+LATEX_HEADER: \usepackage[hyperref=true,maxcitenames=3,doi=false,url=true, backend=biber,natbib=true, maxbibnames=99,uniquename=false,uniquelist=false, indexing=cite,sorting=nyt,mergedate=compact,innamebeforetitle=true,articlein=false]{biblatex}
Second, the following line in the header of the org file specifies the biblatex citation and bibliographic styles to be used.
#+cite_export: biblatex authoryear/authoryear-comp
Third, the following line specifies the bibliographic database to be used for this document.
#+bibliography: bibliobase/bibliobase.bib
Where I have the choice, I also customise the bibliography in the way I like. These customisations are in the file vikas-bibstyle.org in indianstatistics/bibliobase repository. This file can be included in the document as follows:
#+INCLUDE: PATH-TO-FILE/vikas-bibstyle.org
Let us first understand the general syntax of citations in org. To insert a citation at any point in the document, you need to use the following syntax in square brackets.
[cite/style/variant: common_prefix; prefix@key1 pageno suffix; prefix@key2 pageno suffix; common_suffix]
Let us understand this syntax. There are four parts in the syntax.
- cite/style/variant: :: First, you specify the citation style. Table citation-commands lists citations styles and variants available with the biblatex citation processor in org. You can use abbreviations provided in the parentheses rather than the full notation for the citation styles and variants. This part ends in a colon.
- common_prefix: :: You may optionally use some arbitrary text as prefix. This would be
see
in a citation that you want to export as(see Seema, 2010 and Nancy, 2011 as examples of this)
. This part ends in a semicolon. - citations: :: Then you specify the sources that you want to cite. Each source that you want to cite at this point is separated by a semicolon. The source is cited using the citation key from your bibliographic database. So, if the citation key for
Seema, 2010
in the database isseema2010
, you would cite it as@seema2010
. The citation key may be preceded by a source-specific prefix, and followed by a location identifier such as page number and citation-specific suffix. Prefix, pageno and suffix are optional. - Finally, you may optionally use some arbitrary text as a common suffix. This would be
as examples of this
in the citation to be exported(see Seema, 2010 and Nancy, 2011 as examples of this)
.
Style (abbreviation) | Variant (abbreviation) | Biblatex command | Example output |
---|---|---|---|
author(a) | caps(c) | Citeauthor* | Seema et al. |
author(a) | full(f) | citeauthor | Seema, K and David, P |
author(a) | caps-full(cf) | Citeauthor | Seema, K and David, P |
author(a) | nil | citeauthor* | Seema et al. |
locators(l) | bare(b) | notecite | p. 56 |
locators(l) | caps(c) | Pnotecite | (P. 56) |
locators(l) | bare-caps(bc) | Notecite | P. 56 |
locators(l) | nil | pnotecite | (p. 56) |
noauthor(na) | bare(b) | cite* | 2010 |
noauthor(na) | nil | autocite* | (2010) |
nocite(n) | nil | nocite | No citation but included in bibliography |
text(t) | caps(c) | Textcite | Seema and David (2010) |
text(t) | nil | textcite | Seema and David (2010) |
nil | bare(b) | cite | (Seema and David, 2010) |
nil | caps(c) | Autocite | (Seema and David, 2010) |
nil | bare-caps(bc) | Cite | Seema and David, 2010 |
nil | nil | autocite | (Seema and David, 2010) |
Insert processors provide convenient shortcuts for inserting the
citations and this does not need to be done manually. I use the insert
processor from the org-ref-cite repository created by John
Kitchin. Although this insert processor is created for use with
bibtex/natbib, it works reasonably well for biblatex as well. C-c \
is bound to org-cite-insert. It would open your database in a
convenient form using Ivy completion framework, where you can type
keywords to select the source you want to cite, and press enter to
insert it. Please look at org-ref-cite for additional details on how
to modify citation styles, insert multiple citations, and insert
prefixes and suffixes.
org-ref-cite also provides other useful facilities. If you click on the citation, a menu opens which gives you several options including to go to the entry in the biblatex database (in case you want to edit it), to look at your notes about the source or to open the pdf file of the source document. org-ref-cite also colours the citations differently including colouring them differently (red) if the citation key does not match with any valid entry in the database. Please note that since the fontification is designed for bibtex/natbib, it may give some false alerts for a biblatex database.
To insert the bibliography, add the following line where you want to insert the bibliography (usually, at the end of your paper, but before the Footnotes)
#+print_bibliography:
From Org, we can get a well-formatted document as a LaTeX, PDF, odt, docx or html file. To produce a formatted output, we shall use the built-in exporters provided with Org, and for some file types, use Pandoc for further conversion.
Built-in exporters can be called in Org using C-c C-e
or M-x
org-export-dispatch
.
There are many LaTeX packages that can be used to format tables. They provide different options for formatting tables, and using them involves different degrees of complexity. We consider four different libraries for creating tables.
The choice of LaTeX environment and other associated options are
specified in orgmode using lines starting with ~#+attr_latex~. These
lines generate required LaTeX commands on export. We will also use org
special blocks for advanced table layouts. #+attr_latex
lines affect
only the object (table, image or a special block like begin_table or
begin_tablenotes) that follows immediately after. These must,
therefore, be placed immediately before the object they are supposed
to be applied on. Note that there should not be even a blank line
between #+attr_latex
lines and the object.
Some key attributes that need to be specified using #+attr_latex
lines are as follows:
:environment:
:environment
specifies the LaTeX package to be used for a particular table. This is the simplest of all but allows limited possibilities for formatting columns. We will use tabulary, tabularx and tabularray packages for most of our tables. Other packages that you might want to look at are tabu, longtable and sidewaystable. However, the packages used in this guide will suffice for most needs.:align
:align
specifies how to render each columns by using one letter (l,L,r,R,c,C or J) for each column. Each column type is represented by a character. For example, in most packages, c stands for center aligned columns, l for left aligned columns and r for right aligned columns. The number of letters should exactly match the number of columns in your table. For example, for a four column table, you would specify something like:align lccr
, which means that the first column is left aligned, next two columns are center aligned and the last column is right aligned. A|
anywhere in this string implies a vertical line. So in a table formatted with:align l|ccr
, there would be a vertical line after the first column. When you have many columns, you can use expressions like*{3}{c}
in place ofccc
. This says, use columnc
three times. Normally, tables are slightly indented on the left and the right. To remove this indentation, you can add@{}
before and after the alignment expression (for example,:align @{}l*{3}{c}r@{}
).:width
:width
is used to specify the width of the table that the table can take [it may be specified as\textwidth
, implying full text width, as0.8\textwidth
, implying a fraction (0.8 here) of the text width, in centimeters (like, 10cm) or in inches (like, 5in)]. Note that, different environments use the information on width differently. In particular, if the contents of your table columns do not require the width specified by you, some packages would only use the minimum width required while others would widen the table to the total width specified by you. The total width is also divided between different columns differently by various packages.:center
- This option followed by
t
ornil
would center align or left align the table horizontally. :float
- This option specifies whether this object (table, figure or any other special block) is a floating object which can be optimally placed by LaTeX. In most cases, you would want these objects to float so that LaTeX can optimally place them.
:booktabs
- This option followed by
t
gives us nice horizontal lines. In most cases, when usingbooktabs
, one should not use vertical lines in the table (that is,:align
should not have any|
). Booktabs automatically inserts a top line (\toprule
in LaTeX), a middle line just below the header row (\midrule
) and a line at the bottom of the table (\bottomrule
).
One limitation of Org is lack of support for merging of cells in a
Table. However, Eric Schulte has created useful functions that can be
used to overcome this limitation while exporting to LaTeX or HTML. I
have extended those functions to also provide facilities for creating
horizontal lines (\midrule
and \cmidrule
). These functions
(org-export-midrule-filter-latex and org-export-cmidrule-filter-latex)
are already included in my emacs configuration in
(_configs/use-org-contrib.el
).
The method of creating multicolumn (and multirow) cells in tabularay
is different from other tabular environments and will be discussed
separately when I discuss tabularray
environment. For all other
tabular
environments discussed here, the content of any cell can be
preceded by a string such as <3colc> to make those contents span and
centred across 3 columns to the right.
In addition, <mid> placed in a cell draws a \midrule
at that
position across the entire width of the table. Other cells in that row
will be ignored.
A string such as <2cid4> can be used to say that a horizontal line be drawn to span second and fourth columns.
The table below illustrates the use of <2colc> to write text that spans two columns, <mid> to draw a horizontal line across the width of the table, and <2cid3> to draw a line spanning second and third columns.
A particular advantage of these functions is that they can be inserted in a table by the code block that is generating the tables.
#+name: tabulary-yield-out
#+caption: Table formatted using tabulary package
| State | <2colc>Two Column Text | | Variable4 |
| <2cid3> | | | |
| | Yield | Income | |
| <mid> | | | |
| Madhya Pradesh | 669 | 13121.25 | 123 |
| Haryana | 300 | 2532.30 | 22 |
| Punjab | 260 | 35232.45 | 324 |
Package tabulary
extends the tabular
environment by providing
three additional column types. It is relatively easy to integrate this
package into orgmode.
Table tabulary-column-types shows different types of columns available
in tabulary
package.
Type | Description |
---|---|
l | Left aligned, no wrapping |
L | Left aligned with wrapping |
r | Right aligned, no wrapping |
R | Right aligned with wrapping |
c | Centre aligned, no wrapping |
C | Centre aligned with wrapping |
J | Justified and wrapped |
Load tabulary and booktabs using the following line in the preamble.
#+LATEX_HEADER: \usepackage{tabulary,booktabs}
Following #+attr_latex
lines illustrate the way of declaring that
the table should be constructed using tabulary
and passing various
options to it:
#+attr_latex: :environment tabulary :width 0.8\textwidth :align @{}L|llR@{} #+attr_latex: :center :booktabs t
The following code uses tabulary to create a table. The output of the code is shown in Table tabulary-yield-out.
#+name: tabulary-yield-out
#+caption: Table formatted using tabulary package
#+ATTR_LATEX: :environment tabulary :width \textwidth :align @{}lRR@{}
#+ATTR_LATEX: :center t :booktabs t :float t
| State | Average yield | Average income |
|----------------+---------------+----------------|
| Madhya Pradesh | 669 | 13121.25 |
| Haryana | 300 | 2532.30 |
| Punjab | 260 | 35232.45 |
State | Average yield | Average income |
---|---|---|
Madhya Pradesh | 669 | 13121.25 |
Haryana | 300 | 2532.30 |
Punjab | 260 | 35232.45 |
Package tabularx provides a flexible tabular environment. In addition, siunitx provides an additional column type S which can be used to align numbers currently on the decimal. Packages tabularx, siunitx and booktabs are included in the Memoir class, and do not need to be called if you use vmemoir or varticle LaTeX classes defined in my emacs configuration. If you are using article (may be the default) or any other class, you may need to add the following line in the preamble of your org file.
#+LATEX_HEADER: \usepackage{tabularx,booktabs}
Following lines need to be added to the header of the org file to set the alignment of numeric columns correctly (using siunitx LaTeX package) and to properly align headings of table columns.
#+LATEX_HEADER: \usepackage[add-decimal-zero = true,add-integer-zero = true, round-integer-to-decimal,round-mode = places, round-precision=1]{siunitx} #+LATEX_HEADER: \newcolumntype{C}{>{\centering\arraybackslash}X} #+MACRO: M @@latex:\multicolumn{1}{C}{$1}@@
With this, you are set to create nicely formatted tables in LaTeX/PDF files.
The following code uses tabularx and siunitx to create a table. The output of the code is shown in Table tabularx-yield-out.
In the :align
specification for tabularx
tables (with siunitx
),
you can use following types of columns:
l
specifies a left aligned column with no wrapping.r
specifies a right aligned column with no wrapping.X
specifies a left aligned column with wrapping.S
specifies a numeric column aligned to the decimal point.- Use option table-format=n.m to specify the maximum n digits and m decimals allowed in the column. Numbers will be rounded off to m decimals.
- Use option round-mode=off to turn the rounding-mode off (since we turned it on while loading siunitx). This is needed in columns that consist only integers.
You are advised to look at the tabularx and siunitx manuals for the many options that they provide.[fn:1]
For the S
columns, the column headings should be wrapped in macro M
as shown in the example below to ensure that these are properly
aligned in the center of the columns.
#+name: tabularx-yield-out
#+caption: Table formatted using tabularx and siunitx packages
#+ATTR_LATEX: :environment tabularx :width 0.8\textwidth
#+ATTR_LATEX: :align @{}l{S[table-format=3.0, round-mode=off]}{S[table-format=5.2]}@{}
#+ATTR_LATEX: :center t :booktabs t :float t
| State | {{{M(Average yield)}}} | {{{M(Average income)}}} |
|----------------+------------------------+-------------------------+
| Madhya Pradesh | 669 | 13121.25 |
| Haryana | 300 | 2532.30 |
| Punjab | 260 | 35232.45 |
State | {{{M(Average yield)}}} | {{{M(Average income)}}} |
---|---|---|
Madhya Pradesh | 669 | 13121.25 |
Haryana | 300 | 2532.30 |
Punjab | 260 | 35232.45 |
LaTeX package threeparttable
is used for including notes below the
table. For using threeparttable
you need to call the package. In
addition, it is a good idea to include the following special line for
better formatting of notes below the table
#+LATEX_HEADER: \renewcommand{\TPTminimum}{\linewidth}
The following code produces a table (Table threeparttable-table-yield) with notes below.
#+name: threeparttable-table-yield
#+caption: Table created using tabularx, siunitx and threeparttable
#+begin_table
#+ATTR_LATEX: :float t :options [hb]
#+begin_threeparttable
#+ATTR_LATEX: :environment tabularx :width 0.8\textwidth
#+ATTR_LATEX: :align @{}l{S[table-format=3.0]}{S[table-format=5.2]}@{}
#+ATTR_LATEX: :booktabs t
| State | {{{M(Average yield)}}} | {{{M(Average income)}}} |
|----------------+------------------------+-------------------------|
| Madhya Pradesh | 669 | 13121.25 |
| Haryana^{a} | 300 | 2532.30 |
| Punjab | 260 | 35232.45 |
#+attr_latex: :options [flushleft]
#+begin_tablenotes
#+begin_footnotesize
+ Notes: :: \mbox{}
1. This table is very nice. This note is very long. But the long
note wraps nicely under the table.
2. This is the second note. But this is not very wide.
+ We can use bullets.
+ a :: Or use description lists to refer to footnote markets in
the table
+ Source: :: https://www.indianstatistics.org
#+end_footnotesize
#+end_tablenotes
#+end_threeparttable
#+end_table
Org lists can be used to format the notes properly. \mbox{}
is used
to leave white space in the given example since the description list
items +Notes:
and Source:
do not have any description in the same
line.
I have used footnotesize to render the notes in a slightly smaller font. Option [flushleft] is used to align the notes to the left.
tabularray
is a new package that is very versatile and provides
flexible ways of creating multi-row and multi-column cells, adding
table notes and using colours in the tables. You can also create
multi-page tables with tabularray.
You are strongly advised to go through the manual of tabularray package to understand various options. The focus here is to illustrate how to use the package in org documents.
There are three environments provided by tabularray
:
- tblr:
- Basic tabularray environment with excellent support for merging cells across columns and rows.
- talltblr:
- In addition to features of
tblr
, this also allows for notes below the tables. - longtblr:
- For tables running across multiple pages. All the
features of
tblr
andtalltblr
are available here.
To use siunitx
, booktabs
and diagbox
along with tabularray
,
you need to add these lines to the preamble of the orgmode file.
#+LATEX_HEADER: \usepackage{comment,multirow,tabularray} #+latex_header: \UseTblrLibrary{booktabs} #+latex_header: \UseTblrLibrary{siunitx} #+latex_header: \UseTblrLibrary{diagbox}
A few additional hacks are required for using these environments in orgmode.
- :align
Both the table width and column specifications will be provided as value to the :align keyword. Note that a separate :width keyword does not work with
tabularray
. - Macro to center align column headings for siunitx columns
For siunitx(S) columns, we would like to align column headings differently from alignment of cells containing numbers. For this, column headings must be wrapped in triple curly braces. However, org uses triple curly braces to trigger macros. To deal with this, we define a macro (mc below) which just replaces itself with the triple curly braces. So, column headings will be wrapped in the mc macro as:
{{{mc(colheading)}}}
, and on export, this would create{{{colheading}}}
.We will add this macro specification to the header.
#+MACRO: mc @@latex:{{{$1}}}@@
You can also optionally create macros such as this to conveniently add colour/highlights to specific cells:
#+MACRO: HG @@latex:\SetCell{bg=gray9} $1@@
- Some of the options – such as specification of footnotes – are
provided in
tabularray
as an optional argument in square braces when invoking thetabularray
environments. Juan Manuel Macias recently provided an elisp function that allows for the use of an:options
keyword to specify such arguments. This is included in_configs/use-org-contrib.el
in my emacs configuration. -
\cmidrules
The LaTeX syntax for
\cmidrules
intabularray
is slightly different from other tabular environments. Just as we use <2cid3> to declare that we want a line going from second to third column, we can use <2cd3> withtabularray
environment. This feature is provided by the function org-export-tabularray-cmidrule-filter-latex defined in_configs/use-org-contrib.el
.
#+name: tabularrray-yield-out
#+caption: Table formatted using tabularray and siunitx packages
#+ATTR_LATEX: :environment talltblr
#+ATTR_LATEX: :align width=0.8\textwidth,colspec={lS[table-format=3.0,round-precision=0]S[table-format=5.2,round-mode=places,round-precision=true,alignment-mode=marker]S[table-format=3.0,round-precision=0]}
#+ATTR_LaTeX: :options remark{Notes} = {This note goes below the table},
#+ATTR_LaTeX: :options remark{Source} = {Second note goes below the first note}
#+ATTR_LATEX: :center :booktabs t :font \small
| State | <2colc>Two Column Text | | {{{mc(Variable4)}}} |
| <2cd3> | | | |
| | {{{mc(Average yield)}}} | {{{mc(Average income)}}} | |
| <mid> | | | |
| Madhya Pradesh | 669 | 13121.25 | 123 |
| Haryana | 300 | 2532.30 | 22 |
| Punjab | 260 | 35232.45 | 324 |
State | <2colc>Two Column Text | {{{M(Variable4)}}} | |
<2cd3> | |||
{{{M(Average yield)}}} | {{{M(Average income)}}} | ||
<mid> | |||
Madhya Pradesh | 669 | 13121.25 | 123 |
Haryana | 300 | 2532.30 | 22 |
Punjab | 260 | 35232.45 | 324 |
The file default_packages.org lists a set of LaTeX packages that I normally use. Please modify this as you please, save it in a convenient location, and call it using a line of the following kind.
#+INCLUDE: "path/to/default_packages.org"
For producing LaTeX and/or PDF files, use C-c C-e
to call the Org export dispatcher.
- Press l to select LaTeX, and then chose one of the following options.
- Press l again, if you just want to create a LaTeX file
- Press p, if you want to create a pdf file. This will first create a latex file, then use pdflatex and Biber to create a pdf file.
- Press o, if you want to create pdf and have it opened in the default pdf viewing application.
There is a built-in odt exporter in Org. While it works well for most situations, there are two components of the setup proposed here that it does not support. It does not support biblatex and it does not support LaTeX-specific solution we have for Notes under Tables and Images.[fn:6]
Fortunately, Pandoc provides an excellent solution for converting LaTeX output to odt or docx documents. Pandoc supports all the LaTeX syntax that Org produces from our files, and you can get a very well formatted output.
Use C-c C-e l l
to create a LaTeX file. Then, from the terminal, use
Pandoc as follows to create an odt or a docx file.
pandoc --bibliography=biblidatabase.bib --filter pandoc-citeproc \ latexfile.tex -o outputfile.odt
pandoc --bibliography=biblidatabase.bib --filter pandoc-citeproc \ latexfile.tex -o outputfile.docx
If you want, you can use –template to specify an ott or a .dotx template file, so that the fonts and other formatting attributes are to your liking.
For html as well, there is a built-in exporter in Org. The built-in exporter is very good, and the way to go if you are planning to maintain a website using Org (as I do for http://www.indianstatistics.org).
The built-in exporter can support BibTex citations using ox-BibTex.el, which is including in Org, and will be loaded if you have installed research-toolkit.org. You may need to install BibTex2html separately to make it work.
However, ox-BibTex.el uses BibTex2html for converting citations and bibliography to html. BibTex2html provides limited support for citation and bibliography styles.
If you want full support for bibliography and citation styles, as well as for other LaTeX components like Table notes explained in this document, you can use Pandoc for converting LaTeX to html.
This section points some additional solutions that you may like to use. Some of these may come handy when you start using Org for documenting your research.
If are using my emacs configuration, you will have access to three custom classes: vreport, vmemoir and varticle. Of these, vmemoir and varticle are based on the Memoir class, a very versatile LaTeX package that allows for many customisations.
While vmemoir and varticle classes should work for books and articles, you will benefit by looking at ~/.emacs.d/_configs/use-org-contrib.el and customising them for your need.
LaTeX class is specified in the org header using a line as follows
#+LATEX_CLASS: vmemoir#
Additional options can be provided by using the keyword latex_class_options
:
#+latex_class_options: [11pt,twoside,openany,strict,extrafontsizes,article]
Those using my emacs configuration can call the snippet latex-class,
taken from Scimax, using keyword lc
to insert this line. The snippet
shows LaTeX classes that are defined in your installation of
org. Additional options for any latex class can be specified by
calling another snippet using lco
.
By default, Org evaluates source code at the time of exporting. If your code involves a lot of computation, this can slow down exporting.
You can block evaluation of a source code block at the time of export
by using :eval never-export
in the header arguments of the
block. Such code blocks will have to be evaluated manually using C-c
C-c
. To prevent all blocks from being evaluated, set it buffer-wide
using:
#+PROPERTY: header-args :eval never-export~
If your buffer has this line, the source code is not evaluated at the
time of export, and whatever already exists in #+RESULTS
block is
exported.
I like to use the Garamond font. If you do too, add this special line at the top:
#+LaTeX_CLASS_OPTIONS: [garamond]
In LaTeX, package geometry
allows you to modify page margins. The
following line in research-toolkit.org sets the margins. You can tweak
this to define the margins as you like.
("innermargin=1.5in,outermargin=1.25in,vmargin=1.25in" "geometry" t)
If you would like to do it for each document separately, remove the above line, and add the following special line at the top in your documents.
#+LaTeX_HEADER: \usepackage[innermargin=1.5in,outermargin=1.25in,vmargin=3cm]{geometry}
Use the following line at the top. Modify the number to whatever suits you.
#+LATEX_HEADER: \linespread{1.3}
When writing a research paper, it is common to put acknowledgements in a special footnote to names of authors. It is conventional to use * as the symbol for this footnote, and to keep this footnote out of the list of numbered footnotes that the paper may have.
This is achieved as follows.
- As illustrated in the example below, add acknowledgements in the special line that specifies authors of the paper.
#+AUTHOR: Vikas Rawal\footnote{Write your acknowledgements here...}
- Then, before your first headline, add the following text.
{% begin group
\renewcommand{\thefootnote}{\fnsymbol{footnote}}% set smybols
\setcounter{footnote}{0}% set footnote counter back to 0
}% end group
LaTeX has a very sophisticated algorithm for determining the location
of Tables and Images in a document. If, however, you want to add a
restriction that the Tables and Images should not cross section
boundaries, or a particular boundary, this can be done using command
\FloatBarier
provided by placeins package in LaTeX.
You can put any number of \FloatBarrier
commands, each in a line by
itself, in the document. Tables and Images before such a barrier will
be placed before the barrier.
You can use the following special line at the top to restrict all Tables and Images within their own sections.
#+LATEX_HEADER: \usepackage[section]{placeins}
An extension to placeins package, extraplaceins can be used if you want to restrict the Tables and Images within subsections.[fn:7]
I like to use authoryear bibliography style. However, I need some customisations. The file vikas-bibstyle.org contains all my customisations.
Download the file, adjust the path to the file in the line below and add it to your org file.
#+INCLUDE: /path-to-the-file/vikas-bibstyle.org
- Org-mode manual
- Worg
- Org-mode mailing list
- Emacs manual
- R website
- Pandoc
- E. Schulte, D. Davison, T. Dye, and C. Dominik. A multi-language computing environment for literate programming and reproducible research. Journal of Statistical Software, 46(3):1–24, 1 2012.http://www.jstatsoft.org/v46/i03
- Tutorial: Writing scientific papers for ACPD using emacs org-mode, http://draketo.de/english/emacs/writing-papers-in-org-mode-acpd
- Writing papers Using org-mode, http://nakkaya.com/2010/09/07/writing-papers-using-org-mode
[fn:1] These are available at http://mirrors.ctan.org/macros/latex/required/tools/tabularx.pdf and http://mirrors.ctan.org/macros/latex/contrib/siunitx/siunitx.pdf.
[fn:3] Depending on the keyboard and the default configuration of the flavour of emacs you have installed, Meta may instead be mapped to a different key (for example, Windows key, or Option or Command key in Apple computers.
[fn:4] For libraries and functions that you need to call, it is even better to include them in a .Rprofile file in your working directory. These libraries and functions would then be called when R is started, and not each time you evaluate code blocks in your document.
[fn:5] Of various image formats, I find that png files are most versatile. png files support transparency, and are rendered well both on the web and in print. You can also specify jpeg or pdf files. pdf files for images work very well if you are only going to produce a pdf document.
[fn:6] Author of the odt exporter has chosen to develop the exporter outside Org-mode. He has developed a JabRef exporter to integrate citations into odt exports, but that is not a part of Org-mode and needs to be installed separately. In any case, since our toolkit primarily uses LaTeX, using Pandoc to create odt or docx files from LaTeX export works better.