-
Notifications
You must be signed in to change notification settings - Fork 2
/
README.Rmd
161 lines (114 loc) · 4.58 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(doconv)
```
# doconv
<img src="man/figures/logo.png" alt="doconv logo" align="right" /> The tool offers a set of functions for converting 'Microsoft Word' or
'Microsoft PowerPoint' documents to 'PDF' format and also for converting
them to images in the form of thumbnails.
In order to work, the package will use 'Microsoft Word', 'Microsoft PowerPoint',
if they are not available program 'LibreOffice' can be used. A function is also
provided to update all fields and table of contents of a Word document using
'Microsoft Word'.
These features are also used to provide functions for visual-testing documents,
format 'doc', 'docx', 'ppt', 'pptx', 'html', 'pdf' and 'png' are supported. The
functions can be used with packages "testthat" and package "tinytest".
<!-- badges: start -->
[![R build status](https://github.com/ardata-fr/doconv/workflows/R-CMD-check/badge.svg)](https://github.com/ardata-fr/doconv/actions)
<!-- badges: end -->
## Installation
You can install the latest version from GitHub with:
``` r
# install.packages("devtools")
devtools::install_github("ardata-fr/doconv")
```
## Example
```{r}
library(doconv)
```
## Generate thumbails from file
You can generate thumbails as an image by using `to_miniature`:
```{r}
docx_file <- system.file(package = "doconv", "doc-examples/example.docx")
to_miniature(
filename = docx_file,
row = c(1, 1, 2, 2))
```
It uses 'Microsoft Word' or 'Microsoft PowerPoint' program to convert Word or
PowerPoint documents to PDF. If program 'Microsoft Word' or 'Microsoft PowerPoint'
is not available, it uses 'LibreOffice' to convert Word or PowerPoint documents to PDF.
Thus, this package can only be used when 'Microsoft Word' and 'Microsoft PowerPoint'
programs are available or eventually 'LibreOffice'.
## Convert a PowerPoint file to PDF
```{r}
docx_file <- system.file(package = "doconv", "doc-examples/example.pptx")
to_pdf(docx_file, output = "pptx_example.pdf")
to_miniature("pptx_example.pdf", width = 1000)
```
## Convert a Word file to PDF
```{r}
to_pdf(docx_file, output = "docx_example.pdf")
```
```{r include=FALSE}
unlink(c("docx_example.pdf", "pptx_example.pdf") )
```
## Update Word fields and TOC
```{r eval=FALSE}
library(officer)
library(doconv)
read_docx() |>
body_add_fpar(
value = fpar(
run_word_field("DOCPROPERTY \"coco\" \\* MERGEFORMAT"))) |>
set_doc_properties(coco = "test") |>
print(target = "output.docx") |>
docx_update()
```
## Setup
If not available on your machine and if possible, install
'Microsoft Word' or 'Microsoft PowerPoint' programs.
If 'Microsoft Word' and 'Microsoft PowerPoint' can not be installed,
install 'LibreOffice' on your machine; please visit
https://www.libreoffice.org/ and follow the installation instructions.
Use function `check_libreoffice_export()` to check that the software
is installed and can export to PDF:
```{r}
check_libreoffice_export()
```
If 'Microsoft Word' or 'Microsoft PowerPoint' are available on your machine,
you can get images or pdf that looks exactly the same than the original document.
If not 'LibreOffice' is used to convert Word documents
to PDF or as an image, in this case, be aware that 'LibreOffice' does not
always render the document as 'Microsoft Word' would do (sections can be
misunderstood for example).
### Authorization on macOS
If you are running R for 'macOS', you have to authorize few things before
starting.
PDF processing will happen into a working
directory managed with function `working_directory()`.
Manual interventions are necessary to authorize 'Word' and
'PowerPoint' applications to write in a single directory: the working directory.
These permissions must be set manually, this is required by the macOS security
policy. We think that this is not a problem because it is unlikely that you will
use a Mac machine as a server.
User must manually click on few buttons:
1. allow R to run 'AppleScript' scripts that will control Word
2. allow Word to write to the working directory.
![](man/figures/authorizations.png)
Don't worry, these are one-time operations.
## Related work
* Packages [docxtractr](https://CRAN.R-project.org/package=docxtractr) is providing
`convert_to_pdf()` that works very well. The functionality integrated in Bob Rudis'
package depends only on 'LibreOffice'.
```{r include=FALSE}
# minimage::compress_images("man/figures", "man/figures", overwrite = TRUE)
```