-
Notifications
You must be signed in to change notification settings - Fork 17
/
Copy pathMODS
4779 lines (2746 loc) · 130 KB
/
MODS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
In a development branch, the 'mods-dir' directory contains a file for
each git branch in the pqR repository, documenting what that branch is
for. These files will ultimately be combined in a release branch to
produce the MODS file in the main source directory.
Development branches start with branch 00, which is identical to the
version of R released by the R Core Team on which pqR is based
(currently R-2.15.0), except for the 00 and README files in mods-dir.
Later development branches have the form DD-desc, where DD is a
two-digit number, and desc is a descriptive string. All branches
starting with DD are merged into the branch with number DD+1, which
does not contain any modifications of its own, except perhaps for
updates to the version number/date in VERSION/NEWS.Rd, or to this
README file, but which forms the basis for the development branches
starting with DD+1.
Release branches have names of the form Release-YYYY-MM-DD (and are
also given tag pqR-YYYY-MM-DD). They are based on some merged
development branch (ie, with a name of the form DD). Note that it may
be a good idea to have the development branch include the effects on
the source directory of a run of "make update-po" in the "po"
subdirectory of a build directory (which updates source references for
translations).
After the new release branch is created, the following should be done:
1) Remove the file ".gitignore" using "git rm".
2) Update the release date, which is stored in the (now mis-named)
file "SVN-REVISION". This should match the date in "NEWS.Rd",
which would have been set in a development (or merge) branch.
(The date in the "SVN-REVISION" file of a development branch
should have approximately the correct year, but have month and day
set to "00".)
3) Create the file "configure" by running create-configure. Add it
to the git repository with "git add".
4) Run configure and make in some build directory, with "C" locale.
5) Create the files "NEWS", "NEWS.pdf", "doc/html/NEWS.html", and
"doc/html/R-admin.html" from source files, by copying them from
the build directory to the source directory (R-admin.html from
doc/manual). Add them to the git repository.
6) Concatenate all files in the "mods-dir" directory into the file
"MODS", as follows:
(cd mods-dir; cat README; for i in [0-9]*; \
do echo " "; echo " "; echo ${i}:; echo " "; cat $i; done) >MODS
Add it to the git repository, then delete the "mods-dir" directory
with "git rm -r".
7) Commit the above changes to create the final version for a release.
Give this commit the pqR-YYYY-MM-DD tag.
The development branches that go into a release may be changed after
the release, with the new versions forming the basis for a later
release, but without the commits that produce these new versions being
merged into the earlier release. Once a release branch is created, it
is modified only in exceptional circumstances (such as discovery that
a file is missing).
The "master" branch is used only for adding versions of R released by
the R Core Team, starting with R-2.11.1, for each of which there is a
tag.
00:
The R release pqR is based on, unchanged except for adding the README
in mods, and the 00 file within it, containing this text.
00-admin:
Updates R-admin.texi to minimally say where pqR can be obtained.
00-copyright:
This mod updates the copyright notices at the start of many
source files.
00-future:
Replaced the section of "future directions" in R-ints.texi with a stub
for pqR, with no content at the moment.
00-news:
Move NEWS to ONEWS, ONEWS to OONEWS, OONEWS to OOONEWS. Create new
NEWS.Rd with initial content for pqR.
00-version:
Changes to display pqR version information, including separating the
pqR version from the version of R that pqR is based on. Also updates
some copyright notices, and changes bug reporting address.
01-Rprofmem:
New version of function Rprofmem (and related Rprofmemt) implemented.
See NEWS entry.
01-cleanup:
Miscellaneous code cleanups, including the following:
o Cleaned up inconsistencies in checks for arity of primitives in eval.c
Removed checks from functions implementing language features "repeat",
"while", and "function", consistent with other language features (lack
of checks does not cause a crash --- missing arguments just appear
to be NULL). Changed the check in do_set to the standard form
using checkArity, with names.c changed to make the arity be 2 rather
than -1.
o The "spare" bit in sxpinfo is renamed to "misc", and the documentation
in the code and in R-ints.texi is changed to reflect this, and to
document that this bit is actually used.
o The documentation before do_seq in seq.c is changed to be correct
(seq.int is no longer SPECIAL), and the incorrect reference to
seq.int in R-ints.texi is removed.
o The fixup_NaRm function defined in summary.c is moved to match.c,
where it belongs. It is now properly declared in Rinternals.h,
rather than the definition in summary.c being surreptitiously
referenced as an extern from logic.c.
o Defined an isRaw macro globally for consistency with other such macros,
deleting several local definitions of this.
o Fixed problem with recompilation with byte compilation not enabled.
o Fixed (by a kludge, not a proper fix) a bug in the "tre" package that
shows up when WCHAR_MAX doesn't fit in an "int". The kludge reduces
WCHAR_MAX to fit, but really the "int" variables ought to be bigger.
(This problem shows up on a Raspberry Pi running Raspbian.) Also fixed
a going-past-end-of array bug (that probably never happened).
o Code in R-2.15.0 exists for maintaining a cache of primitive objects,
but this code forgets to ever actually enter a primitve into the cache.
This is now done (in mkPRIMSXP).
o Updated R-admin to discuss derived but distributed files (eg configure).
See also entries in NEWS for other mods that merit description there.
01-inspect:
The "inspect" .Internal function was changed to show some details of
pairlist nodes, if SHOW_PAIRLIST_NODES is defined as 1 in inspect.c.
It also shows length and truelength (the hash) for CHARSXP nodes.
Optionally displays details of promises, and now handles R_UnboundSymbol
correctly.
Finally, it no longer produces output with tabs (spaces instead).
01-omp:
Rinternals.h now has a #define for R_OMP_FIRSTPRIVATE_VARS, which
contains a comma-separated list of variables that should usually be
included in the firstprivate part of an OMP parallel construction,
since they are used in macros such as NA_REAL.
The only use of this in the interpreter itself gets deleted by a later
mod, but this mod is retained anyway...
01-pnamedcnt:
New pnamedcnt primitive function for printing named field of object.
See NEWS entry for details.
01-testing:
Eliminated non-deterministic aspect of testing of random number
generators. Also prints more details on failure.
See NEWS entry for details.
02-cons-with-tag:
This mod defines a cons_with_tag function that creates a CONS cell
given its CAR, CDR, and TAG fields. This makes for clearer, more
concise, and faster code when CONS cells with TAG set need to be created.
02-copy-elements:
Defines several functions for use within the R interpreter.
New functions copy_string_elements and copy_vector_elements are now
defined. Compared to using SET_STRING_ELT and SET_VECTOR_ELT, these
allow copying of multiple elements without error checks on every
element, and sometimes without old-to-new checks on every element.
A new function copy_elements is also defined, for copying elements in
any sort of vector (duplicating non-atomic elements, and using
copy_string_elements).
A new function set_elements_to_NA_or_NULL is defined for doing that.
See also the NEWS item on a bug fix.
02-epmatch:
New functions are defined for finding exact or partial matches, to
replace the existing pmatch, psmatch, and other matching functions
(but pmatch and psmatch are retained, in case anyone uses them).
These functions return 0, -1, or +1 for no match, partial match, and
exact match, allowing any subset of these conditions to be easily
checked for in one comparison. There is therefore no need for an
"exact" argument as in psmatch. The new functions should usually be
faster as well (the old psmatch uses strcmp for exact matches, which
might be able to use special machine instructions if they exist, but
since most calls will be for short strings, and early failure to match
is likely the most common result, this is unlikely to provide a
benefit once extra procedure call overhead is accounted for.)
Versions are provided for matching a string to a string, an SEXP to an
SEXP, or a string to an SEXP, called ep_match_strings, ep_match_exprs,
and ep_match_string_expr.
Calls of pmatch are replaced by calls of ep_match_exprs in various
places as part of this mod, and these functions are also used in later
mods (eg, in matchArgs and in do_subset3).
Speedup from using these functions is mentioned in a NEWS entry.
02-evalv-prim:
Two changes to how expressions are evaluated.
First, a facility has been introduced for an expression to be
evaluated in a context in which a "variant result" is allowed - eg, if
the result will be ignored anyway (expression is evaluated only for
side effects), a null result might be allowed. This is done by
introducing an "evalv" function that is like "eval" but with an extra
parameter saying what variant results are permissible. This facility
is used for later modifications, with some symbols defined here in
anticipation of these modifications.
Second, calling of primitive functions has been speeded up by copying
relevant information (eg, arity) from the table defining primitives
(in names.c) to fields in the SEXP for the primitive. This saves
table access computations and also division and remainder operations
to get at the information in the "eval" field in names.c, which is
encoded as decimal digits.
A procedure SET_PRIMFUN in memory.c was surreptitiously changing the
function pointer for a primitive via the function pointer access
macro, PRIMFUN. A SET_PRIMFUN macro now does this properly.
The types used for pointers for C functions implementing primitives
have been make safer, taking account of C99's specification that one
can convert between all types that are pointers to functions without
loss of information, but not necessarily between a pointer to a
function and a pointer to void.
Code in saveload.c (for loading old workspaces?) creates a primitive
directly, bypassing the mkPRIMSXP procedure. This seems unwise, since
creation via mkPRIMSXP is apparently needed to ensure protection of
primitives. Whatever is going on there should not be affected by this
modification, however.
R-ints.texi has been updated to document evalv and variants.
02-promiseWith:
Two related changes.
Created a promiseArgsWithValues function that calls promiseArgs and
then sets the values of the promises created, and a promiseArgsWith1Value
function that does the same except setting only the value for the
first promise. Code to do these things appears in several places, so
creating these functions cleans things up (and is needed for later
mods).
The promiseArgsWithValues and promiseArgsWith1Value functions are not
entirely equivalent to the previous code, which set the values of what
it took to be promises without checking that they actually were
promises. Since promiseArgs doesn't always create a promise for every
argument (it doesn't when the argument is R_MissingArg), this doesn't
seem safe, though there seem to be no examples where a bug actually
arises. The promiseArgsWithValues and promiseArgsWith1Value silently
skip setting the value for arguments that aren't promises, as will be
necessary when missing arguments do arise.
Also, a problem is fixed with the DispatchOrEval function in eval.c.
Without this fix, some subtle things go wrong with existing features
in 2.15.0, and more serious things go wrong with some later pqR mods.
The issue is that if DispatchOrEval is called with argsevald set to 1
(which indicates that arguments have already been evaluated), if
DispatchOrEval dispatches to a method for an object, it passes on
these argument values without putting them in promises along with the
unevaluated arguments. Because of this, a method that attempts to
deparse an argument will not work correctly. It seems possible that
there might also be some other bad effects of not having these
promises.
Here is an illustration:
> a <- 0
> class(a) <- "fred"
> seq.fred <- function (x, y) deparse(substitute(y))
> seq(a,1+2)
[1] "1 + 2"
> seq.int(a,1+2)
[1] "3"
Both "seq" and "seq.int" dispatch to seq.fred, but seq.int calls
DispatchOrEval, which doesn't pass on a promise with the unevaluated
argument. After the fix, seq.int does the same as seq. This example
is now tested in tests/eval-etc.R.
Also fixed some formatting in DispatchOrEval, and improved the
documentation for R_possible_dispatch to explain its features used in
this fix.
03-ISNAN:
IF ENABLE_ISNAN_TRICK is defined when pqR is configured (by including
-DENABLE_ISNAN_TRICK in CFLAGS), the ISNAN macro is changed to be
faster for many common cases. This change relies on the same result
being produced when casting NaN, -NaN, NA, and -NA to integer, which
is true on Intel systems, but not on SPARC systems. A fatal error is
produced if this is seen to not be true (in which case the define of
ENABLE_ISNANS_TRICK should of course be removed).
03-contexts:
Reorder RCNTXT structure and code in begincontext to maybe make saving
a context faster.
03-def-COPY:
Introduced a facility for making a local copy of R_NilValue, and
potentially other globals. This is done with LOCAL_COPY(R_NilValue).
Access to the local copy may be faster, partly because the compiler
will know that it isn't being modified.
03-fast-base:
Lookup of symbols defined in the base environment has been sped up by
flagging symbols that have a base environment definition recorded in
the global cache. This allows the definition to be retrieved quickly
without looking in the hash table. In particular, this speeds up
basic operations such as "+", "<-", "if", and "length".
03-fast-const-eval:
An "eval" of an expression that evaluates to itself (usually a
constant in an expression) has been made faster by quickly checking
for a self-evaluating value by a shift and mask operation, and if the
expression is self-evaluating, returning it immediately, without the
overhead of things like checking for stack overflow. Depending on the
machine and the compiler, it's possible that the subsequent switch on
expresssion type for non-self-evaluating expressions will also be
faster, due to it having many fewer cases.
03-fast-spec:
Lookup of some builtin/special function symbols (eg, '+' and 'if') has
been sped up by allowing fast bypass of non-global environments that
do not contain (and have never contained) one of these symbols. The
symbols that are special for this purpose are specified in InitNames
in names.c.
03-getAttrib:
Defines versions of getAttrib that allow faster attribute search when
it is known that no special processing is needed. Plus other minor
speedups.
03-inlined:
Various inlined procedures were changed to be more efficient.
Several sets of types, represented by 32-bit words with "1" bits
corresponding to included types, are now defined in Rinternals.h. These
allow fast testing for set membership with if ((set >> type) & 1) ...
This is sed heavily in the inlined functions, but may also be used
elsewhere.
The "length" function was un-inlined, since it's fairly long.
Numerous occurences of code like for (i = 0; i < length(...); i++)
... were replaced by code that doesn't call length many times,
sometimes by saving the result of one call of length, sometimes by
replacing length with LENGTH. (Though note that LENGTH doesn't work
for R_NilValue!)
03-install:
Detailed speed-up of the "install" function for installing a new (or
old) symbol. Also put in (currently disabled) code for seeing how
many symbols there are, for tuning purposes.
03-matchArgs:
The matchArgs function, used in the interpreter to match formal and
actual arguments when calling functions has been sped up, and given a
new interface.
One interface change allows the formal arguments to either be given as
a list SEXP (as before), or as an array of C strings, along with a
count of how many strings are in the array. (If formals are given by
C strings, the SEXP for the formals list parameter should be NULL,
whereas if the formals are given by a list, the pointer for the C
strings should be NULL and their count should be 0.)
Numerous calls of matchArgs are changed to use the interface with an
array of C strings (for example, in the code implementing rep and
seq.int). These calls were previously preceded by creation of a list
with calls to "install" for all the formal argument names. Using the
new interface is cleaner and considerably faster.
A second interface change is that if the formals are given by a list
SEXP, tags for the arguments are attached to the actuals list by
matchArgs. Places where matchArgs is called are changed to no longer
do this themselves. (Doing this in matchArgs is both cleaner and
faster.)
The new code is also faster in ways unrelated to these interface
changes.
Finally, 38 calls of check1arg(args,call,"x") were replaced with calls
of a new macro check1arg_x(args,call) that should be faster.
03-parens:
Parentheses are make faster by making them SPECIAL. Also, curly
brackets pass on the eval variant to the last expression, and pass
VARIANT_NULL for earlier expressions.
03-promise-named:
Values of forced promises no longer have NAMED always set to 2.
Instead NAMED for an object is incremented when it becomes the value
of a promise.
03-promiseArgs:
The creation of argument lists for closures is sped up by avoiding an
unnecessary allocation of a CONS cell, in the same way as was done in
my Sep 2010 patch for evalList, which was incorporated into 2.12.0 and
later versions of R. Also, now uses cons_with_tag in all these routines.
03-protect:
PROTECT, UNPROTECT, etc. have been made mostly macros in most of the
files in src/main. This applies only to files that include Defn.h
after defining the symbol USE_FAST_PROTECT_MACROS. If this is
defined, macros PROTECT2 and PROTECT3 for protecting two or three
objects at once are also defined.
This change speeds up numerous operations.
03-save-alloc:
Some binary and unary arithmetic operations have been sped up by, when
possible, using the space holding one of the operands to hold the
result, rather than allocating new space. Though primarily a speed
improvement, for very long vectors avoiding this allocation could
avoid running out of space.
03-scalars:
Global constants R_ScalarLogicalNA, R_ScalarLogicalTRUE, and
R_ScalarLogicalFALSE have been created, and the interpreter's
ScalarLogical function now returns one of these rather than allocate
new space for every logical value.
To avoid problems with an external C or Fortran routine changing one
of these values (with an incorrect specification of DUP=FALSE even
though it modifies the argument), the values of these constants are
checked after the return of an external function called with .C or
.Fortran, and if they have changed, their values are reset and an
error is signalled.
The bytecode interpreter sets up a similar set of logical constants.
That facility should be merged with this one (perhaps by just calling
the ScalarLogical function in the bytecode interpreter).
Various places in coerce.c were changed to use ScalarLogical rather
than allocate logical values themselve. This is both cleaner and now
more efficient given the change above.
03-seq-varop:
Several primitive functions that can generate integer sequences (":",
seq.int, seq_len, and seq_along) will now sometimes not generate an
actual sequence, but rather just a description of its start and end
points. This is not visible to users (except in time and space
savings), but allows for speed up (with other mods) of primitive
operations such as "for" loops and indexing of vectors.
03-sexprec:
The basic sexprec structure for objects is modified here, to allow for
a number of future modifications. The new scheme is documented in
R-ints.texi.
03-stringops:
Procedures copy_1_string, copy_2_strings, and copy_3_strings are now
defined in utils.c, and used in many places in the interpreter. These
procedures concatenate 1, 2, or 3 strings, checking for overflow of
the destination space. They are faster and less error-prone than the
various code sequences they replace (often involving strlen and sprintf).
This gives signficant speed-ups for some operations such as calling
S3 methods.
03-translate:
Speed up character translation a bit somtimes, by doing operations
only when they are actually needed.
03-vstack:
Many calls of vmaxget and vmaxset are replaced by macros VMAXGET and
VMAXSET that do the same thing faster.
03-zap-isMissing:
Removes a call of R_isMissing in the interpeter's evalList function.
This check, done for every argument to a builtin primitive that is a
symbol, is slow, and appears to serve only to produce an error message
that is slightly different (and sometimes less informative) than
simply letting the symbol be evaluated.
04-coerce-bind:
Extensive cleanup in bind.c and coerce.c.
Simple cases of "c" with no names (or names ignored), no conversion,
and no recursion are done more quickly.
The copy_elements procedure is now used where appropriate.
A complete set of XFromY functions are now present in coerce.c (some
were missing). A copy_numeric_or_string_elements procedure is now
defined in coerce.c, which uses these functions.
Also, fixed a bug where excess warning messages may be produced on
conversion to RAW. See NEWS entry.
04-dollar:
Access via the $ operator to lists, pairlists, and environments has
been sped up. The speedup comes mainly from (a) avoiding the overhead
of calling DispatchOrEval if there are no complexities, (b) passing on
the field to extract as a symbol, or a name, or both, as available,
and then converting only as necessary, (c) using the new ep_match
functions instead of the previous local pstrmatch procedure, and (d)
not translating a string multiple times.
An error reporting bug in $ was also fixed. See NEWS entry.
04-ifloop-debug:
Fixes the "debug" facility. See NEWS item.
Also cleans up code, and propagates evalv variant to branches of "if".
04-relop-logic:
Logical operations and relational operators have been sped up in
simple cases, and use the new facility for producing a scalar logical
result without allocating new storage. Relational operators have also
been substantially speeded up for long vectors. Relational operators
are reduced to either EQOP or LTOP to avoid repetitive code, which
then makes it reasonable to specially treat equal length operands and
operands of length 1.
04-subscript:
Speeds up extraction and replacement of subsets of vectors or
matrices, by speeding up the creation of the vector of indexes used.
Often avoids a duplication and eliminates a second scan of the
subscript vector for zero subscripts, folding it into a previous scan
at no additional cost. String subscripts are handled more efficiently
by not creating a vector of indexnames when it is not needed, and by
other detailed code improvements.
The previous code duplicated a vector of indexes when it seems
unnecessary. Duplication was for two reasons: first, to handle the
situation where the index vector is itself being modified in a replace
operation, and second, so that any attributes can be removed, which is
helpful only for string subscripts, given how the routine to handle
them returns information via an attribute. Duplication for the second
reasons can easily be avoided. The first reason for duplication is
sometimes valid, but can usually be avoided by, first, only doing it
if the subscript is to be used for replacement rather than extraction,
and second, only doing it if the NAMED field for the subscript isn't
zero.
Also removes two layers of procedure call overhead (passing seven
arguments, so not trivial) that seemed to be doing nothing.
04-vec-enlarge:
Extending lists and character vectors by assigning to an index past
the end, deleting list items by assigning NULL, and concatenation of
character vectors with "c" have all been speded up. This is partially
from use of copy_string_elements and copy_vector_elements. Another
gain comes from handling deletion of a contiguous block specially.
04-vec-subset:
Speeds up extraction of subsets with "[", as detailed in NEWS entries.
05-BLAS:
The BLAS routines supplied with R were modified to improve the
performance of the routines DGEMM (matrix-matrix multiply) and DGEMV
(matrix-vector multiply). Also, proper propagation of NaN, Inf,
etc. is done now.
These routines are probably still not as fast as those in a more
sophisticated BLAS, but will be of benefit to users who do not
install a different BLAS.
05-RNG:
Improves the performance of the uniform random number generation
routines (which are also used as the base for other generators), and
an unnecessary limitation. See NEWS entry for details.
The previous code was also rather messy - global references were mixed
with references by argument to the same variables, sometimes concealed
by macro definitions, and the seed was often referenced by pointers
which actually always pointed to the same location, which was also in
some places referenced directly.
A bit of previous code that assumed R integers are exactly 32 bits was
changed to assume only an R integer is at least 32 bits in size.
05-anyall:
Speeds up "any" and "all" by detailed code improvements.
05-data-frame:
The R code for as.data.frame.matrix has been sped up a bit. (Other
mods also have the effect of speeding up this function.)
05-matprod:
Includes the matprod library from github.com/radfordneal/matprod,
and uses it for the %*% operator when doing so is specified by the
mat_mult_with_BLAS option. See help("%*%") and help(options) for
details.
The NEWS item for this is a stub, since it will be combined with
that for a later mod.
05-pow:
I previously proposed a simple patch speeding up squaring, which the R
core team did not adopt. Instead, in R-2.12.0 they introduced an
inline function R_POW, that checks specially for a power of 2, and
does it as a multiply, otherwise calling the R_pow function. R_pow
proceeds to check again for a power of 2 at the beginning, and then
check again for a power of 2 just before calling the C pow function.
The R_pow function also contains a check for a power of 0.5, but it is
disabled for recent versions of gcc, to bypass a bug that according to
a comment existed at one time. Note that the inline R_POW function
will not necessarily be actually inlined by the compiler, and that in
any case the check for a power of 2 is done over again for every
element of a vector being raised to that power, even if the power is a
scalar.
In this new patch, if the power is a scalar, I check for it being 2,
1, 0, or -1, and if so handle it specially. Otherwise, I call R_pow,
which I changed to not bother checking for powers of 2, and to
actually check for a power of 0.5. (Since the relevant code has
changed, any buggy compilers still extant probably will compile the
new code OK; if not, they probably compile lots of stuff incorrectly,
since there is nothing unusual in the new code.)
For non-scalar powers, I use R_POW, but change it to a macro, so that
it will definitely be inlined, and have it check for powers of 2 and
1. (If one is going to do this check when the power is a vector, it
makes sense to tailor it to something other than a vector of powers
that are all the same, since this doesn't seem like a common case.
Some powers of 1 and some of 2 seems plausible in some statistical
applications.) The macro also allows for the power to be an integer,
slightly speeding up some integer^integer operations.
The speed improvement from this patch depends a lot on the machine
architecture and the compiler. On machines where memory is much
slower than the processor, checking for a power of 2 every time may
mostly overlap with the a memory fetch or store operation, but one
would not expect this to always be the case.
See also the NEWS item on this.
05-rowcolSums:
Rewrote the internal rowSums and colSums functions to be faster. Also
changed the R rowSums and colSums functions that call the internal
functions so that they treat the common case where the array is matrix
specially, with less overhead.
05-sum-prod:
Code for the sum and prod functions has been changed to move some
checks outside the inner summation loops. The effect depends on the
extent to which the unnecessary checks overlap memory fetch
operations, but one would expect signficant speed-ups with some
machines/compilers.
I had previously proposed this modificaton to sum and prod before
R-2.12.0. That patch was not adopted by the R core team, though they
did swap the order of checking for NA/NaN and checking the na.rm
option, which avoids the worst inefficiency of the previous code in
R-2.11.1.
05-transpose:
The speed of the transpose (t) function has been improved, when
applied to real, integer, and logical matrices. This is done by
moving pairs of elements, which improves memory access behaviour.
Note that I had previously speeded up transpose with a patch that was
incorporated into R 2.12.0. This improvement gives an additional
speedup, by up to a factor of about 1.4.
06-applydefine:
Speed up assignment to a subset of a variable by various detailed
improvements to "applydefine" and associated functions in eval.c.
Includes handling "[", "[[", and "R" as special cases, with pre-setup
symbols for "[<-", etc. rather than using "install".
06-attrib:
Speed improvements in attrib.c from detailed code improvements.
06-contexts2:
Defines a revisecontext function, that is used to avoid the deletion
and recreation of a context during the setup for applying a function.
06-def-nmcnt-macros:
Defines macros for using a three-bit nmcnt in the sxpinfo structure in
the header for every object rather than the named field of R-2.15.0.
These macros support a true reference counting scheme, which is
documented in R-ints.texi.
New versions of NAMED and SET_NAMED are defined for compatibility.
The new macros are actually used only in later mods, not in this one.
The "inspect" and "pnamedcnt" funtions are updated to display nmcnt
properly.
06-fast-prim:
This mod combines several changes, loosely related by their all
reducing interpretive overhead or speeding up primitive functions.
The largest change is that a new scheme is used for quick dispatch of
some unary and binary primitive functions, as described in the
documentation in R-ints.texi. In particular, this mod includes "fast"
versions of the functions implementing the following primitives
(called do_fast_XXX):
arith (+, -, *, /, ^, %%, %/%)
math1 (exp, sin, etc.)
trunc
abs
length
dim
is (is.null, is.integer, etc.)
isna
isnan
isfinite
isinfinite
cmathfuns (Re, Im, etc.)
logic3 (any, all)
colon
seq_len
sum
prod
But note that not all calls of such primitives will use the fast
dispatch mechanism.
The evalListKeepMissing function in eval.c now just calls evalList,
with an indicative argument, eliminating code duplication.
Calls of the R_CheckStack procedure in evalv and other places were
replaced with a macro R_CHECKSTACK, which is faster.
The LOCAL_COPY macro is used in several places to (hopefully) speed up
references to R_NilValue.
Evaluation of .Internal calls now pass on the "variant" desired.
Expressions such as \code{all(v>0)} and \code{any(is.na(v))} where
\code{v} is a real vector now avoid computing and storing a logical
vector, instead computing the result of \code{any} or \code{all}
without this intermediate, looking at only as much of \code{v} as is
needed to determine the result. This is done using the "variant
result" framework.
Similarly, when \code{sum} is applied to many mathematical functions
of one vector argument, for example \code{sum(log(v))}, the sum is
performed as the function is computed, without a vector being
allocated to hold the function values.
Several bugs were fixed, as described in NEWS items. In particular,
int_fast64_t variables are now used to accumulate integer sums for
"sum" and "mean" in order to avoid overflow.
06-gc-mods:
Many changes were made to the garbage collector and associated
routines in memory.c. These do not change the general scheme, but
improve it in many ways, some of which are noted below.
Small nodes classes are now distinguished only by size, with CONS
cells being allocated in the smallest class that can contain a CONS
cell, not in a special class. (Depending on the machine, it is
possible for some vector objects to be smaller than CONS cells.) See
comments in memory.c for details, also the new version of the
documentation obtained with help(Memory).
Linking of nodes in pages is now done in forward order, not backwards
as before, which should improve cache performance.
The source file size.c has been merged into memory.c, where it
logically belongs.
The mkChar function and related routines were moved from envir.c to
memory.c, where they belong, because of the special treatment of the
global string cache by the garbage collector.
HASHPRI was renamed to HASHSLOTSUSED.
The scheme for using valgrind was revised, and a newer version of
valgrind is now used.
The gc.time function was fixed (see NEWS item).
06-gram:
The parser was changed to improve locality of nodes representing
language objects, which may improve cache performance when
interpreting R code.
The paser is now generated with bison version 2.5. This is not
expected to have any consequences.
06-objects:
Speeds up routines in objects.c, such as GetObject. Also fixes the
bug mentioned in the NEWS item here.
06-rm-named:
A new function get_rm is now defined, which removes a variable and
returns the value it had had. See the NEWS item and help(get_rm) for
details.
Many function in base have been changed to use get_rm to reduce
duplication of objects. Some other code improvements have been
made at the same time.
Two bugs have been fixed, as reported in NEWS.
The lapack.Rout.save file has been changed to match the new behaviour
of svd regarding return of NULL elements.
06-varop-for:
For loops in which the index variable goes through an integer sequence
are now done without actualy allocating a vector to hold the sequence.
This saves both space and time. This is implemented using the variant
result facility, which is used by ":" and other sequence primitives.
06-varop-sub:
The variant sequence operations that possibly return a range are used
here to speed up subscripting of vectors and matrices. For matrices,
only the first index (for rows) is currently handled as a range (if
possible). (Handling the column index as a range would be possible,
but would provide a lesser speedup, due to data ordering, and to the
loop over columns being the outer loop.)
A missing row subscript for a matrix is also converted to a range from
1 to the number of rows.
Handling ranges without creating index vectors saves both time and
space.
Note that this change has no effect for accesses via compiled code,
though the compiler could probably be modified to exploit this
feature.
07-apply:
Sped up "apply" in the common case where it is applied to a matrix.
Also clarified the documentation and fixed a bug, as described in
NEWS.
07-coerce-bind2:
The "rbind", "cbind", "c", and "unlist" primitives have been speeded
up for many cases. A bug in "rbind" has been fixed, as documented in
a NEWS item.
A new copy_elements_coerced procedure numeric and string vectors with
possible coercion, and has been used to replace much repetitive code
that does this. This replaces the copy_numeric_or_string_elements
procedure from the 04-coerce-bind mod.
07-math-cleanup:
Cleans up the code for mathematical functions in arith.c, speeding
some things up in the process, and preparing for later mods.
07-scalar-arith:
Rreal arithmetic and integer plus/minus on scalars with no attributes
are now done more quickly as special cases rather than by the
general-purpose code.