forked from redox-os/gawk
-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
162 lines (111 loc) · 5.17 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
Mon Jul 3 21:05:03 IDT 2017
============================
There were too many files tracking different thoughts and ideas for
things to do, or consider doing. This file merges them into one. As
tasks are completed, they should be removed.
This file should exist only in the master branch or branches based off
of it for development, but not in the stable branch. This may require some
careful work with Git.
TODO
====
Minor Cleanups and Code Improvements
------------------------------------
API:
??? #if !defined(GAWK) && !defined(GAWK_OMIT_CONVENIENCE_MACROS)
?? Add debugger commands to reference card
Look at function order within files.
Consider removing use of and/or need for the protos.h file.
Recheck if gnulib regex can be dropped in
Fully synchronize whitespace tests (for \s, \S in Unicode
environment) with those of GNU grep.
See if something like b = a "" can be optimized to not do
a concatenation, but instead just set STRCUR on a.
Minor New Features
------------------
Enable command line source text in the debugger.
Enhance extension/fork.c waitpid to allow the caller to specify
the options. And add an optional array argument to wait and
waitpid in which to return exit status information.
Consider relaxing the strictness of --posix.
? Add an optional base to strtonum, allowing 2-36.
? Optional third argument for index indicating where to start the
search.
Major New Features
------------------
Think about how to generalize indirect access. Manuel Collado
suggests things like
foo = 5
@"foo" += 4
Also needed:
Indirect through array elements, not just scalar variables
Add ability to do decimal arithmetic.
Rework management of array index storage. (Partially DONE.)
Consider using an atom table for all string array indices.
DBM storage of awk arrays. Try to allow multiple dbm packages.
?? A RECLEN variable for fixed-length record input. PROCINFO["RS"]
would be "RS" or "RECLEN" depending upon what's in use.
*** Could this be done as an extension?
?? Use a new or improved dfa and/or regex library.
Rewrite in C++.
Things To Think About That May Never Happen
-------------------------------------------
Consider making shadowed variables a warning and not
a fatal warning when --lint=fatal.
Similar for extra parameters in a function call.
Look at code coverage tools, like S2E: https://s2e.epfl.ch/
Try running with diehard. See http://www.diehard-software.org,
https://github.com/emeryberger/DieHard
Implement namespaces. Arnold suggested the following in an email:
- Extend the definition of an 'identifier' to include "." as a valid
character although an identifier can't start with it.
- Extension libraries install functions and global variables with names
that have a "." in them: XML.parse(), XML.name, whatever.
- Awk code can read/write such variables and call such functions,
but they cannot define such functions
function XML.foo() { .. } # error
or create a variable with such a name if it doesn't exist. This would
be a run-time error, not a parse-time error.
- This last rule may be too restrictive.
I don't want to get into fancy rules a la perl and file-scope visibility
etc, I'd like to keep things simple. But how we design this is going
to be very important.
Include a sample rpm spec file in a new packaging subdirectory.
(Really needed?)
Patch lexer for @include and @load to make quotes optional.
(Really needed?)
Add a lint check if the return value of a function is used but
the function did not supply a value.
Consider making gawk output +nan for NaN values so that it
will accept its own output as input.
NOTE: Investigated this. GLIBC formats NaN as '-nan'
and -NaN as 'nan'. Dealing with this is not simple.
Review the bash source script for working with shared libraries in
order to nuke the use of libtool. [ Partially started in the
dead-branches/nolibtool branch. ]
Things That We Decided We Will Never Do
=======================================
Consider moving var_value info into Node_var itself to reduce
memory usage. This would break all uses of get_lhs in the
code. It's too sweeping a change.
Add macros for working with flags instead of using & and |
directly.
FIX regular field splitting to use FPAT algorithm.
Note: Looked at this. Not sure it's with the trouble:
If it ain't broke...
Scope IDs for IPv6 addresses
Gnulib
Make FIELDWIDTHS be an array?
"Do an optimization pass over parse tree?"
This isn't relevant now that we are using a byte code engine.
"Consider integrating Fred Fish's DBUG library into gawk."
I did this once as an experiment. But I don't see a lot of value
to this at this stage of the development. Stepping through things
in a debugger is generally enough. Also, I would have to try to
track down the latest version of this.
"Make awk '/foo/' files... run at egrep speeds" (How?)
This has been on the list since the early days (gawk 1.x or early
2.x). But I am not sure how to really do this, nor have I done
timings, nor does there seem to be any real demand for this.
Change from dlopen to using the libltdl library (i.e. lt_dlopen).
This may support more platforms. If we move off of libtool
then this is the wrong direction.