Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce Nim impl of hashMessage in reorganized library #41

Merged
merged 1 commit into from
Aug 31, 2020

Conversation

michaelsbradleyjr
Copy link
Contributor

@michaelsbradleyjr michaelsbradleyjr commented Aug 4, 2020

Introduce a Nim implementation of hashMessage situated in a reorganized library with Nim-oriented and C-oriented entry points inspired by the shim strategy suggested by @arnetheduck.

The goal of this approach (per @iurimatias, at least as I understood what was discussed and tasked) is for Nim implementations of equivalent functionality to eventually supersede existing status-go implementations.

These changes will benefit from feedback by @arnetheduck, @zah, @stefantalpalaru, @siphiuel, other members of stimbus/desktop, et al. – my dev experiences with Nim and C are quite limited to date. Some of the changes may be problematic, unnecessary, suboptimal, etc. I may have reified the "shim strategy" badly. Please shred this PR apart and lead me to the 💡 light. Thanks for your help!

N.B. tests/nim/login.nim and tests/c/login.c use loops that never terminate (introduced prior to this PR). Future work will attempt to remedy that shortcoming but it's out of scope for this PR.

The reorganized library can be grouped into two trees of .nim sources, but note that import statements interlink some of them.

Nim-oriented

├── src
│   ├── nim_status
│   │   ├── go
│   │   │   └── shim.nim
│   │   ├── lib
│   │   │   ├── shim.nim
│   │   │   └── util.nim
│   │   ├── lib.nim
│   │   └── types.nim
│   └── nim_status.nim

C-oriented

├── src
│   ├── nim_status
│   │   ├── c
│   │   │   ├── go
│   │   │   │   └── shim.nim
│   │   │   ├── lib
│   │   │   │   └── shim.nim
│   │   │   ├── lib.nim
│   │   │   ├── nim_status.nim
│   │   │   └── sys.nim

The key difference between the Nim sources in one tree and the other is that the Nim-oriented sources are intended to be consumed by other Nim sources (e.g. status-im/nim-status-client via Nim's built-in import), while the C-oriented sources are intended to be compiled to a C library (e.g. nim_status.a) and then linked/called by other C code. To that end, the former use e.g. string in call signatures while the latter use cstring. Along the same lines, the C-oriented procs may return pointers to memory allocated with c_malloc such that it's up to the caller to free the memory occupied by the return values.

Both src/nim_status/go/shim.nim and src/nim_status/c/go/shim.nim are pure shims around status-go, the main difference being their call signatures.

With src/nim_status/lib.nim the intention is to implement functionality in a manner independent of status-go's idioms.

In src/nim_status/[c/]lib/shim.nim the intention is to wrap around src/nim_status/lib.nim in a way that preserves status-go's call signatures and formatting. For example, the hashMessage proc introduced in this PR in src/nim_status/lib.nim returns a hex string like 0xabcd... while lib/shim.nim returns JSON that's a match for status-go: {"result":"0xabcd..."}.

Both src/nim_status.nim and src/nim_status/c/nim_status.nim represent a hybrid of go/shim.nim and lib/shim.nim. Again, the goal is that over time more and more procs implemented in Nim replace the shims around status-go.

Callers that don't need to consume return values formatted according to status-go conventions can use lib.nim directly, e.g. via import nim_status/lib or by compiling src/nim_status/c/lib.nim and linking/calling from C.

c_malloc has been copied (in src/nim_status/c/sys.nim) from the source of Nim's system/ansi_c because ansi_c is an undocumented internal API.

A Nim template or pragma might be useful re: usage of c_malloc in src/nim_status/c/*, i.e. to cut down on repetition.

Use of {.exportc.} is limited to the src/nim_status/c/nim_status.nim hybrid shim to avoid imposing exported symbols on C-oriented library consumers who may wish to compose things in a different manner.

With respect to the Nim implementation of hashMessage, both Nim-oriented and C-oriented tests have been implemented. Whether doubling up the tests is really necessary/desirable for all procs is probably worth discussing.

Adjust .gitignore and refactor Makefile accordingly; apply some lessons learned while working on status-im/nim-status-client.

Closes #34.

@michaelsbradleyjr michaelsbradleyjr requested review from a team August 4, 2020 23:20
@michaelsbradleyjr michaelsbradleyjr force-pushed the feat/nim-hashmessage branch 2 times, most recently from ee5207a to 909da6f Compare August 5, 2020 13:23
src/c/lib.nim Outdated
let hash = lib.hashMessage($message)
result = cast[cstring](c_malloc(csize_t hash.len + 1))
copyMem(result, hash.cstring, hash.len)
result[hash.len] = '\0'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it necessary to do a malloc + copyMem instead of returning the result of hashMessage directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because hash (or hash.cstring) will eventually be garbage collected by Nim, whereas memory allocated with c_malloc won't be, thus leaving it up to the caller to free the memory occupied by the return value. That's the same principle as with status-go — even though Go is a gc'd language, the return values of the status-go methods aren't garbage collected (as I understood from chatting with @cammellos).

We don't need to do such a thing in src/lib.nim because it's assumed that the consumer will be a program implemented in Nim. But in the case of src/c/lib.nim we assume that the consumer will be e.g. a program written in C, linking against the nim sources compiled into a library.

Copy link
Contributor Author

@michaelsbradleyjr michaelsbradleyjr Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note: what I explained above is my understanding of the gc behavior, but I could be mistaken and it's unnecessary.

src/shim.nim Outdated
# status-go, and it will complain of duplication of function names

proc hashMessage*(message: string): string =
$go_shim.hashMessage(message.cstring)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using this nim-status as a dependency of a nim project, what do you think of instead of returning the string json, return the "result" value directly. This should simplify the usage for the library consumers since they would not have to decode the json themselves, and instead they'd receive the hash message string.

Copy link
Contributor Author

@michaelsbradleyjr michaelsbradleyjr Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use the not-shimmed hashMessage proc via import nim_status/lib.

The procs in src/shim.nim are shims around the Nim implementations in src/lib.nim intended to preserve the call signatures and formatting of status-go. In turn src/nim_status.nim builds up a hybrid shim.

Down the road, a broader discussion could be had about possibly dropping the status-go call signatures and formatting altogether. But for now it seemed best at the top-level (src/nim_status.nim and src/c/nim_status.nim) to preserve them.

Copy link
Contributor Author

@michaelsbradleyjr michaelsbradleyjr Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I did an experiment just now re: master branch of nim-status-client w.r.t. master branch nim-status: I can't import nim_status/status_go unless status_go is moved into a subdirectory of src named nim_status.

So... for import nim_status/lib to work (which was/is my intention) I'll need to reorganize a bit — nothing major, just everything inside src/ (except src/nim_status.nim) will need to be moved into src/nim_status/, and then I'll need to adjust some relative directories in the sources and tests.

@michaelsbradleyjr michaelsbradleyjr force-pushed the feat/nim-hashmessage branch 9 times, most recently from 1c3e78c to 3f8aef5 Compare August 10, 2020 15:20
@michaelsbradleyjr
Copy link
Contributor Author

michaelsbradleyjr commented Aug 10, 2020

The code has been further reorganized so that it's possible to e.g. import nim_status/lib and so that the organization/naming (hopefully) better reflects the purpose.

const prefix = END_OF_MEDIUM & "Ethereum Signed Message:\n"

proc hashMessage*(message: string): string =
## hashMessage calculates the hash of a message to be safely signed by the keycard

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this get automatically generated as documentation somewhere?

Copy link
Contributor Author

@michaelsbradleyjr michaelsbradleyjr Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure (yet), tbh, but I think it's likely!

Introduce a Nim implementation of `hashMessage` situated in a reorganized
library with *Nim-oriented* and *C-oriented* entry points inspired by the
[shim strategy][shim-strat] suggested by @arnetheduck.

The goal of this approach (per @iurimatias, at least as I understood what was
discussed and tasked) is for Nim implementations of equivalent functionality to
eventually supersede existing [status-go][sgo] implementations.

These changes will benefit from feedback by @arnetheduck, @zah,
@stefantalpalaru, @siphiuel, other members of stimbus/desktop, et al. – my dev
experiences with Nim and C are quite limited to date. Some of the changes may
be problematic, unnecessary, suboptimal, etc. I may have reified the *"shim
strategy"* badly. Please shred this PR apart and lead me to the 💡
light. Thanks for your help!

N.B. `tests/nim/login.nim` and `tests/c/login.c` use loops that never
terminate (introduced prior to this PR). Future work will attempt to remedy
that shortcoming but it's out of scope for this PR.

The reorganized library can be grouped into two trees of `.nim` sources, but
note that `import` statements interlink some of them.

**Nim-oriented**
```
├── src
│   ├── nim_status
│   │   ├── go
│   │   │   └── shim.nim
│   │   ├── lib
│   │   │   ├── shim.nim
│   │   │   └── util.nim
│   │   ├── lib.nim
│   │   └── types.nim
│   └── nim_status.nim
```

**C-oriented**
```
├── src
│   ├── nim_status
│   │   ├── c
│   │   │   ├── go
│   │   │   │   └── shim.nim
│   │   │   ├── lib
│   │   │   │   └── shim.nim
│   │   │   ├── lib.nim
│   │   │   ├── nim_status.nim
│   │   │   └── sys.nim
```

The key difference between the Nim sources in one tree and the other is that
the Nim-oriented sources are intended to be consumed by other Nim
sources (e.g. [status-im/nim-status-client][nsc] via Nim's built-in `import`),
while the C-oriented sources are intended to be compiled to a C
library (e.g. `nim_status.a`) and then linked/called by other C code. To that
end, the former use e.g. `string` in call signatures while the latter use
`cstring`. Along the same lines, the C-oriented `proc`s may return pointers to
memory allocated with `c_malloc` such that it's up to the caller to free the
memory occupied by the return values.

Both `src/nim_status/go/shim.nim` and `src/nim_status/c/go/shim.nim` are pure
shims around status-go, the main difference being their call signatures.

With `src/nim_status/lib.nim` the intention is to implement functionality in a
manner independent of status-go's idioms.

In `src/nim_status/[c/]lib/shim.nim` the intention is to wrap around
`src/nim_status/lib.nim` in a way that preserves status-go's call signatures
and formatting. For example, the `hashMessage` proc introduced in this PR in
`src/nim_status/lib.nim` returns a hex string like `0xabcd...` while
`lib/shim.nim` returns JSON that's a match for status-go:
`{"result":"0xabcd..."}`.

Both `src/nim_status.nim` and `src/nim_status/c/nim_status.nim` represent a
hybrid of `go/shim.nim` and `lib/shim.nim`. Again, the goal is that over time
more and more `proc`s implemented in Nim replace the shims around status-go.

Callers that don't need to consume return values formatted according to
status-go conventions can use `lib.nim` directly, e.g. via `import
nim_status/lib` or by compiling `src/nim_status/c/lib.nim` and linking/calling
from C.

`c_malloc` has been copied (in `src/nim_status/c/sys.nim`) from the source of
Nim's `system/ansi_c` because `ansi_c` is an undocumented internal API.

A Nim template or pragma might be useful with respect to usgae of `c_malloc` in
`src/nim_status/c/*`, i.e. to cut down on repetition.

Use of `{.exportc.}` is limited to the `src/nim_status/c/nim_status.nim` hybrid
shim to avoid imposing exported symbols on C-oriented library consumers who may
wish to compose things in a different manner.

With respect to the Nim implementation of `hashMessage`, both Nim-oriented and
C-oriented tests have been implemented. Whether doubling up the tests is really
necessary/desirable for all `proc`s is probably worth discussing.

Adjust `.gitignore` and refactor `Makefile` accordingly; apply some lessons
learned while working on [status-im/nim-status-client][nsc].

Closes #34.

[shim-strat]: #5 (comment)
[sgo]: https://github.com/status-im/status-go
[nsc]: https://github.com/status-im/nim-status-client
import ../../lib/shim as nim_shim
import ../sys

proc hashMessage*(message: cstring): cstring =
Copy link

@zah zah Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the rationale behind the general strategy here.

The most straigh-forward approach for me for creating such a shim layer would be to define a high-level interface that uses the most natural Nim types for the task at hand. Here for example, you wouldn't return a cstring, but rather a normal string or even a stronger type such as nimcrypto.MDigest.

Once you have this high-level interface defined, you can create a thin translation layer that calls Go or lib_status.a to produce the result with some lower-level pointer-based APIs. The application will be refactored everywhere to make use of the new nicer higher-level API and the long-term prospects are that the lower-level pointer tricks can be removed in favour of directly importing modules from nim-status in the future or by depending on it as a DLL (through the use of nimRtl).

I cannot imagine what is gained by imitating the existing C API of Status-Go. Since it's too foreign for Nim, it will always have to be wrapped in some facade layer.

Copy link
Contributor Author

@michaelsbradleyjr michaelsbradleyjr Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @zah, much appreciated! I'll try my best to explain, and I'm very open to going a different direction with things.

Here for example, you wouldn't return a cstring, but rather a normal string or even a stronger type such as nimcrypto.MDigest

For the intended beginnings of that see src/nim_status/lib.nim. The idea is to use the most natural Nim types and Nim programming style in lib; users of the library are free to import nim_status/lib if they don't want to use the shimmed procs that imitate status-go. Granted, I just stuck with normal string in hashMessage of src/nim_status/lib.nim, but as you say it can probably be improved with a richer return type like nimcrypto.MDigest.

Once you have this high-level interface defined, you can create a thin translation layer

That's actually what I was/am going for, but being so new to Nim (as in completely new) I may have gone about it in a way that's not as natural or ergonomic as I could have.

I cannot imagine what is gained by imitating the existing C API of Status-Go

My intention was/is to do the imitating only in src/nim_status/go/[c/]/shim.nim, which is a simple wrapper around status-go, and src/nim_status/lib/[c/]/shim.nim, which does the job of imitating status-go for procs in src/nim_status/lib.nim. Those two are combined in src/nim_status.nim and src/nim_status/c/nim_status.nim to provide an imitation that's partly "pure Nim" and partly status-go.

So why have nim_status.nim be a combo of shims that imitate the existing C API of status-go?

To my mind, in the beginning steps, as most of the library relies on wrapping status-go, it seems to enable a more direct swap, i.e. a user like status-react could switch from compiling/linking status-go (libstatus.a) to compiling/linking nim-status (nim_status.a), and it would just work. At the same time, a Nim-based project like nim-status-client can use things implemented in “pure Nim“ via import nim_status/lib (with the more natural types) and get things still only available in status-go via import nim_status/go/shim.

Theoretically, there comes a point where src/nim_status/lib.nim implements all of the functionality and the wrapper around status-go becomes redundant. The wrapper (go/shim.nim) could be deprecated at that point. If/when a project like status-react is able to completely migrate away from the status-go calling conventions to ones in src/nim_status/c/lib.nim then lib/shim.nim could be deprecated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I've indeed missed the role of nim_status/lib.

The shims here (in c/lib/shim) make sense as long as you have users currently using Status-Go from pure C that are interested in getting a drop-in replacement implemented in Nim (although, you'll have to cater to other Nim-specific details such as ensuring that setupForeignThreadGc is called from the C code). If you don't have such users though, maintaining that layer can be seen as a currently unnecessary extra work (you can always create it on demand according to the logic behind the YAGNI principle).

If your goal is to develop nim-desktop-client though, I think a different strategy makes more sense. You can try to create the natural Nim interface as soon as possible and integrate it everywhere where Status-Go is currently used. The initial version will delegate the calls to Status-Go by translating them to the native Nim types and gradually more and more calls will be replaced with native Nim code. At no point, you would have to resort to using c_malloc and copyMem on the native Nim side.

@michaelsbradleyjr michaelsbradleyjr merged commit 9fb5c56 into master Aug 31, 2020
@michaelsbradleyjr michaelsbradleyjr deleted the feat/nim-hashmessage branch August 31, 2020 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement hashMessage in nim
4 participants