Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Private key does not match the address on AMD GPU #13

Open
nslogx opened this issue Oct 24, 2022 · 7 comments
Open

Private key does not match the address on AMD GPU #13

nslogx opened this issue Oct 24, 2022 · 7 comments

Comments

@nslogx
Copy link

nslogx commented Oct 24, 2022

Private Key: 0x41f242f2eed3cd137bf92920217729e0388dd162c95e95a42ac5afb83730df6c
Public Key: 0497f968af3e0db0bc498be53824ff504888dd812efa28f8e44106223702640b6bd6104b40388d719bbe1899198a98e76a395478d0a891403f1a80352034eac699
./profanity2.x64 --leading 0 -z 97f968af3e0db0bc498be53824ff504888dd812efa28f8e44106223702640b6bd6104b40388d719bbe1899198a98e76a395478d0a891403f1a80352034eac699
Mode: leading
Target: Address
Devices:
  GPU0: AMD Radeon Pro 580X Compute Engine, 8589934592 bytes available, 36 compute units (precompiled = no)

Initializing OpenCL...
  Creating context...OK
  Compiling kernel...OK
  Building program...OK
  Saving program...OK

Initializing devices...
  This should take less than a minute. The number of objects initialized on each
  device is equal to inverse-size * inverse-multiple. To lower
  initialization time (and memory footprint) I suggest lowering the
  inverse-multiple first. You can do this via the -I switch. Do note that
  this might negatively impact your performance.

  GPU0 initialized

Initialization time: 93 seconds
Running...
  Always verify that a private key generated by this program corresponds to the
  public key printed by importing it to a wallet of your choice. This program
  like any software might contain bugs and it does by design cut corners to
  improve overall performance.

  Time:    94s Score:  5 Private: 0x00006799acac0ea1a6364b298ec855f3f2e2253fdfad1f42dcee84dc3d8b35fa Address: 0x00000cdcc859dadebd04f17f3136e209fb23018a
  Time:    94s Score:  6 Private: 0x00006799acae27d1a6364b298ec855f3f2e2253fdfad1f42dcee84dc3d8b35fe Address: 0x00000085f9e1a2f5f3bfab686d509f488df57396
  Time:    96s Score:  7 Private: 0x00006799acceebcfa6364b298ec855f3f2e2253fdfad1f42dcee84dc3d8b3617 Address: 0x000000090043190e32734d33772b48fa54c64249
^Ctal: 43.868 MH/s - GPU0: 43.868 MH/s
>>> hex((0x41f242f2eed3cd137bf92920217729e0388dd162c95e95a42ac5afb83730df6c + 0x00006799acceebcfa6364b298ec855f3f2e2253fdfad1f42dcee84dc3d8b3617) % 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F)
'0x41f2aa8c9ba2b8e3222f7449b03f7fd42b6ff6a2a90bb4e707b4349474bc1583'
>>>

the private key does not match the address 0x000000090043190e32734d33772b48fa54c64249

@nslogx
Copy link
Author

nslogx commented Oct 24, 2022

Intel UHD Graphics 630 can generate the correct address while AMD generate error address

@nslogx
Copy link
Author

nslogx commented Oct 24, 2022

johguse#27

@k06a

@k06a
Copy link
Member

k06a commented Oct 24, 2022

This issue seems inherited from the original profanity. Not sure I can resolve it on my own.

@Alchemyst0x
Copy link

Alchemyst0x commented Nov 5, 2022

I seem to be having the same issue - probably should have checked the first results before I went ahead and let it mine away freely on an AWS accelerated compute instance for about a day and a half, but no matter.

I am still figuring out this kind of key generation process conceptually, so there's a chance I did something incorrectly. I am testing my results to see if I can arrive at the expected output that was returned from the run. I did some contract addresses as well as standard addresses; will report back when I have some time to test everything.

I will say, performance was pretty damn impressive though! 405MH/s average on a single V100 GPU.

Thanks for putting the work in to get this up and running again.


Edit: Just tried this on my M1 Max 32Core, 64Gb RAM, pretty satisfied with the results; they all line up perfectly as well 👍

  Time:     7s Score:  5 Private: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Address: 0x00000xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  Time:     7s Score:  6 Private: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Address: 0x000000xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  Time:     7s Score:  8 Private: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Address: 0x00000000xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  Time:   107s Score:  9 Private: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Address: 0x000000000xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Total: 189.493 MH/s - GPU0: 189.493 MH/s

@gobigobigobigobi
Copy link

At the moment I am trying to reproduce the bug on AMD RX 6950 XT. It works ok and generates correct addresses.

$ lspci  | grep  VGA
44:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 73a5 (rev c0)
$ cat /etc/issue.net
Ubuntu 20.04.5 LTS
$ md5sum /opt/rocm-5.4.0/include/CL/cl.h
a218f8b7cf7def7c7159216d1718b50f  /opt/rocm-5.4.0/include/CL/cl.h

$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


$ diff  Makefile.orig Makefile
13c13
<       CFLAGS=-c -std=c++11 -Wall -mmmx -O2 -mcmodel=large
---
>       CFLAGS=-c -std=c++11 -Wall -mmmx -O2 -mcmodel=large  -I /opt/rocm-5.4.0/include/

$ make
g++ -c -std=c++11 -Wall -mmmx -O2 -mcmodel=large  -I /opt/rocm-5.4.0/include/  Dispatcher.cpp -o Dispatcher.o
In file included from /opt/rocm-5.4.0/include/CL/cl.h:20,
                 from Dispatcher.hpp:14,
                 from Dispatcher.cpp:1:
/opt/rocm-5.4.0/include/CL/cl_version.h:21:104: note: #pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)
   21 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")
      |                                                                                                        ^
Dispatcher.cpp: In static member function ‘static _cl_command_queue* Dispatcher::Device::createQueue(_cl_context*&, _cl_device_id*&)’:
Dispatcher.cpp:131:34: warning: converting to non-pointer type ‘cl_command_queue_properties’ {aka ‘long unsigned int’} from NULL [-Wconversion-null]
  131 |  cl_command_queue_properties p = NULL;
      |                                  ^~~~
Dispatcher.cpp: In member function ‘void Dispatcher::addDevice(cl_device_id, size_t, size_t)’:
Dispatcher.cpp:225:141: warning: ‘new’ of type ‘Dispatcher::Device’ with extended alignment 32 [-Waligned-new=]
  225 | Context, m_clProgram, clDeviceId, worksizeLocal, m_size, index, m_mode, m_publicKeyX, m_publicKeyY);
      |                                                                                                   ^

Dispatcher.cpp:225:141: note: uses ‘void* operator new(std::size_t)’, which does not have an alignment parameter
Dispatcher.cpp:225:141: note: use ‘-faligned-new’ to enable C++17 over-aligned new support
Dispatcher.cpp: In constructor ‘Dispatcher::Device::Device(Dispatcher&, _cl_context*&, _cl_program*&, cl_device_id, size_t, size_t, size_t, const Mode&, cl_ulong4, cl_ulong4)’:
Dispatcher.cpp:170:1: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
  170 | Dispatcher::Device::Device(Dispatcher & parent, cl_context & clContext, cl_program & clProgram, cl_device_id clDeviceId, const size_t worksizeLocal, const size_t size, const size_t index, const Mode & mode, cl_ulong4 clSeedX, cl_ulong4 clSeedY) :
      | ^~~~~~~~~~
g++ -c -std=c++11 -Wall -mmmx -O2 -mcmodel=large  -I /opt/rocm-5.4.0/include/  Mode.cpp -o Mode.o
In file included from /opt/rocm-5.4.0/include/CL/cl.h:20,
                 from Mode.hpp:9,
                 from Mode.cpp:1:
/opt/rocm-5.4.0/include/CL/cl_version.h:21:104: note: #pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)
   21 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")
      |                                                                                                        ^
g++ -c -std=c++11 -Wall -mmmx -O2 -mcmodel=large  -I /opt/rocm-5.4.0/include/  precomp.cpp -o precomp.o
In file included from /opt/rocm-5.4.0/include/CL/cl.h:20,
                 from types.hpp:10,
                 from precomp.hpp:4,
                 from precomp.cpp:1:
/opt/rocm-5.4.0/include/CL/cl_version.h:21:104: note: #pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)
   21 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")
      |                                                                                                        ^
g++ -c -std=c++11 -Wall -mmmx -O2 -mcmodel=large  -I /opt/rocm-5.4.0/include/  profanity.cpp -o profanity.o
In file included from /opt/rocm-5.4.0/include/CL/cl.h:20,
                 from profanity.cpp:16:
/opt/rocm-5.4.0/include/CL/cl_version.h:21:104: note: #pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)
   21 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")
      |                                                                                                        ^
g++ -c -std=c++11 -Wall -mmmx -O2 -mcmodel=large  -I /opt/rocm-5.4.0/include/  SpeedSample.cpp -o SpeedSample.o
g++ Dispatcher.o Mode.o precomp.o profanity.o SpeedSample.o -s -lOpenCL -mcmodel=large -o profanity2.x64

$ ./profanity2.x64

usage: ./profanity2 [OPTIONS]

  Mandatory args:
    -z                      Seed public key to start, add it's private key
                            to the "profanity2" resulting private key.

  Basic modes:
    --benchmark             Run without any scoring, a benchmark.
    --zeros                 Score on zeros anywhere in hash.
    --letters               Score on letters anywhere in hash.
    --numbers               Score on numbers anywhere in hash.
    --mirror                Score on mirroring from center.
    --leading-doubles       Score on hashes leading with hexadecimal pairs

  Modes with arguments:
    --leading <single hex>  Score on hashes leading with given hex character.
    --matching <hex string> Score on hashes matching given hex string.

  Advanced modes:
    --contract              Instead of account address, score the contract
                            address created by the account's zeroth transaction.
    --leading-range         Scores on hashes leading with characters within
                            given range.
    --range                 Scores on hashes having characters within given
                            range anywhere.

  Range:
    -m, --min <0-15>        Set range minimum (inclusive), 0 is '0' 15 is 'f'.
    -M, --max <0-15>        Set range maximum (inclusive), 0 is '0' 15 is 'f'.

  Device control:
    -s, --skip <index>      Skip device given by index.
    -n, --no-cache          Don't load cached pre-compiled version of kernel.

  Tweaking:
    -w, --work <size>       Set OpenCL local work size. [default = 64]
    -W, --work-max <size>   Set OpenCL maximum work size. [default = -i * -I]
    -i, --inverse-size      Set size of modular inverses to calculate in one
                            work item. [default = 255]
    -I, --inverse-multiple  Set how many above work items will run in
                            parallell. [default = 16384]

  Examples:
    ./profanity2 --leading f -z HEX_PUBLIC_KEY_128_CHARS_LONG
    ./profanity2 --matching dead -z HEX_PUBLIC_KEY_128_CHARS_LONG
    ./profanity2 --matching badXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXbad -z HEX_PUBLIC_KEY_128_CHARS_LONG
    ./profanity2 --leading-range -m 0 -M 1 -z HEX_PUBLIC_KEY_128_CHARS_LONG
    ./profanity2 --leading-range -m 10 -M 12 -z HEX_PUBLIC_KEY_128_CHARS_LONG
    ./profanity2 --range -m 0 -M 1 -z HEX_PUBLIC_KEY_128_CHARS_LONG
    ./profanity2 --contract --leading 0 -z HEX_PUBLIC_KEY_128_CHARS_LONG

  About:
    profanity2 is a vanity address generator for Ethereum that utilizes
    computing power from GPUs using OpenCL.

  Forked "profanity2":
    Author: 1inch Network <info@1inch.io>
    Disclaimer:
      This project "profanity2" was forked from the original project and
      modified to guarantee "SAFETY BY DESIGN". This means source code of
      this project doesn't require any audits, but still guarantee safe usage.

  From original "profanity":
    Author: Johan Gustafsson <profanity@johgu.se>
    Beer donations: 0x000dead000ae1c8e8ac27103e4ff65f42a4e9203
    Disclaimer:
      Always verify that a private key generated by this program corresponds to
      the public key printed by importing it to a wallet of your choice. This
      program like any software might contain bugs and it does by design cut
      corners to improve overall performance.

$

one of my runs:

$ ./profanity2.x64 --leading 0 -z 97f968af3e0db0bc498be53824ff504888dd812efa28f8e4410622370                                                2640b6bd6104b40388d719bbe1899198a98e76a395478d0a891403f1a80352034eac699
Mode: leading
Target: Address
Devices:
  GPU0: gfx1030, 17163091968 bytes available, 40 compute units (precompiled = no)

Initializing OpenCL...
  Creating context...OK
  Compiling kernel...OK
  Building program...OK
  Saving program...OK

Initializing devices...
  This should take less than a minute. The number of objects initialized on each
  device is equal to inverse-size * inverse-multiple. To lower
  initialization time (and memory footprint) I suggest lowering the
  inverse-multiple first. You can do this via the -I switch. Do note that
  this might negatively impact your performance.

  GPU0 initialized

Initialization time: 0 seconds
Running...
  Always verify that a private key generated by this program corresponds to the
  public key printed by importing it to a wallet of your choice. This program
  like any software might contain bugs and it does by design cut corners to
  improve overall performance.

  Time:     0s Score:  5 Private: 0x000069ee915d31ac9c41c019807d54c0f96214091d0599cebf8e88ee081ead8f Address: 0x00000773b6ccf4e9f1b764983fd8fea85e529cf4
  Time:     0s Score:  6 Private: 0x000069ee915fab5f9c41c019807d54c0f96214091d0599cebf8e88ee081ead9b Address: 0x000000ec41782e6dfb2a5e8fc2005ea5a1b6823f
  Time:     2s Score:  7 Private: 0x000069ee9158ee8c9c41c019807d54c0f96214091d0599cebf8e88ee081eae26 Address: 0x0000000eac1d865ea20bf6eff59ae8945531780b
  Time:    23s Score:  8 Private: 0x000069ee917995099c41c019807d54c0f96214091d0599cebf8e88ee081eb79d Address: 0x0000000071483aa7a07460e44005609abb5979bd
^Ctal: 493.227 MH/s - GPU0: 493.227 MH/s

Overall, it generates CORRECT private keys and addresses. It just works.

Typical VRAM memory occupation:

Every 1.0s: rocm-smi --showmeminfo vram                                                                                                                               c11: Wed Feb  1 12:42:20 2023



======================= ROCm System Management Interface =======================
============================= Memory Usage (Bytes) =============================
GPU[0]          : VRAM Total Memory (B): 17163091968
GPU[0]          : VRAM Total Used Memory (B): 746311680
================================================================================
============================= End of ROCm SMI Log ==============================

@gobigobigobigobi
Copy link

I suppose that it is a driver issue and usage of ROCM 5.4.0 drivers can really help.

@gobigobigobigobi
Copy link

This issue seems inherited from the original profanity. Not sure I can resolve it on my own.

Indeed, original profanity had the very same issue:

johguse#39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@k06a @nslogx @Alchemyst0x @gobigobigobigobi and others