Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add p/uuid #2076

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open

feat: add p/uuid #2076

wants to merge 31 commits into from

Conversation

DIGIX666
Copy link
Contributor

I suggest adding a UUID package because I think it could be useful to have it available. I think it can further facilitate traceability and compatibility between different systems and apps as they would have a common standard for identifying entities.

To improve UUID generation, I wanted to add a resolver like Snowflake, but from what I've searched, sync/atomic is not supported on Gno. Is there another similar package that I might have missed ?

@DIGIX666 DIGIX666 requested review from a team as code owners May 12, 2024 15:39
@DIGIX666 DIGIX666 requested review from harry-hov and piux2 and removed request for a team May 12, 2024 15:39
@github-actions github-actions bot added the 🧾 package/realm Tag used for new Realms or Packages. label May 12, 2024
Copy link

codecov bot commented May 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 63.10%. Comparing base (af05780) to head (3a18918).
Report is 6 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2076      +/-   ##
==========================================
- Coverage   63.32%   63.10%   -0.22%     
==========================================
  Files         548      548              
  Lines       78511    81134    +2623     
==========================================
+ Hits        49719    51203    +1484     
- Misses      25438    26511    +1073     
- Partials     3354     3420      +66     
Flag Coverage Δ
contribs/gnodev 61.11% <ø> (+0.53%) ⬆️
contribs/gnofaucet 14.82% <ø> (ø)
gno.land 67.37% <ø> (+0.18%) ⬆️
gnovm 67.88% <ø> (ø)
misc/genstd 79.72% <ø> (ø)
tm2 62.42% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DIGIX666 DIGIX666 changed the title feat: initial p/uuid dapp feat: add p/uuid May 12, 2024
@notJoon
Copy link
Member

notJoon commented May 13, 2024

To improve UUID generation, I wanted to add a resolver like Snowflake, but from what I've searched, sync/atomic is not supported on Gno. Is there another similar package that I might have missed ?

There is no other way to guarantee atomicity besides using time in environments where atomic is not supported. If we consider it further, it seems possible to generate a random value (using crypto/rand or not merged yet, but maybe later) like a machine ID, assign it, and then combine it. Of course, this is slightly different from the implementation of Snowflake.

@DIGIX666
Copy link
Contributor Author

To improve UUID generation, I wanted to add a resolver like Snowflake, but from what I've searched, sync/atomic is not supported on Gno. Is there another similar package that I might have missed ?

There is no other way to guarantee atomicity besides using time in environments where atomic is not supported. If we consider it further, it seems possible to generate a random value (using crypto/rand or not merged yet, but #1933 later) like a machine ID, assign it, and then combine it. Of course, this is slightly different from the implementation of Snowflake.

Thanks for your explanation. I will check the progress of crypto/rand to try to use it.

Copy link
Member

@zivkovicmilos zivkovicmilos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super cool, I can already see myself using it :)

I've left a few comments, otherwise it looks good 💯

examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved

func NewUUID() *UUID {
u := &UUID{
StartTime: time.Date(2008, 11, 10, 23, 0, 0, 0, time.UTC).Unix() * 1000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is fixed, why not just use the constant unix time for this value?

Also, what is this value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed you are right it would be better to provide the data directly in Unix which would avoid recalculating them at each function call.
The value used is the one used in snowflake because it is based on the fact that go playground starts with the temp of 2009-11-10 23:00:00 UTC.
So I took the same as that of snowflake because I did not find the time.Now() of gno playground and I believe that in gno time.Now() is based on the return of the block time rather than the system time.
SCR-20240820-pcie

Do you have more information on this subject?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in gno time.Now() is based on the return of the block time rather than the system time.

Yes, it's the block execution context time, and same for all transactions within a block

Can you add a comment, and change this value to unix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in gno time.Now() is based on the return of the block time rather than the system time.

Yes, it's the block execution context time, and same for all transactions within a block

Can you add a comment, and change this value to unix?

Yep, I had already made this modification in this commit :)
353beb1

examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid_test.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid_test.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid_test.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid_test.gno Outdated Show resolved Hide resolved
return buf
}

func bytesToHex(bs []byte) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about implementing this as:

return "0x + hex.EncodeToString(bs)`

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the idea of ​​using hex.EncodeToString() I hadn't thought of it.
However I think it's better to stick to the standard generation of UUIDs because 0x indicates that the value is in hexadecimal but is not used in UUID display standards. For the structure you have to keep the - to delimit the 5 groups and keep a quick visibility.

What do you think of this implementation?

hexStr := hex.EncodeToString(bytes)
	return ufmt.Sprintf("%s-%s-%s-%s-%s",
		hexStr[0:8],
		hexStr[8:12],
		hexStr[12:16],
		hexStr[16:20],
		hexStr[20:],
	)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
Comment on lines 23 to 25
TimestampLength uint8 = 32
MachineIDLength uint64 = 10
SequenceLength uint64 = 12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more than "length" (which I understand immediately as bytes), it's bits (ie. TimestampBits).

also, why typed? they could just be untyped constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for the clarification, I thinked length it is a correct name for length of bits.

I chose to keep these constants typed for several reasons:

  1. Clarity: Explicit typing makes it clear that these constants are used in bit-related operations.

  2. Error prevention: Ensure that operations follow expected constraints and avoid compilation errors, especially for bit shifts.

  3. Performance: Typing optimizes performance and avoids implicit conversions.

  4. Scalability: Code compatibility if the logic needs to evolve.

If untyped constants are more appropriate in this case, I'm open to hearing :)

examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
machineIDMoveLength = SequenceLength
timestampMoveLength = MachineIDLength + SequenceLength

startTime uint32 = 1226354400 // 10 Nov 2008 23:00:00 UTC in seconds (Snowflake epoch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for some kind of compatibility, or could it just be 1/1/24?

i see no reason to use epochs from other formats unless it's to ensure some kind of sequencing / compatiblity from existing IDs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your opinion and yes it's possible to just use 1/1/24 but can you check my comment with @zivkovicmilos on this subject ? 🙏🏼
#2076 (comment)

Comment on lines 94 to 109
// Copy transformed ID into the second half of the array
copy(bytes[8:], uint64ToBytes(id))

// Use bits from the Snowflake ID to generate the first half of the UUID
bytes[0] = byte(id >> 60)
bytes[1] = byte(id >> 52)
bytes[2] = byte(id >> 44)
bytes[3] = byte(id >> 36)
bytes[4] = byte(id >> 28)
bytes[5] = byte(id >> 20)
bytes[6] = byte(id >> 12)
bytes[7] = byte(id >> 4)

// set the version and variant bits according to UUID specification.
bytes[6] = (bytes[6] & 0x0f) | 0x40 // version 4 (random)
bytes[8] = (bytes[8] & 0x3f) | 0x80 // variant 1 (RFC 4122)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, we're using the same id both in uint64ToBytes, and then copying it over in a slightly modified manner later in this function

I don't particularly like this. We're restricting the number of bits allotted to a timestamp to 32 bits, so we can generate 64-bit integers; when really UUIDs are 122-bits, so we could reasonably have timestamps of at the very least 40 bits (which IMO is more reasonable).

Additionally, this means that there are not 2^122 possible IDs out of this package, but only 2^64.

There needs to be a decision: is this a Snowflake ID generator or is this a UUID generator? Either is fine, but understanding the target is important; and if we use UUIDs we should try to use all 122 bits at our disposal to try to create really unique identifiers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you're saying. Originally, I had opted for a 64-bit range, but I ran into a uniqueness problem with the entropy value, which was in uint32.

So, to use the full 64-bit range, I changed TimestampBits to 42 bits. Let me know what you think ;)
859cae9

This is a Snowflake ID generator because I want to keep the Timestamp present.

Copy link
Contributor

@leohhhn leohhhn Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DIGIX666 - as @thehowl said, why not call it snowflakeID and implement it like that then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I thought keeping the name UUID was more explicit, but in light of your feedback and my desire to stick with a Snowflake ID generator system, I will change the names to maintain consistency throughout 👍🏼
3a18918

Co-authored-by: Morgan <git@howl.moe>
Comment on lines 57 to 59
current := u.getEntropy()
elapsedTime := current - u.startTime
u.lastTimestamp = current
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you making math stuff between random and timestamp?
it looks like just waste CPU.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't "current" be the current time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because at first I used classic incrementation for current because time.Now() didn't work properly. So I replaced it with entropy once merge to decrease the chances of prediction. But now I'm testing to use time.Now() and entropy

Output with entropy :
SCR-20240917-tava

Copy link
Member

@moul moul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems not "gno-ified" enough :)

can you also provide some examples, like a screenshot or something so that we can see various generated output examples just by reviewing the PR, please?

@DIGIX666
Copy link
Contributor Author

seems not "gno-ified" enough :)

can you also provide some examples, like a screenshot or something so that we can see various generated output examples just by reviewing the PR, please?

Output this my last commit 485ce8d:
SCR-20240917-txql

Copy link
Contributor Author

@DIGIX666 DIGIX666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finish corrective review

Comment on lines 23 to 25
TimestampLength uint8 = 32
MachineIDLength uint64 = 10
SequenceLength uint64 = 12
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for the clarification, I thinked length it is a correct name for length of bits.

I chose to keep these constants typed for several reasons:

  1. Clarity: Explicit typing makes it clear that these constants are used in bit-related operations.

  2. Error prevention: Ensure that operations follow expected constraints and avoid compilation errors, especially for bit shifts.

  3. Performance: Typing optimizes performance and avoids implicit conversions.

  4. Scalability: Code compatibility if the logic needs to evolve.

If untyped constants are more appropriate in this case, I'm open to hearing :)

Comment on lines 94 to 109
// Copy transformed ID into the second half of the array
copy(bytes[8:], uint64ToBytes(id))

// Use bits from the Snowflake ID to generate the first half of the UUID
bytes[0] = byte(id >> 60)
bytes[1] = byte(id >> 52)
bytes[2] = byte(id >> 44)
bytes[3] = byte(id >> 36)
bytes[4] = byte(id >> 28)
bytes[5] = byte(id >> 20)
bytes[6] = byte(id >> 12)
bytes[7] = byte(id >> 4)

// set the version and variant bits according to UUID specification.
bytes[6] = (bytes[6] & 0x0f) | 0x40 // version 4 (random)
bytes[8] = (bytes[8] & 0x3f) | 0x80 // variant 1 (RFC 4122)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you're saying. Originally, I had opted for a 64-bit range, but I ran into a uniqueness problem with the entropy value, which was in uint32.

So, to use the full 64-bit range, I changed TimestampBits to 42 bits. Let me know what you think ;)
859cae9

This is a Snowflake ID generator because I want to keep the Timestamp present.

examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
Comment on lines 57 to 59
current := u.getEntropy()
elapsedTime := current - u.startTime
u.lastTimestamp = current
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because at first I used classic incrementation for current because time.Now() didn't work properly. So I replaced it with entropy once merge to decrease the chances of prediction. But now I'm testing to use time.Now() and entropy

Output with entropy :
SCR-20240917-tava

machineIDMoveLength = SequenceLength
timestampMoveLength = MachineIDLength + SequenceLength

startTime uint32 = 1226354400 // 10 Nov 2008 23:00:00 UTC in seconds (Snowflake epoch)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your opinion and yes it's possible to just use 1/1/24 but can you check my comment with @zivkovicmilos on this subject ? 🙏🏼
#2076 (comment)

@zivkovicmilos
Copy link
Member

Hey @DIGIX666, can you please update this branch with master? 🙏

examples/gno.land/p/demo/uuid/uuid.gno Outdated Show resolved Hide resolved
Comment on lines 94 to 109
// Copy transformed ID into the second half of the array
copy(bytes[8:], uint64ToBytes(id))

// Use bits from the Snowflake ID to generate the first half of the UUID
bytes[0] = byte(id >> 60)
bytes[1] = byte(id >> 52)
bytes[2] = byte(id >> 44)
bytes[3] = byte(id >> 36)
bytes[4] = byte(id >> 28)
bytes[5] = byte(id >> 20)
bytes[6] = byte(id >> 12)
bytes[7] = byte(id >> 4)

// set the version and variant bits according to UUID specification.
bytes[6] = (bytes[6] & 0x0f) | 0x40 // version 4 (random)
bytes[8] = (bytes[8] & 0x3f) | 0x80 // variant 1 (RFC 4122)
Copy link
Contributor

@leohhhn leohhhn Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DIGIX666 - as @thehowl said, why not call it snowflakeID and implement it like that then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📦 ⛰️ gno.land Issues or PRs gno.land package related 📦 🤖 gnovm Issues or PRs gnovm related 🧾 package/realm Tag used for new Realms or Packages.
Projects
Status: In Progress
Status: In Review
Development

Successfully merging this pull request may close these issues.

7 participants