Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] politeiad: Import legacy records to tstore #1477

Closed
wants to merge 19 commits into from

Conversation

thi4go
Copy link
Member

@thi4go thi4go commented Jul 30, 2021

This diff adds a legacy tool that walks a git repository folder
for records and parses them to tstore. It was decided to treat the
last version of the records as version 1 on tstore, for simplicity
sake. If one wishes, more information about later versions of the
legacy records will always be available through the git repository.
This tool also bypasses some backend validations by inserting the
records and their respective blobs through a tstore connection
directly.

Some signature verification capabilities have been lost in the import
process due to significant data changes on the signed data structures.

The README for this tool contains some considerations about decisions
that were taken during the development process, as well as example
usages, configuration options and the best way to test this locally. The
issue (#1425) also contains some documented decisions if one wishes
to read further into.

Closes #1425

@thi4go thi4go marked this pull request as ready for review October 25, 2021 18:43
@thi4go thi4go changed the title [wip] d: Import legacy records to tstore backend d: Import legacy records to tstore backend Oct 25, 2021
@thi4go
Copy link
Member Author

thi4go commented Oct 25, 2021

This diff is ready to begin the reviewing process. It was decided to leave the cast vote timestamp parsing (issue comment) for last, since it does not affect other parts of the import and it is what's holding off this PR. Once it's done, the timestamp data will be added to the cast vote blob without touching any other part of legacyimport, so we can tackle this independently.

After talks with the team, there are a few things that still need to be implemented:

  • Handle failures gracefully. The tool needs to pick the import from where it left off if any failure happen.
  • Fetch external data on dcrdata and pi API before opening a connection to tstore.

To accomplish this, the tool will add a command that pre-parses data from the git repository, including the external APIs data, and dump it to a specified directory. The tool can then load this data with the import command from disk, and not depend on any external services being up and running.

For handling failures, the import command will check the state of legacy proposals on tstore before doing any appending to the tlog tree, and ignore the data that has already been inserted.

@thi4go thi4go changed the title d: Import legacy records to tstore backend politeiad: Import legacy records to tstore Oct 25, 2021
@thi4go thi4go changed the title politeiad: Import legacy records to tstore [WIP] politeiad: Import legacy records to tstore Oct 29, 2021
CastVote castVoteV1 `json:"castvote"` // Client side vote
Receipt string `json:"receipt"` // Signature of CastVote.Signature
}
type castVoteV1 struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type castVoteV1 struct {
type castVoteV1 struct {

Email string `json:"email,omitempty"`
Username string `json:"username"`
}
type usersReply struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type usersReply struct {
type usersReply struct {


Application flags:

`--path` - Path to git record repository. This flag is mandatory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's mandatory, it should be an argument. Flags should be for optional settings or settings that are configurable but have defaults.

`--path` - Path to git record repository. This flag is mandatory.
`--comments` - Enable `comments.journal` parsing.
`default: false`
`--ballot` - Enable `ballot.journal` parsing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, the command should parse everything. Flags should be used to prevent certain parsing during testing, not enable it.

)

var (
defaultHomeDir = dcrutil.AppDataDir("politeiad", false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Home directory refers to the application (legacy) home directory. These politeiad variables should have "politeiad" somewhere in the name to indicate that they are not refering to the legacy application. The term "default" also implies that this value can be configured. If this is not the case then "default" should not be used.

return err
}
// TODO: improve
if path == "data" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this?

// First, check if record is a RFP Parent. If so, save the start
// runoff blob to be saved later on.
if data.VoteMd.LinkBy != 0 {
l.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this locking and then manually unlocking instead of using a defer l.Unlock() is not acceptable.

The caches should have helper methods that handle all locking.

if data.VoteMd.LinkBy != 0 {
l.Lock()
if _, ok := l.rfpParents[data.LegacyToken]; ok {
l.rfpParents[data.LegacyToken].Mask = data.VoteDetailsMd.Params.Mask
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

80 col limit. This code is unreadable.

return err
}

fmt.Printf("legacy: Importing %v records to tstore\n", len(records))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really matter, but why do you prefix all the log statements with "legacy: "? Isn't the legacy tool the only thing that is going to be doing the printing?

@lukebp lukebp closed this Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import legacy git proposals into tstore.
3 participants