-
Notifications
You must be signed in to change notification settings - Fork 0
Add conversion for GitHub API response to GitHubIssue
#192
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #192 +/- ##
==========================================
+ Coverage 30.64% 30.99% +0.34%
==========================================
Files 15 16 +1
Lines 979 968 -11
==========================================
Hits 300 300
+ Misses 679 668 -11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/github.rs
Outdated
| const GRAPHQL_ISSUE_NUMBER_ASSERTION: &str = r" | ||
| GraphQL field Issue.number is Int! type, thus always exist. | ||
| And it will not exceed 2^32."; | ||
| const GRAPHQL_PULL_REQUEST_NUMBER_ASSERTION: &str = r" | ||
| GraphQL field PullRequest.number is Int! type, thus always exist. | ||
| And it will not exceed 2^32."; | ||
| const GRAPHQL_ISSUE_CONNECTION_TOTAL_COUNT_ASSERTION: &str = r" | ||
| GraphQL field IssueConnection.totalCount is Int! type, thus always exist. | ||
| And it will not exceed 2^32."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about removing the assertions and using TryFrom instead, as we've already adopted that approach in discussion.rs?
BTW, the maximum value of i32 is 2^31 - 1, considering negative values and zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented:
TryFromfor a conversion- From: GitHub response (
GraphQlResponse<issues::ResponseData>) - To: Our own data structure
GitHubIssueResponse - I thought the conversion is fallible due to incorrect request, etc.
- From: GitHub response (
Fromfor few conversions- From: Structs generated by
graphql-client's codegen (ex)IssuesRepositoryIssuesNodesComments - To: Our own data structures. (ex)
GitHubCommentConnection - I thought those conversions are infallible, because the content of codegen structs and our own data structures are identical.
- From: Structs generated by
The reason I implement From for those conversions is that those conversions are infallible. Rust standard document guides to implement TryFrom for fallible conversions and From for infallible conversions.
Note: This trait must not fail. The
Fromtrait is intended for perfect conversions. If the conversion can fail or is not perfect, useTryFrom.
I think replacing From implementation for infallible conversion to TryFrom makes one kind of misunderstanding - that the conversion is fallible while it is actually infallible.
And those assertions are for expect() in infallible conversions, not for exception handling.
If you think the code is too verbose or longer than needed, I will simplify the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out that max value of i32 is 2^31-1. I should fix it.
By the way, how about changing types of number, total_count from i32 to unsinged integer such as u32?
The numbers are never be negative number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think replacing
Fromimplementation for infallible conversion toTryFrommakes one kind of misunderstanding - that the conversion is fallible while it is actually infallible.
That said, the documentation also states:
But From cannot be used to convert from u32 to u16, since that cannot succeed in a lossless way.
I believe the concept of fallibility here is from a technical standpoint.
By the way, how about changing types of number, total_count from i32 to unsigned integer such as u32?
These values are used for GraphQL Int type, which is equivalent to i32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll implement TryFrom and use pattern such as:
let number: i32 = issue.number.try_into()?;inside try_from(), and remove those string literals for assertions.
src/github.rs
Outdated
| } | ||
|
|
||
| #[test] | ||
| fn convert_response_to_issue_() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about removing the trailing underscore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
src/github.rs
Outdated
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you verify that your VSCode is configured as described in the Notion page? There are several warnings in the test module.
"rust-analyzer.check.extraArgs": [
"--all-features",
"--tests",
"--",
"-W",
"clippy::pedantic"
],There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not noticed it was set to --features default. Thanks for the check.
src/github.rs
Outdated
| let graphql_response: GraphQlResponse<issues::ResponseData> = | ||
| serde_json::from_str(response_str).expect("Valid JSON"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about removing response_str and implementing Default for GraphQlResponse<issues::ResponseData> instead?
There are several reason for this suggestion:
serde_json::from_stris outside the scope of this test.response_strtakes up too much space in the file.- If the GraphQL fields change,
response_strmust be exhaustively updated, which is quite burdensome.
I believe we can adopt a similar approach to what's used in issue_stat::test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I agree that the string
response_strin unit test is too long. How about separate it to a filetests/issue_response_from_github.json?- As you pointed out, if GitHub GraphQL API changes, we should update the sample json data. However, I think core API of GitHub GraphQL API will not change that frequently.
- I think fetching raw data from GitHub GraphQL API is our core/essential feature. If we can not parse response from GitHub, our dashboard server can not serve to users.
-
Implementing a trait/method for
GraphQlResponse<issues::ResponseData>is impossible because it is foreign type. I suggest (1) not implementing them or (2) implementingissues::ResponseData::new()instead, which gets issue numbers as a parameter.issues::ResponseData::new()can avoid duplicatednumberfields, which will be used as composite key (primary key) of database records.- I think creating object with default values might not simplify our test code much in this test case. For example, our default value for
IssueStateisIssueState::OPEN. For testing resolved issue statistics, we need to create object with default values and modify fields of each object. Also, for another example, we need to modify eachclosed_by_pull_requests's state to be merged, because we deteremine an issue to be resolved if and only if all of itsclosed_by_pull_requestsis merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- As you pointed out, if GitHub GraphQL API changes, we should update the sample json data. However, I think core API of GitHub GraphQL API will not change that frequently.
Sorry for the confusion. I wasn't referring to the GitHub GraphQL API, but to our own issue.graphql query. Whenever we add a new field for a statistics, this test case will fail unless it's updated accordingly. If we can find a way to prevent the test from failing when new fields are added, I believe that would be deal.
If we can not parse response from GitHub, our dashboard server can not serve to users.
Parsing the JSON is handled by the serde_json crate, so I think that falls outside our direct responsibility.
I suggest (1) not implementing them or (2) implementing issues::ResponseData::new() instead
I'd like to suggest option '(1) not implementing them', following the approach used in discussion.rs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overview of implementations in this PR
The figure below shows overview of implementations in this pull request.
response_str * Response JSON from GitHub GraphQL API,
|| which may contain issues, paging informations,
|| graphql_client crate has integration with error messages
|| serde_json crate. The conversion is done by
|| serde_json crate, while result type is
|| graphql_client::Response<Data>, which is of
|| graphql_client crate.
\/
graphql_client::Response<issues::ResponseData> * Data structure defined by graphql_client crate.
||
|| Conversion by using TryFrom. If success,
|| (1) Issue Data and (2) Paging Information
|| are obtained.
||
||
|| if success
|----------------> (IssuesRepositoryIssuesNodes, paging inforamtions)
|| ||
|| || Conversion by using From.
|| ||
|| \/
\/ +-- Vec<GitHubIssue>
GitHubIssueResponse----+ * Data structure defined by us.
+-- Paging Informations
(has_next_page, end_cursor)
Explanation on conversion implementation
Let me explain my implementation.
I decided to define a struct GitHubIssueResponse to store issue data (issues) and paging informations (has_next_page, end_cursor) and implement conversion from GitHub response, whose type is GraphQlResponse<issues::ResponseData>>, to GitHubIssueResponse by implementing TryFrom.
use graphql_client::{GraphQLQuery, QueryBody, Response as GraphQlResponse};
...
struct GitHubIssueResponse {
issues: Vec<GitHubIssue>,
has_next_page: bool,
end_cursor: Option<String>,
}
...
impl TryFrom<GraphQlResponse<issues::ResponseData>> for GitHubIssueResponse {
...
}
...At first, I thought about implementing a function named parse(), such as:
impl GraphQlResponse<issues::ResponseData>> {
fn parse(&self) -> (Vec<GitHubIssue>, bool, Option<String>) {
...
}
}but it was impossible to define function/method to GraphQlResponse<issues::ResponseData> because it was foreign type. So I change my mind to implement TryFrom. To make the conversion lossless, I decided to define struct GitHubIssueResponse in order to make conversion to work without dropping any information from GitHub response. (Rust standard says that From and TryFrom should be lossless)
If conversion is success, we can get issue data as a IssuesRepositoryIssuesNodes object and paging informations.
I also implemented few conversions using From. They convert struct generated by graphql_client crate's codegen, based on GraphQL schema defined by GitHub. All of them is called if and only if the GitHubIssueResponse::try_from() is success. The reason I chose From is that those conversions are almost 1:1 trivial mapping of fields, so they are infallible and lossless.
impl From<IssuesRepositoryIssuesNodes> for GitHubIssue {
...
}Reason for using JSON in unit tests
The reason I gave JSON as a test input is not to test serde_json crate.
I do not intend to test serde_json, but I'm just calling them in the unit tests.
If I separate JSON to a file named tests/issue_response_from_github.json then the test code looks like:
#[test]
fn convert_response_to_issue_() {
let response_str = include_str!("../tests/issue_response_from_github.json");
let graphql_response: GraphQlResponse<issues::ResponseData> =
serde_json::from_str(response_str).expect("Valid JSON");
let resp = GitHubIssueResponse::try_from(graphql_response)
.expect("Correct data, so parsing should success");
assert!(resp.has_next_page);
assert_eq!(
resp.end_cursor,
Some(String::from(
"Y3Vyc29yOnYyOpK5MjAyMi0wNy0xMlQxODozMzo0MiswOTowMM5Nl-UC"
))
);
// ... More assertions on GitHubIssue objects,
// which can be obtained by converting
// GraphQlResponse<issues::ResponseData>
// to GitHubIssueResponseI want to note that:
- I wanted to test that conversion from GitHub response
GraphQlResponse<issues::ResponseData>toGitHubIssueResponseworks correctly andGitHubIssueResponsecontainsGitHubIssues whose fields are correct. - I assume
serde_json::from_str()will always success even when the request is wrong or response is error. So I usedexpect()to pass exception handling.
If I do not use JSON string,
2-3 lines of testing code such as:
let response_str = include_str!("../tests/issue_response_from_github.json");
let graphql_response: GraphQlResponse<issues::ResponseData> =
serde_json::from_str(response_str).expect("Valid JSON");will turn into:
use graphql_client::Response as GraphQlResponse;
...
let graphql_response = GraphQlResponse {
data: Some(issues::ResponseData {
repository: Some(IssuesRepository {
issues: IssuesRepositoryIssues {
page_info: IssuesRepositoryIssuesPageInfo {
has_next_page: false,
end_cursor: Some(String::from(
"Y3Vyc29yOnYyOpK5MjAyMi0wNy0xMlQxODozMzo0MiswOTowMM5Nl-UC",
)),
},
nodes: Some(vec![
...
]),
},
}),
}),
errors: None,
extensions: None,
};I think this is more burdensome than just storing actual JSON response from GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About From vs. TryFrom
I don't see any practical benefit of choosing From over TryFrom in this context. With TryFrom, we avoid the need to define verbose error messages, as internal error messages are automatically propagated automatically by the compiler. It's also a widely adopted pattern in Rust. Could you elaborate on the advantages of your implementation?
About using JSON
I also seriously explored JSON fixtures in #153 (you can find the relevant code in the commits). The reason I moved away from that approach is because, whenever we add a new field for a statistics, this test case will fail unless it's updated accordingly.
In my view, test cases should be additive, not retrospective, meaning it's fine to add a new test case when introducing new functionality, but it's problematic if you have to modify existing test case. In this case, adding a field to a query should be covered by a new test, and shouldn't break previous ones. From a practical standpoint, breaking existing tests would slow down the development pace. Could you share your idea on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing your ideas, experiences, and histories.
I think I understand feedbacks, and I would summarize the feedbacks before modifying the code.
- Use
TryFrominstead ofFrom- For consistency with codebase such as
discussion.rs - Use of
TryFromis general and reasonable
- For consistency with codebase such as
- Remove JSON fixture
- It is burdensome with API changes - Most API changes (ex
issue.graphqledited by us) will be breaking changes to tests - It is too verbose
- It is burdensome with API changes - Most API changes (ex
I also want to share my idea/reasoning.
- I prefer
FromoverTryFromwhere the conversion never fails- Mainly for ergonomics - It avoids misunderstanding that the conversion can fail while it is actually always success.
- It removes need for handling exceptions to simplify code. No need to think about error messages because it will always success.
- As @danbi2990 pointed out, I might overthink the benefit.
- I prefer adding JSON fixture to test conversion from GitHub response to our own data structure
- I think there will be almost no breaking changes to GitHub API and our request to GitHub
- I think we need strong test cases before implementing paging logic mentioned in Pagination 구현: GitHub GraphQL API 에 요청 및 응답을 파싱할 때 pagination 이용 #181
- JSON is human-readable and we can separate JSON to fixture file, while codegened structs are too verbose to create/construct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the summary. In addition, I'm not sure we even need a test case for the conversion, since the conversion is already guaranteed by strong typing.
In contrast, issue_stat.rs includes actual calculation logic (such as filtering and statistics), which justifies having tests. Or if we are implementing a public API, then adding test cases may be required by corporate policy.
If testing code-generated structs becomes too verbose, I believe it's reasonable to skip those tests.
I agree that strong test cases are essential for the pagination logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds reasonable.
I think it is good to skip writing test for the conversion.
I will write strong tests for pagination logic in other issue, not in this issue.
fc7f1f5 to
9fb25cd
Compare
src/database/issue.rs
Outdated
| let author = String::from( | ||
| issue | ||
| .author | ||
| .ok_or(anyhow!("Author of GitHub issue always exist."))?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
How about using
contextinstead ofor_or(anyhow!(...))? I believe that would be more idiomatic. -
I think the error message is misleading. If the value should always exist, we should use
expect. If the goal is to show an error message when it fails, the message should be something like "Author of GitHub issue is missing".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think author of issue should always exist, but GraphQL schema is defined as:
type Issue implements ... {
...
author: Actor
...
}I will use context because GraphQL schema defines author as an optional field.
src/outbound.rs
Outdated
| .data | ||
| .expect("Response data should exist, although when it is empty or error.") | ||
| .repository | ||
| .ok_or(anyhow!("Wrong repository."))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified by using context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll replace both expect and ok_or to context to simplify.
src/database/issue.rs
Outdated
| .author | ||
| .ok_or(anyhow!("Author of GitHub issue always exist."))?, | ||
| ); | ||
| let comments = GitHubIssueCommentConnection::try_from(issue.comments)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using try_into instead of try_from as shown below?
let comments = issue.comments.try_into()?;This makes the code simpler by allowing the compiler to infer the target type, and it's also more idiomatic.
The same change can be applied to all try_from usages below.
src/database/issue.rs
Outdated
| author, | ||
| body: issue.body, | ||
| state: issue.state, | ||
| assignees: Vec::<String>::from(issue.assignees), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the above comments, this can be simplified to assignees: issues.assignees.into().
src/outbound.rs
Outdated
| fn try_from(value: GraphQlResponse<issues::ResponseData>) -> Result<Self> { | ||
| let repo = value | ||
| .data | ||
| .expect("Response data should exist, although when it is empty or error.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct to assume that data should always exist? According to this link, it can be None if an error was raised during the execution.
I suggest using context("error message")? instead of leaving it uncertain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will accept your suggestion.
It seems that the link you gave is official GraphQL spec.
I think GitHub does not obey the rule in the link.
The reason I assumed data should always exist is that GitHub always returns data even if wrong owner and repo are given.
The screenshot below is a response from GitHub GraphQL API:
I think although GitHub does not obey the rule, it is better to obey the official GraphQL spec.
src/outbound.rs
Outdated
| let nodes = repo | ||
| .issues | ||
| .nodes | ||
| .expect("This field will be always returned even if no issue exist"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why this is always true? Otherwise, consider using context("error message")?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, using expect on the assumption that the external API is always safe seems risky. This approach means our software could panic depending on the external API's behavior.
We don't know the details of the GitHub API source code, their infrastructure, or internal guarantees. The only thing we can clearly rely on is their public API contract (schema.graphql), which specifies IssueConnection.nodes: [Issue]. Since the type is nullable, it can be null at any time.
I believe gathering evidence to justify treating IssueConnection.nodes as always safe would be time-consuming, because the public schema explicitly declares it as nullable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove all unnecessary assumptions and rely exclusively on schema.graphql in making my judgment.
src/outbound.rs
Outdated
| .issues | ||
| .nodes | ||
| .expect("This field will be always returned even if no issue exist"); | ||
| let issues: Vec<_> = nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Vec<_> type declaration seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is redundant. I'll remove it.
Thanks for your careful review.
|
@danbi2990 Thanks for your feedbacks. Especially, I recognized that I need to read README of |
85f9bb9 to
2f9ac53
Compare
- Define struct `GitHubIssueResponse` and implement `TryFrom<T>` trait to parse/convert GitHub API response - Add few implementations of `From<T>` trait to convert contents of struct `GitHubIssueResponse` to struct `GitHubIssue`
2f9ac53 to
7eaca6b
Compare

OBJECTIVE: Convert GitHub API Response to our own struct
GitHubIssueusingFrom<T>orTryFrom<T>Define
struct GitHubIssueResponseand implementTryFrom<T>to parse GitHub API response.struct GitHubIssueResponsedefined by ustry_fromreturnsErr. If empty response,try_fromreturnsOk(GitHubIssueResponse).Implement
From<T>to convert typesstructs which is generated byasync-graphql's codegen based on GitHub GraphQL schema (ex)crate::github::issues::ResponseData,crate::github::issues::IssuesRepositoryIssuesNodesComments, ...structs (ex)GitHubIssue,GitHubCommentConnection, ...TryFrom<T>trait forGitHubIssueResponse, these conversions have no need to handle exceptionsfromfor conversion, to avoidclippy::too_many_linesReplace
unwrap_or_defaulttoexpectfor issue number, pr number, total count, etc. (forOption<Vector<T>>, still usingunwrap_or_defaultbecause it returns empty vectorvec![])Unit tests use data of this repository, which is public
Close #185