-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read schema from json file #2
Conversation
larisau
commented
Aug 21, 2018
•
edited
Loading
edited
- read schema from json file
- support multiple tables in keyspaces
- added mode option: write/read/mixed
- collecting results enhancement
- support multiple tables in keyspace
@penberg can you take a look on this? |
@@ -19,6 +19,7 @@ var ( | |||
seed int | |||
dropSchema bool | |||
verbose bool | |||
mode string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a type for this.
Go doesn't have built-in enumerations, but I think the idiom looks something like this:
type Mode string
const (
ReadMode Mode = "read"
MixedMode Mode = "mixed"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, const is better, I'll change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gemini/root.go
Outdated
return sum | ||
type Results interface { | ||
Merge(*Status) Status | ||
Print() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this interface used somewhere? If not, let's drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's used for collecting and printing the results, see the runJob below
cmd/gemini/root.go
Outdated
@@ -79,7 +85,7 @@ func run(cmd *cobra.Command, args []string) { | |||
}, | |||
}) | |||
schema := schemaBuilder.Build() | |||
if dropSchema { | |||
if dropSchema && mode != "read" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we override user's decision to drop schema if mode is "read"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because now it's read only, so we need data to be read
cmd/gemini/root.go
Outdated
@@ -172,7 +181,8 @@ func init() { | |||
rootCmd.MarkFlagRequired("test-cluster") | |||
rootCmd.Flags().StringVarP(&oracleClusterHost, "oracle-cluster", "o", "", "Host name of the oracle cluster that provides correct answers") | |||
rootCmd.MarkFlagRequired("oracle-cluster") | |||
rootCmd.Flags().IntVarP(&maxTests, "max-tests", "m", 100, "Maximum number of test iterations to run") | |||
rootCmd.Flags().StringVarP(&mode, "mode", "m", "mixed", "Mode options: write, read, mixed(default)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space before parenthesis in the help text.
Perhaps the help text could be improved with something like:
Mode of query operations. Options: write, read, and mixed (default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gemini/root.go
Outdated
@@ -22,6 +25,8 @@ var ( | |||
mode string | |||
) | |||
|
|||
const confFile = "schema.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not turn the name of the schema configuration file into a command line option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll do this in the future, when more schema options and data types will be supported
cmd/gemini/root.go
Outdated
type jsonSchema struct { | ||
Keyspace gemini.Keyspace `json:"keyspace"` | ||
Tables []gemini.Table `json:"tables"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this separate type for JSON serialization needed? Can't we just make the main gemini.Schema
serializable?
Btw, for generality in future patches, we probably ought to move table definitions inside keyspace definitions, and make schema a collection of keyspaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we planned it for future
cmd/gemini/root.go
Outdated
} | ||
defer conf.Close() | ||
|
||
byteValue, err := ioutil.ReadAll(conf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use ioutil.ReadFile
to simplify this:
conf, err := ioutil.ReadFile(confFile)
var schema jsonSchema
err := json.Unmarshal(conf, &schema)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, really - ReadFile has open and close inside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
schema.go
Outdated
case "text", "varchar": | ||
values = append(values, randString(randRange(p.Min, p.Max))) | ||
case "timestamp", "date": | ||
values = append(values, randDate()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also add (in future patches) varint
(arbitrary precision integers) using https://golang.org/pkg/math/big/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
schema.go
Outdated
day := randRange(1, 30) | ||
month := randRange(1, 12) | ||
year := randRange(2000, 2018) | ||
return time.Date(year, time.Month(month), day, rand.Intn(24), rand.Intn(60), rand.Intn(60), 0, time.UTC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generates incorrect dates such as "2018-02-30" (there are not that many days in February) and doesn't take account leap years and so on.
It is easier to generate a random number representing seconds since epoch and use time.Unix
to turn that into a time.Time
type.
If needed (is it really?), we can limit the minimum and maximum dates as follows, for example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
schema.go
Outdated
case "int_range": | ||
start := randRange(p.Min, p.Max) | ||
end := start + randRange(p.Min, p.Max) | ||
values = append(values, start) | ||
values = append(values, end) | ||
case "blob": | ||
case "blob", "uuid": | ||
r, _ := uuid.NewRandom() | ||
values = append(values, r.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to improve blob
value generation in a follow up to be something else then that better represents what blobs are used for. Making blogs significantly larger, for example, will make things less easy for Scylla and perhaps uncover some bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's in our plans
1a07722
to
5daad67
Compare
@penberg can we merge it for now? |
I don't understand read-only or write-only modes. Read-only won't read anything because there's nothing there, and write-only won't verify anything. |
@avikivity The write mode can be used to populate the same data in both clusters, the read mode compares data between the two using randomly generated queries, the the mixed mode does exactly the same - it just always runs read after write. |
reads and writes should be run in parallel. read-after-write is too simple. |
read-after-write jobs are running in parallel - each one works with its partition range.
|
Reads should be intermingled with writes, to the same partition and clustering keys. That's what real applications do. |