Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read schema from json file #2

Merged
merged 3 commits into from
Oct 2, 2018
Merged

read schema from json file #2

merged 3 commits into from
Oct 2, 2018

Conversation

larisau
Copy link
Contributor

@larisau larisau commented Aug 21, 2018

  • read schema from json file
  • support multiple tables in keyspaces
  • added mode option: write/read/mixed
  • collecting results enhancement

@larisau larisau requested a review from penberg August 21, 2018 15:24
@larisau larisau requested review from noamha and mmatczuk August 26, 2018 13:17
@larisau larisau changed the title added mode option: write/read/mixed read schema from json file Aug 26, 2018
@larisau
Copy link
Contributor Author

larisau commented Sep 9, 2018

@penberg can you take a look on this?

@@ -19,6 +19,7 @@ var (
seed int
dropSchema bool
verbose bool
mode string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a type for this.

Go doesn't have built-in enumerations, but I think the idiom looks something like this:

type Mode string

const (
  ReadMode Mode = "read"
  MixedMode Mode = "mixed"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, const is better, I'll change it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return sum
type Results interface {
Merge(*Status) Status
Print()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this interface used somewhere? If not, let's drop it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's used for collecting and printing the results, see the runJob below

@@ -79,7 +85,7 @@ func run(cmd *cobra.Command, args []string) {
},
})
schema := schemaBuilder.Build()
if dropSchema {
if dropSchema && mode != "read" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we override user's decision to drop schema if mode is "read"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because now it's read only, so we need data to be read

@@ -172,7 +181,8 @@ func init() {
rootCmd.MarkFlagRequired("test-cluster")
rootCmd.Flags().StringVarP(&oracleClusterHost, "oracle-cluster", "o", "", "Host name of the oracle cluster that provides correct answers")
rootCmd.MarkFlagRequired("oracle-cluster")
rootCmd.Flags().IntVarP(&maxTests, "max-tests", "m", 100, "Maximum number of test iterations to run")
rootCmd.Flags().StringVarP(&mode, "mode", "m", "mixed", "Mode options: write, read, mixed(default)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space before parenthesis in the help text.

Perhaps the help text could be improved with something like:

Mode of query operations. Options: write, read, and mixed (default).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -22,6 +25,8 @@ var (
mode string
)

const confFile = "schema.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not turn the name of the schema configuration file into a command line option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll do this in the future, when more schema options and data types will be supported

type jsonSchema struct {
Keyspace gemini.Keyspace `json:"keyspace"`
Tables []gemini.Table `json:"tables"`
}
Copy link
Contributor

@penberg penberg Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this separate type for JSON serialization needed? Can't we just make the main gemini.Schema serializable?

Btw, for generality in future patches, we probably ought to move table definitions inside keyspace definitions, and make schema a collection of keyspaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we planned it for future

}
defer conf.Close()

byteValue, err := ioutil.ReadAll(conf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use ioutil.ReadFile to simplify this:

conf, err := ioutil.ReadFile(confFile)
var schema jsonSchema
err := json.Unmarshal(conf, &schema)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, really - ReadFile has open and close inside

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

schema.go Outdated
case "text", "varchar":
values = append(values, randString(randRange(p.Min, p.Max)))
case "timestamp", "date":
values = append(values, randDate())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also add (in future patches) varint (arbitrary precision integers) using https://golang.org/pkg/math/big/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

schema.go Outdated
day := randRange(1, 30)
month := randRange(1, 12)
year := randRange(2000, 2018)
return time.Date(year, time.Month(month), day, rand.Intn(24), rand.Intn(60), rand.Intn(60), 0, time.UTC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generates incorrect dates such as "2018-02-30" (there are not that many days in February) and doesn't take account leap years and so on.

It is easier to generate a random number representing seconds since epoch and use time.Unix to turn that into a time.Time type.

If needed (is it really?), we can limit the minimum and maximum dates as follows, for example:

https://stackoverflow.com/a/43497333

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

schema.go Outdated
case "int_range":
start := randRange(p.Min, p.Max)
end := start + randRange(p.Min, p.Max)
values = append(values, start)
values = append(values, end)
case "blob":
case "blob", "uuid":
r, _ := uuid.NewRandom()
values = append(values, r.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to improve blob value generation in a follow up to be something else then that better represents what blobs are used for. Making blogs significantly larger, for example, will make things less easy for Scylla and perhaps uncover some bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's in our plans

@larisau
Copy link
Contributor Author

larisau commented Sep 23, 2018

@penberg can we merge it for now?

@avikivity
Copy link
Member

I don't understand read-only or write-only modes. Read-only won't read anything because there's nothing there, and write-only won't verify anything.

@larisau
Copy link
Contributor Author

larisau commented Sep 23, 2018

@avikivity The write mode can be used to populate the same data in both clusters, the read mode compares data between the two using randomly generated queries, the the mixed mode does exactly the same - it just always runs read after write.

@avikivity
Copy link
Member

reads and writes should be run in parallel. read-after-write is too simple.

@larisau
Copy link
Contributor Author

larisau commented Sep 24, 2018

read-after-write jobs are running in parallel - each one works with its partition range.
In our plans:

  • save on the "oracle" pattern only
  • writes protected by lock, so no need for partition range

@avikivity
Copy link
Member

Reads should be intermingled with writes, to the same partition and clustering keys. That's what real applications do.

@larisau larisau merged commit 7309af4 into master Oct 2, 2018
@larisau larisau deleted the mode_options branch October 22, 2018 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants