Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise basic hardware recs and add cloud-specific recs #2312

Merged
merged 2 commits into from
Jan 29, 2018
Merged

Conversation

jseldess
Copy link
Contributor

@jseldess jseldess commented Dec 19, 2017

Fixes #2181
Fixes #1132
Fixes #1356

@jseldess jseldess requested a review from bdarnell December 19, 2017 23:23
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@bdarnell
Copy link
Contributor

I'm having second thoughts about this because the AWS t2.medium instance advertises 2 vCPUs, but throttled to 20% utilization, which would have the same problems that I think we're trying to avoid with this recommendation. Let's keep discussing in #2181.

@jseldess jseldess changed the title Increase bare minimum rec to 2 CPU Revise basic hardware recs and add cloud-specific recs Jan 24, 2018
@jseldess jseldess changed the title Revise basic hardware recs and add cloud-specific recs [WIP] Revise basic hardware recs and add cloud-specific recs Jan 24, 2018
@jseldess jseldess changed the title [WIP] Revise basic hardware recs and add cloud-specific recs Revise basic hardware recs and add cloud-specific recs Jan 25, 2018
@jseldess
Copy link
Contributor Author

@bdarnell, @mberhault, please review the revised basic and new cloud-specific hardware recommendations. Before adding them to the cloud tutorials (and replicated to v2.0 docs), I'd like to know what's wrong, what needs expansion, etc. Thanks very much for the help thus far.

@jseldess jseldess requested a review from mberhault January 25, 2018 20:01
1 similar comment
@bdarnell
Copy link
Contributor

:lgtm:


Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


v1.1/recommended-production-settings.md, line 20 at r1 (raw file):

- Run each node on a separate machine. Since CockroachDB replicates across nodes, running more than one node per machine increases the risk of data loss if a machine fails. Likewise, if a machine has multiple disks or SSDs, run one node with multiple `--store` flags and not one node per disk. For more details about stores, see [Start a Node](start-a-node.html).

- Configurations with odd numbers of replicas are more robust than those with even numbers. Clusters of three and four nodes can each tolerate one node failure and still reach a majority (2/3 and 3/4 respectively), so the fourth replica doesn't add any extra fault-tolerance. To survive two simultaneous failures, you must have five replicas.

We should clarify that the number of nodes and the replication factor are two different things. The replication factor (set via zone configs) should be an odd number, while you can use any number of nodes (as long as you have at least as many nodes as your replication factor).


v1.1/recommended-production-settings.md, line 35 at r1 (raw file):

- At a bare minimum, each node should have **2 GB of RAM and one entire core**. More data, complex workloads, higher concurrency, and faster performance require additional resources.
    {{site.data.alerts.callout_danger}}Avoid "burstable" or "shared-core" machines that limit the load on a single core.{{site.data.alerts.end}}

Add "virtual" before "machines" here to clarify that this applies to virtualized cloud scenarios and not bare metal.


v1.1/recommended-production-settings.md, line 62 at r1 (raw file):

    For example, Cockroach Labs has used `Standard_L4s` VMs (4 vCPUs and 32 GiB of RAM per VM) for internal testing.
- If you choose local SSD storage, on reboot, the VM can come back with the `ntfs` filesystem. Be sure your automation monitors for this and reformats the disk to the Linux filesystem you chose initially.
- **Do not** use ["burstable" B-series](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/b-series-burstable) VMs, which limit the load on a single core.

If we're going to list machine types to stay away from on azure we also need to list A-series (HDD, data corruption) and D-series (irregular disk performance).


Comments from Reviewable

@jseldess
Copy link
Contributor Author

Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions.


v1.1/recommended-production-settings.md, line 20 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

We should clarify that the number of nodes and the replication factor are two different things. The replication factor (set via zone configs) should be an odd number, while you can use any number of nodes (as long as you have at least as many nodes as your replication factor).

If it's ok with you, I'm going to revise/expand this section in a follow-up PR. Already have plans to provide sample cluster patterns and more details about fault tolerance, so I assume this section will change at least a bit.


v1.1/recommended-production-settings.md, line 35 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Add "virtual" before "machines" here to clarify that this applies to virtualized cloud scenarios and not bare metal.

Done.


v1.1/recommended-production-settings.md, line 62 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

If we're going to list machine types to stay away from on azure we also need to list A-series (HDD, data corruption) and D-series (irregular disk performance).

Done.


Comments from Reviewable

@jseldess jseldess merged commit 588530a into master Jan 29, 2018
@jseldess jseldess deleted the minimum-recs branch January 29, 2018 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants