Skip to content

Commit

Permalink
Auto merge of #74480 - yoshuawuyts:hardware_threads, r=dtolnay
Browse files Browse the repository at this point in the history
Add std::thread::available_concurrency

This PR adds a counterpart to [C++'s `std::thread::hardware_concurrency`](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency) to Rust, tracking issue #74479.

cc/ `@rust-lang/libs`

## Motivation

Being able to know how many hardware threads a platform supports is a core part of building multi-threaded code. In C++ 11 this has become available through the [`std::thread::hardware_concurrency`](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency) API. Currently in Rust most of the ecosystem depends on the [`num_cpus` crate](https://docs.rs/num_cpus/1.13.0/num_cpus/) ([no.35 in top 500 crates](https://docs.google.com/spreadsheets/d/1wwahRMHG3buvnfHjmPQFU4Kyfq15oTwbfsuZpwHUKc4/edit#gid=1253069234)) to provide this functionality. This PR proposes an API to provide access to the number of hardware threads available on a given platform.

__edit (2020-07-24):__ The purpose of this PR is to provide a hint for how many threads to spawn to saturate the processor. There's value in introducing APIs for NUMA and Windows processor groups, but those are intentionally out of scope for this PR. See: #74480 (comment).

## Naming

Discussing the naming of the API on Zulip surfaced two options:

- `std::thread::hardware_concurrency`
- `std::thread::hardware_threads`

Both options seemed acceptable, but overall people seem to gravitate the most towards `hardware_threads`. Additionally `@jonas-schievink` pointed out that the "hardware threads" terminology is well-established and is used in among other the [RISC-V specification](https://riscv.org/specifications/isa-spec-pdf/) (page 20):

> A component is termed a core if it contains an independent instruction fetch unit. A RISC-V-compatible core might support multiple RISC-V-compatible __hardware threads__, or harts, through multithreading.

It's also worth noting that [the original paper introducing C++'s `std::thread` submodule](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2320.html) unfortunately doesn't feature any discussion on the naming of `hardware_concurrency`, so we can't use that to help inform our decision here.

## Return type

An important consideration `@joshtriplett` brought up is that we don't want to default to `1` for platforms where the number of available threads cannot be retrieved. Instead we want to inform the users of the fact that we don't know and allow them to handle that case. Which is why this PR uses `Option<NonZeroUsize>` as its return type, where `None` is returned on platforms where we don't know the number of hardware threads available.

The reasoning for `NonZeroUsize` vs `usize` is that if the number of threads for a platform are known, they'll always be at least 1. As evidenced by the example the `NonZero*` family of APIs may currently not be the most ergonomic to use, but improving the ergonomics of them is something that I think we can address separately.

## Implementation

`@Mark-Simulacrum` pointed out that most of the code we wanted to expose here was already available under `libtest`. So this PR mostly moves the internal code of libtest into a public API.
  • Loading branch information
bors committed Oct 18, 2020
2 parents cbc42a0 + 3717646 commit c38ddb8
Show file tree
Hide file tree
Showing 4 changed files with 166 additions and 101 deletions.
157 changes: 157 additions & 0 deletions library/std/src/thread/available_concurrency.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
use crate::io;
use crate::num::NonZeroUsize;

/// Returns the number of hardware threads available to the program.
///
/// This value should be considered only a hint.
///
/// # Platform-specific behavior
///
/// If interpreted as the number of actual hardware threads, it may undercount on
/// Windows systems with more than 64 hardware threads. If interpreted as the
/// available concurrency for that process, it may overcount on Windows systems
/// when limited by a process wide affinity mask or job object limitations, and
/// it may overcount on Linux systems when limited by a process wide affinity
/// mask or affected by cgroups limits.
///
/// # Errors
///
/// This function will return an error in the following situations, but is not
/// limited to just these cases:
///
/// - If the number of hardware threads is not known for the target platform.
/// - The process lacks permissions to view the number of hardware threads
/// available.
///
/// # Examples
///
/// ```
/// # #![allow(dead_code)]
/// #![feature(available_concurrency)]
/// use std::thread;
///
/// let count = thread::available_concurrency().map(|n| n.get()).unwrap_or(1);
/// ```
#[unstable(feature = "available_concurrency", issue = "74479")]
pub fn available_concurrency() -> io::Result<NonZeroUsize> {
available_concurrency_internal()
}

cfg_if::cfg_if! {
if #[cfg(windows)] {
#[allow(nonstandard_style)]
fn available_concurrency_internal() -> io::Result<NonZeroUsize> {
#[repr(C)]
struct SYSTEM_INFO {
wProcessorArchitecture: u16,
wReserved: u16,
dwPageSize: u32,
lpMinimumApplicationAddress: *mut u8,
lpMaximumApplicationAddress: *mut u8,
dwActiveProcessorMask: *mut u8,
dwNumberOfProcessors: u32,
dwProcessorType: u32,
dwAllocationGranularity: u32,
wProcessorLevel: u16,
wProcessorRevision: u16,
}
extern "system" {
fn GetSystemInfo(info: *mut SYSTEM_INFO) -> i32;
}
let res = unsafe {
let mut sysinfo = crate::mem::zeroed();
GetSystemInfo(&mut sysinfo);
sysinfo.dwNumberOfProcessors as usize
};
match res {
0 => Err(io::Error::new(io::ErrorKind::NotFound, "The number of hardware threads is not known for the target platform")),
cpus => Ok(unsafe { NonZeroUsize::new_unchecked(cpus) }),
}
}
} else if #[cfg(any(
target_os = "android",
target_os = "cloudabi",
target_os = "emscripten",
target_os = "fuchsia",
target_os = "ios",
target_os = "linux",
target_os = "macos",
target_os = "solaris",
target_os = "illumos",
))] {
fn available_concurrency_internal() -> io::Result<NonZeroUsize> {
match unsafe { libc::sysconf(libc::_SC_NPROCESSORS_ONLN) } {
-1 => Err(io::Error::last_os_error()),
0 => Err(io::Error::new(io::ErrorKind::NotFound, "The number of hardware threads is not known for the target platform")),
cpus => Ok(unsafe { NonZeroUsize::new_unchecked(cpus as usize) }),
}
}
} else if #[cfg(any(target_os = "freebsd", target_os = "dragonfly", target_os = "netbsd"))] {
fn available_concurrency_internal() -> io::Result<NonZeroUsize> {
use crate::ptr;

let mut cpus: libc::c_uint = 0;
let mut cpus_size = crate::mem::size_of_val(&cpus);

unsafe {
cpus = libc::sysconf(libc::_SC_NPROCESSORS_ONLN) as libc::c_uint;
}

// Fallback approach in case of errors or no hardware threads.
if cpus < 1 {
let mut mib = [libc::CTL_HW, libc::HW_NCPU, 0, 0];
let res = unsafe {
libc::sysctl(
mib.as_mut_ptr(),
2,
&mut cpus as *mut _ as *mut _,
&mut cpus_size as *mut _ as *mut _,
ptr::null_mut(),
0,
)
};

// Handle errors if any.
if res == -1 {
return Err(io::Error::last_os_error());
} else if cpus == 0 {
return Err(io::Error::new(io::ErrorKind::NotFound, "The number of hardware threads is not known for the target platform"));
}
}
Ok(unsafe { NonZeroUsize::new_unchecked(cpus as usize) })
}
} else if #[cfg(target_os = "openbsd")] {
fn available_concurrency_internal() -> io::Result<NonZeroUsize> {
use crate::ptr;

let mut cpus: libc::c_uint = 0;
let mut cpus_size = crate::mem::size_of_val(&cpus);
let mut mib = [libc::CTL_HW, libc::HW_NCPU, 0, 0];

let res = unsafe {
libc::sysctl(
mib.as_mut_ptr(),
2,
&mut cpus as *mut _ as *mut _,
&mut cpus_size as *mut _ as *mut _,
ptr::null_mut(),
0,
)
};

// Handle errors if any.
if res == -1 {
return Err(io::Error::last_os_error());
} else if cpus == 0 {
return Err(io::Error::new(io::ErrorKind::NotFound, "The number of hardware threads is not known for the target platform"));
}

Ok(unsafe { NonZeroUsize::new_unchecked(cpus as usize) })
}
} else {
// FIXME: implement on vxWorks, Redox, HermitCore, Haiku, l4re
fn available_concurrency_internal() -> io::Result<NonZeroUsize> {
Err(io::Error::new(io::ErrorKind::NotFound, "The number of hardware threads is not known for the target platform"))
}
}
}
6 changes: 6 additions & 0 deletions library/std/src/thread/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -175,9 +175,15 @@ use crate::time::Duration;
#[macro_use]
mod local;

#[unstable(feature = "available_concurrency", issue = "74479")]
mod available_concurrency;

#[stable(feature = "rust1", since = "1.0.0")]
pub use self::local::{AccessError, LocalKey};

#[unstable(feature = "available_concurrency", issue = "74479")]
pub use available_concurrency::available_concurrency;

// The types used by the thread_local! macro to access TLS keys. Note that there
// are two types, the "OS" type and the "fast" type. The OS thread local key
// type is accessed via platform-specific API calls and is slow, while the fast
Expand Down
103 changes: 2 additions & 101 deletions library/test/src/helpers/concurrency.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
//! Helper module which helps to determine amount of threads to be used
//! during tests execution.
use std::env;
use std::thread;

#[allow(deprecated)]
pub fn get_concurrency() -> usize {
Expand All @@ -12,106 +13,6 @@ pub fn get_concurrency() -> usize {
_ => panic!("RUST_TEST_THREADS is `{}`, should be a positive integer.", s),
}
}
Err(..) => num_cpus(),
}
}

cfg_if::cfg_if! {
if #[cfg(windows)] {
#[allow(nonstandard_style)]
fn num_cpus() -> usize {
#[repr(C)]
struct SYSTEM_INFO {
wProcessorArchitecture: u16,
wReserved: u16,
dwPageSize: u32,
lpMinimumApplicationAddress: *mut u8,
lpMaximumApplicationAddress: *mut u8,
dwActiveProcessorMask: *mut u8,
dwNumberOfProcessors: u32,
dwProcessorType: u32,
dwAllocationGranularity: u32,
wProcessorLevel: u16,
wProcessorRevision: u16,
}
extern "system" {
fn GetSystemInfo(info: *mut SYSTEM_INFO) -> i32;
}
unsafe {
let mut sysinfo = std::mem::zeroed();
GetSystemInfo(&mut sysinfo);
sysinfo.dwNumberOfProcessors as usize
}
}
} else if #[cfg(any(
target_os = "android",
target_os = "cloudabi",
target_os = "emscripten",
target_os = "fuchsia",
target_os = "ios",
target_os = "linux",
target_os = "macos",
target_os = "solaris",
target_os = "illumos",
))] {
fn num_cpus() -> usize {
unsafe { libc::sysconf(libc::_SC_NPROCESSORS_ONLN) as usize }
}
} else if #[cfg(any(target_os = "freebsd", target_os = "dragonfly", target_os = "netbsd"))] {
fn num_cpus() -> usize {
use std::ptr;

let mut cpus: libc::c_uint = 0;
let mut cpus_size = std::mem::size_of_val(&cpus);

unsafe {
cpus = libc::sysconf(libc::_SC_NPROCESSORS_ONLN) as libc::c_uint;
}
if cpus < 1 {
let mut mib = [libc::CTL_HW, libc::HW_NCPU, 0, 0];
unsafe {
libc::sysctl(
mib.as_mut_ptr(),
2,
&mut cpus as *mut _ as *mut _,
&mut cpus_size as *mut _ as *mut _,
ptr::null_mut(),
0,
);
}
if cpus < 1 {
cpus = 1;
}
}
cpus as usize
}
} else if #[cfg(target_os = "openbsd")] {
fn num_cpus() -> usize {
use std::ptr;

let mut cpus: libc::c_uint = 0;
let mut cpus_size = std::mem::size_of_val(&cpus);
let mut mib = [libc::CTL_HW, libc::HW_NCPU, 0, 0];

unsafe {
libc::sysctl(
mib.as_mut_ptr(),
2,
&mut cpus as *mut _ as *mut _,
&mut cpus_size as *mut _ as *mut _,
ptr::null_mut(),
0,
);
}
if cpus < 1 {
cpus = 1;
}
cpus as usize
}
} else {
// FIXME: implement on vxWorks, Redox, HermitCore, Haiku, l4re
fn num_cpus() -> usize {
1
}
Err(..) => thread::available_concurrency().map(|n| n.get()).unwrap_or(1),
}
}
1 change: 1 addition & 0 deletions library/test/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
#![feature(rustc_private)]
#![feature(nll)]
#![feature(bool_to_option)]
#![feature(available_concurrency)]
#![feature(set_stdio)]
#![feature(panic_unwind)]
#![feature(staged_api)]
Expand Down

0 comments on commit c38ddb8

Please sign in to comment.