Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

function to choose element at random from an array #3075

Closed
SamChill opened this issue May 11, 2013 · 16 comments
Closed

function to choose element at random from an array #3075

SamChill opened this issue May 11, 2013 · 16 comments

Comments

@SamChill
Copy link
Contributor

I think a useful utility function is to chose a random element from an array. Python has this in random.choice. Here is what a Julia implementation might look like:

function choice(a::Array)
    n = length(a)
    idx = mod(rand(Uint),n)+1
    return a[idx]
end

An option would be to have an additional argument for drawing some number of samples. This would be sampling with replacement from an array with uniform probability.

Another useful function could be to have sampling without replacement.

@andrioni
Copy link
Member

Using idx = rand(1:n) would be a better solution, as using mod can't guarantee randomness, IIRC.

@johnmyleswhite
Copy link
Member

Sampling with replacement already exists in the Stats package as the randsample function. Adding sampling without replacement is an ongoing issue.

@JeffBezanson
Copy link
Member

Yes I think a good way to do this is just a[rand(1:end)].

@arshak
Copy link

arshak commented Sep 12, 2014

I think you mean a[1:rand(1:end)]

@StefanKarpinski
Copy link
Member

I've seen a handful of @JeffBezanson coding slip ups. This is not one of them ;-)

@pao
Copy link
Member

pao commented Sep 12, 2014

a[1:rand(1:end)] will produce a random-length prefix of a, rather than a random sample from a.

@IainNZ
Copy link
Member

IainNZ commented Sep 12, 2014

Since this issue is very old, I just wanted to updated @johnmyleswhite's comment for 2014: sampling is in StatsBase.jl and the sample function does some very clever sampling.

@arshak
Copy link

arshak commented Sep 12, 2014

How does StatsBase.sample work with dataframe? I know I can convert it to
multi-dimensional array first but I'd like to keep my original
structure/headers.

On Fri, Sep 12, 2014 at 3:03 PM, Iain Dunning notifications@github.com
wrote:

Since this issue is very old, I just wanted to updated @johnmyleswhite
https://github.com/johnmyleswhite's comment for 2014: sampling is in
StatsBase.jl and the sample function does some very clever sampling.


Reply to this email directly or view it on GitHub
#3075 (comment).

@johnmyleswhite
Copy link
Member

Just ask for a subset of rows generated by sampling from 1:size(df, 1).

-- John

On Sep 12, 2014, at 1:12 PM, arshak notifications@github.com wrote:

How does StatsBase.sample work with dataframe? I know I can convert it to
multi-dimensional array first but I'd like to keep my original
structure/headers.

On Fri, Sep 12, 2014 at 3:03 PM, Iain Dunning notifications@github.com
wrote:

Since this issue is very old, I just wanted to updated @johnmyleswhite
https://github.com/johnmyleswhite's comment for 2014: sampling is in
StatsBase.jl and the sample function does some very clever sampling.


Reply to this email directly or view it on GitHub
#3075 (comment).


Reply to this email directly or view it on GitHub.

@cossio
Copy link
Contributor

cossio commented May 5, 2016

@SamChill This does not work for non-indexable collections, like a Set. What's an efficient way to get a random element out of a Set?

@ivarne
Copy link
Member

ivarne commented May 6, 2016

rand(a::Array) works after #9049.

An efficient implementation for Set (and Dict?) seems non-trival, and a PR might be accepted. In the mean time you can use rand(collect(set)).

@cossio
Copy link
Contributor

cossio commented May 6, 2016

@undwad
Copy link

undwad commented Sep 10, 2019

choose(xs, n) = xs[randperm(end)][1:n]

@cossio
Copy link
Contributor

cossio commented Sep 10, 2019

@undwad That seems inefficient, particularly when n is small compared to length(xs).

@StefanKarpinski
Copy link
Member

Please don't necropost on old, resolved issues.

@KrishnaChaitanya-Gopaluni

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests