-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add math function for sampling integers #5945
Conversation
1. uniform distribution 2. log uniform distribution
@@ -0,0 +1,47 @@ | |||
#include "sampler.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A license should be added to this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx.
paddle/operators/math/sampler.cc
Outdated
|
||
LogUniformSampler::LogUniformSampler(int64 range) | ||
: Sampler(range), log_range_(log(range + 1)) { | ||
std::random_device r; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to provide an option to explicitly set the random seed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx.
1. Add copyright info 2. Overload structure for customized random seed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM.
There are some nit-picky comments, but do not let them block this PR.
We can merge it first and fire another PR to fix them. We can also have a flexible design for supporting all kinds of sampling methods.
|
||
UniformSampler::UniformSampler(int64 range, unsigned int seed) | ||
: Sampler(range, seed), inv_range_(1.0 / range) { | ||
random_engine_ = std::make_shared<std::mt19937>(seed_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little nit-picky, but why use shared_ptr
here.
Maybe a plan random_engine_ and dist_ is enough.
namespace operators { | ||
namespace math { | ||
|
||
// TODO(wanghaoshuang): Support for GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these samplers support for GPU?
It might not be needed since these samplers are used to generate training data for nce-loss or other sampling methods.
#pragma once | ||
#include <memory> | ||
#include <random> | ||
typedef long int64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use int64_t from stdint.h
random_engine_ = std::make_shared<std::mt19937>(seed_); | ||
dist_ = std::make_shared<std::uniform_real_distribution<>>(0, 1); | ||
} | ||
int64 LogUniformSampler::Sample() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sample()
method will change the internal random state. So it should not be marked as const
.
@reyoung Thx for your suggestion. I merged this pr firstly and I will fix these issues in another pr. |
fix #5917
Add math function for sampling integers from: