-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizer library #2190
Optimizer library #2190
Changes from 10 commits
26ad14f
7d884c1
c0ca125
48e85f2
043a589
bcbd264
4804376
c9c2ee8
ee4a0d1
de1c756
29c364f
b8f33bf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
include_directories(${CMAKE_CURRENT_BINARY_DIR}) | ||
add_subdirectory(optimizer) | ||
|
||
go_library(adder SRCS adder.go) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,7 +26,8 @@ const ( | |
type Parameter struct { | ||
Name string | ||
ElementType ElementType | ||
Content []byte | ||
Size uint32 | ||
// Content []byte | ||
} | ||
|
||
// ParameterWithConfig contains the parameter and the configuration. | ||
|
@@ -42,15 +43,16 @@ type Gradient Parameter | |
type Service struct { | ||
initialized chan struct{} | ||
|
||
mu sync.Mutex | ||
opt *optimizer | ||
paramMap map[string]Parameter | ||
mu sync.Mutex | ||
paramMap map[string]Parameter | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since there is already optimizerMap, and optimizer owns parameter. So maybe we no longer need paramMap? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. agreed.fix the go part Done. |
||
optimizerMap map[string]*optimizer // per parameter to optmizer | ||
} | ||
|
||
// NewService creates a new service. | ||
func NewService() *Service { | ||
s := &Service{} | ||
s.paramMap = make(map[string]Parameter) | ||
s.optimizerMap = make(map[string]*optimizer) | ||
s.initialized = make(chan struct{}) | ||
return s | ||
} | ||
|
@@ -71,8 +73,9 @@ func (s *Service) BeginInitParams(config []byte, dummy *int) error { | |
s.opt.Cleanup() | ||
} | ||
|
||
// TODO(helin): parse learning rate from config | ||
s.opt = newOptimizer(sgd, 0.01) | ||
// TODO(h | ||
// elin): parse learning rate from config | ||
s.opt = newOptimizer(config OptimizerConfig) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have not see the definition of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fix it, change to string |
||
return nil | ||
} | ||
|
||
|
@@ -135,7 +138,10 @@ func (s *Service) SendGrads(grads []Gradient, dummy *int) error { | |
errCh := make(chan error, count) | ||
for _, g := range grads { | ||
go func(p Parameter, g Gradient) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This change introduces concurrent read to optimizerMap, go func(name string, g Gradient) {
s.mu.Lock()
defer s.mu.Unlock()
opt, err := s.optimizerMap[p.Name]
if err != nil {
err = opt.UpdateParameter(p, g)
}
} The above function locks the mutex until optimization is finished, which is safe, since concurrent update to optimizer is a race condition. But the performance will hurt, since there is only a single mutex per Service.
go func(o *Optimizer, g Gradient) {
err := o.UpdateParameter(g)
errCh <- err
}(s.optimizerMap[g.Name], g) // we are still protected by mutex when invoking s.optimizerMap[g.Name] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for your kindly reminding! I'm agreed. I thought that we don't need to protect against concurrent update. |
||
err := s.opt.UpdateParameter(p, g) | ||
opt, err := s.optimizerMap[p.Name] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I noticed the code here does not compile, because We should not check in code that does not compile. Maybe I can do the Go code part for now, and you can get more familiar with Go by reviewing the PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry, I haven't written the cmake script in right way. I will check in code more carefully. Thanks for your go lint editor plugin and your commit in go cmake script, now I can run this part. Thanks! |
||
if err != nil { | ||
err := opt.UpdateParameter(p, g) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sadly I keep do my job in baidu's dev machine, last two days "jumbo(a package management system)" broken in whole baidu. It do not pass the go compile side. |
||
} | ||
errCh <- err | ||
}(s.paramMap[g.Name], g) | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
include_directories(${CMAKE_CURRENT_BINARY_DIR}) | ||
|
||
set(OPITMIZER_SRCS | ||
optimizer_factory.cc | ||
parameter_optimizer.cc | ||
regularizer.cc | ||
) | ||
|
||
set(OPITMIZER_Headers | ||
optimizer.h | ||
Tensor.h | ||
optimizer_factory.h | ||
parameter_optimizer.h | ||
regularizer.h | ||
) | ||
|
||
add_library(optimizer STATIC ${OPITMIZER_SRCS}) | ||
add_dependencies(optimizer gen_proto_cpp) | ||
|
||
add_simple_unittest(optimizer_test) | ||
add_simple_unittest(optimizer_factory_test) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#ifndef PADDLE_FAKE_TENSOR_H_ | ||
#define PADDLE_FAKE_TENSOR_H_ | ||
/** | ||
* @brief fake tensor for testing | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe "Tensor used by optimizer" is more appropriate. Since this tensor is actually working, not something fake and does not work. If you agree, please change header guard (PADDLE_FAKE_TENSOR_H_) as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fix done. |
||
*/ | ||
|
||
#include "paddle/math/BaseMatrix.h" | ||
#include <string.h> | ||
|
||
namespace paddle { | ||
template <class T> | ||
using TensorBase = BaseMatrixT<T>; | ||
|
||
template <class T> | ||
class Tensor : public TensorBase<T> { | ||
public: | ||
Tensor(T* data, int size) : TensorBase<T>(size, 1, 0, data, false, false) {} | ||
T* get_buffer() { return this->data_; } | ||
// TODO: replace with tensorshape | ||
size_t height() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought we agreed to change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry for leave out some commits of this PR. |
||
return this->height_; | ||
} | ||
}; | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
#ifndef PADDLE_ADADELTA_OPTIMIZER_H_ | ||
#define PADDLE_ADADELTA_OPTIMIZER_H_ | ||
|
||
#include "parameter_optimizer.h" | ||
|
||
namespace paddle { | ||
namespace optimizer { | ||
|
||
template <class T> | ||
class AdadeltaOptimizer : public ParameterOptimizer<T> { | ||
public: | ||
/*! \brief call the applySGD for example */ | ||
void update(const Tensor<T> &gradient) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is not a working optimizer, maybe let's remove it for now to avoid confusion? (see: #2190 (comment)) |
||
} | ||
private: | ||
double learning_rate; | ||
double rho; | ||
double epsilon; | ||
double decay; | ||
}; | ||
|
||
|
||
} | ||
} | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#include "sgd_optimizer.h" | ||
|
||
namespace paddle { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not a working source file, probably checked in by mistake. Maybe it's a good idea to review the PR first so that these mistakes can be avoided. |
||
namespace optimizer { |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
#ifndef PADDLE_ADAGRAD_OPTIMIZER_H_ | ||
#define PADDLE_ADAGRAD_OPTIMIZER_H_ | ||
|
||
#include "parameter_optimizer.h" | ||
|
||
namespace paddle { | ||
namespace optimizer { | ||
|
||
|
||
template <class T> | ||
class AdagradOptimizer : public ParameterOptimizer<T> { | ||
public: | ||
void update(const Tensor<T> &gradient) { | ||
} | ||
private: | ||
double learning_rate; | ||
double epsilon; | ||
double decay; | ||
}; | ||
|
||
} | ||
} | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
#ifndef PADDLE_ADAM_OPTIMIZER_H_ | ||
#define PADDLE_ADAM_OPTIMIZER_H_ | ||
|
||
#include "parameter_optimizer.h" | ||
|
||
namespace paddle { | ||
namespace optimizer { | ||
|
||
|
||
template <class T> | ||
class AdamOptimizer : public ParameterOptimizer<T> { | ||
public: | ||
/*! \brief call the applySGD for example */ | ||
void update(const Tensor<T> &gradient) { | ||
} | ||
private: | ||
double learning_rate ; | ||
double beta_1; | ||
double beta_2; | ||
double epsilon; | ||
}; | ||
|
||
|
||
} // namespace optimizer | ||
} // namespace paddle | ||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
#include "optimizer.h" | ||
#include <string> | ||
|
||
#include "parameter_optimizer.h" | ||
|
||
template<class T> | ||
struct EnumToType {}; | ||
|
||
template<class T> | ||
struct TypeToEnum {}; | ||
|
||
#define MATCH_ENUM_TYPE(TYPE, ENUM) \ | ||
template<> \ | ||
struct TypeToEnum<ENUM> { \ | ||
static paddle_element_type v() {return ENUM;}; \ | ||
static constexpr TYPE value = ENUM; | ||
}; | ||
template<> \ | ||
struct EnumToType<ENUM> { \ | ||
typedef TYPE Type; \ | ||
} \ | ||
|
||
MATCH_ENUM_TYPE(int32_t, PADDLE_ELEMENT_TYPE_INT32); | ||
MATCH_ENUM_TYPE(uint32_t, PADDLE_ELEMENT_TYPE_UINT32); | ||
MATCH_ENUM_TYPE(int64_t, PADDLE_ELEMENT_TYPE_INT64); | ||
MATCH_ENUM_TYPE(uint64_t, PADDLE_ELEMENT_TYPE_UINT64); | ||
MATCH_ENUM_TYPE(float, PADDLE_ELEMENT_TYPE_FLOAT32); | ||
MATCH_ENUM_TYPE(double, PADDLE_ELEMENT_TYPE_FLOAT64); | ||
|
||
struct paddle_optimizer { | ||
/*! \brief optmizer in C++ side */ | ||
|
||
paddle::optimizer::ParameterOptimzier* impl; | ||
}; | ||
|
||
paddle_optimizer* paddle_create_optimizer(const unsigned char* config_proto, | ||
int config_proto_len) { | ||
paddle_optimizer* optimizer; | ||
std::string config(config_proto, config_proto + config_proto_len); | ||
optimizer->impl->create(config_proto); | ||
return optimizer; | ||
} | ||
|
||
int paddle_release_optimizer(paddle_optimizer* o) { | ||
if (o != nullptr) | ||
delete o->impl; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are following Google C++ coding style.
See: https://google.github.io/styleguide/cppguide.html#Conditionals There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fix the style. Thanks a lot for such a careful review! I would check in my code more carefully, and fix the style, such as go code compile problem by double check. Thanks ! |
||
return PADDLE_SUCCESS; | ||
} | ||
|
||
int paddle_update_parameter(paddle_optimizer* o, | ||
paddle_element_type data_type, | ||
const void* grad_buffer, | ||
int num_bytes) { | ||
auto type = EnumToType<data_type>::Type; | ||
paddle::Tensor<type> gradient(reinterpret_cast<type*>(grad_buffer), | ||
num_bytes); | ||
o->impl->update(gradient); | ||
return PADDLE_SUCCESS; | ||
} | ||
|
||
int paddle_optimizer_set_weights(paddle_optimizer* o, | ||
paddle_element_type data_type, | ||
void*param_buffer, | ||
int num_bytes) { | ||
auto type = EnumToType<data_type>::Type; | ||
paddle::Tensor<type>* param = | ||
new paddle::Tensor<type>(reinterpret_cast<type*>(param_buffer), num_bytes); | ||
o->impl->set_weight(param); | ||
return PADDLE_SUCCESS; | ||
} | ||
|
||
void* paddle_optimizer_get_weights(paddle_optimizer* o) { | ||
void* buffer = (void *)o->impl->get_weight(); | ||
return buffer; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
#ifndef PADDLE_LIB_OPTIMIZER_H_ | ||
#define PADDLE_LIB_OPTIMIZER_H_ | ||
#include <stdbool.h> | ||
#include <stdint.h> | ||
|
||
/*! \brief optimizer export C API. which will be used in | ||
Case A, on Trainer (On ParameterServer Client) optimize gradient | ||
|
||
Case B, on ParameterServer side optimize gradient | ||
|
||
To simplify the configuration parsing. optimizer *do not* parse any config | ||
e.g. learning rate should be calculated by the caller | ||
*/ | ||
|
||
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
/*! \brief datatypes */ | ||
typedef enum { | ||
PADDLE_ELEMENT_TYPE_INT32 = 0, | ||
PADDLE_ELEMENT_TYPE_UINT32 = 1, | ||
PADDLE_ELEMENT_TYPE_INT64 = 2, | ||
PADDLE_ELEMENT_TYPE_UINT64 = 3, | ||
PADDLE_ELEMENT_TYPE_FLOAT32 = 4, | ||
PADDLE_ELEMENT_TYPE_FLOAT64 = 5, | ||
} paddle_element_type; | ||
|
||
/*! \brief execute status code */ | ||
const int32_t PADDLE_SUCCESS = 0; | ||
const int32_t PADDLE_ERROR = -1; | ||
|
||
typedef struct paddle_optimizer paddle_optimizer; | ||
/** | ||
* this group interface called in order : | ||
* 1. create optimizer with config | ||
* 2. set weights | ||
* 3. update_parameter | ||
* 4. get_weights | ||
* 5. release optimizer | ||
*/ | ||
|
||
|
||
/** | ||
* @brief create optimizer with proto_config | ||
* @param config_proto, optimizer protobuf, see OptimizerConfig.proto in detail | ||
* @return return optimizer instance | ||
*/ | ||
paddle_optimizer* paddle_create_optimizer(const unsigned char* config_proto, | ||
int config_proto_len); | ||
|
||
/** | ||
* @brief release optimizer | ||
* @param optimizer | ||
* @return return exec status | ||
*/ | ||
int paddle_release_optimizer(paddle_optimizer* o); | ||
|
||
/** | ||
* @brief optimizer instance | ||
* @param datatype of gradient and parameter | ||
* @param gradient, calculate by optimzizer caller. | ||
* TODO(zhihong): just pass loss to reduce communicate overhead. | ||
* Project Adam Ms'14 paper for detail | ||
* @param num_bytes, gradient size | ||
* @return return exec status | ||
*/ | ||
int paddle_update_parameter(paddle_optimizer* o, | ||
paddle_element_type data_type, | ||
const void* gradient, | ||
int num_bytes); | ||
|
||
/** | ||
* @brief optimizer instance | ||
* @param data_type datatype of gradient | ||
* @param param_buffer, initilized parameter buffer | ||
* @param num_bytes, parameter size | ||
* @return return exec status | ||
*/ | ||
int paddle_optimizer_set_weights(paddle_optimizer* o, | ||
paddle_element_type data_type, | ||
void* param_buffer, | ||
int num_bytes); | ||
|
||
/** | ||
* @brief optimizer instance | ||
* @return return content of parameter buffer in optimizer | ||
*/ | ||
void* paddle_optimizer_get_weights(paddle_optimizer* o); | ||
|
||
#ifdef __cplusplus | ||
} | ||
#endif | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No commented out code please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix done.