diff --git a/CHANGELOG.md b/CHANGELOG.md index 5cbdf76b76..2fda7eec7e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,156 @@ # Changelog +## [v1.1.1](https://github.com/kubeflow/tf-operator/tree/v1.1.1) (2021-08-03) + +[Full Changelog](https://github.com/kubeflow/tf-operator/compare/v1.1.0...v1.1.1) + +## Features + +* Add job namespace to `tf_operator_jobs_*` counters ([#1283](https://github.com/kubeflow/tf-operator/pull/1283), @alembiewski) +* feat: upgrade kubeflow common and volcano version ([#1276](https://github.com/kubeflow/tf-operator/pull/1276), @shinytang6) +* Add task type annotation for pods when EnableGangScheduling is true. ([#1268](https://github.com/kubeflow/tf-operator/pull/1268), @jiangkaihua) + +## Bug fixes + +* Fix invalid pointer when tfjob is deleted ([#1285](https://github.com/kubeflow/tf-operator/pull/1285), @johnugeorge) +* fix get_logs pod_names type and iteration blocking ([#1280](https://github.com/kubeflow/tf-operator/pull/1280), @Windfarer) +* fix calling custom_api.delete_namespaced_custom_object args error ([#1281](https://github.com/kubeflow/tf-operator/pull/1281), @Windfarer) +* fix: Remove the dup comment tag ([#1274](https://github.com/kubeflow/tf-operator/pull/1274), @gaocegege) +* Fix: Remove Github CD workflow ([#1263](https://github.com/kubeflow/tf-operator/pull/1263), @PatrickXYS) +* Fix: the "follow" of TFJobClient.get_logs ([#1254](https://github.com/kubeflow/tf-operator/pull/1254), @Windfarer) + +## Misc + +* Update container image for v1.1.1 ([#1328](https://github.com/kubeflow/tf-operator/pull/1328), @Jeffwan) +* add a specific version of tensorflow_datasets ([#1305](https://github.com/kubeflow/tf-operator/pull/1305), @jazzsir) +* Remove vendor folder ([#1288](https://github.com/kubeflow/tf-operator/pull/1288), @Jeffwan) +* add podgroups rule in cluster-role.yaml ([#1272](https://github.com/kubeflow/tf-operator/pull/1272), @huone1) +* Use remote Kustomize build option in standalone installation instructions ([#1266](https://github.com/kubeflow/tf-operator/pull/1266), @verult) + + +## [v1.1.0](https://github.com/kubeflow/tf-operator/tree/v1.1.0) (2021-03-24) + +[Full Changelog](https://github.com/kubeflow/tf-operator/compare/v0.5.3...v1.1.0) + +## Features +* feat: Remove k8s.io/kubernetes ([#1235](https://github.com/kubeflow/tf-operator/pull/1235), @gaocegege) +* Migrate to public ECR ([#1256](https://github.com/kubeflow/tf-operator/pull/1256), @PatrickXYS) +* feat: Add API Documentation WIP ([#1249](https://github.com/kubeflow/tf-operator/pull/1249), @gaocegege) +* feat: Update developers guide and readme ([#1244](https://github.com/kubeflow/tf-operator/pull/1244), @gaocegege) +* Move TF Operator e2e tests to AWS Prow ([#1204](https://github.com/kubeflow/tf-operator/pull/1204), @ChanYiLin) +* crd definition support multiple evaluator ([#1240](https://github.com/kubeflow/tf-operator/pull/1240), @oikomi) +* support multiple evaluators ([#1239](https://github.com/kubeflow/tf-operator/pull/1239), @oikomi) +* feat: Change the message for running condition ([#1230](https://github.com/kubeflow/tf-operator/pull/1230), @gaocegege) +* feat(server): Use apiextension client to check if crd exists ([#1228](https://github.com/kubeflow/tf-operator/pull/1228), @gaocegege) +* checkCRDExists func return true when k8s cluster is not connected ([#1207](https://github.com/kubeflow/tf-operator/pull/1207), @oikomi) +* feat: Add CD using GitHub Actions ([#1196](https://github.com/kubeflow/tf-operator/pull/1196), @gaocegege) +* Migrate controller implementation to kubeflow/common fashion ([#1171](https://github.com/kubeflow/tf-operator/pull/1171), @ChanYiLin) +* Support success policy for TFJob ([#1165](https://github.com/kubeflow/tf-operator/pull/1165), @terrytangyuan) +* add distributed training example of using TF 2.1 Strategy API ([#1164](https://github.com/kubeflow/tf-operator/pull/1164), @jazzsir) +* Set completion time when job exceed specified deadline. ([#1150](https://github.com/kubeflow/tf-operator/pull/1150), @SimonCqk) +* Support ClusterSpec Propagation Feature in TF 1.14 ([#1149](https://github.com/kubeflow/tf-operator/pull/1149), @zhujl1991) +* Add watch function for TFJob python Client API ([#1122](https://github.com/kubeflow/tf-operator/pull/1122), @jinchihe) +* Enhance tfjobs sdk docs ([#1114](https://github.com/kubeflow/tf-operator/pull/1114), @jinchihe) +* Generate TFJob Python SDK ([#1103](https://github.com/kubeflow/tf-operator/pull/1103), @jinchihe) +* feat: Support pprof when monitoring is specified ([#1102](https://github.com/kubeflow/tf-operator/pull/1102), @gaocegege) +* feat: Use kubeflow/common ([#1088](https://github.com/kubeflow/tf-operator/pull/1088), @gaocegege) +* Add support for aarch64 ([#1098](https://github.com/kubeflow/tf-operator/pull/1098), @MrXinWang) +* feat: Do not set TF_CONFIG for local training ([#1080](https://github.com/kubeflow/tf-operator/pull/1080), @gaocegege) +* feat: Replace gometalinter with golangci-lint ([#1081](https://github.com/kubeflow/tf-operator/pull/1081), @gaocegege) +* Add controller-name label for Pod and service ([#1067](https://github.com/kubeflow/tf-operator/pull/1067), @hougangliu) +* Add qps and burst options ([#1063](https://github.com/kubeflow/tf-operator/pull/1063), @ScorpioCPH) +* Avoid unnecessary update when tfjob is complete ([#1051](https://github.com/kubeflow/tf-operator/pull/1051), @cheyang) +* set annotation automatically when EnableGangScheduling is set to true ([#1032](https://github.com/kubeflow/tf-operator/pull/1032), @ChanYiLin) +* feat(pod): Support custom gang scheduler via CLI argument ([#1050](https://github.com/kubeflow/tf-operator/pull/1050), @gaocegege) + +## Bug fixes +* Fix kubeflow overlay ([#1260](https://github.com/kubeflow/tf-operator/pull/1260), @PatrickXYS) +* fix: Do not validate evaluator ([#1238](https://github.com/kubeflow/tf-operator/pull/1238), @gaocegege) +* fix: Remove default resync period ([#1237](https://github.com/kubeflow/tf-operator/pull/1237), @gaocegege) +* fix: Observe the creation when failed to create the pod ([#1236](https://github.com/kubeflow/tf-operator/pull/1236), @gaocegege) +* fix: Remove vendor cp command ([#1232](https://github.com/kubeflow/tf-operator/pull/1232), @gaocegege) +* Fix completion time setting bug ([#1226](https://github.com/kubeflow/tf-operator/pull/1226), @shaowei-su) +* feat(deploy): Add standalone deployment yaml ([#1218](https://github.com/kubeflow/tf-operator/pull/1218), @gaocegege) +* Fix updateStatus no worker Crashoff ([#1215](https://github.com/kubeflow/tf-operator/pull/1215), @kuikuikuizzZ) +* fix: Fix the log message ([#1203](https://github.com/kubeflow/tf-operator/pull/1203), @gaocegege) +* Fix the typo ([#1178](https://github.com/kubeflow/tf-operator/pull/1178), @pingsutw) +* Fix setup cluster issue and Pylint issue in CI tests ([#1179](https://github.com/kubeflow/tf-operator/pull/1179), @jinchihe) +* Fix the link to run_e2e_workflow.py script ([#1154](https://github.com/kubeflow/tf-operator/pull/1154), @terrytangyuan) +* Fix evaluator runconfig ([#1146](https://github.com/kubeflow/tf-operator/pull/1146), @richardsliu) +* Fix sdk test issue that's caused by kubenertes Client bug. ([#1143](https://github.com/kubeflow/tf-operator/pull/1143), @jinchihe) +* fix(controller): calculate satisfied with && instead of || ([#1120](https://github.com/kubeflow/tf-operator/pull/1120), @GuoHaiqing) +* fix comment, add +optional flag to comment. ([#1137](https://github.com/kubeflow/tf-operator/pull/1137), @EDGsheryl) +* fix(ConvertTFJobToUnstructured): ConvertTFJobToUnstructured uses function ToUnstructured to convert TFJob to Unstructured ([#1118](https://github.com/kubeflow/tf-operator/pull/1118), @leileiwan) +* fix the reconcile flow ([#1111](https://github.com/kubeflow/tf-operator/pull/1111), @ChanYiLin) +* Fix example Mnist With Summaries ([#1073](https://github.com/kubeflow/tf-operator/pull/1073), @andreyvelich) +* fix bug: When executing `tf-operator.v1 -version`, GitSHA is always 'not provided' ([#1046](https://github.com/kubeflow/tf-operator/pull/1046), @asdfsx) +* fix(UI): show correct namespace and name when deleting job through dashboard ([#1044](https://github.com/kubeflow/tf-operator/pull/1044), @gbin10533) +* Minor fix to add CoreV1 to scheme ([#1037](https://github.com/kubeflow/tf-operator/pull/1037), @johnugeorge) +* fix(docs): Fix link for simple_TFJob_test ([#1038](https://github.com/kubeflow/tf-operator/pull/1038), @gaocegege) +* fix: Remove dup code ([#1022](https://github.com/kubeflow/tf-operator/pull/1022), @gaocegege) + +## Chores +* tf-operator: Consolidate manifests ([#1255](https://github.com/kubeflow/tf-operator/pull/1255), @yanniszark) +* TFJob Operator: Move manifests development upstream ([#1247](https://github.com/kubeflow/tf-operator/pull/1247), @yanniszark) +* Update vendor as kubeflow/common is updated. ([#1252](https://github.com/kubeflow/tf-operator/pull/1252), @jiangkaihua) +* docs: Add Ant Group to ADOPTERS.md ([#1243](https://github.com/kubeflow/tf-operator/pull/1243), @terrytangyuan) +* chore: Add tencent cloud ([#1234](https://github.com/kubeflow/tf-operator/pull/1234), @gaocegege) +* add vip ([#1233](https://github.com/kubeflow/tf-operator/pull/1233), @oikomi) +* chore: Update changelog ([#1227](https://github.com/kubeflow/tf-operator/pull/1227), @gaocegege) +* Update kubeflow common to 0.3.2 ([#1225](https://github.com/kubeflow/tf-operator/pull/1225), @shaowei-su) +* chore: Remove useless expectation ([#1217](https://github.com/kubeflow/tf-operator/pull/1217), @gaocegege) +* chore: Update codegen ([#1211](https://github.com/kubeflow/tf-operator/pull/1211), @gaocegege) +* add Evaluator type for CRD example ([#1209](https://github.com/kubeflow/tf-operator/pull/1209), @oikomi) +* add err log for create client set failed and code minor optimization ([#1210](https://github.com/kubeflow/tf-operator/pull/1210), @oikomi) +* chore: Remove the kanban update workflow ([#1201](https://github.com/kubeflow/tf-operator/pull/1201), @gaocegege) +* chore: Refactor cmd ([#1199](https://github.com/kubeflow/tf-operator/pull/1199), @gaocegege) +* bugfix for multi_worker_strategy-with-keras.py ([#1198](https://github.com/kubeflow/tf-operator/pull/1198), @jiaqianjing) +* Fix error when `conditions` is empty. ([#1185](https://github.com/kubeflow/tf-operator/pull/1185), @Corea) +* b/168938304 - Inclusive Language Fix-It, repo has non-inclusive language ([#1190](https://github.com/kubeflow/tf-operator/pull/1190), @sculd) +* chore: Update OWNERS ([#1177](https://github.com/kubeflow/tf-operator/pull/1177), @gaocegege) +* Update developer_guide.md ([#1176](https://github.com/kubeflow/tf-operator/pull/1176), @pingsutw) +* Update swagger-codegen-cli URL ([#1172](https://github.com/kubeflow/tf-operator/pull/1172), @jinchihe) +* Use go mod ([#1144](https://github.com/kubeflow/tf-operator/pull/1144), @xychu) +* Make tf_operator use static compilation in container ([#1160](https://github.com/kubeflow/tf-operator/pull/1160), @MrXinWang) +* Update tf_job_client.py remove unused variable. ([#1157](https://github.com/kubeflow/tf-operator/pull/1157), @NikeNano) +* Update e2e_testing.md ([#1155](https://github.com/kubeflow/tf-operator/pull/1155), @NikeNano) +* Disable istio sidecar injection in simple tfjob test ([#1148](https://github.com/kubeflow/tf-operator/pull/1148), @Bobgy) +* OWNERS: Add ChanYiLin as approver ([#1147](https://github.com/kubeflow/tf-operator/pull/1147), @ChanYiLin) +* Remove unused function arg ([#1145](https://github.com/kubeflow/tf-operator/pull/1145), @zhujl1991) +* docs: Add roadmap ([#1140](https://github.com/kubeflow/tf-operator/pull/1140), @gaocegege) +* simple_tfjob_tests py3 version ([#1134](https://github.com/kubeflow/tf-operator/pull/1134), @gabrielwen) +* add tf-operator test in py3 ([#1133](https://github.com/kubeflow/tf-operator/pull/1133), @gabrielwen) +* Distroless image for TF operator ([#1124](https://github.com/kubeflow/tf-operator/pull/1124), @krishnadurai) +* SDK support getting the TFJob training logs ([#1130](https://github.com/kubeflow/tf-operator/pull/1130), @jinchihe) +* Copy third party vendor source code to Docker image ([#1128](https://github.com/kubeflow/tf-operator/pull/1128), @richardsliu) +* Add third party licenses ([#1127](https://github.com/kubeflow/tf-operator/pull/1127), @richardsliu) +* remove tfjob dashboard ([#1119](https://github.com/kubeflow/tf-operator/pull/1119), @ChanYiLin) +* Update checking status API name ([#1117](https://github.com/kubeflow/tf-operator/pull/1117), @jinchihe) +* Add more APIs for TFJob done ([#1116](https://github.com/kubeflow/tf-operator/pull/1116), @jinchihe) +* feat: Add adopters in README ([#1092](https://github.com/kubeflow/tf-operator/pull/1092), @gaocegege) +* Support for ppc64le ([#1082](https://github.com/kubeflow/tf-operator/pull/1082), @zoyun) +* use multi-stage build to build tf-operator image ([#1072](https://github.com/kubeflow/tf-operator/pull/1072), @hmtai) +* add ppc64le support for the example dist-mnist ([#1084](https://github.com/kubeflow/tf-operator/pull/1084), @alongzhi) +* add the dockerfile for ppc64le ([#1083](https://github.com/kubeflow/tf-operator/pull/1083), @alongzhi) +* Updating issue bot configs ([#1074](https://github.com/kubeflow/tf-operator/pull/1074), @rbrishabh) +* Delete v1beta2 api ([#1075](https://github.com/kubeflow/tf-operator/pull/1075), @johnugeorge) +* add ldflag verion ([#1052](https://github.com/kubeflow/tf-operator/pull/1052), @yeya24) +* Add verify-codegen in travis CI ([#1070](https://github.com/kubeflow/tf-operator/pull/1070), @ohmystack) +* Set tfjob defaults in test utils ([#1071](https://github.com/kubeflow/tf-operator/pull/1071), @ohmystack) +* Update codegen ([#1069](https://github.com/kubeflow/tf-operator/pull/1069), @ohmystack) +* rewrite dockerfile ([#1062](https://github.com/kubeflow/tf-operator/pull/1062), @hmtai) +* Renaming labels to common types ([#1064](https://github.com/kubeflow/tf-operator/pull/1064), @johnugeorge) +* add total suffix in counter metrics ([#1055](https://github.com/kubeflow/tf-operator/pull/1055), @yeya24) +* Update k8s libraries to 1.12.3 ([#1054](https://github.com/kubeflow/tf-operator/pull/1054), @johnugeorge) +* add flag kubeconfig ([#1049](https://github.com/kubeflow/tf-operator/pull/1049), @yeya24) +* Easily detect the GOPATH in current development environment. ([#1047](https://github.com/kubeflow/tf-operator/pull/1047), @xauthulei) +* Update gang scheduler name ([#1028](https://github.com/kubeflow/tf-operator/pull/1028), @goodluckbot) +* Set worker 0 completed if pod's phase goto succeeded ([#1042](https://github.com/kubeflow/tf-operator/pull/1042), @ScorpioCPH) +* Removing unnecessary Rbac authorization ([#1036](https://github.com/kubeflow/tf-operator/pull/1036), @johnugeorge) +* refactor: add GenPodGroupName method to extract podGroupName in diffe… ([#1034](https://github.com/kubeflow/tf-operator/pull/1034), @zlcnju) +* update release script ([#1040](https://github.com/kubeflow/tf-operator/pull/1040), @kunmingg) +* Update image base to UBI8 GA ([#1023](https://github.com/kubeflow/tf-operator/pull/1023), @pdmack) + ## [v1.0.1-rc.2](https://github.com/kubeflow/tf-operator/tree/v1.0.1-rc.2) (2021-01-27) [Full Changelog](https://github.com/kubeflow/tf-operator/compare/v1.0.1-rc.1...v1.0.1-rc.2)