Merge steps #27

Anya497 · 2024-01-25T14:39:10Z

Add merge_steps function to FullDataset class
New way to distribute maps between processes during validation
Swap errors and steps number in result tuple
New path to server working directory
Refactor data_loader and add parallelism to dataset processing

emnigma · 2024-01-26T08:59:31Z

AIAgent/ml/common_model/dataset.py

@@ -91,7 +92,11 @@ def get_plain_data(self, threshold: int = 100):
 result = []
 for map_result, map_steps in self.maps_data.values():
 if map_result[0] >= threshold:
- for step in map_steps:
+ if len(map_steps) > 2000:


количество шагов надо указать как входное значение для функции

emnigma · 2024-01-26T09:00:06Z

AIAgent/ml/common_model/dataset.py

@@ -91,7 +92,11 @@ def get_plain_data(self, threshold: int = 100):
 result = []
 for map_result, map_steps in self.maps_data.values():
 if map_result[0] >= threshold:
- for step in map_steps:
+ if len(map_steps) > 2000:
+ selected_steps = random.sample(map_steps, 2000)


вот тут она же видимо

emnigma · 2024-01-26T09:03:42Z

AIAgent/ml/common_model/dataset.py

@@ -91,7 +92,11 @@ def get_plain_data(self, threshold: int = 100):
 result = []
 for map_result, map_steps in self.maps_data.values():
 if map_result[0] >= threshold:
- for step in map_steps:
+ if len(map_steps) > 2000:
+ selected_steps = random.sample(map_steps, 2000)


и тут важно, что именно random.sample? он выбирает шаги в случайном порядке. если len(map_steps) > 2000 будет false, то шаги будут в прямом. то, что в разных случаях разные порядки, не будет влиять на обучение?

Не будет. Там снаружи всё равно шафл всего и вся.

emnigma · 2024-01-26T09:16:05Z

AIAgent/ml/common_model/dataset.py

+ new_steps_num = len(self.maps_data[map_name][1])
+ logging.info(
+ f"Steps on map {map_name} were merged with current steps with result {map_result}. {len(filtered_map_steps)} + {init_steps_num} -> {new_steps_num}. "
+ )


similar steps removal is preformed unconditionally in all of the possible cases except one. maybe filter only once? If performance is critical, maybe do it lazily be declaring lambda-function?

emnigma · 2024-01-26T09:17:42Z

AIAgent/ml/common_model/dataset.py

+ if self.maps_data[map_name][0] == map_result and map_result[0] == 100:
+ init_steps_num = len(self.maps_data[map_name][1])
+
+ filtered_map_steps = self.remove_similar_steps(filtered_map_steps)


let's inline this variable, re-using it is serving no purpose. same on line 147

По-моему, есть противоречие с комментарием про единоразовое вычисление remove_similar_steps.

emnigma · 2024-01-26T09:37:43Z

AIAgent/ml/common_model/dataset.py

+ break
+ if should_add:
+ merged_steps.append(new_step)
+ merged_steps.extend(sum(old_steps.values(), []))


странный extend. что тут происходит?

sum суммирует списки в один. А extend добавляет все в merged_steps

emnigma · 2024-01-26T09:40:26Z

AIAgent/run_common_model_training.py


 all_average_results = []
 for epoch in range(config.epochs):
- data_list = dataset.get_plain_data()
+ data_list = dataset.get_plain_data(80)


keyword is needed. 80 is not linked with function name, so it is impossible to interpret without exploring get_plain_data

Я скоро буду весь датасет сильно переписывать. Учту это в новой версии

emnigma · 2024-01-26T09:46:45Z

AIAgent/ml/common_model/dataset.py

@@ -91,7 +92,11 @@ def get_plain_data(self, threshold: int = 100):
 result = []


please rename threshold parameter for better clarity. threshold of what?

what does this function do? why the data it gets is plain? why do we need a threshold? maybe function name should reflect that

В следующей версии датасета эта функция будет не нужна.

emnigma · 2024-01-26T09:52:25Z

AIAgent/run_common_model_training.py

@@ -190,7 +191,7 @@ def train(trial: optuna.trial.Trial, dataset: FullDataset):
 cmwrapper.make_copy(str(epoch + 1))

 with mp.Pool(GeneralConfig.SERVER_COUNT) as p:
- result = list(p.map(play_game_task, tasks))
+ result = list(p.map(play_game_task, tasks, 1))


constants should come with the keyword: chunksize=1

why chunksize=1?

emnigma · 2024-01-26T09:55:29Z

AIAgent/run_common_model_training.py

 tasks = [
- (maps[i], FullDataset("", ""), cmwrapper)
- for i in range(GeneralConfig.SERVER_COUNT)
+ ([all_maps[i]], FullDataset("", ""), cmwrapper)


tasks = [ ([concrete_map], FullDataset("", ""), cmwrapper) for concrete_map in all_maps ]

?

emnigma · 2024-01-27T10:46:01Z

AIAgent/common/constants.py

@@ -47,4 +47,6 @@ class ResultsHandlerLinks:
 BASE_NN_OUT_FEATURES_NUM = 8

 # assuming we start from /VSharp/VSharp.ML.AIAgent
-SERVER_WORKING_DIR = "../VSharp.ML.GameServer.Runner/bin/Release/net7.0/"
+SERVER_WORKING_DIR = (
+ "../GameServers/VSharp/VSharp.ML.GameServer.Runner/bin/Release/net7.0/"


pathlib.Path("...")?

…ew algorithm for parallel validation.

…st-training sequentially. Turn off weights loading.

…training. Add pretraining dataset generation.

…ctor.

… variable.

…ts loading.

emnigma · 2024-02-01T18:13:23Z

.gitignore

@@ -162,3 +162,4 @@ cython_debug/

 # MacOS specific
 .DS_Store
+AIAgent/report/


last line should be empty

emnigma · 2024-02-01T18:15:31Z

.pre-commit-config.yaml

@@ -3,4 +3,4 @@ repos:
 rev: 23.12.1
 hooks:
 - id: black
- language_version: python3.11
+ language_version: python3.10


???
from the docs:

It is recommended to specify the latest version of Python supported by your project here

emnigma · 2024-02-01T18:17:27Z

AIAgent/ml/common_model/paths.py

- ROOT, "ml", "pretrained_models", "models_for_parallel_architecture"
-)
+PRETRAINED_MODEL_PATH = os.path.join("ml", "models")
+RAW_FILES_PATH = os.path.join("report", "SerializedEpisodes")


does SerializedEpisodes folder always in this location? Should it instead be passed as the cmd argument?

emnigma · 2024-02-01T18:20:02Z

AIAgent/ml/data_loader_compact.py

+
+@dataclass(slots=True)
+class Step:
+ Graph: TypeAlias = HeteroData


а так можно что ли?) какой семантический смысл у этой конструкции?

emnigma · 2024-02-01T18:23:01Z

AIAgent/ml/data_loader_compact.py

+ f = open(
+ file_path
+ ) # without resource manager in order to escape file descriptors leaks


super counterintuitive: file resource managers are used to disallow file descriptor leaks. what is the reason to not use it there?

emnigma · 2024-02-01T18:39:34Z

AIAgent/run_common_model_training.py

 )
- dataset.save()
+ dataset.load()


why save -> load?

emnigma · 2024-02-01T18:40:56Z

AIAgent/run_common_model_training.py

@@ -271,20 +270,19 @@ def main():
 type=bool,
 help="set this flag if dataset generation is needed",
 action=argparse.BooleanOptionalAction,
- default=False,
+ default=True,


So default user action should be to generate dataset? Wouldn't user update dataset more frequently in general?

emnigma · 2024-02-01T18:41:10Z

AIAgent/run_common_model_training.py

- ref_model_initializer = lambda: RefStateModelEncoderLastLayer(
- hidden_channels=32, out_channels=8
- )
+ print(GeneralConfig.DEVICE)


why print it twice?

emnigma · 2024-02-01T18:42:26Z

AIAgent/run_common_model_training.py

- maps: list[GameMap], ref_model_init: t.Callable[[], torch.nn.Module]
-):
- global DATASET_BASE_PATH
+def generate_dataset():


it is better to parametrise function, global constants are not good in general

emnigma · 2024-02-01T18:42:35Z

AIAgent/run_common_model_training.py

 )
- dataset.save()
+ dataset.load()
 return dataset


 def get_dataset():


parametrise?

…rticular file. Split running, training and validation by different files. Create dataloader. Delete support of models learned with genetic algorithms.

Big refactor

Anya497 force-pushed the merge_steps branch 2 times, most recently from e22e828 to 9e68fc6 Compare January 25, 2024 14:51

gsvgit requested review from gsvgit and emnigma January 25, 2024 14:57

emnigma requested changes Jan 26, 2024

View reviewed changes

emnigma reviewed Jan 27, 2024

View reviewed changes

Anya497 force-pushed the merge_steps branch from 700b215 to 7661343 Compare January 29, 2024 14:24

Anya497 added 7 commits January 29, 2024 17:42

add merge steps function (doesn't work yet)

52918a6

Fix merge steps function. Add dataset directory creation.

c09e173

Swap errors and steps number, refactor in merge_steps function, use n…

2fe0a3b

…ew algorithm for parallel validation.

Refactor ServerDataloaderHeteroVector class to process dataset for po…

d9609c6

…st-training sequentially. Turn off weights loading.

Parallel dataset processing

aabf4eb

Replace strings with tuples in HeteroData keys. Refactor code for pre…

b413357

…training. Add pretraining dataset generation.

Fix file descriptors leak in dataset processing function. Little refa…

b81321b

…ctor.

Anya497 force-pushed the merge_steps branch from 183d2f9 to b81321b Compare January 31, 2024 14:45

Fix path to server. Delete progress bar. Make dataset_base_path local…

be68ecf

… variable.

Anya497 force-pushed the merge_steps branch from c5a8f9f to f6370e6 Compare February 2, 2024 09:15

Fix bug in table creating. Add tabulate to requirements.txt

942c783

Anya497 force-pushed the merge_steps branch from f6370e6 to 942c783 Compare February 2, 2024 09:21

Anya497 requested a review from emnigma February 2, 2024 11:07

Set edge_dim=2 and replace -1 with exact values in order to fix weigh…

3ed6e84

…ts loading.

Anya497 force-pushed the merge_steps branch from 37e8235 to 3ed6e84 Compare February 2, 2024 11:14

emnigma requested changes Feb 6, 2024

View reviewed changes

Anya497 and others added 4 commits February 16, 2024 10:02

Big refactor of dataset and code at all. Add saving each step in a pa…

9ca1916

…rticular file. Split running, training and validation by different files. Create dataloader. Delete support of models learned with genetic algorithms.

Merge pull request #39 from gsvgit/refactor

8d82fdf

Big refactor

Add next_free_port func, remove DEBUG mode

4b97eb2

Add new server launcher

a5edfe5

emnigma force-pushed the merge_steps branch 2 times, most recently from b9776a2 to 5016902 Compare March 24, 2024 15:45

ci: Add run script

aac7669

ancavar and others added 3 commits April 5, 2024 22:17

Add incrementalization for client

29be6e6

Reformat code

8f9c0f5

ci: Upd VSharp version

ccc17a5

emnigma force-pushed the merge_steps branch from 95cdd07 to ccc17a5 Compare April 5, 2024 19:46

emnigma added 10 commits April 9, 2024 13:49

Add ONNX converter

6998626

Remove redundant comments

3d3eb7b

Fix typo

86ab268

fix: Support model init from console args

8017f67

fix: Remove unnecessary imports

f630697

feat: Add pip caching

2dd4ed5

fix: Change steps order

8143c19

feat: Centralize inference & ONNX keys

9c4953d

fix: Make all strings depend on TORCH class

a4b7c4a

fix: Add onyx.py desc of --import-model-fqn flag

73b05c0

Anya497 closed this Apr 12, 2024

Anya497 deleted the merge_steps branch April 12, 2024 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge steps #27

Merge steps #27

Anya497 commented Jan 25, 2024 •

edited

Loading

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

gsvgit Jan 27, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024 •

edited

Loading

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 26, 2024

Anya497 Feb 2, 2024

emnigma Jan 27, 2024 •

edited

Loading

Anya497 Feb 2, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

emnigma Feb 1, 2024

		@@ -91,7 +92,11 @@ def get_plain_data(self, threshold: int = 100):
		result = []

Merge steps #27

Merge steps #27

Conversation

Anya497 commented Jan 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emnigma Jan 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emnigma Jan 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Anya497 commented Jan 25, 2024 •

edited

Loading

emnigma Jan 26, 2024 •

edited

Loading

emnigma Jan 27, 2024 •

edited

Loading