Fix TypeError: Object of type int64 is not JSON serializable #24340

xiaoli · 2023-06-18T06:29:10Z

What does this PR do?

Fixed that "TypeError: Object of type int64 is not JSON serializable"

Who can review?

Anyone in the community is free to review the PR once the tests have passed.

amyeroberts · 2023-06-19T16:12:11Z

Hi @xiaoli, thanks for opening this PR.

Could you provide some more information about when the error occurs? Does this happen when running with the values from the example readme?

xiaoli · 2023-06-20T02:35:08Z

Hi @amyeroberts, it happened on executing ./run_no_trainer.sh, and everything works smoothly but the last step of that saving results into JSON file.

I got this error:
TypeError: Object of type int64 is not JSON serializable, so this commit is trying to fix that.

This was happened on my Ubuntu 22.04 workstation.

xiaoli · 2023-06-20T04:48:40Z

(transformers) ➜  token-classification git:(main) ./run_no_trainer.sh && echo $(date +%d.%m.%y-%H:%M:%S)
The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `0`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
06/20/2023 10:54:40 - INFO - __main__ - Distributed environment: DistributedType.NO
Num processes: 1
Process index: 0
Local process index: 0
Device: mps

Mixed precision type: no

Downloading builder script: 100%|████████████████████████████████████████████| 9.57k/9.57k [00:00<00:00, 8.80MB/s]
Downloading metadata: 100%|██████████████████████████████████████████████████| 3.73k/3.73k [00:00<00:00, 9.41MB/s]
Downloading readme: 100%|████████████████████████████████████████████████████| 12.3k/12.3k [00:00<00:00, 16.9MB/s]
Downloading and preparing dataset conll2003/conll2003 to /Users/xiaoliwang/.cache/huggingface/datasets/conll2003/conll2003/1.0.0/9a4d16a94f8674ba3466315300359b0acd891b68b6c8743ddf60b9c702adce98...
Downloading data: 100%|████████████████████████████████████████████████████████| 983k/983k [00:00<00:00, 3.57MB/s]
Generating train split:   0%|                                                    | 0/14041 [00:00<?, ? examples/s]06/20/2023 10:54:47 - INFO - datasets_modules.datasets.conll2003.9a4d16a94f8674ba3466315300359b0acd891b68b6c8743ddf60b9c702adce98.conll2003 - ⏳ Generating examples from = /Users/xiaoliwang/.cache/huggingface/datasets/downloads/extracted/31a52031f62b2a9281d3b6c2723006e2fa05b33157a4249729067b79f7aa068a/train.txt
Generating validation split:   0%|                                                | 0/3250 [00:00<?, ? examples/s]06/20/2023 10:54:48 - INFO - datasets_modules.datasets.conll2003.9a4d16a94f8674ba3466315300359b0acd891b68b6c8743ddf60b9c702adce98.conll2003 - ⏳ Generating examples from = /Users/xiaoliwang/.cache/huggingface/datasets/downloads/extracted/31a52031f62b2a9281d3b6c2723006e2fa05b33157a4249729067b79f7aa068a/valid.txt
Generating test split:   0%|                                                      | 0/3453 [00:00<?, ? examples/s]06/20/2023 10:54:48 - INFO - datasets_modules.datasets.conll2003.9a4d16a94f8674ba3466315300359b0acd891b68b6c8743ddf60b9c702adce98.conll2003 - ⏳ Generating examples from = /Users/xiaoliwang/.cache/huggingface/datasets/downloads/extracted/31a52031f62b2a9281d3b6c2723006e2fa05b33157a4249729067b79f7aa068a/test.txt
Dataset conll2003 downloaded and prepared to /Users/xiaoliwang/.cache/huggingface/datasets/conll2003/conll2003/1.0.0/9a4d16a94f8674ba3466315300359b0acd891b68b6c8743ddf60b9c702adce98. Subsequent calls will reuse this data.
100%|█████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 1282.14it/s]
loading configuration file config.json from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/config.json
Model config BertConfig {
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5",
    "6": "LABEL_6",
    "7": "LABEL_7",
    "8": "LABEL_8"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5,
    "LABEL_6": 6,
    "LABEL_7": 7,
    "LABEL_8": 8
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.31.0.dev0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

loading configuration file config.json from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/config.json
Model config BertConfig {
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.31.0.dev0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

loading file vocab.txt from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/vocab.txt
loading file tokenizer.json from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/tokenizer_config.json
loading configuration file config.json from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/config.json
Model config BertConfig {
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.31.0.dev0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

Downloading model.safetensors: 100%|███████████████████████████████████████████| 440M/440M [00:22<00:00, 19.8MB/s]
loading weights file model.safetensors from cache at /Users/xiaoliwang/.cache/huggingface/hub/models--bert-base-uncased/snapshots/a265f773a47193eed794233aa2a0f0bb6d3eaa63/model.safetensors
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForTokenClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
06/20/2023 10:55:15 - INFO - __main__ - Sample 622 of the training set: {'input_ids': [101, 2522, 6657, 15222, 6962, 1015, 19739, 20486, 2072, 1014, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'labels': [-100, 3, -100, -100, -100, 0, 3, -100, -100, 0, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100]}.
06/20/2023 10:55:15 - INFO - __main__ - Sample 12142 of the training set: {'input_ids': [101, 2019, 26354, 4861, 2056, 2008, 9779, 9048, 2015, 1010, 2007, 2095, 1011, 2203, 2727, 7045, 1997, 2149, 1002, 2184, 1012, 1023, 2454, 1998, 10067, 1997, 1002, 2184, 1012, 1019, 2454, 1010, 2052, 2022, 3205, 2006, 1996, 5548, 4518, 3863, 1010, 2021, 2106, 2025, 2360, 2043, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'labels': [-100, 0, 3, 0, 0, 0, 3, -100, -100, 0, 0, 0, -100, -100, 0, 0, 0, 7, -100, 0, -100, -100, 0, 0, 0, 0, 0, 0, -100, -100, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100]}.
06/20/2023 10:55:15 - INFO - __main__ - Sample 4570 of the training set: {'input_ids': [101, 2117, 2679, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'labels': [-100, 0, 0, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100]}.
Downloading builder script: 100%|████████████████████████████████████████████| 6.34k/6.34k [00:00<00:00, 9.02MB/s]
06/20/2023 10:55:18 - INFO - __main__ - ***** Running training *****
06/20/2023 10:55:18 - INFO - __main__ -   Num examples = 14041
06/20/2023 10:55:18 - INFO - __main__ -   Num Epochs = 3
06/20/2023 10:55:18 - INFO - __main__ -   Instantaneous batch size per device = 8
06/20/2023 10:55:18 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 8
06/20/2023 10:55:18 - INFO - __main__ -   Gradient Accumulation steps = 1
06/20/2023 10:55:18 - INFO - __main__ -   Total optimization steps = 5268
 33%|███████████████████████▋                                               | 1756/5268 [24:08<1:29:30,  1.53s/it]epoch 0: {'LOC_precision': 0.9499192245557351, 'LOC_recall': 0.9602612955906369, 'LOC_f1': 0.9550622631293991, 'LOC_number': 1837, 'MISC_precision': 0.8572972972972973, 'MISC_recall': 0.8600867678958786, 'MISC_f1': 0.858689767190038, 'MISC_number': 922, 'ORG_precision': 0.8539482879105521, 'ORG_recall': 0.9112602535421327, 'ORG_f1': 0.8816738816738816, 'ORG_number': 1341, 'PER_precision': 0.9776810016330975, 'PER_recall': 0.9766177270255574, 'PER_f1': 0.9771490750816105, 'PER_number': 1839, 'overall_precision': 0.9214876033057852, 'overall_recall': 0.9387102205758545, 'overall_f1': 0.9300191842522312, 'overall_accuracy': 0.9868336482091035}
 67%|████████████████████████████████████████████████▋                        | 3512/5268 [50:27<18:04,  1.62it/s]epoch 1: {'LOC_precision': 0.9637760702524698, 'LOC_recall': 0.9559063690800218, 'LOC_f1': 0.9598250888220825, 'LOC_number': 1837, 'MISC_precision': 0.8524251805985552, 'MISC_recall': 0.89587852494577, 'MISC_f1': 0.8736118455843469, 'MISC_number': 922, 'ORG_precision': 0.892675852066715, 'ORG_recall': 0.9179716629381058, 'ORG_f1': 0.9051470588235293, 'ORG_number': 1341, 'PER_precision': 0.9721925133689839, 'PER_recall': 0.9885807504078303, 'PER_f1': 0.9803181450525748, 'PER_number': 1839, 'overall_precision': 0.9322847682119205, 'overall_recall': 0.9481394174103385, 'overall_f1': 0.940145254194841, 'overall_accuracy': 0.9880217361665661}
100%|███████████████████████████████████████████████████████████████████████| 5268/5268 [1:15:39<00:00,  1.44it/s]epoch 2: {'LOC_precision': 0.9538378958668814, 'LOC_recall': 0.9673380511703865, 'LOC_f1': 0.9605405405405405, 'LOC_number': 1837, 'MISC_precision': 0.8783351120597652, 'MISC_recall': 0.8926247288503254, 'MISC_f1': 0.8854222700376547, 'MISC_number': 922, 'ORG_precision': 0.9074759437453738, 'ORG_recall': 0.9142431021625652, 'ORG_f1': 0.9108469539375927, 'ORG_number': 1341, 'PER_precision': 0.9751619870410367, 'PER_recall': 0.9820554649265906, 'PER_f1': 0.978596586290978, 'PER_number': 1839, 'overall_precision': 0.9381975678827253, 'overall_recall': 0.94830779592524, 'overall_f1': 0.9432255903533747, 'overall_accuracy': 0.9891513935687436}
Configuration saved in /tmp/test-ner/config.json
Model weights saved in /tmp/test-ner/pytorch_model.bin
tokenizer config file saved in /tmp/test-ner/tokenizer_config.json
Special tokens file saved in /tmp/test-ner/special_tokens_map.json
Traceback (most recent call last):
  File "/Users/xiaoliwang/repo/research/huggingface/transformers/examples/pytorch/token-classification/run_ner_no_trainer.py", line 784, in <module>
    main()
  File "/Users/xiaoliwang/repo/research/huggingface/transformers/examples/pytorch/token-classification/run_ner_no_trainer.py", line 780, in main
    json.dump(all_results, f)
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/json/__init__.py", line 179, in dump
    for chunk in iterable:
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/json/encoder.py", line 432, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/json/encoder.py", line 406, in _iterencode_dict
    yield from chunks
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/json/encoder.py", line 439, in _iterencode
    o = _default(o)
        ^^^^^^^^^^^
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type int64 is not JSON serializable
100%|███████████████████████████████████████████████████████████████████████| 5268/5268 [1:17:11<00:00,  1.14it/s]
Traceback (most recent call last):
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/bin/accelerate", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/site-packages/accelerate/commands/launch.py", line 969, in launch_command
    simple_launcher(args)
  File "/Users/xiaoliwang/development/miniforge3/envs/transformers/lib/python3.11/site-packages/accelerate/commands/launch.py", line 625, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/Users/xiaoliwang/development/miniforge3/envs/transformers/bin/python3.11', 'run_ner_no_trainer.py', '--model_name_or_path', 'bert-base-uncased', '--dataset_name', 'conll2003', '--output_dir', '/tmp/test-ner', '--pad_to_max_length', '--task_name', 'ner', '--return_entity_level_metrics']' returned non-zero exit status 1.

I have reproduced this on my Macbook Air M1 with mps accleration enabled. The full error messages have been posted above here, same as on my Ubuntu workstation.

amyeroberts

@xiaoli Thanks for explaining and adding this fix!

Could we instead do the conversion of the np.int64 values in all_results before saving, this way we don't blindly try to serialize everything to int within the json.dump call?

xiaoli · 2023-06-22T12:08:20Z

@amyeroberts Thanks for your comments!

I think your idea is good, and I understand that your intention is obviously to avoid that int convertment of everything.

But according to this page https://docs.python.org/3/library/json.html

If specified, default should be a function that gets called for objects that can’t otherwise be serialized. 
It should return a JSON encodable version of the object or raise a [TypeError](https://docs.python.org/3/library/exceptions.html#TypeError). 
If not specified, [TypeError](https://docs.python.org/3/library/exceptions.html#TypeError) is raised.

From my understanding, this default parameter is just likely giving a new converter function, and in this case that function is a concise int(), yes, that's it. I think we don't need to write a new handler function to handling all different object types here, because we only cannot handle/serialize the np.int64 here.
So in the future if we have something more than that, I could definitely to write a new hanlder to take good care of them, hence for the time being, I think default=int is a good enough solution :)

…lization

xiaoli · 2023-06-26T10:58:08Z

Hi @amyeroberts, I have changed that a little bit as you mentioned before :)

amyeroberts

Thanks for fixing!

HuggingFaceDocBuilderDev · 2023-06-26T12:02:38Z

The documentation is not available anymore as the PR was closed or merged.

amyeroberts · 2023-06-26T12:22:53Z

@xiaoli For the quality CI checks, you'll need to run make style at the top level of the repo and push any changes that are applied. Once this is done, CI should all be green and branch good to merge in 👍

…ainer.py

xiaoli · 2023-06-26T13:10:35Z

@xiaoli For the quality CI checks, you'll need to run make style at the top level of the repo and push any changes that are applied. Once this is done, CI should all be green and branch good to merge in 👍

@amyeroberts Thanks for intructions, but I am afraid that so many files being changed after make style execution:

(transformers) ➜  transformers git:(main) ✗ git status
On branch main
Your branch is ahead of 'origin/main' by 8 commits.
  (use "git push" to publish your local commits)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   examples/research_projects/codeparrot/scripts/human_eval.py
	modified:   examples/research_projects/fsner/src/fsner/tokenizer_utils.py
	modified:   examples/research_projects/jax-projects/big_bird/prepare_natural_questions.py
	modified:   examples/research_projects/luke/run_luke_ner_no_trainer.py
	modified:   examples/research_projects/lxmert/modeling_frcnn.py
	modified:   examples/research_projects/visual_bert/modeling_frcnn.py
	modified:   src/transformers/generation/logits_process.py
	modified:   src/transformers/generation/tf_logits_process.py
	modified:   src/transformers/generation/tf_utils.py
	modified:   src/transformers/keras_callbacks.py
	modified:   src/transformers/models/bert/convert_bert_pytorch_checkpoint_to_original_tf.py
	modified:   src/transformers/models/bigbird_pegasus/convert_bigbird_pegasus_tf_to_pytorch.py
	modified:   src/transformers/models/deta/modeling_deta.py
	modified:   src/transformers/models/dpr/tokenization_dpr.py
	modified:   src/transformers/models/dpr/tokenization_dpr_fast.py
	modified:   src/transformers/models/pegasus/convert_pegasus_tf_to_pytorch.py
	modified:   src/transformers/models/sam/processing_sam.py
	modified:   tests/generation/test_framework_agnostic.py
	modified:   tests/models/codegen/test_modeling_codegen.py
	modified:   tests/models/data2vec/test_modeling_data2vec_audio.py
	modified:   tests/models/encodec/test_modeling_encodec.py
	modified:   tests/models/gpt2/test_modeling_gpt2.py
	modified:   tests/models/gptj/test_modeling_gptj.py
	modified:   tests/models/hubert/test_modeling_hubert.py
	modified:   tests/models/mctct/test_modeling_mctct.py
	modified:   tests/models/rwkv/test_modeling_rwkv.py
	modified:   tests/models/sew/test_modeling_sew.py
	modified:   tests/models/sew_d/test_modeling_sew_d.py
	modified:   tests/models/speecht5/test_modeling_speecht5.py
	modified:   tests/models/unispeech/test_modeling_unispeech.py
	modified:   tests/models/unispeech_sat/test_modeling_unispeech_sat.py
	modified:   tests/models/wav2vec2/test_modeling_flax_wav2vec2.py
	modified:   tests/models/wav2vec2/test_modeling_wav2vec2.py
	modified:   tests/models/wav2vec2_conformer/test_modeling_wav2vec2_conformer.py
	modified:   tests/models/wavlm/test_modeling_wavlm.py
	modified:   tests/models/whisper/test_modeling_whisper.py
	modified:   tests/onnx/test_onnx.py
	modified:   tests/test_modeling_tf_common.py
	modified:   tests/test_tokenization_common.py
	modified:   tests/trainer/test_trainer_seq2seq.py
	modified:   utils/check_copies.py
	modified:   utils/create_dummy_models.py
	modified:   utils/tests_fetcher.py

no changes added to commit (use "git add" and/or "git commit -a")

xiaoli · 2023-06-26T22:49:04Z

@amyeroberts make style changes are committed, thank you 😁

Fix TypeError: Object of type int64 is not JSON serializable

6d46b7a

Merge branch 'huggingface:main' into main

dba69fd

amyeroberts reviewed Jun 22, 2023

View reviewed changes

Convert numpy.float64 and numpy.int64 to float and int for json seria…

d1a47b3

…lization

Merge branch 'huggingface:main' into main

e3ca8e7

amyeroberts approved these changes Jun 26, 2023

View reviewed changes

xiaoli added 4 commits June 26, 2023 20:50

Merge branch 'huggingface:main' into main

bc86cae

Black reformatted examples/pytorch/token-classification/run_ner_no_tr…

6c5b5da

…ainer.py

Merge branch 'main' of https://github.com/huggingface/transformers

fe88694

Merge branch 'main' of https://github.com/xiaoli/transformers

8362e83

xiaoli added 2 commits June 27, 2023 00:17

Merge branch 'main' of https://github.com/huggingface/transformers

fc889a4

* make style

283dbbc

amyeroberts merged commit 239ace1 into huggingface:main Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TypeError: Object of type int64 is not JSON serializable #24340

Fix TypeError: Object of type int64 is not JSON serializable #24340

xiaoli commented Jun 18, 2023

amyeroberts commented Jun 19, 2023

xiaoli commented Jun 20, 2023 •

edited

Loading

xiaoli commented Jun 20, 2023 •

edited

Loading

amyeroberts left a comment

xiaoli commented Jun 22, 2023 •

edited

Loading

xiaoli commented Jun 26, 2023

amyeroberts left a comment

HuggingFaceDocBuilderDev commented Jun 26, 2023 •

edited

Loading

amyeroberts commented Jun 26, 2023

xiaoli commented Jun 26, 2023 •

edited

Loading

xiaoli commented Jun 26, 2023

Fix TypeError: Object of type int64 is not JSON serializable #24340

Fix TypeError: Object of type int64 is not JSON serializable #24340

Conversation

xiaoli commented Jun 18, 2023

What does this PR do?

Who can review?

amyeroberts commented Jun 19, 2023

xiaoli commented Jun 20, 2023 • edited Loading

xiaoli commented Jun 20, 2023 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

xiaoli commented Jun 22, 2023 • edited Loading

xiaoli commented Jun 26, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 26, 2023 • edited Loading

amyeroberts commented Jun 26, 2023

xiaoli commented Jun 26, 2023 • edited Loading

xiaoli commented Jun 26, 2023

xiaoli commented Jun 20, 2023 •

edited

Loading

xiaoli commented Jun 20, 2023 •

edited

Loading

xiaoli commented Jun 22, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 26, 2023 •

edited

Loading

xiaoli commented Jun 26, 2023 •

edited

Loading