Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval only for MS MARCO #47

Open
tangzhy opened this issue Dec 14, 2020 · 1 comment
Open

Eval only for MS MARCO #47

tangzhy opened this issue Dec 14, 2020 · 1 comment

Comments

@tangzhy
Copy link

tangzhy commented Dec 14, 2020

Hi, I use the colab code exactly from your demo.

Model config

The general is the same as follows, except that OUTPUT_DIR is changed to your decompressed bert based directory, and the batch_size is set to 8 since I'm running on V100-16GB. I also changed the MAX_EVAL_EXAMPLES=100 because it takes too much time to get the full eval performance.

BERT_PRETRAINED_DIR = '/search/odin/Data/pre-trained-models/bert/uncased_L-12_H-768_A-12/'
OUTPUT_DIR = '/search/odin/Data/marco-passage-ranking/models/BERT_Base_trained_on_MSMARCO/'
DATA_DIR = '/search/odin/Data/marco-passage-ranking/tfrecord/'

USE_TPU = False
DO_TRAIN = False  # Whether to run training.
DO_EVAL = True  # Whether to run evaluation.
TRAIN_BATCH_SIZE = 8
EVAL_BATCH_SIZE = 8
LEARNING_RATE = 1e-6
NUM_TRAIN_STEPS = 100
NUM_WARMUP_STEPS = 40000
MAX_SEQ_LENGTH = 512
SAVE_CHECKPOINTS_STEPS = 10
ITERATIONS_PER_LOOP = 100
NUM_TPU_CORES = 8
BERT_CONFIG_FILE = os.path.join(BERT_PRETRAINED_DIR, 'bert_config.json')
INIT_CHECKPOINT = os.path.join(BERT_PRETRAINED_DIR, 'bert_model.ckpt')
MSMARCO_OUTPUT = False  # Write the predictions to a MS-MARCO-formatted file.
MAX_EVAL_EXAMPLES = 100  # Maximum number of examples to be evaluated.
NUM_EVAL_DOCS = 1000  # Number of docs per query in the dev and eval files.
METRICS_MAP = ['MAP', 'RPrec', 'NDCG', 'MRR', 'MRR@10']

Logging

The logging and performance are listed as follows. My concerns are:

  1. Is the model loaded from your fine-tuned checkpoint properly? BTW, no logging info like *INIT_FROM_CKPT* occurs.
  2. Why is the trained model performance so poor? MRR@10 = 0.01 for the top 100 eval examples. Is that expected? Since I only run for 100 eval examples (100 * 1000 entries are actually predicted.)
  3. If the model is loaded improperly, how shall I load the model instead? Any example code?
WARNING:tensorflow:From /search/odin/Codes/marco-passage-ranking/modeling.py:101: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

W1214 14:42:01.364466 140473733826368 module_wrapper.py:139] From /search/odin/Codes/marco-passage-ranking/modeling.py:101: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

W1214 14:42:01.366216 140473733826368 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder.<locals>.model_fn at 0x7fc2139b9cb0>) includes params argument, but params are not passed to Estimator.
W1214 14:42:01.737018 140473733826368 estimator.py:1994] Estimator's model_fn (<function model_fn_builder.<locals>.model_fn at 0x7fc2139b9cb0>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Using config: {'_model_dir': '/search/odin/Data/marco-passage-ranking/models/BERT_Base_trained_on_MSMARCO/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 10, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc2104ed710>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=100, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None}
I1214 14:42:01.739134 140473733826368 estimator.py:212] Using config: {'_model_dir': '/search/odin/Data/marco-passage-ranking/models/BERT_Base_trained_on_MSMARCO/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 10, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc2104ed710>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=100, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None}
INFO:tensorflow:_TPUContext: eval_on_tpu True
I1214 14:42:01.740160 140473733826368 tpu_context.py:220] _TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
W1214 14:42:01.740942 140473733826368 tpu_context.py:222] eval_on_tpu ignored because use_tpu is False.
INFO:tensorflow:***** Running evaluation *****
I1214 14:42:01.741715 140473733826368 <ipython-input-3-e0f70c5ba30e>:280] ***** Running evaluation *****
INFO:tensorflow:  Batch size = 8
I1214 14:42:01.742430 140473733826368 <ipython-input-3-e0f70c5ba30e>:281]   Batch size = 8
WARNING:tensorflow:From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
W1214 14:42:01.750143 140473733826368 deprecation.py:506] From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.FixedLenSequenceFeature is deprecated. Please use tf.io.FixedLenSequenceFeature instead.

W1214 14:42:01.832499 140473733826368 module_wrapper.py:139] From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.FixedLenSequenceFeature is deprecated. Please use tf.io.FixedLenSequenceFeature instead.

WARNING:tensorflow:From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

W1214 14:42:01.833697 140473733826368 module_wrapper.py:139] From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING:tensorflow:From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

W1214 14:42:01.834634 140473733826368 module_wrapper.py:139] From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

WARNING:tensorflow:From /search/odin/Codes/marco-passage-ranking/modeling.py:190: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W1214 14:42:02.021957 140473733826368 module_wrapper.py:139] From /search/odin/Codes/marco-passage-ranking/modeling.py:190: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /search/odin/Codes/marco-passage-ranking/modeling.py:458: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

W1214 14:42:02.024990 140473733826368 module_wrapper.py:139] From /search/odin/Codes/marco-passage-ranking/modeling.py:458: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From /search/odin/Codes/marco-passage-ranking/modeling.py:743: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
W1214 14:42:02.076209 140473733826368 deprecation.py:323] From /search/odin/Codes/marco-passage-ranking/modeling.py:743: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
WARNING:tensorflow:From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/layers/core.py:187: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W1214 14:42:02.077801 140473733826368 deprecation.py:323] From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/layers/core.py:187: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /search/odin/Codes/marco-passage-ranking/modeling.py:314: The name tf.erf is deprecated. Please use tf.math.erf instead.

W1214 14:42:02.173272 140473733826368 module_wrapper.py:139] From /search/odin/Codes/marco-passage-ranking/modeling.py:314: The name tf.erf is deprecated. Please use tf.math.erf instead.

WARNING:tensorflow:From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W1214 14:42:04.507863 140473733826368 deprecation.py:323] From /root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:Read 10000 examples in 136 secs. Metrics so far:
W1214 14:44:17.816093 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 10000 examples in 136 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:44:17.818014 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00090869 0.         0.07455417 0.00085925 0.        ]
W1214 14:44:17.818776 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00090869 0.         0.07455417 0.00085925 0.        ]
WARNING:tensorflow:Read 20000 examples in 262 secs. Metrics so far:
W1214 14:46:24.263339 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 20000 examples in 262 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:46:24.265303 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00100144 0.         0.08361872 0.00097672 0.        ]
W1214 14:46:24.266042 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00100144 0.         0.08361872 0.00097672 0.        ]
WARNING:tensorflow:Read 30000 examples in 388 secs. Metrics so far:
W1214 14:48:30.680611 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 30000 examples in 388 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:48:30.682710 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00108026 0.         0.0901455  0.00106377 0.        ]
W1214 14:48:30.683455 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00108026 0.         0.0901455  0.00106377 0.        ]
WARNING:tensorflow:Read 40000 examples in 515 secs. Metrics so far:
W1214 14:50:37.156615 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 40000 examples in 515 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:50:37.158547 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00102465 0.         0.08564832 0.00101229 0.        ]
W1214 14:50:37.159287 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00102465 0.         0.08564832 0.00101229 0.        ]
WARNING:tensorflow:Read 50000 examples in 641 secs. Metrics so far:
W1214 14:52:43.648328 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 50000 examples in 641 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:52:43.650337 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00102337 0.         0.08508468 0.00101348 0.        ]
W1214 14:52:43.651077 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00102337 0.         0.08508468 0.00101348 0.        ]
WARNING:tensorflow:Read 60000 examples in 768 secs. Metrics so far:
W1214 14:54:50.169556 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 60000 examples in 768 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:54:50.171495 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00112496 0.         0.08702416 0.00111672 0.        ]
W1214 14:54:50.172239 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00112496 0.         0.08702416 0.00111672 0.        ]
WARNING:tensorflow:Read 70000 examples in 894 secs. Metrics so far:
W1214 14:56:56.690977 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 70000 examples in 894 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:56:56.692907 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.00109861 0.         0.0863206  0.00109154 0.        ]
W1214 14:56:56.693676 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.00109861 0.         0.0863206  0.00109154 0.        ]
WARNING:tensorflow:Read 80000 examples in 1021 secs. Metrics so far:
W1214 14:59:03.240334 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 80000 examples in 1021 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 14:59:03.242281 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.01356263 0.0125     0.09699456 0.01355645 0.0125    ]
W1214 14:59:03.243049 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.01356263 0.0125     0.09699456 0.01355645 0.0125    ]
WARNING:tensorflow:Read 90000 examples in 1148 secs. Metrics so far:
W1214 15:01:09.778834 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 90000 examples in 1148 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 15:01:09.780769 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.01214304 0.01111111 0.09414841 0.01213754 0.01111111]
W1214 15:01:09.781500 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.01214304 0.01111111 0.09414841 0.01213754 0.01111111]
WARNING:tensorflow:Read 100000 examples in 1274 secs. Metrics so far:
W1214 15:03:16.311386 140473733826368 <ipython-input-3-e0f70c5ba30e>:355] Read 100000 examples in 1274 secs. Metrics so far:
WARNING:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
W1214 15:03:16.313336 140473733826368 <ipython-input-3-e0f70c5ba30e>:356] MAP  RPrec  NDCG  MRR  MRR@10
WARNING:tensorflow:[0.01104599 0.01       0.09408713 0.01104105 0.01      ]
W1214 15:03:16.314079 140473733826368 <ipython-input-3-e0f70c5ba30e>:357] [0.01104599 0.01       0.09408713 0.01104105 0.01      ]
INFO:tensorflow:Eval dev:
I1214 15:03:16.407423 140473733826368 <ipython-input-3-e0f70c5ba30e>:368] Eval dev:
INFO:tensorflow:MAP  RPrec  NDCG  MRR  MRR@10
I1214 15:03:16.408445 140473733826368 <ipython-input-3-e0f70c5ba30e>:369] MAP  RPrec  NDCG  MRR  MRR@10
INFO:tensorflow:[0.01104599 0.01       0.09408713 0.01104105 0.01      ]
I1214 15:03:16.409163 140473733826368 <ipython-input-3-e0f70c5ba30e>:370] [0.01104599 0.01       0.09408713 0.01104105 0.01      ]
An exception has occurred, use %tb to see the full traceback.

SystemExit


/root/Softwares/anaconda3/envs/tf1.15/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3426: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)
@rodrigonogueira4
Copy link
Collaborator

Is the model loaded from your fine-tuned checkpoint properly? BTW, no logging info like INIT_FROM_CKPT occurs.

It seems that the checkpoint is not being loaded.

Why is the trained model performance so poor? MRR@10 = 0.01 for the top 100 eval examples. Is that expected? Since I only run for 100 eval examples (100 * 1000 entries are actually predicted.)

MRR@10 should be at least 0.30.

If the model is loaded improperly, how shall I load the model instead? Any example code?

I would first try to use a "dummy" path in which no checkpoint exists. If the log is identical to what you have now, then the problem is in BERT_PRETRAINED_DIR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants