Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix read_batch behavior for visgeno testing #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

qleroy
Copy link

@qleroy qleroy commented Jan 28, 2018

Testing scripts in exp-visgeno-rel

The DataReader allows to read the same batch several times even at testing time, if a batch raises an IOError in run_prefetch from util/visgeno_rel_train/rel_data_reader.py, it does not increment self.n_batch causing the reader to eventually read the first batch once again without incrementing self.n_epoch.

For example in exp-visgeno-rel/exp_test_visgeno_attbilstm.py, the test runs over reader.num_batch (equals 5000 in case of imdb_tst) but 127 batches of imdb_tst does not contain any relationship causing prepare_batch to raise an IOError, caught by run_prefetch and it does not do anything in that case. Eventually the first 127 batches of imdb_tst are read and been computed scores once again. I think this is not an intended behavior, rather exp_test_visgeno_attbilstm.py (and other testing scripts) should compute scores against batches containing relationships only once.

To solve the problem, I suggest that run_prefetch adds None to the prefetch_queue in case of an "empty" batch.

# from run_prefetch in util/visgeno_rel_train/rel_data_reader.py
except IOError as err:
            # Print error and move on to next batch
            print('data reader: skipped an image.', err)
            prefetch_queue.put(None, block=True)

In testing scripts such as exp_test_visgeno_attbilstm.py if read_batch is None n_iter is incremented and we move to the next batch.

# from exp_test_visgeno_attbilstm.py in exp-visgeno-rel/exp_test_visgeno_attbilstm.py
for n_iter in range(reader.num_batch):
    batch = reader.read_batch()

    ###
    # continue if the batch does not contain any relationship
    # and increment n_iter so that the first few batches are not read once again in case of
    ###
    if batch is None:
        n_iter = n_iter + 1
        continue

    # compute score for batch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant