Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in initializing self.cnndm in benchmark.py #16

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

forrestbao
Copy link

@forrestbao forrestbao commented Dec 3, 2023

Critical change:

Below, lines 44 and 54 can cause an error due to CNNDM is undefined.
And there is no need for the global variable CNNDM when self.cnndm presents.

This bug fix uses only self.cnndm which is initialized to None. And when it is none, self.get_cnndm_documents() or self.get_cnndm_references() will populate it with real data.

def get_cnndm_document(self, aid):
global CNNDM
if self.cnndm is None:
if CNNDM is None:
CNNDM = load_dataset("cnn_dailymail", "3.0.0")
self.cnndm = CNNDM
self.cnndm_id2article = {}
for cut in ["test", "validation"]:
self.cnndm_id2article.update({d["id"]: d["article"] for d in self.cnndm[cut]})
return self.cnndm_id2article[aid]
def get_cnndm_reference(self, aid):
global CNNDM
if CNNDM is None:
CNNDM = load_dataset("cnn_dailymail", "3.0.0")
self.cnndm = CNNDM

Non-critical change:
This fix also exposes cache_dir of Huggingface datasets.load_datasets().

@forrestbao forrestbao changed the title Fix bugs in initializing self.cnndm Fix bugs in initializing self.cnndm in benchmark.py Dec 3, 2023
@tusharg7797
Copy link

Hi @forrestbao. Thanks for the fix. @tingofurro I had the same error and this fixes the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants