-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-caption strategy from parquet might not work #1092
Comments
It crashes later on too at: # Check for empty strings
if (df[caption_column] == "").sum() > 0 and not fallback_caption_column:
raise ValueError(
f"Parquet file {parquet_path} contains empty strings in the '{caption_column}' column."
)
if (df[filename_column] == "").sum() > 0:
raise ValueError(
f"Parquet file {parquet_path} contains empty strings in the '{filename_column}' column."
) |
Needs also in if type(image_caption) == bytes:
image_caption = image_caption.decode("utf-8")
if type(image_caption) == str:
image_caption = image_caption.strip()
+ if type(image_caption) in (list, tuple, numpy.ndarray, pd.Series):
+ image_caption = [str(item).strip() for item in image_caption if item is not None]
if prepend_instance_prompt:
if type(image_caption) == list:
image_caption = [instance_prompt + " " + x for x in image_caption]
else:
image_caption = instance_prompt + " " + image_caption
return image_caption |
bghira
added a commit
that referenced
this issue
Nov 7, 2024
…aption-parquets Fix multi-caption parquets crashing in multiple locations (Closes #1092)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Caption is checked for existence, but if it's a list it causes a crash.
You should
The text was updated successfully, but these errors were encountered: