Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for ARW, DNG, CR2 raw #140

Merged
36 changes: 34 additions & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ torchvision = [
{ version = "==0.17.2+cpu", source = "pytorch-cpu", markers = "sys_platform == 'linux' and platform_machine != 'aarch64'" }
]
tqdm = "^4.65.0"
rawpy = "^0.23.2"

[tool.poetry.group.dev.dependencies]
pycodestyle = ">=2.7,<3.0"
Expand Down
35 changes: 28 additions & 7 deletions rclip/main.py
Copy link
Owner

@yurijmikhalevich yurijmikhalevich Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an amazing start 🙌 Thank you. Can you please add a test or two that includes all of the newly added image formats?

When adding test images, can you please pick or create RAW files that weigh as little as possible to keep the repo size small?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have downloaded the raw images from internet that have relatively larger file sizes (about 10 to 40 MB). I also searched conversion tools and tried to convert using Python libraries. But it didn't work.

Adding these larger images could increase the tool size. What is your opinion regarding this sized image upload to the test folder? @yurijmikhalevich

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, I think adding 3 10Mb images to the repository can be ok. We aren't bundling them in the distributions anyway.

But, if we can create 3 small 100px * 100px RAW images, it would be much better.

Also, when adding images, we should be mindful of their license. All of the images in the rclip test dir are the images I took myself 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurijmikhalevich, I have tried to resize or convert images. But I failed. So, It's difficult for me to add different raw photos. But I can download some raw images from the internet those sizes on average 40MB. Need your suggestion badly.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, let me see what I can do.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, I am going to cleanup and merge your branch, and then will add tests in a separate PR.

I can't add it in this PR because GitHub and git-lfs don't let me push large images to your branch (-:

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, another important thing to do in this PR was to ensure that the --preview works for the RAW images. Check out this diff for implementation: 46f2fc8.

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Also, it doesn't actually support "DNG" files created by Lightroom. This is why tests are important 💭

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, merged! 🙌 Congrats with your first contribution to rclip and thank you for the help 🔥

Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from tqdm import tqdm
import PIL
from PIL import Image, ImageFile
from rawpy import imread

from rclip import db, fs, model
from rclip.utils.preview import preview
Expand Down Expand Up @@ -41,7 +42,7 @@ def is_image_meta_equal(image: db.Image, meta: ImageMeta) -> bool:

class RClip:
EXCLUDE_DIRS_DEFAULT = ['@eaDir', 'node_modules', '.git']
IMAGE_REGEX = re.compile(r'^.+\.(jpe?g|png|webp)$', re.I)
IMAGE_REGEX = re.compile(r'^.+\.(jpe?g|png|webp|arw|dng|cr2)$', re.I)
DB_IMAGES_BEFORE_COMMIT = 50_000

class SearchResult(NamedTuple):
Expand All @@ -62,18 +63,38 @@ def __init__(
excluded_dirs = '|'.join(re.escape(dir) for dir in exclude_dirs or self.EXCLUDE_DIRS_DEFAULT)
self._exclude_dir_regex = re.compile(f'^.+\\{os.path.sep}({excluded_dirs})(\\{os.path.sep}.+)?$')

def _read_raw_image_file(self, path: str):
image = None
try:
raw = imread(path)
rgb = raw.postprocess()
image = Image.fromarray(np.array(rgb))
except Exception as ex:
print(f'not a valid raw file {path}', ex, file=sys.stderr)
return image

def _read_image_file(self, path: str):
image = None
try:
image = Image.open(path)
except PIL.UnidentifiedImageError as ex:
print(f'unidentified image error {path}:', ex, file=sys.stderr)
except Exception as ex:
print(f'error loading image {path}:', ex, file=sys.stderr)
return image

def _index_files(self, filepaths: List[str], metas: List[ImageMeta]):
images: List[Image.Image] = []
filtered_paths: List[str] = []
for path in filepaths:
try:
image = Image.open(path)
if os.path.splitext(path)[1].lower() in fs.PRIORITIZED_IMAGE_EXTENSIONS:
image = self._read_image_file(path)
else:
image = self._read_raw_image_file(path)

if image:
images.append(image)
filtered_paths.append(path)
except PIL.UnidentifiedImageError as ex:
pass
except Exception as ex:
print(f'error loading image {path}:', ex, file=sys.stderr)

try:
features = self._model.compute_image_features(images)
Expand Down
Loading