Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_clamp_coord in FuyuProcessor was not defined #27168

Closed
4 tasks
nooodles2023 opened this issue Oct 31, 2023 · 5 comments
Closed
4 tasks

_clamp_coord in FuyuProcessor was not defined #27168

nooodles2023 opened this issue Oct 31, 2023 · 5 comments

Comments

@nooodles2023
Copy link

System Info

on transformers master branch

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

def original_to_transformed_h_coords(self, original_coords):
    # apply crop
    cropped_coords = (
        self._clamp_coords(original_coords, min_value=self.crop_top, max_value=self.crop_bottom) - self.crop_top
    )
    # apply scale
    scaled_coords = self._scale_coords(cropped_coords, scale=self.scaled_h / self.original_h)
    # apply pad
    return scaled_coords + self.padding_top


def original_to_transformed_w_coords(self, original_coords):
    # apply crop
    cropped_coords = (
        self._clamp_coords(original_coords, min_value=self.crop_left, max_value=self.crop_right) - self.crop_left
    )
    # apply scale
    scaled_coords = self._scale_coords(cropped_coords, scale=self.scaled_w / self.original_w)
    # apply pad
    return scaled_coords + self.padding_left


def scale_point_to_transformed_image(x: float, y: float) -> List[int]:
    x_scaled = original_to_transformed_w_coords(np.array([x / 2]))[0]
    y_scaled = original_to_transformed_h_coords(np.array([y / 2]))[0]
    return [x_scaled, y_scaled]


def scale_bbox_to_transformed_image(top: float, left: float, bottom: float, right: float) -> List[int]:
    top_scaled = original_to_transformed_w_coords(np.array([top / 2]))[0]
    left_scaled = original_to_transformed_h_coords(np.array([left / 2]))[0]
    bottom_scaled = original_to_transformed_w_coords(np.array([bottom / 2]))[0]
    right_scaled = original_to_transformed_h_coords(np.array([right / 2]))[0]
    return [top_scaled, left_scaled, bottom_scaled, right_scaled]

scale_point_to_transformed_imagegot 2 params, but the caller passed 3 params.

    # Remove all spaces from num_ints
    num_ints = [float(num.strip()) for num in num_int_strs]
    # scale to transformed image siz
    if len(num_ints) == 2:
        num_ints_translated = scale_point_to_transformed_image(
            x=num_ints[0], y=num_ints[1], transformed_image=transformed_image
        )

Expected behavior

check the code if there were something wrong when uploaded

@amyeroberts
Copy link
Collaborator

Hi @nooodles2023, thanks for raising this issue!

There's currently some active development on the Fuyu processing so it's not-yet in a stable state: #27133, #27007, #27083

It's good that this has been flagged as something we need to address cc @pcuenca @molbap

@pcuenca
Copy link
Member

pcuenca commented Oct 31, 2023

You are right @nooodles2023! As @amyeroberts mentioned we are currently working on a refactor. I removed the calls to _clamp_coords in amyeroberts#113; we don't need the "transformed image" any more, only the scale factor that was used to prepare it for inference.

@nooodles2023
Copy link
Author

Thank you for your reply.
Fuyu is a really nice VL model, I want to finetune it for object detection in UI screens by text prompt. Would you have any plan to post the finetune script?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@pcuenca
Copy link
Member

pcuenca commented Nov 30, 2023

Fixed in the refactor.

@pcuenca pcuenca closed this as completed Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants