Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what is load_image doing internally and how to apply the same operation to frames from video #370

Open
llealgt opened this issue Nov 13, 2024 · 1 comment

Comments

@llealgt
Copy link

llealgt commented Nov 13, 2024

Doing some testing I noticied that doing inference returns very different results for the same image but loaded with different methods:

  • Method 1: the official load_image function from the library(it reads the image using the path passed as argument)
  • Method2: using cv2 to read the image, then converting to tensor and then swapping axis to have depth as first axis.

As I said, both methods give you a tensor to pass to the model, but they return very different results(method2 usually are bad), I inspected the shape of the image returned by both cases and they are different so defintelly there are transformations going on inside load_image, my question is: what is happening inside load_image? so I can replicate it in other scripts

My end goal is to run the model on video, I mean running the model on frames in the video, so I cannot use load_image because they are not images from disk, they are obtained from the video, so I need to understand what is happening inside_load image so I can emulate that behavior on the frames of the video.

Thanks

@llealgt
Copy link
Author

llealgt commented Nov 14, 2024

Ignore my message, for some moment I forgot the code is avaialble for me to see:

def load_image(image_path: str) -> Tuple[np.array, torch.Tensor]:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant