Skip to content

cchaung/CV_Final_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Object and its effects eliminate

tags: Mask R-CNN RAFT Deepfillv2

Member

  • 數據所 310554031 葉詠富
  • 數據所 310554037 黃乾哲

Workflow

Mask R-CNN

  • We use the Mask RCNN to select the object.
  • The training data is “PennFudanPed” which is containing images that are used for pedestrian detection in the experiments reported in.

  • Since our background is sample, the human contour is clear.
  • We do the threshold to strengthen the the curve of the mask

RAFT

Flow to RGB to Grayscale

  • Use the pretrain model raft-thing to detect the flow from people and shadow
  • convert flow to rgb images

  • convert rgb image to grayscale image

RAFT Mask by Threshold

  • Mapping grayscale images onto a 1D vector
  • Set the threshold to create RAFT Mask.
  • The threshold is set with percentiles.
  • For example: 10% of the data is less than the threshold as the foreground, greater than the threshold as the background

  • threshold 2% vs 10%
    • 2% covers less shadows, but reduces the effect of optical flow in nearby scenes
    • 10% covers more shadows, but increases the effect of the optical flow of nearby scenes
    • We choose 10%, hoping to cover more shadows

RAFT Mask by KMeans

  • Divide the rgb optical flow map into 10 groups using Kmeans
  • Sort by the number of pixels in the group
  • If the difference between the front and the back is 5 times, it is the dividing point between the foreground and background.
  • Example: 90505/4247=21 > 5

Threshold v.s. KMeans

  • We can find that the use of KMeans can better connect the relationship between people and shadows to generate masks.
  • Eliminate the effects of background.
  • The people behind the original were kept.

Combine Mask

  • Mask RCNN is good at detecting the object.
  • RAFT is good at detecting the shadow.
  • Combine the both advantage to produce the mask.

Deepfillv2

  • We use pretrain model Deepfillv2 to inpainting disappaer part by the mask

Discussion

  • The mask almost cover the whole object and its effect.
  • RAFT can not detect the whole shadow, it may cause by the action of the object.
  • Inpainting result is not nature, it need to retrain a personal model for this video’s background.

Reference

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published