Exploring SAM2 and Scene Graphs
I recently came across an idea, in which scene-graphs are used to describe what is happening in a given video frame, below is a very short demonstration along with a link to colab notebook where I explore using Meta’s Segment Anything 2 (SAM2) model to create scene graphs.
The video we are going to be using is as follows, we want SAM2 to track the cup that contains the ball and in the end tell us the correct position.
Here’s the video showing a side by side comparision of scene graphs (atleast the ones that show a spatial relationship between objects) along with the segmented frame.
I was really impressed by how sam2 kept track of three different object id’s with nearly identical features, i.e., red cups.
Future plans:
- This was admittedly a very basic exploration, I would like to compare scene graphs that related objects with actions.
- Deep dive into how SAM2 actually performs the tracking.