Jayesh Nair

Exploring SAM2 and Scene Graphs

I recently came across an idea, in which scene-graphs are used to describe what is happening in a given video frame, below is a very short demonstration along with a link to colab notebook where I explore using Meta’s Segment Anything 2 (SAM2) model to create scene graphs.

The video we are going to be using is as follows, we want SAM2 to track the cup that contains the ball and in the end tell us the correct position.

A video showing a person shifting cups.

Here’s the video showing a side by side comparision of scene graphs (atleast the ones that show a spatial relationship between objects) along with the segmented frame.

I was really impressed by how sam2 kept track of three different object id’s with nearly identical features, i.e., red cups.

A video comparision side by side of spatial graphs along with segmented video.

Future plans: