I made a supercut out of musicvideos of the Top10-SingleCharts from my birthday. The scenes comprise footage in which several instruments are "showcased" visually. I assumed that those scenes would indicate important moments of the referenced composition, like "hooks", solos or other instrumental parts.
In order to achieve this I trained the yolov5-object-detection-algorithm  with a custom dataset of images of Guitars, Drums, Keyboards, Microphones and Saxophones using fifty one and open images :
I cleaned and prepared the resulting
MR's hit single from 1990 in a nutshell: "ich", a decent amount of "nicht" paired with a tiny dash of "dich".