If you look right the distant 'pop' sound looks like it is repeated 4 times, one (1:04.5 video timestamp) is louder than the other ones. While the lack of high frequencies could tell it's distant, it sounds to me like something hitting microphones, the microphone stand or the plastic of the camera.
Ok, so here's a new tidbit after looking at your spectrogram and listening to the audio track a couple of more times:
The 'pop' sound is much louder because the hydrazine/N2O4 tanks, as they hit the ground and combusted almost immediately, created both a much larger explosion plus they were closer to the ground as well: and the amplitude of ground waves follows the inverse square law over distance.
So the 'pop' is the hydrazine explosion via a ground-transmitted seismic wave. Wow ...