Well the Video isn't suspect it's the Audio. So that narrows it down. Now why wouldn't someone who works in a field where said fakery (adding sound after the vid is shot) would be used, count?
Generally the purpose of forensic audio is to identify what a sound
is or voice recognition, that kind of thing.
In terms of establishing authenticity, that would only be done from an original source, not from digital copies, and original sources have control tracks and digital signatures that identify where they came from, so analysing the actual content is not part of the job.
Likewise, any modification of the original leaves tell-tale "fingerprints" that again, don't require analysis of the action audio-visual content.
What we are doing here is determining whether the characteristics of the explosion match the characteristics of the video. My observations of the video are as follows:
1. Only two individuals in the entire video appear to even notice the explosion, and their reactions are not sudden as expected from an explosion. Three of the individuals in the video appear entirely oblivious to the explosion, at least of them talking over the top of it.
2. The explosion audio does not peak at any point, despite being louder than the speech of the individuals within said video (which does peak quite badly).
3. The video does not have echoes of the explosion, as would be characteristic of an explosion that occured within an urban environment of tall buildings.
4. The explosion audio has high volume levels at both low and high frequencies, indicating close proximity to the explosion. However the explosion does not contain either of the alternative expected characteristics of such close proximity:
A) A sudden jolt of the camera due to the shockwave from the explosion
OR
B) a muffled explosion due to building structures shielding the camera, followed by clearer echoes as the explosion refracts around the building.
None of these require any level of special expertise in audio analysis.
Applying Occam's Razor, the simpliest explanation is that the explosion element of the audio has been added over the original audio, and that the original audio and video are genuine. The alternative hypothesis - that the video itself is faked - requires significant assumptions such as the involvement of a number of people in the fakery (either belonging to FDNY or owning firefighting equipment, for example).
-Gumboot