This paper proposes two distinct methods: one that produces independent features from each forensic filter and then fuses them and one that performs early mixing of different modal outputs and produces early combined features (this is referred to as early fusion).