This paper explores a sequential approach for target speech extraction by combining blind source separation (BSS) with the x-vector based speaker recognition (SR) module, and extends the training of MVAE to evaluate its generalization to unseen speakers.