BUG is a large-scale gender bias dataset of 108K diverse real-world English sentences, sampled semiautomatically from large corpora using lexical syntactic pattern matching