The use of a neural network architecture with the connectionist temporal classification loss function for phonemic and tonal transcription in a language documentation setting is explored and the method's promise in improving efficiency, minimizing typographical errors, and maintaining the transcription's faithfulness to the acoustic signal is shown.