The Django dataset is a dataset for code generation comprising of 16000 training, 1000 development and 1805 test annotations. Each data point consists of a line of Python code together with a manually created natural language description.
Source: Latent Predictor Networks for Code Generation Image Source: https://github.com/microsoft/vscode-docs/issues/2696