1
Measuring The Impact Of Programming Language Distribution
2
CCTEST: Testing and Repairing Code Completion Systems
3
CoditT5: Pretraining for Source Code and Natural Language Editing
4
Grounded Copilot: How Programmers Interact with Code-Generating Models
5
Code Translation with Compiler Representations
6
XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence
7
OPT: Open Pre-trained Transformer Language Models
8
Natural Language to Code Translation with Execution
9
InCoder: A Generative Model for Code Infilling and Synthesis
10
PaLM: Scaling Language Modeling with Pathways
11
Training Compute-Optimal Large Language Models
12
MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages
13
A systematic evaluation of large language models of code
14
Synchromesh: Reliable code generation from pre-trained language models
15
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
16
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
17
Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy
18
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
19
AVATAR: A Parallel Corpus for Java-Python Program Translation
20
Program Synthesis with Large Language Models
21
Measuring Coding Challenge Competence With APPS
22
The Power of Scale for Parameter-Efficient Prompt Tuning
23
Unified Pre-training for Program Understanding and Generation
24
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
25
GraphCodeBERT: Pre-training Code Representations with Data Flow
26
DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters
27
Unsupervised Translation of Programming Languages
28
Language Models are Few-Shot Learners
29
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
30
Data augmentation using back-translation for context-aware neural machine translation
31
Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back-Translation
32
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
33
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
34
SPoC: Search-based Pseudocode to Code
35
A Study of BFLOAT16 for Deep Learning Training
36
The Curious Case of Neural Text Degeneration
37
Introducing MathQA - A Math-Aware Question Answering System
38
Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow
39
Tree-to-tree Neural Networks for Program Translation
40
Decoupled Weight Decay Regularization
41
Attention is All you Need
42
DeepFix: Fixing Common C Language Errors by Deep Learning
43
Probabilistic model for code with decision trees
44
Using machine translation for converting Python 2 to Python 3 code
45
Phrase-Based Statistical Translation of Programming Languages
46
Lexical statistical machine translation for language migration
47
Mining source code repositories at massive scale using language modeling
48
WordNet: A Lexical Database for English
49
A Conversational Paradigm for Program Synthesis
50
Improving automatically generated code from Codex via Automated Program Repair
51
2022) introduce a new dataset which is parallel across 7 programming languages
52
In addition, researchers proposed various ways of improving code generation models. For example, Poesia et al. (2022) propose Target Similarity Tuning for code retrieval augmentation and Con
53
2021) improves upon CodeBERT by leveraging AST and data flow
54
Jangda. A scalable and extensible approach to benchmarking nl2code for 18 programming languages, 2022
55
2022) introduce execution result–based minimum Bayes
56
2022) extended the dataset in Go and Rust languages
57
2022), and CodeGen (Nijkamp et al., 2022)
58
Prefix-Tuning: Optimizing Continuous Prompts for Generation
59
2021) presented a method generation dataset in Python based on
60
2021) composed a token and line completion
61
MBPP: PYTHON Note that we convert the original MBPP dataset (Austin et al., 2021) which has a slightly different format into HumanEval format (Chen et al., 2021) with function signature and docstring
62
In this setup, we use a complete function in Python as an input prompt. The transcoder model then generates a complete function in Java and C++
63
2020a) collected a corpus of parallel functions in Java, Python
64
2019) (e.g., “create a function” to “write one function
65
Error at test case 2" 33 end 34 x = min_cost
66
Error at 3th assert statement
67
18 for (let j = 0; j <= n; j++) { 19 dp
68
12)){} else { throw 'Error at 2th assert statement. Value = ' + JSON.stringify(x ) }
69
2) 30 if(compare(x, 8)){} else { throw 'Error at 1th assert statement
70
28 var arg01 : Int = 2 29 var arg02 : Int = 2 30 var x0 : Int = minCost(cost : arg00, m : arg01, n : arg02) 31 var v0 : Int = 8 32 assert(x0 == v0
71
# Write a function to find the minimum cost path to reach (m, n) from (0, 0) for the given cost matrix cost
72
Exception --test case 1 did not pass
73
You are an expert Perl programmer, and here is your task