Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B (2019-06-16T00:00:00.000000Z)

TL;DR

A novel neural approach for automatic decipherment of lost languages with first automatic results in deciphering Linear B, a syllabic language related to ancient Greek, where the model correctly translates 67.3% of cognates.

Abstract

In this paper we propose a novel neural approach for automatic decipherment of lost languages. To compensate for the lack of strong supervision signal, our model design is informed by patterns in language change documented in historical linguistics. The model utilizes an expressive sequence-to-sequence model to capture character-level correspondences between cognates. To effectively train the model in unsupervised manner, we innovate the training procedure by formalizing it as a minimum-cost flow problem. When applied to decipherment of Ugaritic, we achieve 5% absolute improvement over state-of-the-art results. We also report first automatic results in deciphering Linear B, a syllabic language related to ancient Greek, where our model correctly translates 67.3% of cognates.

Authors

R. Barzilay

24 papers

Yuan Cao

6 papers

Jiaming Luo

2 papers

Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B

TL;DR

Abstract

Authors

References24 items

Decipherment of Substitution Ciphers with Neural Language Models

Universal Neural Machine Translation for Extremely Low Resource Languages

Word Translation Without Parallel Data

Improving Lexical Choice in Neural Machine Translation

Deciphering Related Languages

Modeling Coverage for Neural Machine Translation

Minimum Risk Training for Neural Machine Translation

Adam: A Method for Stochastic Optimization

Solving Substitution Ciphers with Combined Language Models

Improving Vector Space Word Representations Using Multilingual Correlation

Decipherment with a Million Random Restarts

Beam Search for Solving Substitution Ciphers

Simple Effective Decipherment via Combinatorial Optimization

Deciphering Foreign Language

A Statistical Model for Lost Language Decipherment

Finding Cognate Groups Using Phylogenies

Learning Bilingual Lexicons from Monolingual Corpora

Unsupervised Analysis for Decipherment Problems

Unsupervised Bilingual Lexicon Induction via Latent Variable Models

Learning Unsupervised Word Translations Without Adversaries

The Story Of Decipherment From Egyptian Hieroglyphic To Linear B

Lost Languages: The Enigma of the World's Undeciphered Scripts

A Computational Approach to Deciphering Unknown Scripts

Historical Linguistics: An Introduction

Field of Study

Venue Information

Name

Type

URL

Alternate Names