Skip to content

glicerico/gramind_transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gramind-transformers

Unsupervised grammar induction with transformers

The gramind-transformer repository is part of a larger project that aims to construct an unsupervised grammar induction pipeline. Currently, This repository takes a grammar and filters out rules that don't make sense according to a language model. The module takes a grammar with spurious rules, in LinkGrammar format, and selects those rules that are in agreement with the grammar implicit in the language model in use, thus explicitly obtaining a grammar that describes the language in question. In order to do that, it leverages the implicit grammar knowledge contained in the language model learned by transformers (e.g. BERT, GPT-3, etc.).

The high-level description of the project, and some proof-of-concept results have been published here. This repository implements those ideas in a preliminary manner and is intended for experimentation purposes, not as a finished pipeline.

The pipeline currently has been implemented using the rangram repository to generate sentences from grammars, and the wordcat_transformer repository to evaluate the "quality" of a sentence, which is done using a pre-trained BERT language model. These tasks, however, could be performed with different methods, and those could be plugged to this grammar inducer. In fact, other methods are being currently explored, e.g. this repository tries to generate sentences from the grammar learned by a Tree Transfomer.

About

Grammar induction leveraging transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages