GitHub - kuhumcst/DanNet: The Danish WordNet as an RDF graph.

DanNet is a WordNet for the Danish language. The goal of this project is to represent DanNet in full using RDF as its native representation at both the database level, in the application space, and as its primary serialisation format.

Compatibility

Special care has been taken to maximise the compatibility of this iteration of DanNet. Like the DanNet of yore, the base dataset is published as both RDF (Turtle) and CSV. RDF is the native representation and can be loaded as-is inside a suitable RDF graph database, e.g. Apache Jena. The CSV files are now published along with column metadata as CSVW.

Companion datasets

Apart from the base DanNet dataset, several companion datasets exist expanding the graph with additional data. The companion datasets collectively provide a broader view of the data with both implicit and explicit links to other data:

The COR companion dataset links DanNet resources to IDs from the COR project.
The DDS companion dataset decorates DanNet resources with sentiment data.
The OEWN extension companion dataset provides DanNet-like labels for the Open English WordNet to better facilitate browsing the connections between the two datasets.

The current version of the datasets can be downloaded on wordnet.dk/dannet. All of the releases from 2023 and onwards are available as releases on this project page.

Inferred data

Additional data is also implicitly inferred from the base dataset, the aforementioned companion datasets, and any associated ontological metadata. These inferred data points can be browsed along with the rest of the data on the official DanNet website.

Inferring data so can be both computationally expensive and mentally taxing for the consumer of the data, so we do not always publish the fully inferred graph in a DanNet release; when we do, those release will be specifically marked as containing this extra data.

Standards-based

The old DanNet was modelled as tables inside a relational database. Two serialised representations also exist: RDF/XML 1.0 and a custom CSV format. The latter served as input for the new data model, remapping the relations described in these files onto a modern WordNet based on the Ontolex-lemon standard combined with various relations defined by the Global Wordnet Association as used in the official GWA RDF standard.

In Ontolex-lemon...

Synsets are analogous to ontolex:LexicalConcept.
Word senses are analogous to ontolex:LexicalSense.
Words are analogous to ontolex:LexicalEntry.
Forms are analogous to ontolex:Form.

By building DanNet according to these standards we maximise its ability to integrate with other lexical resources, in particular with other WordNets.

Significant changes

New schema, prefixes, URIs

DanNet uses a new schema, available in this repository and also at https://wordnet.dk/dannet/schema.

DanNet uses the following URI prefixes for the dataset instances, concepts (members of a dns:ontologicalType) and the schema itself:

dn -> https://wordnet.dk/dannet/data/
dnc -> https://wordnet.dk/dannet/concepts/
dns -> https://wordnet.dk/dannet/schema/

NOTE: these new prefixes/URIs take over from the ones used for DanNet 2.2 (the last version before the 2023 re-release):

dn -> http://www.wordnet.dk/owl/instance/2009/03/instances/

dn_schema -> http://www.wordnet.dk/owl/instance/2009/03/schema/

All the new URIs resolve to HTTP resources, which is to say that accessing a resource with a GET request (e.g. through a web browser) returns data for the resource (or schema) in question.

Finally, the new DanNet schema is written in accordance with the RDF conventions listed by Philippe Martin.

Name		Name	Last commit message	Last commit date
Latest commit History 563 Commits
doc		doc
docker		docker
examples		examples
pages		pages
resources		resources
src		src
system		system
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deps.edn		deps.edn
package-lock.json		package-lock.json
package.json		package.json
shadow-cljs.edn		shadow-cljs.edn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compatibility

Companion datasets

Inferred data

Standards-based

Significant changes

New schema, prefixes, URIs

Implementation

License

kuhumcst/DanNet

Folders and files

Latest commit

History

Repository files navigation

Compatibility

Companion datasets

Inferred data

Standards-based

Significant changes

New schema, prefixes, URIs

Implementation