Skip to content

Lexical attributes

Panos Louridas edited this page Jul 27, 2018 · 2 revisions

Lexical Attributes

spaCy includes a collection of functions that can identify whether a token has some specific attributes (for example, it is a URL!).

spaCy built-in functions for lexical attributes

For a complete list of spaCy built-in functions for lexical attributes have a look here.

Greek modifications to spaCy built-in functions

Currently, the Greek spaCy models have overwritten only like_num spaCy function which checks if a token represents a numerical expression.

The modifications include a check in a list of words that represent numerical expressions in Greek, a check whether the token represents a fraction or a check whether the token represents a power of a number.

In this way, we can catch expressions like the following:

​ δεκατέσσερις

​ εκατομμύριο

Clone this wiki locally