Skip to content

Releases: AylaRT/ACTER

v1.5

08 Apr 13:45
Compare
Choose a tag to compare

Now includes sequential and tokenised annotations,

Not many changes to actual annotations, but major update to how the annotations are presented etc.:

  • Removed a few very long Named Entity annotations (from wind-en and from htfl-en; counts updated) over which there was doubt whether it was a real NE.
  • Updated normalisation:
    • Replaced "İ" with "I" in the annotations to avoid problems lowercasing (concerns mainly wind_en_01)
    • Removed rare but problematic characters: ["", "", "", "", "�"] (not handled well by some transformers)
  • Major update of README.md
  • Different structure of all data:
    • include sequential annotations
    • include tokenised versions of annotations

v1.4 normalised

15 Jul 14:46
Compare
Choose a tag to compare

Identical to version 1.3, except with minor normalisation for both the text files and annotations:
unicodedata.normalize("NFC", text)
normalising all dashes to "-", all single quotes to "'" and all double quotes to '"'

v1.3 Github release

15 Jul 14:23
Compare
Choose a tag to compare

First release on Github, after completion of TermEval shared task.