Skip to content

Simple exact-matching algorithm to standardise drug names appearing on the DrugBank open-access repository

License

Notifications You must be signed in to change notification settings

fgh95/DrugLinker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DrugLinker

This package is a fast and simple string-matching module to obtain all unique drug identifiers within an input string. It does it by matching the input term with DrugBank open-data vocabulary : https://www.drugbank.ca/releases/latest#open-data. It is useful to process candidate drug tokens after drug named-entity recognition (NER) or to directly match sentences after tokenization.

To install the package simply run the following in a bash shell:

pip install druglinker==0.1.1

Example of use in a python environment:

from druglinker import dbsearch
dbsearch.get_ids("amox")
# output: ["DB01060"]
dbsearch.get_ids("amoxicillin")
# output: ["DB01060"]
dbsearch.get_ids("vancomycin")
# output: ["DB00512"]

# Simple sentence tagging:

text = "20mg of midazolam were administered intravenously"
for term in text.split():
    ids = dbsearch.get_ids(term)
    if ids:
        print(term)
        print(ids)

# output:
# midazolam
# ['DB00683']

For instance you can now easily find further information about the drug identifier detected (e.g. DB00683) at https://www.drugbank.ca/drugs/DB00683

Releases

No releases published

Packages

No packages published

Languages