Fastcat is a little Python library for quickly looking up broader/narrower relations in Wikipedia categories locally. The idea is that fastcat can be useful in situations where you need to rapidly lookup category relations, but don't want to hammer on the Wikipedia API. Fastcat relies on Redis and the SKOS file that DBpedia makes available basing on the Wikipedia MySQL dumps.
This software is a fork of fastcat tool created by Ed Summers. Some changes were made under the Creative Commons Attribution-ShareAlike 3.0 license, and they are described in commit messages. Major changes are porting the code to Python 3 as well as adding support for more than one language.
The first time you import fastcat you'll need to populate your Redis database
with the category data from DBpedia. To do that instantiate a FastCat object
and call the load
method. After that you can use it to do lookups.
>>> import fastcat
>>> f = fastcat.FastCat()
>>> f.load() # brew a pot of coffee while the data is downloaded and loaded into redis
...
>>> print(f.broader("Computer programming"))
['Software engineering', 'Computing']
>>> print(f.narrower("Computer programming"))
['Programming idioms', 'Programming languages', 'Concurrent computing', 'Source code', 'Refactoring', 'Data structures', 'Programming games', 'Computer programmers', 'Version control', 'Anti-patterns', 'Programming constructs', 'Algorithms', 'Web Services tools', 'Programming paradigms', 'Software optimization', 'Debugging', 'Computer programming tools', 'Computer libraries', 'Programming contests', 'Archive networks', 'Self-hosting software', 'Educational abstract machines', 'Software design patterns', 'Computer arithmetic']
Just fill-in the language
argument in the FastCat()
constructor with a language code listed below.
>>> import fastcat
>>> f = fastcat.FastCat(language='de')
>>> f.load() # brew a pot of coffee while the data is downloaded and loaded into redis
...
>>> print(f.broader("Berlin"))
['Europa nach Ort', 'Deutschland nach Gemeinde', 'Deutschland nach Bundesland']
>>> print(f.narrower("Berlin"))
['Umwelt- und Naturschutz (Berlin)', 'Veranstaltung (Berlin)', 'Stadtplanung (Berlin)', 'Verwaltung (Berlin)', 'Urbaner Freiraum in Berlin als Thema']
- English (
en
) - Estonian (
et
) - German (
de
) - Japanese (
ja
) - Polish (
pl
) - Portuguese (
pt
) - Russian (
ru
) - Ukrainian (
ua
) - Czech (
cs
)
You first need to setup Redis server on your machine as follows.
On Mac:
$ brew install redis
On Linux:
$ sudo apt-get install redis-server
On Windows:
Please refer to instruction on installing Vagrant Redis. You will need an Ubuntu installation on your Windows, more information can be found here: Install your Linux Distribution of Choice
If you are ready, installing Fastcat is pretty straightforward:
$ pip install fastcat
Or if you wish to get the newest dev code:
$ pip install git+https://github.com/oskar-j/fastcat.git
That's it!
See CONTRIBUTING.md for more details