Skip to content
forked from edsu/fastcat

Navigate Wikipedia categories quickly in a local Redis instance [7+ languages supported]

License

Notifications You must be signed in to change notification settings

oskar-j/fastcat

 
 

Repository files navigation

fastcat

Build Status Requirements Status Pending Pull-Requests Github Issues Commits Since Release Downloads

Fastcat is a little Python library for quickly looking up broader/narrower relations in Wikipedia categories locally. The idea is that fastcat can be useful in situations where you need to rapidly lookup category relations, but don't want to hammer on the Wikipedia API. Fastcat relies on Redis and the SKOS file that DBpedia makes available basing on the Wikipedia MySQL dumps.

fastcat logo

Attribution

This software is a fork of fastcat tool created by Ed Summers. Some changes were made under the Creative Commons Attribution-ShareAlike 3.0 license, and they are described in commit messages. Major changes are porting the code to Python 3 as well as adding support for more than one language.

Usage

Basic usage

The first time you import fastcat you'll need to populate your Redis database with the category data from DBpedia. To do that instantiate a FastCat object and call the load method. After that you can use it to do lookups.

>>> import fastcat
>>> f = fastcat.FastCat()
>>> f.load()  # brew a pot of coffee while the data is downloaded and loaded into redis
...
>>> print(f.broader("Computer programming"))
['Software engineering', 'Computing']
>>> print(f.narrower("Computer programming"))
['Programming idioms', 'Programming languages', 'Concurrent computing', 'Source code', 'Refactoring', 'Data structures', 'Programming games', 'Computer programmers', 'Version control', 'Anti-patterns', 'Programming constructs', 'Algorithms', 'Web Services tools', 'Programming paradigms', 'Software optimization', 'Debugging', 'Computer programming tools', 'Computer libraries', 'Programming contests', 'Archive networks', 'Self-hosting software', 'Educational abstract machines', 'Software design patterns', 'Computer arithmetic']

Non-english categories

Just fill-in the language argument in the FastCat() constructor with a language code listed below.

>>> import fastcat
>>> f = fastcat.FastCat(language='de')
>>> f.load()  # brew a pot of coffee while the data is downloaded and loaded into redis
...
>>> print(f.broader("Berlin"))
['Europa nach Ort', 'Deutschland nach Gemeinde', 'Deutschland nach Bundesland']
>>> print(f.narrower("Berlin"))
['Umwelt- und Naturschutz (Berlin)', 'Veranstaltung (Berlin)', 'Stadtplanung (Berlin)', 'Verwaltung (Berlin)', 'Urbaner Freiraum in Berlin als Thema']
Currently supported languages (and their codes)
  1. English (en)
  2. Estonian (et)
  3. German (de)
  4. Japanese (ja)
  5. Polish (pl)
  6. Portuguese (pt)
  7. Russian (ru)
  8. Ukrainian (ua)
  9. Czech (cs)

Install

Redis installation

You first need to setup Redis server on your machine as follows.

On Mac:

$ brew install redis

On Linux:

$ sudo apt-get install redis-server

On Windows:

Please refer to instruction on installing Vagrant Redis. You will need an Ubuntu installation on your Windows, more information can be found here: Install your Linux Distribution of Choice

Installing the module

If you are ready, installing Fastcat is pretty straightforward:

$ pip install fastcat

Or if you wish to get the newest dev code:

$ pip install git+https://github.com/oskar-j/fastcat.git

That's it!

Contributing to the project

Guidelines

See CONTRIBUTING.md for more details

Runing unit tests