-
Notifications
You must be signed in to change notification settings - Fork 2
Virtuoso Freebase Setup
sameersingh edited this page Sep 23, 2014
·
5 revisions
Adding some notes here to keep track of how Virtuoso Freebase was setup, and how to query it using SPARQL.
- Install Virtuoso Open-source on Ubuntu using
sudo aptitude install vituoso-opensource
- Ensure
/var/lib/virtuoso-opensource-6.1/db
is linked to HDD with a lot of space. - Get the freebase dump using
wget http://download.freebaseapps.com/
into thedb
folder - Gunzip it (requires ~330G):
gunzip freebase-rdf-*.gz
- Load RDF triples into virtuoso:
isql-vt 1111
- Register load request:
SQL> ld_dir('.', 'freebase-rdf-*', 'http://freebase.com');
- To see if the request registered:
SQL> select * from DB.DBA.load_list;
SQL> rdf_loader_run();
- In another
isql-vt
window:SQL> SPARQL SELECT ?g COUNT(*) { GRAPH ?g {?s ?p ?o.} } GROUP BY ?g ORDER BY DESC 2;
- http://sivareddy.in/load-freebase-dump-into-virtuoso-sparql-sql
- http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtBulkRDFLoader
Based on https://groups.google.com/forum/#!topic/sindicetech-freebase/93PBGJBnnIU.
Basic steps for each query:
- Point browser to http://localhost:8890/sparql.
- Ensure query produces expected output (use http://freebase.com as the Graph IRI)
- Run same query using
curl
with limits off, TSV format, etc.
For the complete reference, see Freebase types and relations, and Virtuoso SPARQL service.
Get number of triples in the DB.
SELECT COUNT(*) {
?s ?p ?o
}
Get all relations of a mention.
PREFIX ns: <http://rdf.freebase.com/ns/>
select * where {
ns:m.014zcr ?p ?o
}
LIMIT 10
For multi-hop relations, one would do:
PREFIX ns: <http://rdf.freebase.com/ns/>
select * where {
ns:m.014zcr ns:film.actor.film ?film_performance .
?film_performance ns:film.performance.film ?film .
?film ns:type.object.name ?name .
?film ns:film.film.initial_release_date ?initial_release_date .
FILTER(lang(?name) = 'en')
}
LIMIT 1