Skip to content

Scraping wikipedia to build a neo4j graph of garbage tv.

Notifications You must be signed in to change notification settings

wpower12/garbagenet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning Neo4j with Garbage Television

I like social network data, and I want to learn graph databases, specifically neo4j. I do not like the 'romantic competition' shows; The Bachelor, The Bachelorette, and Bachelor in Paradise. I do like the sheer amount of scrapable data these shows expose on wikipedia.

These scripts show my attempts at learning to:

  • Scrape semi-structured wikipedia data
  • Create and update a Neo4j graph database
  • Write a decent document describing the above

The current results give you a network of the players and their starred-in and competed-in relationships with the various seasons of the various shows. Red node show players, and blue nodes show the individual seasons.

garbagenet v0.1

To Use

You'll need to pip install a few things:

You'll also need to install neo4j and have a local graph database running, with address and auth info to match what's in the scripts. You can get started here

About

Scraping wikipedia to build a neo4j graph of garbage tv.

Topics

Resources

Stars

Watchers

Forks

Languages