I’d name this “Electric Boogaloo”, but the window of opportunity for that is long gone.
These are some of my personal data aggregation scripts.
The project is written using literate programming paradigm with Emacs’ Org Mode as the engine and semi-broken English for comments. Not sure if it was worth it at this point, but it seems to work.
The basic dataflow is as follows:
- Data from various sources is saved to the folder called
logs-sync
in machine-readable formats (mostly CSV) - The folder gets rsynced to my VPS
- The VPS processes that with Prefect 2 flows and stores the results to a PostgreSQL database
- Metabase queries the database and creates nice dashboards.
The entire thing is written in Python and more or less follows the path of the least resistance.
Yeah, and it most definitely won’t work for anyone except me.
The common functionality resides in core-new.org (once upon a time it had to coexist with core.org
).
service.org runs maintainence flows, such as gzipping old files.
Some files are related to particular datasources:
File | Data source | Automation | Status |
---|---|---|---|
aw.org | ActivityWatch (Desktop & Android) | Complete | OK |
mpd.org | Music Player Daemon | Complete | OK |
locations.org | My CSVs with location history | Complete | OK |
wakatime.org | WakaTime | Partial | OK |
messengers.org | Telegram + aggregation | Manual | OK |
vk.org | VK, GDPR dump | Manual | Left the network, whatever |
sleep.org | Sleep As Android | Manual | Archive |
google-android.org | Google Takeout, Android activity | Manual | Archive |
“Automation” means:
- Complete - no manual actions required
- Partial - some manual action required
- Manual - manually retrieve the required data and feed it to the project
In some files I tried to aggregate data from multiple datasources:
- youtube.org (Archive) - here I tried to join data from MPV, YouTube watch history, ActivityWatch and NewPipe to figure out what I was watching. Didn’t work out that well, maybe I’ll return to it someday.