Skip to content

Latest commit

 

History

History
81 lines (65 loc) · 2.71 KB

locations.org

File metadata and controls

81 lines (65 loc) · 2.71 KB

Locations

Working with my location history. I don’t move around too much, but it’s still interesting to add that datapoint.

Also, ActivityWatch and some other apps store the timestamps in UTC, so this data is required to fix the timezone.

Storing this is the database won’t be convenient, so I just manage a handful of CSV files with Org Mode.

Matching locations

The structure of CSV files is as follows:

  • tz_csv
    • location - location string (e.g. Saint-Petersburg)
    • timezone - timezone (e.g. “+3”)
  • list_csv
    • start_time
    • location - location from tz_csv
  • loc_hostnames
    • hostname
    • location - location from tz_csv

The class is meant to throw errors in case there is a mismatch somewhere.

import pandas as pd

from datetime import timedelta
from sqrt_data_service.api import settings

__all__ = ['LocationMatcher']


class LocationMatcher:
    def __init__(self):
        self._df_tz = pd.read_csv(settings['location']['tz_csv'])
        self._df_list = pd.read_csv(settings['location']['list_csv'])
        self._df_hostnames = pd.read_csv(settings['location']['hostnames_csv'])

        self._df_list['start_time'] = pd.to_datetime(
            self._df_list['start_time']
        )
        self._df_list = self._df_list.sort_values(
            by='start_time', ascending=False
        )

        self._init_timezones()

    def _init_timezones(self):
        self._timezones = {
            d.location: int(d.timezone)
            for d in self._df_tz.itertuples(index=False)
        }
        self._df_list['timezone'] = [
            self._timezones[l] for l in self._df_list['location']
        ]
        self._df_hostnames['timezone'] = [
            self._timezones[l] for l in self._df_hostnames['location']
        ]

    def get_location(self, time, hostname=None):
        if hostname is not None:
            matches = self._df_hostnames[self._df_hostnames.hostname == hostname
                                        ]
            if len(matches) > 0:
                match = matches.iloc[0]
                time += timedelta(seconds=60 * 60 * int(match.timezone))
                return (match.location, time)
        row = self._df_list[self._df_list.start_time < time.to_datetime64()
                           ].iloc[0]
        time += timedelta(seconds=60 * 60 * int(row.timezone))
        return (row.location, time)

if __name__ == '__main__':
    LocationMatcher()