Skip to content
Arlindo Pereira edited this page Sep 1, 2015 · 5 revisions

Terminology

  • User: A dict representing a user of tapiriik.
  • Service: A module, implementing the ServiceBase class, which interacts with a remote site.
  • Service Record/Connection: An object representing a user's connection to a Service.
  • Activity: A fitness activity.
  • Non-GPS Activity: an activity without GPS data
  • Stationary Activity: an activity with statistical data, but no GPS track or sensor data (i.e. stationary activities are a subset of non-GPS activities).
  • Activity UID: The unique ID of an activity. NOT used for deduplication, but is used to locally record activity presence on remote services.
  • External ID: An ID provided by a remote site.
  • Extended Authorization vs Authorization: Authorization is for storing regular authorization data, Extended Authorization is for storing data the user must opt-in to remember (i.e. passwords).
  • Flow Exception: Allow the user to control which direction activities flow between connected sites.
  • Exclusion: An activity may be Excluded if it is not suitable for synchronization: occurs too far in the future, corrupt source files, etc. See below for details.
  • Sync Error: An error generated during synchronization. See below for the complete guide.
  • Synchronization Worker: The process that performs the actual synchronization.
  • Synchronization Watchdog: The process that monitors the Synchronization Workers for stalling or crashes.
  • Stats cron: The process that calculates and stores all synchronization statistics.

I will attempt to use "function" when referring to an idempotent operation which returns its results, and "method" otherwise. Parameters not marked as (required) are optional.

Object Hierarchy

Sync

Contains the main synchronization methods (PerformGlobalSync and PerformUserSync), and associated functions for deduplicating & coalescing activities, determining destination services, etc.

User

Contains a list of dicts (user["ConnectedServices"] = [{"Service": "strava", "ID": ObjectId("1234")}, ...]) referencing the user's Service Records, along with other metadata (next sync time, total sync error count, etc.)

ServiceRecord

Representation of a Service Record, includes methods to retrieve and update configuration of that specific Record.

  • ExternalID, Authorization and ExtendedAuthorization are the values returned by the Service's Authorize() function. See above for the difference between Authorization and ExtendedAuthorization.
  • SynchronizedActivities is a list of Activity UIDs which the remote account is known to posses.
  • Config is a dict of configuration variables for that service - please use the GetConfiguration() function instead, as it automatically resolves default configuration variables.
  • Service is a dynamic reference to the Service that the Service Record represents a connection to.
  • SyncErrors and SyncExclusions are lists of errors and exclusions, maintained by the synchronization core. Do not directly modify these, instead, raise appropriate exceptions within the Service. Do not access them during a synchronization (they will be empty).

Activity

Representation of a single activity. Includes all raw activity data, summary statistics, metadata (timezone, type, etc.) and any data attached to it by the originating Service(s).

  • Stationary - marks the activity as Stationary or not - must be set to True or False in either DownloadActivityList or DownloadActivity (see below)
  • GPS - indicates that the activity has GPS data. Similar to Stationary, it must be set to True or False in DownloadActivityList or DownloadActivity
  • ActivityType (Activity.Type) - what sort of activity this is (running, cycling, etc...)
  • Device (Activity.Device) - an object specifying the device where the activity data originated, if known (otherwise None)
    • Serial (Activity.Device.Serial) - the serial number of the device - must be an integer for proper behaviour in FIT and TCX export
    • VersionMajor/VersionMinor (Activity.Device.VersionMajor/Activity.Device.VersionMinor) - the version of the device. These correspond with TCX's VersionMajor and VersionMinor elements, and are represented as Major.Minor in FIT export (following Garmin's practice)
    • DeviceIdentifier (Activity.Device.Identifier) - an object specifying the model of device where the activity data originated, if known. Refer to TCXIO, FITIO, and devices.py for examples of usage.
  • ActivityStatistics (Activity.Stats) - includes members for each group of statistics (e.g. heart rate, power)
    • ActivityStatistic (e.g. Activity.Stats.HR) - includes a standard set of metrics for each statistic group: Value, Average, Max, Min, Gain, Loss.
      • Not all metrics are relevant to each grouping (e.g. HR.Gain or Distance.Average), and not all services populate all relevant metrics.
      • Each ActivityStatistic has an associated unit of measure (ActivityStatisticUnit, in ActivityStatistic.Unit), and can be represented in any desired unit (within reason) using the asUnits(ActivityStatisticUnit.xyz) function.
  • Lap (Activity.Laps) - representation of a single lap.
    • In the case of stationary activities without lap information, the Laps list must still be populated with a single lap. The statistics of said lap must be identical to the activity as a whole.
    • All laps must have a StartTime and EndTime. Ideally this would be "total time" (vs. moving time, in Lap.Stats.MovingTime), but can also represent moving time (in which case, Lap.Stats.MovingTime should still be set appropriately).
    • ActivityStatistics also appear here (in Lap.Stats) - these statistics apply only to this lap (e.g. Distance should be the distance travelled between the StartTime and EndTime of the lap)
    • Waypoint is a single data point in the lap - inserted chronologically into Lap.Waypoints. All Waypoints must contain a Timestamp (datetime) and a Type (WaypointType.xyz). Laps do not need Waypoints - note the section on Stationary activities.
      • They may additionally contain any combination of the following members (if not, the member will be None)
        • Location - an object containing at either a Latitude and Longitude (WGS84), an Altitude (m), or both. Not required, but currently if an activity has waypoints, at least one must contain a valid Lat/Lng.
        • HR - BPM
        • Calories - kilocalories burned up until and including that point in the Activity
        • Distance - distance travelled (m) up until including that point in the Activity
        • Power - Watts
        • Cadence - RPM
        • RunCadence - SPM
        • Speed - m/s
        • Temp - ºC

FITIO, TCXIO, GPXIO, PWXIO

Classes to support generation of their respective file formats from an Activity (Dump(act)) or creation of an Activity from an existing file (Parse(data)). Function signatures vary slightly.

Stationary activities

Stationary activities do not have sensor data, just summary data. They are handled in the same structures as regular activities. Stationary activities must be marked as such via Activity.Stationary=[True|False]. This will disable checks for minimum waypoint count (etc.) and is used by some services in determining the appropriate upload method. Activities should be marked as soon as possible (i.e. if it can be determined in DownloadActivityList), and must be set before the time the activity is sanity checked (after DownloadActivity). Otherwise, the flag should remain at the default state (Stationary = None) to allow effective coalescing when deduplicating activities.

If a service does not support stationary activities, set ReceivesStationaryActivities = False.

Non-GPS activities

Unlike stationary activities, non-GPS (activity.GPS = False) activities can still have sensor data recorded in Waypoints (and therefore should not be flagged as Stationary). Services should be written to allow for uploading Waypoints without Location, although they may ignore such waypoints if the remote site does not support them (e.g. GPX export).

If the service does not support non-GPS activities, set ReceivesNonGPSActivitiesWithOtherSensorData = False.

Picking the right Exception

Errors

ServiceExceptions generated in Services are eventually turn into Sync Errors, which...

  • May apply to a specific activity, or an entire remote account.
  • May be blocking (that scope will not be processed until the error is cleared).
  • May be displayed to the user.
  • May allow user intervention to clear a blocking error (e.g. user is prompted to re-authenticate after an authentication failure). Currently, all exceptions displayed to the user require user intervention.
    • Clearing happens in groups (e.g. all 400 authentication failures are cleared at once).

All of these attributes are set through the constructor of the ServiceException (or its little-used, entirely-useless subclass APIException):

  • message (required)
  • scope - one of ServiceExceptionScope.Service or ServiceExceptionScope.Account
  • block - boolean for whether the error should block processing of the given scope
  • user_exception - an instance of UserException

Exceptions thrown in DownloadActivityList() will result the service being omitted ("excluded" is the term used in the handling code) from the remainder of the synchronization.

UserException

ServiceExceptions thrown with a UserException given will be displayed to the user as a an error on the dashboard. The constructor takes the following arguments:

  • type (required) - one of UserExceptionType.Authorization, UserExceptionType.AccountFull or UserExceptionType.AccountExpired. Determines which message is shown to the user.
  • intervention_required (required) - currently all of the UserExceptionTypes assume this is set to True and offer the user a way to clear the associated Sync Error by performing the appropriate action (e.g. successfully reauthorizing the Service Record's remote account)
  • clear_group - Clearing one error in a given clear_group clears all other errors in that group. Defaults to type if not specified.

ServiceExceptions bearing UserExceptions are generally thrown during synchronization, but may also be generated in the Authorize() function (present in services that use UsernamePassword authentication). These errors are passed to the front-end JS for further case-specific handling (e.g. the TrainingPeaks non-premium account error).

Exclusions

APIExcludeActivity generated within calls to DownloadActivityList() or DownloadActivity() are turned into Exclusions. An Exclusion applies to a specific activity (identified either by the external ID or a Activity object) in a specific Service Record. Exclusions may be permanent (never cleared) or not permanent (exist only until the beginning of the next synchronization). Excluding an Activity means that the excluding Service will not be called upon to provide further information regarding that Activity (but other Services may be, should they posses the same Activity).

When excluding activities in DownloadActivityList(), do not raise the APIExcludeActivity - instead, append it to a list to be returned as the second member of the tuple.

The APIExcludeActivity constructor takes the following arguments:

  • message (required)
  • activity (required if not being raised from within DownloadActivity AND if activityId is not specified) - an instance of an Activity which is to be excluded
  • activityId (required if not being raised from within DownloadActivity AND if activity is not specified) - a unique identifier of the activity to be excluded
  • permanent (required) - whether the exclusion should apply permanently (e.g. corrupt file) or until the next synchronization (e.g. a live-tracking activity in progress).

I hate time zones

Some sites have great timezone support, some don't. In order to deal with the case of transferring activities from a TZ-naive source to a TZ-aware destination, and to ease deduplication, there are several built in methods on the Activity object.

There are two different places timezones can be applied to an activity - Activity.TZ and the individual timestamps within the activity (e.g. StartTime, Waypoint.Timestamp). Activity.TZ should only be set in cases where the time zone of the activity's occurrence is known. Timestamps should be assigned time zones as appropriate for the format (e.g. all timestamps originating from GPXIO will be in UTC, while the same activity from PWXIO would be TZ-naive). TZ-aware and TZ-naive timestamps should not be mixed within an activity (i.e. StartTime being TZ-aware but EndTime being TZ-naive is a bad thing).

To reconcile this potential discrepancy, the sync core automatically calls EnsureTZ immediately before attempting activity uploads.

pytz is used for all timezone operations.

Activity.EnsureTZ()

Long story short, ensures that Activity.TZ is set appropriately. Calls CalulcateTZ() in all cases, but CalculateTZ will simply return the existing Activity.TZ if recalculate=True is not given.

Activity.CalculateTZ()

Attempts to determine a time zone for the activity. The most common case involves looking up the first geographic coordinate in the activity in a database of timezone boundaries.

  • If there are no geographic Waypoints in the activity, or you wish to override the point used for calculation, you may supply loc=Location().
  • If no geographic Waypoints exist and no loc is specified (as is the case with stationary activites), the calculation will fall back to the value of FallbackTZ (populated by the synchronization core, determined by the majority of the user's other activities, if available).
  • If none of the above apply, the calculation will fail.

The calculation sets Activity.TZ but does not change the timezones associated with any of the timestamps within the Activity.

Activity.AdjustTZ() and Activity.DefineTZ()

These methods update all the timestamps contained within the activity to reflect the current value of Activity.TZ. Use DefineTZ() when the timestamps are TZ-naive, and AdjustTZ() when the timestamps are already TZ-aware.

Deduplication

The rules to deduplicate activities are described in _accumulateActivities. In short, if the activities have identical start times, or their start times are +/- 3 minutes, or their start times are +/- 30 seconds plus [-38, 38] hours and/or 30/-30 minutes (for TZ mistakes), and their activity types match (or are reasonably similar, e.g. MTB vs. Biking), then the activity is considered the same. Otherwise it will be re-uploaded.

Once Activities are determined to be duplicate, their details are coalesced into a single Activity. In general, the first activity to be listed is given preference, with exceptions for...

  • Start/End time - the timezone from the latter activity will apply to the former if the former is not TZ-aware and the latter is.
  • Activity Type - the most specific activity type is chosen (e.g. Mountain Biking over Cycling)
  • Private - the most restrictive setting is chosen
  • Stationary - False overrides True
  • Stats - they are averaged

Creating your own Service

Great! It's not all that hard, you could probably figure it out just by looking at the existing Services. Here are some areas to watch out for:

Authorization

Services may authorize remote accounts via OAuth or direct Username/Password entry.

OAuth

Services should specify the appropriate UserAuthorizationURL, and implement GenerateUserAuthorizationURL if the authorization URL must be unique (take a look at Dropbox for an example of this). This is the URL that the user will be redirected to when they click Connect. The appropriate return URL can be generated with WEB_ROOT + reverse("oauth_return", kwargs={"service": "serviceid"})

Once the user returns from a successful OAuth transaction, RetrieveAuthorizationToken() will be called.

Username/Password

Services must still specify a UserAuthorizationURL (which will be local to tapiriik - see Endomondo for an example). When a user attempts to authorize, Authorize() is called.

The Service should specify RequiresExtendedAuthorizationDetails = True if storage of the raw credentials is required (i.e. use of Extended Authorization) is required. If this is True and no Extended Authorization details are available (i.e. user did not opt to have them remembered), the Service will be omitted from synchronization.

What to return from RetrieveAuthorizationToken()/Authorize()

You should return a triple containing:

  • External ID of the remote account (required)
  • Authorization dict - this will be stored directly in the Service Record. If no authorization details are appropriate (e.g. they are all in Extended Authorization), an empty dict must be specified.
  • Extended Authorization dict - this will be made available in the Service Record (where it is stored varies on whether the user opted to save their credentials). The CredentialStorage class should be used when appropriate. If no extended authorization details are appropriate, omit this.

DownloadActivityList()

tapiriik operates with two different modes of synchronization: partial and exhaustive/full. If the exhaustive kwarg is not set, this function should return only the most recent n activities for the given Service Record (common values for n are 50 and 25). The exact number returned in this case is not important, just that we are not enumerating thousands of activities over tens of pages, since new activities are only likely to appear in the recent past. If exhaustive is set, every activity in the account should be returned.

The Activity.ServiceData member is available to store data like remote activity IDs. If set here, its value is made available (in the same member) in future calls to DownloadActivity().

This method should return a tuple, the first member being a list of Activities, the second a list of APIExcludeExceptions. If activities are excluded, they should be omitted from the former list.

WTH is WebInit()?

It's a method called only when the web interface is starting up (i.e. not in the synchronization worker), allowing you to make Django calls like reverse().