Skip to content

Commit

Permalink
automod: HOWTO write rules doc (#503)
Browse files Browse the repository at this point in the history
  • Loading branch information
bnewbold committed Jan 2, 2024
2 parents e0ade54 + 6623966 commit 8d6bd5d
Show file tree
Hide file tree
Showing 3 changed files with 153 additions and 54 deletions.
150 changes: 150 additions & 0 deletions automod/HOWTO_write_rules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@

HOWTO: Write automod Rules
==========================

The short version is:

- identity a behavior pattern or type of content in the network, and an action that should be taken in response
- write a "rule" function, in golang, which will detect this pattern (usually start by copying an existing rule)
- register the new rule with the rule engine
- test triggering the rule, either in a test network or using "captured" content from a real network
- deploy the rule, first with reduced "effects" (actions) to monitor impact

The `automod/rules` package contains a set of example rules and some shared helper functions, and demonstrates some patterns for how to use counters, sets, filters, and account metadata to compose a rule pattern.

## How Rules Work

Automod rules are golang functions which get called every time a relevant event takes place in the network. Rule functions receive static metadata about the event; can fetch additional state or metadata as needed; and can optionally output "effects". These effects can include state mutations (such as incrementing counters), or taking moderation actions.

There are multiple rule function types (eg, specifically for bsky "posts", or for atproto identity updates), but they all receive a `c` "Context" argument as the primary API for the rules system, including both accessing metadata and recording effects.

Multiple rules for the same event may be run concurrently, or in arbitrary order. Effects *are not* visible between rule execution on the same event, and are only persisted after all rules have finished executing. This means that if one rule increments a counter or adds a label, other rules will not "see" that effect when processing the same event.

Effects are automatically de-duplicated by the rules engine, both between concurrent rules and against the current state of an effect's subject. This means that rules can generally "trigger" continuously (eg, report an account on the basis of multiple posts), and the action will only take place once (not reported multiple times).

It is expected that some rules will act together, for example paired rules on record creation and record deletion.

The design philosophy of rules are that they mostly contain their own configuration, as code. Rules are not expected to be directly configurable, and changing the "effects" or action of a rule is a change to the rule code itself.


## Rule APIs

There are two general categories of rules and effects: at the account-level, and at the record-level, with the later being a superset of the former.

Note that none of the Context methods return errors. If errors are encountered (for example, network faults), error state is persisted internally to the Context object, a placeholder value is returned, and no effects will be persisted for the overall event execution. This is to keep rule code simple and readable.


### Rule Types

The notable rule function types are:

- `type IdentityRuleFunc = func(c *AccountContext) error`: triggers on events like handle updates or account migrations
- `type RecordRuleFunc = func(c *RecordContext) error`: triggers on every repo operation: create, update, or delete. Triggers for every record type, including posts and profiles
- `type PostRuleFunc = func(c *RecordContext, post *appbsky.FeedPost) error`: triggers on creation or update of any `app.bsky.feed.post` record. The post record is de-serialized for convenience, but otherwise this is basically just `RecordRuleFunc`
- `type ProfileRuleFunc = func(c *RecordContext, profile *appbsky.ActorProfile) error`: same as `PostRuleFunc`, but for profile

The `PostRuleFunc` and `ProfileRuleFunc` are simply affordances so that rules for those common record types don't all need to filter and type-cast. Rules for other record types (such as `app.bsky.graph.follow`) do need to use `RecordRuleFunc` and implement that filtering and type-casting.

### Pre-Hydrated Metadata

The `c *automod.AccountContext` parameter provides the following pre-hydrated metadata:

- `c.Account.Identity`: atproto identity for the account, including `DID` and `Handle` fields, and the PDS endpoint URL (if declared)
- `c.Account.Private` (optional): contains things like `.IndexedAt` (account first seen), `.Email` (the current registered account email), and `.EmailConfirmed` (boolean). Only hydrated when the rule engine is configured with admin privileges, and the account is on a PDS those privileges have access to
- `c.Account.Profile` (optional): a cached subset of the account's bsky profile record
- `c.Account.AccountLabels` (array of strings): cached view of any moderation labels applied to the account, by the relevant "local" moderation service
- `c.Account.AccountNegatedLabels` (array of strings)
- `c.Account.Takendown` (bool): if the account is currently taken down or not
- `c.Account.FollowersCount` (int64): cached
- `c.Account.PostsCount` (int64): cached

The `c *automod.RecordContext` parameter is a superset of `AccountContext` and also includes:

- `c.RecordOp.Action`: one of "create", "update", or "delete"
- `c.RecordOp.DID`
- `c.RecordOp.Collection`
- `c.RecordOp.RecordKey`
- `c.RecordOp.CID` (optional): not included for "delete"
- `c.RecordOp.Value` (optional): the record itself, usually as a pointer to an un-marshalled struct

### Counters

All `Context` objects provide access to counters. Rules don't need to pre-configure counter namespaces or values, they can just start using them. The default value for a counter which has never been incremented is `0`.

The datastore providing counters is an internal implementation/configuration detail of the rule engine, but is usually Redis. Reads (`GetCount`) may hit the network but are pretty fast.

Incrementing a counter is an "effect" and is not persisted until the end of all rule execution for an event. That is, if you read, increment, and read again, you will read the same count.

The counter API has distinct "namespace" and "value" fields, which are combined to form a key. You generally chose a unique namespace specific to your rule and counter type, and then values are either a fixed string or a normalized field like a DID or hash. The keyspace is global, so rules can access and mutate each other's counters, and need to avoid namespace collisions.

Time periods for counters:

- `automod.PeriodHour`: time bucket of current hour
- `automod.PeriodDay`: time bucket of current day
- `automod.PeriodTotal`: all-time counts

Basic counters:

- `c.GetCount(<namespace-string>, <value-string>, <time-period>)`: reads count for the specific time period
- `c.Increment(<namespace-string>, <value-string>)`: increments all time periods
- `c.IncrementPeriod(<namespace-string>, <value-string>, <time-period>)`: increments only a single time period bucket, as a resource optimization. You should generally use the full `Increment` method.

"Distinct value" counters use a statistical data structure (hyperloglog) to estimate the number of unique strings incremented for the given bucket. These counters consume more memory (up to a couple KBytes per counter), though they are generally smaller for small-N buckets.

- `c.GetCountDistinct(<namespace>, <bucket>, <time-period>)`
- `c.IncrementDistinct(<namespace>, <bucket>, <value>)`

### Sets

Sets are a mechanism to separate configuration from rule implementation. They are simply named arrays of strings. Membership checks are very fast, and won't hit the network more than once per set per rule invocation.

- `c.InSet(<set-name>, <value>)`: checks if a string is in a named set, returning a `bool`

### Moderation Effects (Actions)

"Flags" are a concept invented for automod. They are essentially private labels: string values attached to a subject (account or record) and persisted.

Rules can take account-level actions using the following methods:

- `c.AddAccountFlag(val string)`
- `c.AddAccountLabel(val string)`
- `c.ReportAccount(reason string, comment string)`
- `c.TakedownAccount()`

The `RecordContext` additionally has record-level equivalents for all these methods.

### Other Stuff

- `c.Logger`: a `log/slog` logging interface. Logging currently happens immediately, instead of being accumulated as an "effect"
- `c.Directory()`: returns an `identity.Directory` (interface), which can be used for (cached) identity resolution

## Development Process

When deploying a new rule, it is recommended to start with a minimal action, like setting a flag or just logging. Any "action" (including new flag creation) can result in a Slack notification. You can gain confidence in the rule by running against the full firehose with these limited actions, tweaking the rule until it seems to have acceptable sensitivity (eg, few false positives), and then escalate the actions to reporting (adds to the human review queue), or action-and-report (label or takedown, and concurrently report for humans to review the action).

### Network Data

The `hepa` command provides `process-record` and `process-recent` sub-commands which will pull an existing individual record (by AT-URI) or all recent bsky posts for an account (by handle or DID), which can be helpful for testing.

There is also a `capture-recent` sub-command which will save a snapshot ("capture") of the current account identity and profile, and recent bsky posts, as JSON. This can be combined with testing helpers (which will load the capture and push it through a mock rules engine) to test that new rules actually trigger as expected against real-world data.

Note that, of course, any real-world captures should have identifying or otherwise sensitive information redacted or replaced before committing to git.


## Examples

Here is a trivial post record rule:

```golang
// the GTUBE string is a special value historically used to test email spam filtering behavior
var gtubeString = "XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X"

func GtubePostRule(c *automod.RecordContext, post *appbsky.FeedPost) error {
if strings.Contains(post.Text, gtubeString) {
c.AddRecordLabel("spam")
}
return nil
}
```

Every new (or updated) post is checked for an exact string match, and is labeled "spam" if found.
55 changes: 2 additions & 53 deletions automod/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,66 +13,15 @@ API reference documentation can be found on [pkg.go.dev](https://pkg.go.dev/gith

## Architecture

The runtime (`automod.Engine`) manages network requests, caching, and configuration. Outside calling code makes concurrent calls to the `Process*` methods that the runtime provides. The runtime constructs event context structs (eg, `automod.RecordContext`), hydrates relevant metadata from (cached) external services, and then executes a configured set of rules on the event. Rules may request additional context, do arbitrary local compute, and update the context with "effects" (such as moderation actions). After all rules have run, the runtime will inspect the context and persist any side-effects, such as updating counter state and pushing any new moderation actions to external services.
The runtime (`automod.Engine`) manages network requests, caching, and configuration. Outside calling code makes concurrent calls to the `Process*` methods that the runtime provides. The runtime constructs event context structs (eg, `automod.RecordContext`), hydrates relevant metadata from (cached) external services, and then executes a configured set of rules on the event. Rules may request additional information (via the `Context` object), do arbitrary local compute, and then update the context with "effects" (such as moderation actions). After all rules have run, the runtime will inspect the context and persist any side-effects, such as updating counter state and pushing any new moderation actions to external services.

The runtime keeps state in several "stores", each of which has an interface and both in-memory and Redis implementations. It is expected that Redis is used in virtually all deployments. The store types are:
The runtime maintains state in several "stores", each of which has an interface and both in-memory and Redis implementations. The automod stores are semi-ephemeral: they are persisted and are important state for rules to work as expected, but they are not a canonical or long-term store for moderation decisions or actions. It is expected that Redis is used in virtually all deployments. The store types are:

- `automod/cachestore`: generic data caching with expiration (TTL) and explicit purging. Used to cache account-level metadata, including identity lookups and (if available) private account metadata
- `automod/countstore`: keyed integer counters with time bucketing (eg, "hour", "day", "total"). Also includes probabilistic "distinct value" counters (eg, Redis HyperLogLog counters, with roughly 2% precision)
- `automod/setstore`: configurable static string sets. May eventually be runtime configurable
- `automod/flagstore`: mechanism to keep track of automod-generated "flags" (like labels or hashtags) on accounts or records. Mostly used to detect *new* flags. May eventually be moved in to the moderation service itself, similar to labels


## Rule API

Here is a simple example rule, which handles creation of new events:

```golang
var gtubeString = "XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X"

func GtubePostRule(c *automod.RecordContext, post *appbsky.FeedPost) error {
if strings.Contains(post.Text, gtubeString) {
c.AddRecordLabel("spam")
}
return nil
}
```

Every new post record will be inspected to see if it contains a static test string. If it does, the label `spam` will be applied to the record itself.

The `c` parameter provides access to relevant pre-fetched metadata; methods to fetch additional metadata from the network; a `slog` logging interface; and methods to store output decisions. The runtime will catch and recover from unexpected panics, and will log returned errors, but rules are generally expected to run robustly and efficiently, and not have complex control flow needs.

Some of the more commonly used features of `c` (`automod.RecordContext`):

- `c.Logger`: a `log/slog` logging interface
- `c.Account.Identity`: atproto identity for the author account, including DID, handle, and PDS endpoint
- `c.Account.Private`: when not-null (aka, when the runtime has administrator access) will contain things like `.IndexedAt` (account first seen) and `.Email` (the current registered account email)
- `c.Account.Profile`: a cached subset of the account's `app.bsky.actor.profile` record (if non-null)
- `c.GetCount(<namespace>, <value>, <time-period>)` and `c.Increment(<namespace>, <value>)`: to access and update simple counters (by hour, day, or total). Incrementing counters is lazy and happens in batch after all rules have executed: this means that multiple calls are de-duplicated, and that `GetCount` will not reflect any prior `Increment` calls in the same rule (or between rules).
- `c.GetCountDistinct(<namespace>, <bucket>, <time-period>)` and `c.IncrementDistinct(<namespace>, <bucket>, <value>)`: similar to simple counters, but counts "unique distinct values"
- `c.InSet(<set-name>, <value>)`: checks if a string is in a named set

Notice that few (or none) of the context methods return errors. Errors are accumulated internally on the context itself, and error handling takes place before any effects are persisted by the engine.


## Developing New Rules

The current tl;dr process to deploy a new rule:

- copy a similar existing rule from `automod/rules`
- add the new rule to a `RuleSet`, so it will be invoked
- test against content that triggers the new rule
- deploy

You'll usually want to start with both a known pattern you are looking for, and some example real-world content which you want to trigger on.

The `automod/rules` package contains a set of example rules and some shared helper functions, and demonstrates some patterns for how to use counters, sets, filters, and account metadata to compose a rule pattern.

The `hepa` command provides `process-record` and `process-recent` sub-commands which will pull an existing individual record (by AT-URI) or all recent bsky posts for an account (by handle or DID), which can be helpful for testing.

When deploying a new rule, it is recommended to start with a minimal action, like setting a flag or just logging. Any "action" (including new flag creation) can result in a Slack notification. You can gain confidence in the rule by running against the full firehose with these limited actions, tweaking the rule until it seems to have acceptable sensitivity (eg, few false positives), and then escalate the actions to reporting (adds to the human review queue), or action-and-report (label or takedown, and concurrently report for humans to review the action).


## Prior Art

* The [SQRL language](https://sqrl-lang.github.io/sqrl/) and runtime was originally developed by an industry vendor named Smyte, then acquired by Twitter, with some core Javascript components released open source in 2023. The SQRL documentation is extensive and describes many of the design trade-offs and features specific to rules engines. Bluesky considered adopting SQRL but decided to start with a simpler runtime with rules in a known language (golang).
Expand Down
2 changes: 1 addition & 1 deletion automod/doc.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// Auto-Moderation rules engine for anti-spam and other moderation tasks.
//
// This package (`github.com/bluesky-social/indigo/automod`) contains a "rules engine" to augment human moderators in the atproto network. Batches of rules are processed for novel "events" such as a new post or update of an account handle. Counters and other statistics are collected, which can drive subsequent rule invocations. The outcome of rules can be moderation events like "report account for human review" or "label post". A lot of what this package does is collect and maintain caches of relevant metadata about accounts and pieces of content, so that rules have efficient access to this information.
// This package (`github.com/bluesky-social/indigo/automod`) contains a "rules engine" to augment human moderators in the atproto network. Batches of rules are processed for novel "events", such as a new post or update of an account handle. Counters and other statistics are collected, which can drive subsequent rule invocations. The outcome of rules can be moderation events like "report account for human review" or "label post". A lot of what this package does is collect and maintain caches of relevant metadata about accounts and pieces of content, so that rules have efficient access to this information.
//
// See `automod/README.md` for more background, and `cmd/hepa` for a daemon built on this package.
package automod

0 comments on commit 8d6bd5d

Please sign in to comment.