Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified and reconfigured model building and application #3

Merged

Conversation

shankari
Copy link

@shankari shankari commented Jul 27, 2021

  • Use only the first stage
  • up the radius to 500m
  • no filtering, no cutoffs for the similarity code
  • user is valid if they have > 14 trip (no % requirement)
  • user input is valid if any entry has been filled out (before all entries had to be filled out)

Improvements from these changes, compared side by side with the old model, are at:
e-mission/e-mission-eval-private-data#28 (comment)

  • bonus fix: fix the regression in the tests by handling the corner case of no user inputs.

We copy them instead of modifying the existing models so that we can run them
side by side for a comparison.

TODO: Unify with parameters (at least for the auxiliary files) instead of
making copies.

This is currently unchanged so that we can have a clean diff when we do apply
the changes.
Configuration changes include:
- `data_preprocessing.py`: don't drop user input where any fild is None, only
  filter if all fields are not defined.
- `evaluation_pipeline.py`: turn off filtering and cutoffs for the similarity code
- `get_users.py`: don't require that 50% of the trips are labeled

Then make significant changes to `build_save_model`. Concretely:
- save the models after the first round, which significantly simplifies the
  prediction code
- pull out the location map generation and user input generation code into
  separate functions to simplify the code invocation.

Testing done:
- Ran it against the staging database
- Several users did not have any labeled trips

```
$ grep "enough valid trips" /tmp/staging_run.log
2021-07-27 01:31:02,224:DEBUG:Total: 0, labeled: 0, user 9cfc3133-3505-4923-b6fb-feb0e50a5aba doesn't have enough valid trips for further analysis.
2021-07-27 01:31:02,246:DEBUG:Total: 0, labeled: 0, user cc114670-ef99-4247-a3a2-47529247d8ac doesn't have enough valid trips for further analysis.
2021-07-27 01:31:02,267:DEBUG:Total: 0, labeled: 0, user 2bc8ca71-7d0f-4930-ba2c-cf97f7dceaea doesn't have enough valid trips for further analysis.
2021-07-27 01:31:05,340:DEBUG:Total: 171, labeled: 6, user 6373dfb8-cb9b-47e8-8e8f-76adcfadde20 doesn't have enough valid trips for further analysis.
```

They were filtered correctly

```
$ grep "enough valid trips" /tmp/staging_run.log | wc -l
      21
```

Models for 20 users were created

```
$ ls locations_first_round_* | wc -l
      20
```

No errors
If the user has never submitted an input, we can't infer anything.
So we return early.
This fixes e-mission#829 (comment)
Concretely:
- pull out the matching code into `find_bin`
- load only locations and user inputs
- from trip -> bin -> user_labels

Change filenames for consistency

Added additional log statements for clarity

Testing done:
- Ran the intake pipeline
- Ran `label_stats` for a single user
- Ran the intake pipeline again
- Confirmed that the confirmed trips had inferences
Previous commit: 71e299e
fixes: model filenames, load process and variable name fix

Testing done: really works, even with reset
- Remove it from the list of algorithms
- Switch the ensemble to the one that returns the first value only
@shankari
Copy link
Author

Just wanted to call out these highlights:

There are still many users for whom the new model does not predict a lot of trips, but it is much much better.

image
image

The only users with < 20% trips prediction rate have very few (<50) labeled trips.

@shankari
Copy link
Author

Running against the full dataset now; will deploy to staging tomorrow morning when I am not so brain dead.
Don't want to make a mistake and end up deleting all the data.

Since this is a normal use case and not an error.
@shankari
Copy link
Author

shankari commented Jul 27, 2021

After running the full intake pipeline against the staging DB from last week for one user, I get

all inferred trips 243
all confirmed trips 243
bin/debug/label_stats.py:40: DeprecationWarning: count is deprecated. Use Collection.count_documents instead.
  print("confirmed trips without inferred labels %s" % (edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/confirmed_trip", "user_id": sel_uuid, "data.inferred_labels": []}).count()))
confirmed trips without inferred labels 205
confirmed trips with expectation 243

@shankari
Copy link
Author

The pipeline ran successfully against the staging database from last week on my laptop.
Now getting ready to run on staging!

@shankari
Copy link
Author

Found at least a couple of users with:

Found confirmed trip with matching inferred trip, without user labels
Found confirmed trip with matching inferred trip, without user labels
Found confirmed trip with matching inferred trip, without user labels
Found confirmed trip with matching inferred trip, without user labels
Found confirmed trip with matching inferred trip, without user labels
Found confirmed trip with matching inferred trip, without user labels

@shankari
Copy link
Author

Resetting the inferences; due to e-mission/e-mission-docs#654
this took a few minutes to delete all the expected trips, at least for one of the users.

# ./e-mission-py.bash bin/debug/reset_partial_label_testing.py -i
Connecting to database URL stage-db
{'n': 5148, 'ok': 1.0}
{'n': 813198, 'ok': 1.0}
{'n': 10297, 'ok': 1.0}
{'n': 5148, 'ok': 1.0}
{'n': 76, 'ok': 1.0}

@shankari
Copy link
Author

shankari commented Jul 27, 2021

Resetting the confirmed trips

./e-mission-py.bash bin/debug/reset_partial_label_testing.py -c
Connecting to database URL stage-db
{'n': 0, 'ok': 1.0}
{'n': 5148, 'ok': 1.0}
{'n': 38, 'ok': 1.0}

Why do we have inferred trips for some and not for the others? Too late to debug now...

@shankari
Copy link
Author

shankari commented Jul 27, 2021

Re-running the pipeline to generate confirmed trips so we can build the model.

# ./e-mission-py.bash bin/intake_multiprocess.py 3
Connecting to database URL stage-db
google maps key not configured, falling back to nominatim
nominatim not configured either, place decoding must happen on the client
overpass not configured, falling back to default overleaf.de
transit stops query not configured, falling back to default
...
2021-07-27T14:42:19.965960+00:00**********UUID ae91f6bc-b25d-4d80-93ba-0159ac220901: storing views to cache**********

Next, building the model

# ./e-mission-py.bash emission/analysis/modelling/tour_model_first_only/build_save_model.py > /tmp/build_model_jul_27.log 2>&1
# ls user_labels* | wc -l
19
# grep "doesn't have enough valid trips" /tmp/build_model_jul_27.log  | wc -l
20

Huh! On my laptop, I got

$ ls user_labels_first_round_* | wc -l
      20
$ grep "doesn't have enough valid trips" /tmp/staging_run.log | wc -l
      21

Need to follow up on the discrepancy.

Running the pipeline once so we can fill in the inferred_trip and expected_trip objects. The confirmed_trips can be filled in during the regular run.

# ./e-mission-py.bash bin/intake_multiprocess.py 3
2021-07-27T14:55:19.880112+00:00**********UUID ae91f6bc-b25d-4d80-93ba-0159ac220901: storing views to cache**********

Resetting the confirmed trips so they will be regenerated on the next run.

# ./e-mission-py.bash bin/debug/reset_partial_label_testing.py -c
Connecting to database URL stage-db
{'n': 10296, 'ok': 1.0}
{'n': 5148, 'ok': 1.0}
{'n': 38, 'ok': 1.0}

Spot checking a couple of entries

# ./e-mission-py.bash bin/debug/label_stats.py -u [redacted]
Connecting to database URL stage-db
All inferred trips 176
Inferred trips with inferences 0
All expected trips 0
Expected trips with inferences 0
all inferred trips 176
all confirmed trips 0
confirmed trips with inferred labels 0
confirmed trips without inferred labels 0
confirmed trips with expectation 0

Will double check after the next pipeline run to confirm that the confirmed trips are created.

@GabrielKS you can go ahead and test against staging now!

@shankari
Copy link
Author

For the same user, we now have:

(emission) root@a5b7fc6b3e7e:/usr/src/app/e-mission-server# ./e-mission-py.bash bin/debug/label_stats.py -u [redacted]
Connecting to database URL stage-db
All inferred trips 176
Inferred trips with inferences 0
All expected trips 352
Expected trips with inferences 352
all inferred trips 176
all confirmed trips 176
confirmed trips with inferred labels 0
confirmed trips without inferred labels 176
confirmed trips with expectation 176

@shankari
Copy link
Author

spot checking several users, most of them don't have confirmed trips with labels.

Checking the aggregate:

>>> import emission.core.get_database as edb
Connecting to database URL stage-db
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "data.inferred_labels": {"$ne": []}})
1

wha!! there's only inferred label.

Maybe we fixed e-mission/e-mission-docs#654 so now we shouldn't delete the expected trips any more without resetting the pipeline. Let's regenerate everything without the hack and see if it works.

@shankari
Copy link
Author

shankari commented Jul 28, 2021

@corinne-hcr Can you please merge this to your branch as well? It should fix the failing test....

@corinne-hcr corinne-hcr merged commit 0e1f88f into corinne-hcr:modeling_and_functions Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants