Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement label inference pipeline #825

Merged
merged 8 commits into from
Jul 8, 2021
Merged

Conversation

GabrielKS
Copy link
Contributor

  • Adds a stage to the intake pipeline to perform label inference
  • Includes several placeholder label inference algorithms to test the system
  • Unit tests coming soon!

+ Includes a placeholder prediction algorithm returning dummy data.
+ Adds inferred labels to confirmed trips.

Testing done:
Ensured that placeholder label predictions were being added to confirmed
trip objects. No client-side work or testing done yet.
+ Adds two more placeholder label inference "algorithms" to facilitate
  front-end testing
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall except the questions about the prediction structure and the matching of the labels and the confirmed trips.

emission/core/wrapper/labelprediction.py Outdated Show resolved Hide resolved
emission/analysis/userinput/matcher.py Outdated Show resolved Hide resolved
+ The label inference pipeline.py now handles multiple algorithms less
  awkwardly and has a place for combination of multiple predictions into
  an ensemble.
+ The pipeline now stores each algorithm's prediction and a final
  (perhaps ensemble) prediction in different places

Tested by confirming that client-side behavior is the same as before.
+ The label inference pipeline now reads confirmed trips instead of
  cleaned trips
+ It then writes inference data directly to confirmed trips instead of
  relying on matcher.py to find the data in the database

Tested by confirming that client-side behavior is the same as before.
@shankari
Copy link
Contributor

shankari commented Jul 2, 2021

The other changes look good.

The inference pipeline stage now takes cleaned trips and outputs
"inferred trips," which the CREATE_CONFIRMED_OBJECTS stage then accepts.

Tested by confirming that the client-side behavior is the same.
@GabrielKS
Copy link
Contributor Author

Et voilà, no ts.update. Everything is very pretty now. :)

@shankari
Copy link
Contributor

shankari commented Jul 7, 2021

This looks fine to me. Are you going to include unit tests now, or in a later PR?

@GabrielKS
Copy link
Contributor Author

I'll write unit tests now.

+ TestLabelInferencePipeline.py now tests the plumbing of labels.pipeline.py.
  It does not test the actual inference algorithms (which have not been implemented for real yet).
+ The pipeline code has been reworked slightly to fix bugs illuminated
  by the unit tests and to make placeholder_predictor_2 deterministic

Tested by running the unit tests and by confirming that client-side behavior is the same
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants