Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLE Summary with specific vehicles #1073

Open
JGreenlee opened this issue May 10, 2024 · 4 comments
Open

BLE Summary with specific vehicles #1073

JGreenlee opened this issue May 10, 2024 · 4 comments

Comments

@JGreenlee
Copy link

JGreenlee commented May 10, 2024

The current implementation of ble_sensed_summary on e-mission-server mimics the format of cleaned_section_summary and inferred_section_summary; it looks like this:

{
  "count": {
    "CAR": 1
  },
  "distance": {
    "CAR": 20184.92261045545
  },
  "duration": {
    "CAR": 1772.7775580883026
  }
}

We'd like to know specifically what vehicle it was; instead of just "CAR"; we want to know it was "car_jacks_mazda3". So we talked about having 2 versions of the summary.

{
  "count": {
    "car_jacks_mazda3": 1
  },
  "distance": {
    "car_jacks_mazda3": 20184.92261045545
  },
  "duration": {
    "car_jacks_mazda3": 1772.7775580883026
  }
}

But, if for example we wanted to calculate the carbon footprint based on the car's MPG, we'd still have to cross-reference with the dynamic config to find the vehicle that matches car_jacks_mazda3.

As an alternative, what if we use a different structure that will allow us to have 1 unified summary (an array of modes / "mode summary") ? Then we can include vehicle information in the summary.

[
  {
    "vehicle": {
      "value": "car_jacks_mazda3",
      "bluetooth_major_minor": ["dfc0:fff0"],
      "text": "Jack's Mazda 3",
      "baseMode":"CAR",
      "met_equivalent":"IN_VEHICLE",
      "kgCo2PerKm": 0.16777,
      "vehicle_info": {
        "type": "car",
        "license": "JHK ****",
        "make": "Mazda",
        "model": "3",
        "year": 2014,
        "color": "red",
        "engine": "ICE",
        "mpg": 33
      }
    },
    "count": 1,
    "distance": 20184.92261045545,
    "duration": 1772.7775580883026
  }
]
@shankari
Copy link
Contributor

@JGreenlee interesting. The reason that we had the type of structure was from the "count every trip" project to add uncertainty to the metrics. And the reason the "count every trip" project had that structure, IIRC, was so that we could get a feature (like distance) and see the distribution across modes without having to iterate over sections. So if you wanted to get the primary mode, for example, you could do something like (trip['count'].idxmax()) to get the primary mode.

Having said that, transforming between the structures is not that hard (I think). I would suggest:

  • writing out what the code for that use case would look like (to verify that it is not too bad)
  • explaining how this fits within a trip; since an object {'vehicle': ...}, cannot be a key

@JGreenlee
Copy link
Author

JGreenlee commented May 14, 2024

If the confirmed trip is a dict, it would have a property ble_modes_summary whose value is an array of objects, each object representing a mode. The object contains 'vehicle' with vehicle info, alongside 'count', 'distance', and 'duration'.

To get the primary mode, we could use the max function on ble_modes_summary with 'distance' (or 'count') as the key.

confirmed_trip = {
  "ble_modes_summary": [
    {
      "vehicle": { 
        "value": "vehicle1",
        ...,
       },
      "count": 1,
      "distance": 800,
    },
    {
      "vehicle": {
        "value": "vehicle2",
      },
      "count": 2,
      "distance": 1300,
    },
  ]
}

primary_mode = max(confirmed_trip['ble_modes_summary'], key=lambda x: x['distance'])
print('primary vehicle is ' + primary_mode['vehicle']['value'])
primary vehicle is vehicle2

@shankari
Copy link
Contributor

shankari commented May 14, 2024

ok, I think that there are only a couple more questions before we go ahead with this:

  • we need to have a backwards compatibility plan since we will need to rewrite all existing trips to the new format
    • we will need a script to do the rewrite (which we should test on both the data in emission/tests and on a couple of real dataset snapshots)
    • the script is likely to take a long time to run, at least for those of us who have been using the app for a really long time
    • in the meanwhile, existing code (primarily in the public dashboard) needs to handle both
  • we may need not just the max but also the actual distribution so that we can get the probabilities (e.g. the probability that the trip was CAR or BIKE or WALK, which feeds into the uncertainty which feeds into the error bars).
  • I think another challenge here is that we can get the primary mode for one trip at a time fairly easily, but for any of the dashboards, we will need to work on an aggregate basis and the previous format might work better for that. Let's think through it
df = json.normalize(confirmed_trips)
df.columns

Will have ble_section_summary.E_CAR in the old method, will have ble_modes_summary.vehicle.baseMode.CAR in the new one, so maybe not a huge deal wrt grouping or other post-processing.

Is your proposal to only change this for the ble modes, or for the cleaned and inferred modes as well? I would prefer to have the same structure for all the *summary entries, although of course that will make the migration take longer. And it would also take more effort to generate the probability distributions above.

@JGreenlee do you have thoughts on what the same structure would look like for the cleaned and inferred section summaries?

@JGreenlee
Copy link
Author

JGreenlee commented May 28, 2024

{
  "count": {
    "car_jacks_mazda3": 1,
    ...,
  },
  "distance": {
    "car_jacks_mazda3": 20184.92261045545,
    ...,
  },
  "duration": {
    "car_jacks_mazda3": 1772.7775580883026,
    ...,
  },
  "vehicles": {
    "car_jacks_mazda3": {
      "value": "car_jacks_mazda3",
      "bluetooth_major_minor": ["dfc0:fff0"],
      "text": "Jack's Mazda 3",
      "baseMode":"CAR",
      "met_equivalent":"IN_VEHICLE",
      "kgCo2PerKm": 0.16777,
      "vehicle_info": {
        "type": "car",
        "license": "JHK ****",
        "make": "Mazda",
        "model": "3",
        "year": 2014,
        "color": "red",
        "engine": "ICE",
        "mpg": 33
      }
    },
    ...,
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Issues being worked on
Development

No branches or pull requests

2 participants