the bar graph shows the Q value, which show how good it is for the two actions(jump and dont jump) in one particular state/situation. Q value is computed using convolutional neutral network with 4 consecutive screen shot as input. The model is trained by playing 1000 games on its own.
https://www.youtube.com/watch?v=WSUFRITj02A&t=98s&ab_channel=geinezhang
https://www.freecodecamp.org/news/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8/
https://www.groundai.com/project/reinforcement-learning-and-video-games/1#Ch3.S4