Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to sync frame delta after draw #48555

Merged
merged 1 commit into from
Jul 22, 2021

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented May 8, 2021

Investigations have showed that a lot of the random variation in frame deltas causing glitches may be due to sampling the time at a sub optimal place in the game loop.

Although sampling at the start of Main::Iteration makes logical sense, the most consistent deltas may be better measured after the location likely to block at vsync - either the OpenGL draw commands or the SwapBuffers.

Here we add an experimental setting to allow syncing after the OpenGL draw section of Main::Iteration.

Notes

  • I believe this should be worth trying in at least one of the betas for 3.2.4 as an experiment.
  • It is a very simple change but could potentially help a lot, based on the results of testing in Add frame delta smoothing option #48390
  • The reasoning is that the blocking for vsync is likely to occur during OpenGL calls, either drawing or swapping buffers. The point where these functions return may provide the most consistent synchronisation with the actual vsyncs (albeit delayed). The actual situation may depend on the OS and drivers, so this is based on conjecture and empirical evidence so far, hence the desire to test it further.
  • There is as yet no mechanism for changing the setting at runtime, or overriding with the command line, these could be added if desired.
  • Also valid for 4.x, but in the interests of getting some testing would be more practical in a 3.x beta.
  • There are some trade offs involved here. In return for a more consistent delta, we are potentially susceptible to a large gap between the measured time and the start of the next iteration. However the PR accounts for this and reverts to a new time sample if the gap is longer than 1/10th second.

Investigations have showed that a lot of the random variation in frame deltas causing glitches may be due to sampling the time at the wrong place in the game loop.

Although sampling at the start of Main::Iteration makes logical sense, the most consistent deltas may be better measured after the location likely to block at vsync - either the OpenGL draw commands or the SwapBuffers.

Here we add an experimental setting to allow syncing after the OpenGL draw section of Main::Iteration.
@akien-mga akien-mga merged commit 0b4080b into godotengine:3.x Jul 22, 2021
@akien-mga
Copy link
Member

Thanks!

@Calinou
Copy link
Member

Calinou commented Jul 22, 2021

Should the setting be enabled by default during the betas? This way, we can get some real world testing on it.

@lawnjelly
Copy link
Member Author

Should the setting be enabled by default during the betas? This way, we can get some real world testing on it.

We did set on for the default for delta_smoothing, but for the sync after draw it is very much an experiment at this stage. I'm personally happy for it to default to off, I think we'll get enough feedback from guys who are interested to decide whether it can help or not, and whether we should pursue it longterm.

@lawnjelly lawnjelly deleted the delta_sync branch July 22, 2021 11:51
@oeleo1
Copy link

oeleo1 commented Jul 28, 2021

Catching up on the work done with great interest after the 3.4beta2 release. On this topic and on the delta smoothing thing, I wonder what is a sensible way of evaluating the result. It looks like the appreciation boils down to the visual perception of some project, which is subjective.

Would it be possible to let Godot measure that for us more precisely and report via debug options? The smoothing logic goes to great length at computing things, so some additional debug on std deviation, counting spikes above a threshold and the like be detected and reported for a configurable time window would probably be a good additional debug toolset. And you would be able to appreciate the results by getting those numbers.

Then I read the discussion about (verbose) logging possibly getting in the way. Some guidance on how to test (relevant options on/off) would be nice.

I’d be happy to test this thing but am am really unsure how I would report anything objective on both subjects beyond the obvious “I can see better with glasses than without” 😊 which I can hardly quantify any further. So again, what is a sensible way of testing these?

@lawnjelly
Copy link
Member Author

lawnjelly commented Jul 28, 2021

There are some debug options for the smoothing that require a recompile. I did consider putting in some kind of debugging output, or at least a way of querying whether the smoothing was active (sync after draw is always active if you switch it on).

Two points though:

  • Adding debug logging for this sort of thing is just another option to confuse users. There's a real danger of 'choice fatigue' if we provide too many project settings etc. Although it could be available for e.g. a beta. There's no in built support in 3.x for novice / advanced options, which would make this kind of thing easier.
  • It is fairly simple to make logs in gdscript, see below.
var samples = []
var num_samples = 0
const MAX_SAMPLES = 60

func _ready():
	# blank samples
	for i in range (MAX_SAMPLES):
		samples.push_back(0)

func _process(delta):
	samples[num_samples] = delta
	num_samples += 1
	
	# print the samples every so often
	if (num_samples >= MAX_SAMPLES):
		for i in range (MAX_SAMPLES):
			print(str(samples[i]))
		num_samples = 0

Some typical output on my system is:

0.12377 // first frame, often a bit longer because of loading etc
0.017169
0.007456
0.012077
0.020519
0.015804
0.009845
0.012863
0.01156
0.011319
0.007181
0.017991
0.015617
0.017006
0.016425
0.017378
0.016379
0.016408
0.016849
0.018162
0.022388
0.00928
0.016624
0.016703
0.016624
0.016701
0.016701
0.016672
0.016704
0.017041
0.016515
0.016778
0.016543
0.016697
0.016737
0.01658
0.016931
0.016408

Then once it has settled and estimated the refresh rate, here are the deltas:

0.016666
0.016666
0.016666
0.016666
0.016666
0.016666
0.016666
0.016666
0.016666
0.016666
etc

You can try running this with delta smoothing on / off and sync after draw on / off (and combinations). If sync after draw gives more consistent deltas even with smoothing off, then it is helping on your system.

If you aren't getting a consistent delta after say 30 seconds with delta smoothing, then it hasn't settled and your system doesn't have a consistent enough refresh rate - for example you have variable refresh rate, or you are not keeping up to the vsync consistently.

Bear in mind also that the actual act of printing the logs will affect the deltas at that point in time, so you might want to sample over a larger number of frames, and ignore results which are clearly wildly out.

@belzecue
Copy link

belzecue commented Jul 31, 2021

Inconclusive results from me, testing 3.4B2 with various delta_after_draw/delta_smoothing/jitter_fix combos against a 3D gridmap maze project on my Intel NUC (BOXNUC7PJYH4, Linux Ubuntu/MATE 20.04.2, Mesa Intel UHD Graphics 605 GLK 3), at fullscreen 960x540 with VSync.

The only thing I might infer from this limited testing is that the combo [jitterFix_0.5, deltaSmoothing_OFF, deltaAfterDraw_ON] seems to be worst performing on my NUC rig. And curiously, my best run was with the old settings: [jitterFix_0.5 deltaSmoothing_OFF deltaAfterDraw_OFF].

By the way, I'm not sure how the metrics can show me averaging 60 fps when _process seems to be averaging way above 16 ms in most of these tests. That doesn't make much sense to me. UPDATE: Oh, right, I see it. It's the huge fps lag spike at start of runtime throwing the _process averages off, presumably.

EDIT: chart snapshots removed because they were not reflective of results. See my updated chart link in a later comment.

@Calinou
Copy link
Member

Calinou commented Jul 31, 2021

UPDATE: Oh, right, I see it. It's the huge fps lag spike at start of runtime throwing the _process averages off, presumably.

We should probably make it so the first 2 seconds of runtime are always ignored in the profiler and when using --print-fps. Things generally stabilize starting from the 3rd second.

@lawnjelly
Copy link
Member Author

lawnjelly commented Jul 31, 2021

I just checked and the debug monitor doesn't display the delta that is sent in to the game, it is using a separately measured delta. So any graphs above should not show anything different (or at least nothing relevant) - the profiled delta will be full of the usual random fluctuations.

That actually might be something worth changing. I have to admit I didn't even look at this section (the profiling code is below where it does the actual ticks), or consider whether it would be worth changing at the time - my bad, this was an oversight.

Currently if you want to see the actual delta supplied to the game at the moment, you need to use a script like the one I posted earlier.

Let me know thoughts on this, it would be fairly easy to get the profiler showing the smoothed input delta if that is desired. 👍 🙂

@belzecue
Copy link

belzecue commented Aug 1, 2021

I'm a big fan of both ideas!

A Project parameter to set the e.g. "Monitoring delay start" (default 2s) would let users adjust for each project, where some projects need no delay and others need some delay to account for ignore-able startup activities that would otherwise skew the monitor charting.

Adding smoothed delta to the Monitor charts, if that feature is active in Project settings, is of course desirable because what good is Monitoring if it's not accurate?

Here are my corrected results for same setup described earlier, using the logging script.

https://docs.google.com/spreadsheets/d/e/2PACX-1vSnJ9pPBYSWIm5BMTnHcBBoJoNSEKeQiemqFztjfOlsQv1ZJ37XPpVi-pcbSoUO9ZHwrs_1D6Lutg4t/pubhtml

Conclusions for my testing:

  • Delta After Draw seemed to have slightly worse result whenever switched on
  • Best result was using Jitter Fix 0.5 (to smooth initial startup) and Delta Smoothing

It's pretty clear that the new Delta Smoothing has a PROFOUND effect on getting uniform delta.

@lawnjelly
Copy link
Member Author

lawnjelly commented Aug 1, 2021

Yes the sync_after_draw is very much an experiment (which is why we didn't go with enabling it by default). It did improve things on my machine but I'm conscious the result may vary depending on the OS etc. The best way to measure the sync_after_draw is with the delta_smoothing off, as the smoothing will often just hide any of the effects.

The idea with sync_after_draw is to see if it can be used to reduce (not remove) some of the variance in deltas - to see whether it is a better spot to take the sample.

Yeah when delta_smoothing works, it really works. There are two things to bear in mind:

  • It won't activate for everyone, probably only fixed vsync monitors, with machines that are powerful enough to consistently hit vsync.
  • We are watching for any situations where it activates where it shouldn't, and people experience something like more glitch / stutters than with it off. I went to some lengths to prevent this happening, but it is just possible there is some scenario I missed which could be corrected for.

Your results

They are quite interesting, mostly in line with expectations .. this would be my interpretation:

Delta after draw

I'm not sure there is a significant difference on your machine (I don't know whether those extra peaks are just sampling error, although it could be the other sampling point is slightly better in this case). The most likely explanation is that your GPU / OS does not block after the SwapBuffers call (moving the sync point to here will presumably only help if the blocking occurs there, rather than randomly).

Smoothing

Works as expected.

jitter_chart

@lawnjelly
Copy link
Member Author

lawnjelly commented Aug 1, 2021

See the PR above for a little more discussion. I fixed a small bug in the frame_time, and started converting it to profile the smoothed deltas but it turned out to be a bigger job than I wanted to spend time on today, as it needs to pack / unpack the data to the remote debugger and I'm not familiar with that code.

So for now it remains that the profiler measures wall clock timings, not the actual deltas passed to _process. This is in keeping with what profilers usually do, however in this case I can see how it would be useful for us to profile the smoothed deltas in addition to the the wall clock deltas. Hopefully myself or another will get to this in a later PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants