Skip to content
This repository has been archived by the owner on Mar 30, 2023. It is now read-only.
This repository has been archived by the owner on Mar 30, 2023. It is now read-only.

Pervasive lag issue with label/milestone changes in issues and PRs #78

Closed
jberkus opened this issue Mar 1, 2018 · 59 comments
Closed

Pervasive lag issue with label/milestone changes in issues and PRs #78

jberkus opened this issue Mar 1, 2018 · 59 comments

Comments

@jberkus
Copy link
Collaborator

jberkus commented Mar 1, 2018

Lukasz,

If you look here:
https://k8s.devstats.cncf.io/d/IIUa5kezk/open-issues-prs-by-milestone?orgId=1&from=1509407831268&to=1511830631269&var-sig_name=All&var-sig=all&var-milestone_name=v1.9&var-milestone=v1_9&var-repo_name=kubernetes%2Fkubernetes&var-repo=kubernetes_kubernetes&var-full_name=Kubernetes

... it says that as of Nov 27, we had 90ish open issues against 1.9. However, if you look at the burndown report, we actually had 28 issues open on that date.

What's the reason for the extremely different counts?

@lukaszgryglicki
Copy link
Member

Looking into that.
Will let you know what I found.

@lukaszgryglicki
Copy link
Member

lukaszgryglicki commented Mar 1, 2018

At first glance all seems OK.
There were 96 issues open at that time.
We're selecting SIG=All, so SIG is not taken into account.
Issues open at 2017-11-27 (repo=k/k, milestone=v1.9) were:

./runq x.sql {{to}} 2017-11-27
/---------+------+--------------------+--------------------+---------\
|issue_id |number|opened_at           |closed_at           |milestone|
+---------+------+--------------------+--------------------+---------+
|257517249|52444 |2017-09-13T20:51:41Z|                    |v1.9     |
|276181000|56242 |2017-11-22T19:26:59Z|2017-11-27T15:20:04Z|v1.9     |
|258657089|52683 |2017-09-19T00:06:23Z|                    |v1.9     |
|242791185|48893 |2017-07-13T18:32:20Z|                    |v1.9     |
|240004664|48396 |2017-07-02T11:28:01Z|                    |v1.9     |
|142012943|23233 |2016-03-19T01:27:59Z|                    |v1.9     |
|260790217|53084 |2017-09-26T22:30:35Z|2018-02-21T18:22:57Z|v1.9     |
|249544284|50495 |2017-08-11T05:50:42Z|                    |v1.9     |
|274977067|55967 |2017-11-17T19:48:34Z|2018-02-26T12:05:05Z|v1.9     |
|224607650|44975 |2017-04-26T21:40:07Z|                    |v1.9     |
|233429070|46934 |2017-06-04T11:57:01Z|2018-02-27T21:04:49Z|v1.9     |
|274039875|55768 |2017-11-15T05:33:05Z|                    |v1.9     |
|276180396|56241 |2017-11-22T19:24:28Z|2017-11-27T15:19:42Z|v1.9     |
|143491681|23479 |2016-03-25T12:42:12Z|                    |v1.9     |
|233673871|46983 |2017-06-05T18:41:13Z|                    |v1.9     |
|258951594|52745 |2017-09-19T20:22:04Z|                    |v1.9     |
|263574915|53548 |2017-10-06T21:20:01Z|2017-12-01T00:29:33Z|v1.9     |
|271678686|55194 |2017-11-07T01:23:14Z|2017-12-08T08:02:50Z|v1.9     |
|261503927|53236 |2017-09-29T01:05:05Z|2017-12-15T01:46:35Z|v1.9     |
|142002272|23225 |2016-03-18T23:33:23Z|                    |v1.9     |
|254753537|51825 |2017-09-01T21:16:18Z|                    |v1.9     |
|276173709|56239 |2017-11-22T18:57:52Z|2017-11-27T14:25:25Z|v1.9     |
|216652325|43607 |2017-03-24T05:02:08Z|                    |v1.9     |
|254492560|51746 |2017-08-31T23:04:25Z|                    |v1.9     |
|261236330|53188 |2017-09-28T08:37:00Z|                    |v1.9     |
|249961544|50599 |2017-08-14T08:17:01Z|2018-02-12T13:15:21Z|v1.9     |
|267271142|54318 |2017-10-20T18:55:07Z|                    |v1.9     |
|260853556|53109 |2017-09-27T05:45:02Z|                    |v1.9     |
|255366506|51965 |2017-09-05T18:26:03Z|2018-02-09T05:34:41Z|v1.9     |
|255714945|52039 |2017-09-06T19:22:00Z|                    |v1.9     |
|188847316|36666 |2016-11-11T20:47:13Z|                    |v1.9     |
|263211016|53497 |2017-10-05T17:48:56Z|                    |v1.9     |
|275491269|56091 |2017-11-20T20:38:46Z|2017-11-28T09:51:58Z|v1.9     |
|137393454|22212 |2016-02-29T22:10:23Z|                    |v1.9     |
|135339370|21657 |2016-02-22T07:23:55Z|                    |v1.9     |
|215912236|43486 |2017-03-21T23:45:56Z|                    |v1.9     |
|81132222 |8830  |2015-05-26T20:53:41Z|                    |v1.9     |
|247541182|50046 |2017-08-02T22:27:57Z|                    |v1.9     |
|250551665|50752 |2017-08-16T08:38:21Z|                    |v1.9     |
|243478249|49038 |2017-07-17T18:04:07Z|                    |v1.9     |
|266227745|54088 |2017-10-17T18:17:32Z|2018-02-26T17:43:10Z|v1.9     |
|276182336|56244 |2017-11-22T19:32:06Z|2017-12-14T01:09:00Z|v1.9     |
|275371939|56061 |2017-11-20T14:22:04Z|2017-11-27T08:05:51Z|v1.9     |
|254169942|51665 |2017-08-31T00:01:34Z|                    |v1.9     |
|256581518|52258 |2017-09-11T04:55:02Z|2017-12-14T05:25:53Z|v1.9     |
|236263512|47604 |2017-06-15T17:37:37Z|                    |v1.9     |
|246472365|49820 |2017-07-28T22:20:24Z|2017-12-08T02:27:38Z|v1.9     |
|89361997 |10045 |2015-06-18T18:20:39Z|                    |v1.9     |
|245021139|49480 |2017-07-24T09:28:06Z|2017-12-15T17:22:02Z|v1.9     |
|251517777|50986 |2017-08-20T22:03:06Z|2017-12-16T17:33:42Z|v1.9     |
|256085955|52123 |2017-09-07T22:12:47Z|                    |v1.9     |
|254492984|51747 |2017-08-31T23:07:09Z|                    |v1.9     |
|248283779|50215 |2017-08-07T00:50:53Z|2018-02-17T01:31:06Z|v1.9     |
|270182138|54904 |2017-11-01T03:22:32Z|2017-12-16T17:33:39Z|v1.9     |
|260361499|53006 |2017-09-25T17:51:13Z|                    |v1.9     |
|251925100|51099 |2017-08-22T11:28:55Z|2018-01-07T18:21:50Z|v1.9     |
|38003437 |489   |2014-07-16T17:08:14Z|                    |v1.9     |
|268441314|54574 |2017-10-25T15:24:34Z|2017-12-04T02:50:00Z|v1.9     |
|258641145|52678 |2017-09-18T22:31:42Z|2017-12-15T18:27:47Z|v1.9     |
|246135911|49734 |2017-07-27T18:54:20Z|                    |v1.9     |
|275776261|56155 |2017-11-21T16:28:18Z|2017-12-01T03:25:43Z|v1.9     |
|276707674|56357 |2017-11-24T22:49:37Z|2018-02-13T17:10:46Z|v1.9     |
|120494919|18233 |2015-12-04T21:59:00Z|2017-12-17T14:25:59Z|v1.9     |
|224611269|44976 |2017-04-26T21:55:49Z|2018-01-05T19:07:43Z|v1.9     |
|264036039|53615 |2017-10-09T21:46:11Z|                    |v1.9     |
|262875129|53451 |2017-10-04T17:52:05Z|                    |v1.9     |
|276181978|56243 |2017-11-22T19:30:50Z|2017-11-27T15:20:12Z|v1.9     |
|253886104|51594 |2017-08-30T05:55:42Z|2018-01-09T21:06:53Z|v1.9     |
|230543993|46255 |2017-05-22T23:11:29Z|                    |v1.9     |
|234274279|47131 |2017-06-07T16:49:20Z|2018-02-12T17:08:57Z|v1.9     |
|226416047|45385 |2017-05-04T21:37:36Z|2018-01-08T14:53:46Z|v1.9     |
|268341485|54551 |2017-10-25T10:13:48Z|2017-12-07T12:44:14Z|v1.9     |
|258902409|52735 |2017-09-19T17:31:12Z|                    |v1.9     |
|274574803|55892 |2017-11-16T16:17:09Z|2017-11-29T02:24:49Z|v1.9     |
|276656324|56348 |2017-11-24T16:07:23Z|                    |v1.9     |
|254420606|51726 |2017-08-31T18:04:53Z|                    |v1.9     |
|254487869|51745 |2017-08-31T22:36:42Z|                    |v1.9     |
|262507676|53395 |2017-10-03T17:04:35Z|2018-02-26T06:00:50Z|v1.9     |
|257343867|52412 |2017-09-13T11:16:14Z|2018-02-23T05:40:37Z|v1.9     |
|246459494|49817 |2017-07-28T21:06:56Z|2018-02-22T19:29:08Z|v1.9     |
|260418089|53020 |2017-09-25T21:17:33Z|                    |v1.9     |
|259375860|52827 |2017-09-21T05:02:02Z|                    |v1.9     |
|251782459|51049 |2017-08-21T21:59:02Z|                    |v1.9     |
|37915597 |473   |2014-07-15T19:08:45Z|2018-01-18T15:36:28Z|v1.9     |
|261237806|53189 |2017-09-28T08:42:04Z|                    |v1.9     |
|261504871|53237 |2017-09-29T01:12:42Z|2017-12-22T13:16:25Z|v1.9     |
|275025437|55978 |2017-11-17T23:20:50Z|2017-11-27T23:11:27Z|v1.9     |
|238030823|47943 |2017-06-23T03:25:34Z|2018-01-12T16:02:13Z|v1.9     |
|206339935|41161 |2017-02-08T22:07:11Z|2018-01-29T14:47:09Z|v1.9     |
|276697465|56355 |2017-11-24T20:53:55Z|2017-11-28T00:53:09Z|v1.9     |
|243130299|48968 |2017-07-14T22:49:30Z|                    |v1.9     |
|217737938|43783 |2017-03-29T01:20:33Z|2018-01-26T02:29:31Z|v1.9     |
|276170518|56235 |2017-11-22T18:45:33Z|2017-11-28T00:04:25Z|v1.9     |
|219746769|44118 |2017-04-05T23:42:57Z|                    |v1.9     |
|276243348|56262 |2017-11-23T01:02:31Z|2017-11-28T21:08:11Z|v1.9     |
|261426202|53221 |2017-09-28T19:01:16Z|                    |v1.9     |
\---------+------+--------------------+--------------------+---------/

I need to check them manually to see what is happening.

@lukaszgryglicki
Copy link
Member

First of them: kubernetes/kubernetes#52444
It had a final milestone set here: kubernetes/kubernetes#52444 (comment)
But I don't see this issue's milestone now:
screen shot 2018-03-01 at 07 14 08

So the question is:
Is this issue v1.9 milestone or not?
I see adding this milestone but the final issue state is "no milestone".
This is confusing.
My dashboard checks the final state (open/closed), SIG label and milestone for a given day.
Database says that this issue had milestone v1.9 then.

@lukaszgryglicki
Copy link
Member

This is the final issues list (links).
I'll check them and make some summary, for now I've tripple checked my SQLs and I think all is fine :/
|52444|
|56242|
|52683|
|48893|
|48396|
|23233|
|53084|
|50495|
|55967|
|44975|
|46934|
|55768|
|56241|
|23479|
|46983|
|52745|
|53548|
|55194|
|53236|
|23225|
|51825|
|56239|
|43607|
|51746|
|53188|
|50599|
|54318|
|53109|
|51965|
|52039|
|36666|
|53497|
|56091|
|22212|
|21657|
|43486|
|8830 |
|50046|
|50752|
|49038|
|54088|
|56244|
|56061|
|51665|
|52258|
|47604|
|49820|
|10045|
|49480|
|50986|
|52123|
|51747|
|50215|
|54904|
|53006|
|51099|
|489 |
|54574|
|52678|
|49734|
|56155|
|56357|
|18233|
|44976|
|53615|
|53451|
|56243|
|51594|
|46255|
|47131|
|45385|
|54551|
|52735|
|55892|
|56348|
|51726|
|51745|
|53395|
|52412|
|49817|
|53020|
|52827|
|51049|
|473 |
|53189|
|53237|
|55978|
|47943|
|41161|
|56355|
|48968|
|43783|
|56235|
|44118|
|56262|
|53221|

@lukaszgryglicki
Copy link
Member

The second one has v1.9 milestone and is closed now, but it was closed 2017-11-27T15:20:04Z which is after 2017-11-27, so it was open at date to 2017-11-27 and had milestone v1.9 - so this issue is correct.

@lukaszgryglicki
Copy link
Member

But the third one had v1.9 milestone that was later removed by bot: k8s-merge-robot removed this from the v1.9 milestone on Oct 9, 2017 before date to: 2017-11-27.

This may be the bug.
My code detects final milestone before "date to" but I see that I'm not detecting if milestone was removed later!

@lukaszgryglicki
Copy link
Member

This one is very interesting.
It was on v1.9.
Then bot removed v1.9 (which I'm not detecting)
And after the date to it received v1.10.
So the final milestone before 2017-11-27 was v1.9, but bot removed it.
So I need to add detecting removed milestones, but still the first issue had a final v1.9 applied, there is no info on GitHub UI that milestone was removed, but issue has no milestone.
I need to analyse all events for this issue (on the GHA database).

@lukaszgryglicki
Copy link
Member

The other (potential) issue can be:

  • Some final SIG label was applied (I'm taking the last SIG label before date to)
  • But after that last SIG label was applied, it could have been removed (still before date to)
  • Not a problem in this bug (we're taking about SIG: All in this case) but potentially can alter SIG values.

Detecting removed labels is handled here
Seems like I should do something similar here, for SIG and milestone.
This is quite complex and will take some time. I'll post my results here.
In all cases full data regenerate will be needed.

@lukaszgryglicki
Copy link
Member

In first case I don't see any milestone removal on the GitHub UI, but indeed - it have milestone removed, database contains full history:

gha=# select e.created_at, i.milestone_id from gha_issues i, gha_events e where i.event_id = e.id and i.id = 257517249 order by e.created_at;
     created_at      | milestone_id 
---------------------+--------------
 2017-09-13 20:51:44 |             
 2017-09-13 21:44:09 |             
 2017-09-14 01:27:22 |             
 2017-09-14 12:35:33 |      2545392
 2017-09-18 18:27:15 |      2545392
 2017-09-18 18:30:52 |      2422217
 2017-10-05 22:42:03 |      2422217
 2017-10-07 08:14:47 |      2422217
 2017-10-08 08:25:02 |      2422217
 2017-10-09 18:49:32 |      2422217
 2017-10-11 08:24:05 |      2422217
 2017-10-12 17:29:14 |      2422217
 2017-10-14 08:21:59 |      2422217
 2017-10-15 08:24:21 |      2422217
 2017-10-16 08:25:36 |      2422217
 2017-10-18 00:09:01 |      2422217
 2017-10-18 19:33:45 |      2422217
 2017-10-20 08:26:12 |      2422217
 2017-10-22 08:23:17 |      2422217
 2017-10-23 08:26:04 |      2422217
 2017-10-24 08:27:20 |      2422217
 2017-10-25 08:29:44 |      2422217
 2017-10-27 08:24:29 |      2422217
 2017-10-30 08:31:40 |      2422217
 2017-11-01 08:21:30 |      2422217
 2017-11-02 08:23:19 |      2422217
 2017-11-04 08:19:49 |      2422217
 2017-11-06 08:21:14 |      2422217
 2017-11-07 08:24:28 |      2422217
 2017-11-08 14:23:48 |      2422217
 2017-11-08 14:24:32 |      2422217
 2017-11-08 14:28:46 |      2422217
 2017-11-08 15:12:58 |             
 2017-11-08 15:23:26 |             
 2017-11-09 00:43:22 |             
 2018-02-07 19:35:41 |             
(36 rows) 

@lukaszgryglicki
Copy link
Member

And this is the case when Issue had SIG label, but it was removed before date to.
And finally it has no SIG label, so it shouldn't be counted as any SIG (only in SIG: All which skips SIG labels processing).
There are about 14/~2800 such issues now (issues that had SIG label once, but no longer have it now.
So I will add detecting removed SIG labels too.

@lukaszgryglicki
Copy link
Member

lukaszgryglicki commented Mar 1, 2018

Not good, I see that when k8s robot is removing the milestone events records with milestone not yet removed. And the next event is one month later, and this is after date to, and that event is issue close.
Investigating more
SIG's removal is already handled, but I have major problems with Milestones...
I'm really scared that there may be no event recorded without milestone and only next event (which can happen even one year later) contains no milestone.

@lukaszgryglicki
Copy link
Member

I need to go really deep - I'll download and save this event's JSON and see what data I can get from GitHub, because now I can see on the GitHub UI that "k8s-merge-robot removed milestone v1.9" but GHA database event is recorded with that milestone present, and the next event happens one month later (and that one has no milestone).

@lukaszgryglicki
Copy link
Member

JSON does have milestone in "remove milestone" event.
Dead end.
The only hope seems to be "milestone/removed" label.
It is is applied at the date to time, we should ignore milestone.
Or possibly if applied after last milestone was set but before date to.

I'll try this approach now (in addition to standard milestone detect, which detects removed milestone but on the NEXT event, not removing event itself).

@lukaszgryglicki
Copy link
Member

The problem is that for every event that modifies the milestone - we only have current milestone, not the new

  • So when somebody changes milestone from v1.9 to v1.10, we only have v1.9 milestone info, and on the next GitHub event we have v1.10, but next even can happen anytime, or there can be no next event at all
  • When somebody removes the milestone, we only know about this on the next event too.

This is probably why it now shows 36.. struggling more.

@lukaszgryglicki
Copy link
Member

I will try the really crazy approach with finding milestones by always using next event on the same issue (if present).

@lukaszgryglicki
Copy link
Member

Seems like this trick may work.
I will need something similar to PRs... not only issues.

@lukaszgryglicki lukaszgryglicki self-assigned this Mar 1, 2018
@lukaszgryglicki
Copy link
Member

I think this is OK now, see on the test server
I'll update prod too.

@lukaszgryglicki
Copy link
Member

Prod also updated, I think this is very close to what we need, but due to special trick that tries to get milestone from next event (with current fallback) it is not ideal.

  • Trick with next event here.
  • Handling of milestone removal (which also uses milestone/removed label - saves life) here and then here.
  • Handling of SIG label removal here.

Let me know what do you think @jberkus

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 1, 2018

Damn. Ok, that's pretty problematic. Have you filed a bug with Github?

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 1, 2018

So, as I understand it, issues which were taken out of the milestone won't be removed from the count until another event happens to that issue? And the same with PRs, correct?

In the future milestone automation will de-facto remove this issue, but it would be nice if github fixed it.

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 1, 2018

Question: do other, manual changes to labels generate events? Or only comments/open/close?

@lukaszgryglicki
Copy link
Member

I've used trick, that uses next event. If there is no event, I fallback to current event.
The problem is when chnaging/removing milestones we have:

  • old milestone
  • new milestone (possibly null when removed).

The problem is that we only have "old milestone" while we should have two fields (both nullable):
milestone, new milestone.
Or if not possible then only new milestone, bot old.
Initially I thought that this is a bug in gha2db/devstats - but no, I've examined the exact JSON's and we really have no info about new milestone, and we can only take it from the next event.

Anyway, my tricks makes it work quite good atm imho.

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 1, 2018

Right, what I'm saying is that the removal of the milestone doesn't, by itself, generate an event, correct?

So my question is: does the addition or removal of labels generate an event on its own?

BTW, checking issue burndown records, this means that issue counts are about 10-15% higher than history, and trail a day or two behind, which we'll want to note in the eventual documentation.

@lukaszgryglicki
Copy link
Member

There is no separate event like

  • Milestone add/remove/change and we only have IssueComment event, which contains old milestone, the next event (any) will contain new milestone (next event related to this issue)
  • Add/Remove/change labels also doesn't generate event

This makes me wonder what if:

  • I've only add a label - there will be no GH event, so I will actually "see" that new label on the next event
  • I'm commenting on some issue and changing labels. There will be IssueComment event - but with old label set? new label set? (issue labels are kept in a separate table).

I'll check this tomorrow and report here.

@lukaszgryglicki
Copy link
Member

I will open myself (and will check if that is true tomorrow).

  • There are no GH events for changing labels and milestones
    All possible event types are:
gha=# select distinct type from gha_events;
             type              
-------------------------------
 PullRequestReviewCommentEvent
 MemberEvent
 PushEvent
 ReleaseEvent
 CreateEvent
 GollumEvent
 TeamAddEvent
 DeleteEvent
 PublicEvent
 ForkEvent
 PullRequestEvent
 IssuesEvent
 WatchEvent
 IssueCommentEvent
 CommitCommentEvent
(15 rows)
  • The problem with "old" milestone is because bot first comments (and this creates IssueCommentEvent with old milestone) and then changes milestone
  • If bot would first change milestone and then comment, we would have correct milestone information in this IssueCommentEvent
  • Same with labels, the final label set we get depends if we comment first and then modify labels (we get old label set then), if we change labels and then comment we would get correct labels set.
  • If we only change labels/milestone without commenting, we get new correct albels set/milestone on the next GH event referring to this issue
  • The only events that refer to issue are: IssueCommentEvent (commenting on the issue), IssuesEvent (change state: open, close)

Seems like we have a problem here.
Any ideas?

I'll confirm this 100% tomorrow.
@jberkus @dankohn ?

@jberkus jberkus changed the title Count issues with release by milestone Pervasive lag issue with label/milestone changes in issues and PRs Mar 3, 2018
@lukaszgryglicki
Copy link
Member

@jberkus what do you think about this:

I think the nice "workaround" would also be:

k8s-*-bot create additional comment after changing/updating milestone/label, something like "Note: milestone updated to abc", or "Note: label xyz removed"
prow creating similar comment after changing milestone/label.
any other automatic tool (if there is any) do the same.
That way we're in sync immediately.

@lukaszgryglicki
Copy link
Member

Changed from bug to exchancement.
This si not a bug, we just don't have that data in GitHub archives, as already explained.

@lukaszgryglicki
Copy link
Member

@jberkus any updates on this on the K8s side?

@dankohn @jberkus what do you think about spending few days researching new data source: GitHub API (in addition to already existing GHA & git)?

  • I think I can write yet another data source that will periodically query GitHub using API - just to get Issues/PRs current label state (that would eliminate need for another GHA event happening after somebody added the label from the GitHub UI).
  • This would certainly work for current issues/PR, I don't know if this is possible to query past state usingGitHub API - I think this is not possible, but I can double check it.
  • This separate data source would have to run in a separate process, because it can block when we're out of GitHub API points.
  • We're already querying GitHub API to get new releases tags (annotations/releases) but this is using very few GitHub API points. This runs every hour and uses just a few points out of 5000 available.
  • So new "labels" state API calls should always happen after the annotations part, because it can potentially run out of API points.
  • If we go this way, we may want to think again about GitHub OAuth token that is used by DevStats. Currently it uses my private GitHub OAuth token.

@dankohn can I investigate this task?

@dankohn
Copy link
Contributor

dankohn commented Mar 21, 2018 via email

@lukaszgryglicki
Copy link
Member

Yes, I've already suggested that.
But I think this won't be that easy to change k8s process to help Devstats.
Devstats is the tool to help K8s not the opposite :p
Ok, I'll do reasearch then and will see what I can do without touching current k8s workflow.

@dankohn
Copy link
Contributor

dankohn commented Mar 21, 2018 via email

@lukaszgryglicki
Copy link
Member

OK will check this too.
Actually it needs some discussion - because I think sucha a change in mungebot would be quite easy to implement, but it needs acceptance from k8s people first.
Anyway, I'll postpone this a bit, because I've just received an email that I should add another project to DevStats.
So any feedback welcomed here, especially from @jberkus who originally detected this issue.

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 24, 2018

@lukaszgryglicki there's two issues with using the API:

  1. Kubernetes is constantly running out of API "tokens", so anything that requires a lot of additional API calls is just out.

  2. I checked API data, and in the API it's also true that issues/PRs that have only had labels or milestones changed do not show up as "updated" in the API either. So we'd be in a position of polling all the issues/PRs in some way, which is a LOT of API calls.

Frankly, I think the best next step is to talk to someone at Github.

@dankohn
Copy link
Contributor

dankohn commented Mar 24, 2018 via email

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 24, 2018

@dankohn it's not the technical difficulty, which is negligable.

It's that any method which involves increasing github notification traffic just to support devstats is a total nonstarter.

@dankohn
Copy link
Contributor

dankohn commented Mar 24, 2018 via email

@lukaszgryglicki
Copy link
Member

I almost have the working solution.
I'm using API to get all open issues state (since last hour to ask for possible smallest issues set).
It works.
I mean if I add the label to the issue without commenting and after this use GitHub API to get labels list for this issue - I can see the label just added.

And this is a quite fast and straightforward process - I've added 'ghapi2db' tool to support that, I only need to comment it.
When I detect that issue has different milestone or labels (GHA versus API) I'm creating artificial event with new state.

I can give you working solution tomorrow, without touching mungegithub at all and it will need about 200 API point/hour, which is a lot less than 5000.

mungegithub often adds or removes labels as a last operation, just after creating comment, so this situation happens often IMHO.

@lukaszgryglicki
Copy link
Member

Actually I've just connected ghapi2db to our standard workflow (on the test server).

@lukaszgryglicki
Copy link
Member

Seems like all is working OK, so data quality will increase all the time, starting from yesterday.
There is no way to aks GitHub API about issues state from the past, so the correct values will start from yesterday.
Not closing yet, but this should fix the lag issues.

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 26, 2018

Wow, great work, @lukaszgryglicki

@lukaszgryglicki
Copy link
Member

lukaszgryglicki commented Mar 27, 2018

And we don't need to touch mungebot.
BTW @dankohn I've missed your question about this occuring in a regular case orcorner case.
This is rather a regular case, becaus emost label work is done by the bot, and bot usually reacts to devs comments to add/remove/modify labels/milestone.
So this is a regular case.

@jberkus
Copy link
Collaborator Author

jberkus commented Mar 27, 2018

Now, I do think it's worth talking about having prow write a log of its actions that devstats can access. That would give us the data WITHOUT adding to the github notification burden.

@lukaszgryglicki
Copy link
Member

No problem for me anymore. I already have a tool that gets the ifno it needs.
But if prow will create such a log I can write another tool to get this data and make ghapi2db tool not needed anymore.
But as I said, I already have the date, and I'm far from API limits to get it, so no longer a problem for me.

@lukaszgryglicki
Copy link
Member

lukaszgryglicki commented Mar 29, 2018

Final version is on the test server, here: https://k8s.cncftest.io/d/22/open-issues-prs-by-milestone?orgId=1
It fixed three more things:

The same problem with detecting closed & reopened issues/PRs (and performance issues too) also happens for:

Now I'll work on the remaining dashboards - all on the test now.
I'll update prod when I have green light for it.
I've also updated Influx DB to v1.5.1 in the meantime and had a horror day yesterday with fixing issues due to this (18 hours of trial & errors).

@jberkus @dankohn

@lukaszgryglicki
Copy link
Member

All problems described above are now fixed and gone.
The only thing remining it excludoing sandbox repos.
I'm currently doing this on the test but not on the prod.

@jberkus
Copy link
Collaborator Author

jberkus commented Apr 2, 2018

This looks good to me, I've thrown it in #devstats to see if I can get more eyeballs on it. OK if you want to wait for a day just so more people can look for obvious glitches.

@lukaszgryglicki
Copy link
Member

Currently we don't have any data newer than 2018-04-02 14:00 UTC, due to GitHub archives outage: https://github.com/cncf/devstats/issues/91

@lukaszgryglicki
Copy link
Member

Outage fixed on the GHA side, DevStats has all the data again.

@lukaszgryglicki
Copy link
Member

No longer blocked, now just need to confirm that it works ok.

@lukaszgryglicki
Copy link
Member

I'm closing this, please reopen if you find lag/bug.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants