-
Notifications
You must be signed in to change notification settings - Fork 0
/
exploratory_data_analysis.Rmd
161 lines (139 loc) · 4.81 KB
/
exploratory_data_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
title: "Exploring World Cup Statistics Geographically By Confederation"
output:
html_document:
toc: TRUE
toc_float: TRUE
---
<style type="text/css">
h1.title {
text-align: center;
}
</style>
```{r setup, include=FALSE}
# Load packages
library(sf)
library(tidyverse)
library(tmap)
library(tmaptools)
library(viridis)
library(plotly)
# Set ggplot theme
knitr::opts_chunk$set(
echo = TRUE,
warning = FALSE,
fig.width = 6,
fig.asp = .6,
out.width = "90%"
)
theme_set(theme_minimal() + theme(legend.position = "bottom"))
options(
ggplot2.continuous.colour = "viridis",
ggplot2.continuous.fill = "viridis"
)
scale_colour_discrete = scale_colour_viridis_d
scale_fill_discrete = scale_fill_viridis_d
# Set tmap mode to interactive
tmap_mode("view")
```
```{r import df, include=FALSE, warning=FALSE, message=FALSE}
# Read in csv datafile
wc_df <- read_csv("./data/worldcup_final.csv") %>%
janitor::clean_names() %>%
mutate(
gd = gsub("\\+","",gd),
gd = gsub("\\−","-",gd),
part = as.numeric(part),
pld = as.numeric(pld),
w = as.numeric(w),
d = as.numeric(d),
l = as.numeric(l),
gf = as.numeric(gf),
ga = as.numeric(ga),
gd = as.numeric(gd),
pts = as.numeric(pts),
rank = as.numeric(rank),
goals = as.numeric(goals),
land_area = as.numeric(land_area_km))
```
```{r import shapefile, include=FALSE, warning=FALSE, message=FALSE}
# Read in shapefile and convert values to numeric
wc_countries <- st_read("data/geofiles/worldcup_countries.shp") %>%
janitor::clean_names() %>%
mutate(
gd = gsub("\\+","",gd),
gd = gsub("\\−","-",gd),
part = as.numeric(part),
pld = as.numeric(pld),
w = as.numeric(w),
d = as.numeric(d),
l = as.numeric(l),
gf = as.numeric(gf),
ga = as.numeric(ga),
gd = as.numeric(gd),
pts = as.numeric(pts),
rank = as.numeric(rank),
goals = as.numeric(goals),
land_area = as.numeric(land_area))
```
<br>
The International Federation of Association Football (In French: Fédération Internationale de Football Association), commonly known as FIFA, is the international governing body of association football. FIFA is responsible for the organization and governance of football's major international tournaments, most notably the FIFA World Cup, held since 1930.
<br>
## FIFA Confederations
In international soccer, the world is broken up into six regions and the countries in each region are grouped together. These groups, which are called Confederations, are responsible for overseeing the game in their section of the world.
The six Confederations recogized by FIFA include: <br>
- AFC - Asian Football Confederation in **Asia and Australia** <br>
- CAF - Confédération Africaine de Football in **Africa** <br>
- CONCACAF - Confederation of **North, Central American and Caribbean** Association Football <br>
- CONMEBOL - Confederación Sudamericana de Fútbol in **South America** <br>
- OFC - Oceania Football Confederation in **Oceania** <br>
- UEFA - Union of European Football Associations in **Europe** <br>
<br>
### National Teams That Have Participated in the World Cup by FIFA Confederation
```{r echo=FALSE, warning=FALSE, message=FALSE}
tm_shape(wc_countries) +
tm_polygons(
col = "confederat",
style = "cat",
palette = "viridis",
id = "country",
alpha = .8,
border.col = "white",
lwd = .5,
title = "Confederation") -> worldmap_conf
worldmap_conf
```
As of the 2022, `r nrow(wc_df)` national teams from all six confederations have participated in the FIFA World Cup.
<br>
```{r echo=FALSE, warning=FALSE, message=FALSE}
wc_df %>%
group_by(confederation) %>%
summarize(n_countries = n()) %>%
ggplot(aes(x = confederation, y = n_countries, fill = confederation)) +
geom_bar(stat = 'identity', alpha = .7) +
labs(
title = "World Cup Participations by FIFA Confederation",
x = "Confederation",
y = "Number of Countries") +
geom_text(aes(label = n_countries), vjust = -0.3, size = 3.5) +
theme(legend.position = "none")
```
<br>
Most of the national teams that have participated in the World Cup have been from the Union of European Football Associations in Europe (UEFA) Confederation, while only one team from the Oceania Football Confederation in Oceania (OFC) Confederation has participated.
<br>
### Number of Participations in the World Cup
```{r echo=FALSE, warning=FALSE, message=FALSE}
worldmap_conf +
tm_bubbles("part",
col = "part",
border.col = "white",
style = "cont",
breaks = seq(0, 22, by = 2),
palette = "Greys",
alpha = 0.7,
size = "part",
scale = 1.5,
title.col = "Participations (part)",
id = "country")
```
The national teams that have participated in the most World Cup tournaments appear to be spatially clustered in the CONMEBOL (South America) and UEFA (Europe) Confederations.