-
Notifications
You must be signed in to change notification settings - Fork 0
/
learnbaseplot.Rmd
383 lines (293 loc) · 13 KB
/
learnbaseplot.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
<!-- # (PART) Supporting Material {-} -->
<!-- # Learning the Base Plotting {#tidyverse2 .unnumbered} -->
# Learning to use the base plot {#baseplot .unnumbered}
As an alternative to ggplot and tidyverse, there are other basic plotting functions that you might prefer. This module takes you through a few basic plotting functions. It starts in the same way as the tidyverse module, using the same dataset.
## Visualising Module 1 Flow Data {-}
### Setting up {-}
#### Creating an R Studio project {-}
1) Open R Studio
2) Go *File -> New Project -> New Directory -> New Project*
3) *Directory name:* ENVT3362_workshop_2
4) *Create project as a subdirectory of:* Wherever you store your ENVT3362 files!
5) Click *Create project*
5) Download the spreadsheet for this workshop [here](envFlowData.xls)
6) Move this to your *ENVT3362_workshop_2* directory
### Importing and formatting the data {-}
#### Load the necessary packages {-}
- readxl and lubridate need to be loaded separately
```{r echo=TRUE, message=FALSE, warning=FALSE}
library(readxl)
library(lubridate)
```
#### Import the spreadsheet {-}
- The `path` argument is relative to you R Studio project file
- `sheet` specifies which Excel sheet to read
```{r message=FALSE, warning=FALSE, include=FALSE}
envFlow <- read_xls(path = "data/workshop2/envFlowData.xls", sheet = 1)
```
```{r eval=FALSE, echo=TRUE, message=FALSE, warning=FALSE}
envFlow <- read_xls(path = "envFlowData.xls", sheet = 1)
```
#### Inspect the data {-}
- `head()` prints the first few observations
- What data type is `date`?
```{r echo=TRUE, message=FALSE, warning=FALSE}
head(envFlow)
```
#### Format the date {-}
- Use lubridate's `ymd()` function to overwrite the existing `date` variable and convert the **character** data to **date** data
```{r echo=TRUE, message=FALSE, warning=FALSE}
envFlow$date <- ymd(envFlow$date)
```
#### Inspect the data again {-}
- Notice the change in data type of `date`
```{r echo=TRUE, message=FALSE, warning=FALSE}
head(envFlow)
```
## Graphing with base plot {-}
Simply using the 'plot' function will make a default points plot. The plot does not look good, however, the function is convenient for checking your data quickly.
### Call `plot()` {-}
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow)
```
`plot` defaults to a points plot, but we can quickly change it to a line plot using the argument `type = "l"`.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow, type = "l")
```
Just as with ggplot, we can access the R database of standard colours.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow, type = "l", col = "dodgerblue3")
```
By default it is selecting x and y from columns 1 and 2.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(x=envFlow$date, y=envFlow$totalDischarge ,type="l",col="dodgerblue3")
```
If you had several columns, you could choose which to plot on the y axis. We can make a new variable, say, double flow. A new line can be appended to the original plot using the `lines` function.
```{r echo=TRUE, message=FALSE, warning=FALSE}
doubleflow = envFlow$totalDischarge * 2
plot(x=envFlow$date, y=envFlow$totalDischarge ,type="l",col="dodgerblue3")
lines(x=envFlow$date, y=doubleflow, col="red4")
```
To plot a subset of the data, use the square brackets to select indices, say, points 100 to 800.
```{r echo=TRUE, message=FALSE, warning=FALSE}
doubleflow = envFlow$totalDischarge * 2
plot(x=envFlow$date[100:800], y=envFlow$totalDischarge[100:800] ,type="l",col="dodgerblue3")
```
Similarly, the limits of the axes can be set with the `xlim` and `ylim` arguments.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(x=envFlow$date, y=envFlow$totalDischarge
,type="l",col="dodgerblue3"
,xlim = c(date("2000-01-01"),date("2005-01-01"))
,ylim = c(0,1500)
)
```
## Call `plot()` and then start again {-}
The base plot is convenient, however, to make it look good, sometimes it is easier to plot it as a blank and add each element in separately. Just as "l" was the argument for lines, "n" is an argument for no plot. We can even turn off the axes and titles.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("2000-01-01"),date("2005-01-01"))
)
```
Then we can add all the elements back in. The plot is still open so it will plot everything to the blank plot.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("2000-01-01"),date("2005-01-01"))
)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
lines(envFlow$date,doubleflow,col="red4")
box(bty="o")
```
### Axis and mtext {-}
We can put the axis and axis labels back in using the functions 'axis' and 'mtext'. They both have the `side` argument:
1 = bottom, 2 = left, 3 = top, 4 = right
The font size defaults to 1, but we can adjust it using the argument `cex`. The distance from the axis is set using `line`. the `expression` function allows formatting of text, such as subscripts and superscripts.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("2000-01-01"),date("2005-01-01"))
)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
lines(envFlow$date,doubleflow,col="red4")
box(bty="o")
mtext(side = 1,text="Date",line=2)
mtext(side = 2,text=expression("Flow (ML d"^-1*")"),line=2)
axis(2,cex=0.7)
```
The date axis needs to be formatted, then the `axis.Date` function is used. This step is clunky compared to ggplot.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("2000-01-01"),date("2005-01-01"))
)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
lines(envFlow$date,doubleflow,col="red4")
box(bty="o")
mtext(side = 1,text="Date",line=2)
mtext(side = 2,text=expression("Flow (ML d"^-1*")"),line=2)
axis(2,cex=0.7)
x.axis<-as.Date(seq(min(envFlow$date),max(envFlow$date),by=3*365),format="%Y")
axis.Date(at=x.axis,side=1,cex=0.7)
```
### Legend {-}
Add a legend using the `legend` function. You can search for its details. In this case, `topleft` is the position, inset by 0.1; the `legend` argument is the text, `col` is the sequence of colours, and pch is the symbol.
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("2000-01-01"),date("2005-01-01"))
)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
lines(envFlow$date,doubleflow,col="red4")
box(bty="o")
mtext(side = 1,text="Date",line=2)
mtext(side = 2,text=expression("Flow (ML d"^-1*")"),line=2)
axis(2,cex=0.7)
x.axis<-as.Date(seq(min(envFlow$date),max(envFlow$date),by=3*365),format="%Y")
axis.Date(at=x.axis,side=1,cex=0.7)
legend("topleft", inset=0.1
,legend=c("Flow","Doubleflow")
,col=c("dodgerblue3","red4")
,pch=16
,cex = 0.7)
```
### Add a polygon {-}
We can add a polygon that represents the threshold above, say, 500 ML. The `polygon` function draws a polygon of any shape, by listing a sequence of the x and y values of a polygon. For example, imagine
- it started at 0,0,
- then you drew a line along the x axis to x = 10, y still = 0,
- then you drew a line up from x still = 10 and y = 3,
- then backwards along the x direction from y still = 3, x = 0 again.
The polygon joins all of these points. These could also be expressed as a matrix
```{r echo=TRUE, message=FALSE, warning=FALSE}
x=c(0,1,1,0)
y=c(0,0,1,1)
this.matrix<-matrix(ncol = 2
,nrow= 4
,data=c(x,y)
)
colnames(this.matrix)=c("x","y")
```
In our flow plot, we use dates as the x values and the range 0 to 500 ML.
The `adjustcolor` function allows us to make a new colour, as a variable called 'polygoncolour'. The argument `alpha` sets the opacity, where 0 is completely transparent and 1 is completely opaque.
```{r echo=TRUE, message=FALSE, warning=FALSE}
polygoncolour<-adjustcolor("red",alpha=0.33)
```
If we stick it all together:
```{r echo=TRUE, message=FALSE, warning=FALSE}
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
)
polygoncolour<-adjustcolor("red",alpha=0.25)
polygon(x = c( envFlow$date[which(envFlow$date=="1992-01-01")]
,envFlow$date[which(envFlow$date=="2010-12-31")]
,envFlow$date[which(envFlow$date=="2010-12-31")]
,envFlow$date[which(envFlow$date=="1992-01-01")]
)
,y=c(50,50,550,550)
,col=polygoncolour
,lty = 0)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
lines(envFlow$date,doubleflow,col="red4")
box(bty="o")
mtext(side = 1,text="Date",line=2)
mtext(side = 2,text=expression("Flow (ML d"^-1*")"),line=2)
x.axis<-as.Date(seq(min(envFlow$date),max(envFlow$date),by=3*365),format="%Y");
axis.Date(at=x.axis,side=1,cex=0.7)
axis(2,cex=0.7)
legend("topleft", inset=0.1
,legend=c("Flow","Doubleflow")
,col=c("dodgerblue3","red4")
,pch=16 )
```
## Make an annual column chart {-}
We can make an annual summary then plot it as a column chart. The years of the date column can be isolated by formatting the $date column with the `format` function, and setting the argument to `"%Y"`, which means year.
The `rowsum` function sums rows of a matrix (there is also a function called `colsum`). The first argument is the matrix column to be summed. The second argument is how to group the rows that are summed, i.e. it will look for values that are the same. Here we can make a new matrix called `envFlow.by.year`.
```{r echo=TRUE, message=FALSE, warning=FALSE}
envFlow.by.year<-rowsum(x=envFlow$totalDischarge
,group=format(envFlow$date,"%Y") )
```
The `barplot` function creates a column chart. The `height` argument is the matrix to plot. The `beside` argument is set to `TRUE` so that there are separate columns for each year. The `names.arg` argument sets the labels of the x axis. The other arguments are for formatting, which you can change as you like.
```{r echo=TRUE, message=FALSE, warning=FALSE}
envFlow.by.year<-rowsum(x=envFlow$totalDischarge
,group=format(envFlow$date,"%Y") )
barplot(envFlow.by.year,beside = TRUE
,space=0.1
,col="dodgerblue3"
,border = NA
,ylim = c(0,max(envFlow.by.year)*1.1)
,cex.axis = 0.7,cex.names = 0.7,
,names.arg = rownames(envFlow.by.year )
,xlab = "Year"
,ylab = "Anual flow (ML)"
)
box(bty="o")
```
## Make a multi-plot figure {-}
You can make one figure that has several plots in it. One way to do this is to use the `mfrow` function. This sets a blank grid where the figures will be written. The arguments are the number of rows and columns that the figures are written into. To put a figure in a particular spot in this blank grid, the `mfg` function is used. The arguments in this function also correspond to the row and column. You can add white space around each figure using the `mar` function. Its arguments correspond to the number of lines below, left, above and right of the figure.
For example, to make two figures above and below, in two rows, use `mfrow` 2,1. Alternatively, to make two figures left and right, in two columns, use `mfrow` 1,2.
```{r echo=TRUE, message=FALSE, warning=FALSE}
par(mfrow=c(2,1))
par(mfg=c(1,1))
par(mar=c(4,4,0.5,1))
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("1992-01-01"),date("2010-12-31"))
)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
box(bty="o")
mtext(side = 1,text="Year",line=3)
mtext(side = 2,text=expression("Flow (ML d"^-1*")"),line=3)
axis(2,cex.axis=0.7)
x.axis<-as.Date(seq(min(envFlow$date),max(envFlow$date),by=1*365),format="%Y")
axis.Date(at=x.axis,side=1,cex.axis=0.7)
par(mfg=c(2,1))
par(mar=c(4,4,0.5,1))
barplot(height=envFlow.by.year,beside = T
,space=0.1
,col="dodgerblue3"
,border = NA
,ylim = c(0,max(envFlow.by.year)*1.1)
,cex.axis = 0.7,cex.names = 0.7,
,names.arg = rownames(envFlow.by.year )
,xlab = "Year"
,ylab = "Anual flow (ML)"
)
box(bty="o")
```
```{r echo=TRUE, message=FALSE, warning=FALSE}
par(mfrow=c(1,2))
par(mfg=c(1,1))
par(mar=c(4,4,1,1))
plot(envFlow,type="n"
,axes=F
,xlab="", ylab=""
,xlim = c(date("1992-01-01"),date("2010-12-31"))
)
lines(envFlow$date,envFlow$totalDischarge,col="dodgerblue3")
box(bty="o")
mtext(side = 1,text="Year",line=3)
mtext(side = 2,text=expression("Flow (ML d"^-1*")"),line=3)
axis(2,cex.axis=0.7)
x.axis<-as.Date(seq(min(envFlow$date),max(envFlow$date),by=1*365),format="%Y")
axis.Date(at=x.axis,side=1,cex.axis=0.7)
par(mfg=c(1,2))
par(mar=c(4,4,1,1))
barplot(height=envFlow.by.year,beside = T
,space=0.1
,col="dodgerblue3"
,border = NA
,ylim = c(0,max(envFlow.by.year)*1.1)
,cex.axis = 0.7,cex.names = 0.7,
,names.arg = rownames(envFlow.by.year )
,xlab = "Year"
,ylab = "Anual flow (ML)"
)
box(bty="o")
```