COVID-19 Number of Tests in Italy

Table of Contents

Menu

Introduction

This page presents some data about the number of tests and people tested for COVID-19 over time in Italy and compares them with the number of people found positive.

This page was created on <2020-08-20 Thu> and last updated on <2021-01-03 Sun>.

The source code available on the COVID-19 pages is distributed under the MIT License; the content is distributed under a Creative Commons - Attribution 4.0.

Getting data into R

We first read the data from the Civil Protection repository adding the ratio between positives and tests, computed on the same day and computed with data shifted by two days (on the assumption tests take two days to complete).

In fact data about tests is used with different semantics by different regions. Some regions reports tests with results (and the ratio new positives / tests makes sense). Other reports the number of test performed, in which case the correct ratio is between positives and tests performed some days earlier. We assume two days and report both ratios for all regions. See the following issue on GitHub for an explanation and some more details https://github.com/pcm-dpc/COVID-19/issues/577 (in Italian).

PATH="./data"
DIGITS = 4

national = read.csv(file.path(PATH, "dpc-covid19-ita-andamento-nazionale.csv"))
national$data <- as.Date(national$data)

national$nuovi_casi_testati = c(NA, diff(national$casi_testati, 1))
national$p_over_t <- round(national$nuovi_positivi / national$nuovi_casi_testati, digits = DIGITS) * 100

national$nuovi_tamponi = c(NA, diff(national$tamponi, 1))
national$p_tamponi_over_t <- round(national$nuovi_positivi / national$nuovi_tamponi, digits = DIGITS) * 100

# national$nuovi_casi_testati_2 <- c(NA, NA, head(national$nuovi_casi_testati, -2))
# national$p_over_t_2 = round(national$nuovi_positivi / national$nuovi_casi_testati_2, digits = DIGITS) * 100

# national$nuovi_tamponi_2 <- c(NA, NA, head(national$tamponi_2, -2))
# national$p_tamponi_over_t_2 = round(national$nuovi_positivi / national$nuovi_tamponi_2, digits = DIGITS) * 100

Concerning the regional level, computed columns, such as the number of people tested in a day, have to be computed after filtering, or the diif will work on values from different regions.

# evolution over time, by Region
data = read.csv(file.path(PATH, "dpc-covid19-ita-regioni.csv"))
data$data <- as.Date(data$data)

These are the columns we are interested in and their translation in English:

cols = c(
  "data",
  "nuovi_positivi",
  "nuovi_tamponi",
  "nuovi_casi_testati",
  "p_tamponi_over_t",
  "p_over_t"
)

We now define a function to ouput the last N rows of the input data frame. The real “challenge”, here, is transposing the data, to get a more natural presentation (with time progressing from left to right).

table_data <- function(df, cols, rows = 10) {
  # get the last 10 elements and the interesting columns of the dataframe
  f  <- tail(df, rows)
  rf <- f[, cols]

  # the labels in the transposed matrix are the column names of the original data.frame
  row_labels  <- colnames(rf)
  # the columns in the trasposed matrix are the dates
  col_labels  <- c("Label", format(rf$data, "%a, %b %d"))

  rft <- data.frame(row_labels, t(rf))
  colnames(rft) <- col_labels
  return(rft[-1,])
}

People Tested and Cases in Italy

Data of the last ten days

table_data(national, cols)
Label Sat, May 22 Sun, May 23 Mon, May 24 Tue, May 25 Wed, May 26 Thu, May 27 Fri, May 28 Sat, May 29 Sun, May 30 Mon, May 31
nuovi_positivi 4717 3995 2490 3224 3937 4147 3738 3351 2949 1820
nuovi_tamponi 286603 179391 107481 252646 260962 243967 249911 247330 164495 86977
nuovi_casi_testati 83919 64515 33517 69724 77505 72915 72389 69175 54803 30210
p_tamponi_over_t 1.65 2.23 2.32 1.28 1.51 1.7 1.5 1.35 1.79 2.09
p_over_t 5.62 6.19 7.43 4.62 5.08 5.69 5.16 4.84 5.38 6.02

New Cases

New cases.

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Allow a second plot on the same graph
# par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 5),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     pos = 1, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

new_cases_italia.png

New Cases Tested

plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=16, cex=2.5, col=c("#3B3176"))
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1.2, col=c("#3B3176"))
 grid(col="black")

tests_italia.png

Number of Tests and New Cases Tested

Plot new cases and tests together. (Solution taken from How can I plot with 2 different y-axes? on Stack Overflow.)

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Plot first set of data and draw its axis
tests_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]), max(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]) )
plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=11, cex=1.5, col=c("#3B3176"),
     axes=FALSE,
     ylim=tests_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1, col=c("#3B3176"))
mtext("Number of Tests", side=2, col="#3B3176", line=4) 
axis(2, ylim=tests_limits, col="black", las=1)  
box()

## Allow a second plot on the same graph
par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     pos = 4, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

tests_and_new_cases_italia.png

Positive/Number of Tests

Here we plot the number of positive people over tests performed. The standard measurement is the ratio between positive and tests performed (shown in blue). The way I understand it is that this number also includes tests performed on people already diagnosed and recovered.

The second graph, in red, shows the ration of positive over new people tested, that is, of all the people not yet diagnosed, how many resulted positive?

plot(national$p_over_t ~ national$data, type="o", lwd=3, pch=21, col="#ff0000", main="Positive over Tests", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_over_t, "%", sep=""), pos=4, col="#ff0000", cex=1.3)

# Second plot with Positive over tests
p = lines(national$p_tamponi_over_t ~ national$data, type="o", lwd=3, pch=21, col="#000088", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_tamponi_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_tamponi_over_t, "%", sep=""), pos=4, col="#000088", cex=1.3)

## Add Legend
grid(col="black")
legend("bottomleft", legend = c("Positive over new People Tested", "Positive over Tests Performed"),
       text.col = c("#ff0000", "#000088"), pch= c(15, 17), col=c("#AA0000", "#000088"))

positive_over_tests_italia.png

People Tested and Cases in Trentino

region <- subset(data, denominazione_regione == "P.A. Trento")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100
region$nuovi_casi_testati_2 <- c(NA, NA, head(region$nuovi_casi_testati, -2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100
region$nuovi_tamponi_2 <- c(NA, NA, head(region$tamponi_2, -2))
region$p_tamponi_over_t_2 = round(region$nuovi_positivi / region$nuovi_tamponi_2, digits = DIGITS) * 100

table_data(region, cols)
x
org_babel_R_eoe

People Tested and Cases in Liguria

region <- subset(data, denominazione_regione == "Liguria")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Sat, May 22 Sun, May 23 Mon, May 24 Tue, May 25 Wed, May 26 Thu, May 27 Fri, May 28 Sat, May 29 Sun, May 30 Mon, May 31
nuovi_positivi 60 58 61 47 74 49 67 82 41 43
nuovi_tamponi 5498 3614 3384 5965 6125 5694 5753 6531 3746 2676
nuovi_casi_testati 1711 1073 1224 1891 1978 1866 1929 2224 1400 1056
p_tamponi_over_t 1.09 1.6 1.8 0.79 1.21 0.86 1.16 1.26 1.09 1.61
p_over_t 3.51 5.41 4.98 2.49 3.74 2.63 3.47 3.69 2.93 4.07

People Tested and Cases in Veneto

region <- subset(data, denominazione_regione == "Veneto")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Sat, May 22 Sun, May 23 Mon, May 24 Tue, May 25 Wed, May 26 Thu, May 27 Fri, May 28 Sat, May 29 Sun, May 30 Mon, May 31
nuovi_positivi 234 133 150 159 255 242 189 198 150 60
nuovi_tamponi 35639 17323 10429 34431 34754 29818 27677 31000 15817 6495
nuovi_casi_testati 3927 1817 1281 2594 4213 2700 2340 2751 2103 1249
p_tamponi_over_t 0.66 0.77 1.44 0.46 0.73 0.81 0.68 0.64 0.95 0.92
p_over_t 5.96 7.32 11.71 6.13 6.05 8.96 8.08 7.2 7.13 4.8

People Tested and Cases in Lombardia

region <- subset(data, denominazione_regione == "Lombardia")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Sat, May 22 Sun, May 23 Mon, May 24 Tue, May 25 Wed, May 26 Thu, May 27 Fri, May 28 Sat, May 29 Sun, May 30 Mon, May 31
nuovi_positivi 828 711 249 505 666 739 661 620 458 132
nuovi_tamponi 47376 32977 13519 32446 42222 44390 45540 44888 28195 8661
nuovi_casi_testati 15237 12631 6974 11504 13134 14314 14588 14807 12140 5626
p_tamponi_over_t 1.75 2.16 1.84 1.56 1.58 1.66 1.45 1.38 1.62 1.52
p_over_t 5.43 5.63 3.57 4.39 5.07 5.16 4.53 4.19 3.77 2.35