COVID-19 Number of Tests in Italy

Table of Contents

Menu

Introduction

This page presents some data about the number of tests and people tested for COVID-19 over time in Italy and compares them with the number of people found positive.

This page was created on <2020-08-20 Thu> and last updated on <2021-01-03 Sun>.

The source code available on the COVID-19 pages is distributed under the MIT License; the content is distributed under a Creative Commons - Attribution 4.0.

Getting data into R

We first read the data from the Civil Protection repository adding the ratio between positives and tests, computed on the same day and computed with data shifted by two days (on the assumption tests take two days to complete).

In fact data about tests is used with different semantics by different regions. Some regions reports tests with results (and the ratio new positives / tests makes sense). Other reports the number of test performed, in which case the correct ratio is between positives and tests performed some days earlier. We assume two days and report both ratios for all regions. See the following issue on GitHub for an explanation and some more details https://github.com/pcm-dpc/COVID-19/issues/577 (in Italian).

PATH="./data"
DIGITS = 4

national = read.csv(file.path(PATH, "dpc-covid19-ita-andamento-nazionale.csv"))
national$data <- as.Date(national$data)

national$nuovi_casi_testati = c(NA, diff(national$casi_testati, 1))
national$p_over_t <- round(national$nuovi_positivi / national$nuovi_casi_testati, digits = DIGITS) * 100

national$nuovi_tamponi = c(NA, diff(national$tamponi, 1))
national$p_tamponi_over_t <- round(national$nuovi_positivi / national$nuovi_tamponi, digits = DIGITS) * 100

# national$nuovi_casi_testati_2 <- c(NA, NA, head(national$nuovi_casi_testati, -2))
# national$p_over_t_2 = round(national$nuovi_positivi / national$nuovi_casi_testati_2, digits = DIGITS) * 100

# national$nuovi_tamponi_2 <- c(NA, NA, head(national$tamponi_2, -2))
# national$p_tamponi_over_t_2 = round(national$nuovi_positivi / national$nuovi_tamponi_2, digits = DIGITS) * 100

Concerning the regional level, computed columns, such as the number of people tested in a day, have to be computed after filtering, or the diif will work on values from different regions.

# evolution over time, by Region
data = read.csv(file.path(PATH, "dpc-covid19-ita-regioni.csv"))
data$data <- as.Date(data$data)

These are the columns we are interested in and their translation in English:

cols = c(
  "data",
  "nuovi_positivi",
  "nuovi_tamponi",
  "nuovi_casi_testati",
  "p_tamponi_over_t",
  "p_over_t"
)

We now define a function to ouput the last N rows of the input data frame. The real “challenge”, here, is transposing the data, to get a more natural presentation (with time progressing from left to right).

table_data <- function(df, cols, rows = 10) {
  # get the last 10 elements and the interesting columns of the dataframe
  f  <- tail(df, rows)
  rf <- f[, cols]

  # the labels in the transposed matrix are the column names of the original data.frame
  row_labels  <- colnames(rf)
  # the columns in the trasposed matrix are the dates
  col_labels  <- c("Label", format(rf$data, "%a, %b %d"))

  rft <- data.frame(row_labels, t(rf))
  colnames(rft) <- col_labels
  return(rft[-1,])
}

People Tested and Cases in Italy

Data of the last ten days

table_data(national, cols)
Label Mon, Jan 04 Tue, Jan 05 Wed, Jan 06 Thu, Jan 07 Fri, Jan 08 Sat, Jan 09 Sun, Jan 10 Mon, Jan 11 Tue, Jan 12 Wed, Jan 13
nuovi_positivi 10800 15378 20331 18020 17533 19978 18627 12532 14242 15774
nuovi_tamponi 77993 135106 178596 121275 140267 172119 139758 91656 141641 175429
nuovi_casi_testati 35417 54512 75719 53423 56858 66417 63105 42553 53604 69626
p_tamponi_over_t 13.85 11.38 11.38 14.86 12.5 11.61 13.33 13.67 10.05 8.99
p_over_t 30.49 28.21 26.85 33.73 30.84 30.08 29.52 29.45 26.57 22.66

New Cases

New cases.

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Allow a second plot on the same graph
# par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 5),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     pos = 1, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

new_cases_italia.png

New Cases Tested

plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=16, cex=2.5, col=c("#3B3176"))
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1.2, col=c("#3B3176"))
 grid(col="black")

tests_italia.png

Number of Tests and New Cases Tested

Plot new cases and tests together. (Solution taken from How can I plot with 2 different y-axes? on Stack Overflow.)

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Plot first set of data and draw its axis
tests_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]), max(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]) )
plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=11, cex=1.5, col=c("#3B3176"),
     axes=FALSE,
     ylim=tests_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1, col=c("#3B3176"))
mtext("Number of Tests", side=2, col="#3B3176", line=4) 
axis(2, ylim=tests_limits, col="black", las=1)  
box()

## Allow a second plot on the same graph
par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     pos = 4, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

tests_and_new_cases_italia.png

Positive/Number of Tests

Here we plot the number of positive people over tests performed. The standard measurement is the ratio between positive and tests performed (shown in blue). The way I understand it is that this number also includes tests performed on people already diagnosed and recovered.

The second graph, in red, shows the ration of positive over new people tested, that is, of all the people not yet diagnosed, how many resulted positive?

plot(national$p_over_t ~ national$data, type="o", lwd=3, pch=21, col="#ff0000", main="Positive over Tests", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_over_t, "%", sep=""), pos=4, col="#ff0000", cex=1.3)

# Second plot with Positive over tests
p = lines(national$p_tamponi_over_t ~ national$data, type="o", lwd=3, pch=21, col="#000088", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_tamponi_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_tamponi_over_t, "%", sep=""), pos=4, col="#000088", cex=1.3)

## Add Legend
grid(col="black")
legend("bottomleft", legend = c("Positive over new People Tested", "Positive over Tests Performed"),
       text.col = c("#ff0000", "#000088"), pch= c(15, 17), col=c("#AA0000", "#000088"))

positive_over_tests_italia.png

People Tested and Cases in Trentino

region <- subset(data, denominazione_regione == "P.A. Trento")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100
region$nuovi_casi_testati_2 <- c(NA, NA, head(region$nuovi_casi_testati, -2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100
region$nuovi_tamponi_2 <- c(NA, NA, head(region$tamponi_2, -2))
region$p_tamponi_over_t_2 = round(region$nuovi_positivi / region$nuovi_tamponi_2, digits = DIGITS) * 100

table_data(region, cols)
x
org_babel_R_eoe

People Tested and Cases in Liguria

region <- subset(data, denominazione_regione == "Liguria")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Mon, Jan 04 Tue, Jan 05 Wed, Jan 06 Thu, Jan 07 Fri, Jan 08 Sat, Jan 09 Sun, Jan 10 Mon, Jan 11 Tue, Jan 12 Wed, Jan 13
nuovi_positivi 204 404 365 196 400 526 374 226 276 395
nuovi_tamponi 2007 4586 4586 1642 4563 4519 2877 2085 5060 4442
nuovi_casi_testati 653 1364 1280 529 1432 1403 1072 898 1521 1420
p_tamponi_over_t 10.16 8.81 7.96 11.94 8.77 11.64 13.0 10.84 5.45 8.89
p_over_t 31.24 29.62 28.52 37.05 27.93 37.49 34.89 25.17 18.15 27.82

People Tested and Cases in Veneto

region <- subset(data, denominazione_regione == "Veneto")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Mon, Jan 04 Tue, Jan 05 Wed, Jan 06 Thu, Jan 07 Fri, Jan 08 Sat, Jan 09 Sun, Jan 10 Mon, Jan 11 Tue, Jan 12 Wed, Jan 13
nuovi_positivi 1682 3151 3638 3596 3388 3100 2167 1715 2134 1884
nuovi_tamponi 8134 17734 21386 17243 17722 20779 14586 11001 15867 19063
nuovi_casi_testati 2081 4339 5223 5163 4683 5188 3820 2784 4032 3747
p_tamponi_over_t 20.68 17.77 17.01 20.85 19.12 14.92 14.86 15.59 13.45 9.88
p_over_t 80.83 72.62 69.65 69.65 72.35 59.75 56.73 61.6 52.93 50.28

People Tested and Cases in Lombardia

region <- subset(data, denominazione_regione == "Lombardia")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)
Label Mon, Jan 04 Tue, Jan 05 Wed, Jan 06 Thu, Jan 07 Fri, Jan 08 Sat, Jan 09 Sun, Jan 10 Mon, Jan 11 Tue, Jan 12 Wed, Jan 13
nuovi_positivi 863 1338 2952 2799 1963 2506 3267 1488 1146 2245
nuovi_tamponi 8161 12790 28462 20331 18415 24847 25011 13356 15964 31880
nuovi_casi_testati 3029 4611 8833 7522 6411 8061 9791 5224 5384 9584
p_tamponi_over_t 10.57 10.46 10.37 13.77 10.66 10.09 13.06 11.14 7.18 7.04
p_over_t 28.49 29.02 33.42 37.21 30.62 31.09 33.37 28.48 21.29 23.42

Author: Adolfo Villafiorita

Last modified: 2021-01-03 Sun 11:11 (created on: 2020-08-20 Thu 00:00)

Published: 2021-01-13 Wed 18:50