COVID-19 Number of Tests in Italy

Introduction
Getting data into R
People Tested and Cases in Italy
People Tested and Cases in Trentino
People Tested and Cases in Liguria
People Tested and Cases in Veneto
People Tested and Cases in Lombardia

After two years, I finally decided to stop updating this page and evaluation of all code blocks has been disabled. The data was last updated on October 28/2022.

Introduction

This page presents some data about the number of tests and people tested for COVID-19 over time in Italy and compares them with the number of people found positive.

This page was created on <2020-08-20 Thu> and last updated on <2022-10-29 Sat>.

The source code available on the COVID-19 pages is distributed under the MIT License; the content is distributed under a Creative Commons - Attribution 4.0.

Getting data into R

We first read the data from the Civil Protection repository adding the ratio between positives and tests, computed on the same day and computed with data shifted by two days (on the assumption tests take two days to complete).

In fact data about tests is used with different semantics by different regions. Some regions reports tests with results (and the ratio new positives / tests makes sense). Other reports the number of test performed, in which case the correct ratio is between positives and tests performed some days earlier. We assume two days and report both ratios for all regions. See the following issue on GitHub for an explanation and some more details https://github.com/pcm-dpc/COVID-19/issues/577 (in Italian).

DIGITS = 4

national = read.csv(file.path(PATH, "dpc-covid19-ita-andamento-nazionale.csv"))
national$data <- as.Date(national$data)

national$nuovi_casi_testati = c(NA, diff(national$casi_testati, 1))
national$p_over_t <- round(national$nuovi_positivi / national$nuovi_casi_testati, digits = DIGITS) * 100

national$nuovi_tamponi = c(NA, diff(national$tamponi, 1))
national$p_tamponi_over_t <- round(national$nuovi_positivi / national$nuovi_tamponi, digits = DIGITS) * 100

# national$nuovi_casi_testati_2 <- c(NA, NA, head(national$nuovi_casi_testati, -2))
# national$p_over_t_2 = round(national$nuovi_positivi / national$nuovi_casi_testati_2, digits = DIGITS) * 100

# national$nuovi_tamponi_2 <- c(NA, NA, head(national$tamponi_2, -2))
# national$p_tamponi_over_t_2 = round(national$nuovi_positivi / national$nuovi_tamponi_2, digits = DIGITS) * 100

Concerning the regional level, computed columns, such as the number of people tested in a day, have to be computed after filtering, or the diif will work on values from different regions.

# evolution over time, by Region
data = read.csv(file.path(PATH, "dpc-covid19-ita-regioni.csv"))
data$data <- as.Date(data$data)

These are the columns we are interested in and their translation in English:

cols = c(
  "data",
  "nuovi_positivi",
  "nuovi_tamponi",
  "nuovi_casi_testati",
  "p_tamponi_over_t",
  "p_over_t"
)

We now define a function to ouput the last N rows of the input data frame. The real “challenge”, here, is transposing the data, to get a more natural presentation (with time progressing from left to right).

table_data <- function(df, cols, rows = 10) {
  # get the last 10 elements and the interesting columns of the dataframe
  f  <- tail(df, rows)
  rf <- f[, cols]

  # the labels in the transposed matrix are the column names of the original data.frame
  row_labels  <- colnames(rf)
  # the columns in the trasposed matrix are the dates
  col_labels  <- c("Label", format(rf$data, "%a, %b %d"))

  rft <- data.frame(row_labels, t(rf))
  colnames(rft) <- col_labels
  return(rft[-1,])
}

People Tested and Cases in Italy

Data of the last ten days

table_data(national, cols)

Label	Wed, Oct 19	Thu, Oct 20	Fri, Oct 21	Sat, Oct 22	Sun, Oct 23	Mon, Oct 24	Tue, Oct 25	Wed, Oct 26	Thu, Oct 27	Fri, Oct 28
nuovi_positivi	41712	40563	36116	31775	25554	11606	48714	35043	31760	29040
nuovi_tamponi	233084	229140	213088	195575	161787	80319	297268	216735	205738	182614
nuovi_casi_testati	40696	40632	35965	33864	28465	15254	48906	38124	35350	33215
p_tamponi_over_t	17.9	17.7	16.95	16.25	15.79	14.45	16.39	16.17	15.44	15.9
p_over_t	102.5	99.83	100.42	93.83	89.77	76.08	99.61	91.92	89.84	87.43

New Cases

New cases.

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Allow a second plot on the same graph
# par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 5),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 5),
     pos = 1, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

New Cases Tested

plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=16, cex=2.5, col=c("#3B3176"))
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1.2, col=c("#3B3176"))
 grid(col="black")

Number of Tests and New Cases Tested

Plot new cases and tests together. (Solution taken from How can I plot with 2 different y-axes? on Stack Overflow.)

## add extra space to right margin of plot within frame
par(mar=c(5, 4, 4, 6) + 0.1)

## Plot first set of data and draw its axis
tests_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]), max(national[national$data >= "2020-08-01", c("nuovi_casi_testati")]) )
plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 
     type="l", lwd=6, pch=11, cex=1.5, col=c("#3B3176"),
     axes=FALSE,
     ylim=tests_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_casi_testati")], 1),
     pos = 4, cex = 1, col=c("#3B3176"))
mtext("Number of Tests", side=2, col="#3B3176", line=4) 
axis(2, ylim=tests_limits, col="black", las=1)  
box()

## Allow a second plot on the same graph
par(new=TRUE)
new_cases_limits = c( min(national[national$data >= "2020-08-01", c("nuovi_positivi")]), max(national[national$data >= "2020-08-01", c("nuovi_positivi")]) )

p = plot(x = national[national$data >= "2020-08-01", c("data")], 
     y = national[national$data >= "2020-08-01", c("nuovi_positivi")], 
     type="l", lwd=6, pch=21, cex=1.5, col=c("#AA0000"),
     axes=FALSE,
     ylim=new_cases_limits,
     ylab="", xlab="")
text(x = tail(national[national$data >= "2020-08-01", c("data")], 1),
     y = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     labels = tail(national[national$data >= "2020-08-01", c("nuovi_positivi")], 1),
     pos = 4, cex = 1, col="#AA0000")
mtext("New Cases", side=4, line=4, col="#AA0000") 
axis(4, ylim=new_cases_limits, las=1)

grid(p, col = "black", lty = "dotted")

# x-axis
dates = national[national$data >= "2020-08-01", c("data")]
axis.Date(1, at=seq(min(dates), max(dates), by="week"), format="%b %d", las=2)
mtext("Day", side=1, line=2.5)

## Add Legend
legend("topleft", legend = c("Tests", "New Cases"),
       text.col = c("#3B3176", "#AA0000"), pch= c(15, 17), col=c("#3B3176", "#AA0000"))

Positive/Number of Tests

Here we plot the number of positive people over tests performed. The standard measurement is the ratio between positive and tests performed (shown in blue). The way I understand it is that this number also includes tests performed on people already diagnosed and recovered.

The second graph, in red, shows the ration of positive over new people tested, that is, of all the people not yet diagnosed, how many resulted positive?

plot(national$p_over_t ~ national$data, type="o", lwd=3, pch=21, col="#ff0000", main="Positive over Tests", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_over_t, "%", sep=""), pos=4, col="#ff0000", cex=1.3)

# Second plot with Positive over tests
p = lines(national$p_tamponi_over_t ~ national$data, type="o", lwd=3, pch=21, col="#000088", xlab="Date", ylab="Percentage")
text(y = tail(national, 1)$p_tamponi_over_t, x = tail(national, 1)$data, lab = paste(tail(national, 1)$p_tamponi_over_t, "%", sep=""), pos=4, col="#000088", cex=1.3)

## Add Legend
grid(col="black")
legend("bottomleft", legend = c("Positive over new People Tested", "Positive over Tests Performed"),
       text.col = c("#ff0000", "#000088"), pch= c(15, 17), col=c("#AA0000", "#000088"))

People Tested and Cases in Trentino

region <- subset(data, denominazione_regione == "P.A. Trento")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100
region$nuovi_casi_testati_2 <- c(NA, NA, head(region$nuovi_casi_testati, -2))
region$p_over_t_2 = round(region$nuovi_positivi / region$nuovi_casi_testati_2, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100
region$nuovi_tamponi_2 <- c(NA, NA, head(region$tamponi_2, -2))
region$p_tamponi_over_t_2 = round(region$nuovi_positivi / region$nuovi_tamponi_2, digits = DIGITS) * 100

table_data(region, cols)

Label	Wed, Oct 19	Thu, Oct 20	Fri, Oct 21	Sat, Oct 22	Sun, Oct 23	Mon, Oct 24	Tue, Oct 25	Wed, Oct 26	Thu, Oct 27	Fri, Oct 28
nuovi_positivi	578	545	477	400	314	119	485	386	301	269
nuovi_tamponi	2772	2798	2404	2197	1952	735	3162	2185	2065	1804
nuovi_casi_testati	279	257	223	204	134	51	214	203	167	156
p_tamponi_over_t	20.85	19.48	19.84	18.21	16.09	16.19	15.34	17.67	14.58	14.91
p_over_t	207.17	212.06	213.9	196.08	234.33	233.33	226.64	190.15	180.24	172.44

People Tested and Cases in Liguria

region <- subset(data, denominazione_regione == "Liguria")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))

region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100
region$nuovi_casi_testati_2 = c(NA, NA, diff(region$casi_testati, 2))

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)

Label	Wed, Oct 19	Thu, Oct 20	Fri, Oct 21	Sat, Oct 22	Sun, Oct 23	Mon, Oct 24	Tue, Oct 25	Wed, Oct 26	Thu, Oct 27	Fri, Oct 28
nuovi_positivi	1113	1091	896	857	657	295	1352	936	817	748
nuovi_tamponi	6312	6161	5682	5151	4444	1984	8533	5762	5631	4918
nuovi_casi_testati	816	755	647	581	520	279	1031	677	645	588
p_tamponi_over_t	17.63	17.71	15.77	16.64	14.78	14.87	15.84	16.24	14.51	15.21
p_over_t	136.4	144.5	138.49	147.5	126.35	105.73	131.13	138.26	126.67	127.21

People Tested and Cases in Veneto

region <- subset(data, denominazione_regione == "Veneto")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)

Label	Wed, Oct 19	Thu, Oct 20	Fri, Oct 21	Sat, Oct 22	Sun, Oct 23	Mon, Oct 24	Tue, Oct 25	Wed, Oct 26	Thu, Oct 27	Fri, Oct 28
nuovi_positivi	5709	5167	4677	4486	3238	1040	6363	4772	4310	3891
nuovi_tamponi	39929	34525	32292	30152	21572	7816	48143	37239	33782	30803
nuovi_casi_testati	2216	2131	1706	1719	1159	431	2696	1913	1792	1580
p_tamponi_over_t	14.3	14.97	14.48	14.88	15.01	13.31	13.22	12.81	12.76	12.63
p_over_t	257.63	242.47	274.15	260.97	279.38	241.3	236.02	249.45	240.51	246.27

People Tested and Cases in Lombardia

region <- subset(data, denominazione_regione == "Lombardia")

region$nuovi_casi_testati = c(NA, diff(region$casi_testati, 1))
region$p_over_t <- round(region$nuovi_positivi / region$nuovi_casi_testati, digits = DIGITS) * 100

region$nuovi_tamponi = c(NA, diff(region$tamponi, 1))
region$p_tamponi_over_t <- round(region$nuovi_positivi / region$nuovi_tamponi, digits = DIGITS) * 100

table_data(region, cols)

Label	Wed, Oct 19	Thu, Oct 20	Fri, Oct 21	Sat, Oct 22	Sun, Oct 23	Mon, Oct 24	Tue, Oct 25	Wed, Oct 26	Thu, Oct 27	Fri, Oct 28
nuovi_positivi	8230	7983	6803	6161	4646	1640	9979	6216	6173	5504
nuovi_tamponi	43994	42561	36207	35098	30472	12264	57217	38317	36623	32744
nuovi_casi_testati	5011	4646	4050	3792	2894	1373	5840	4235	4033	3551
p_tamponi_over_t	18.71	18.76	18.79	17.55	15.25	13.37	17.44	16.22	16.86	16.81
p_over_t	164.24	171.83	167.98	162.47	160.54	119.45	170.87	146.78	153.06	155.0