Investigating COVID-19 from the perspective of IB Math

2021-02-21 嘉博國際數學課堂

As Covid-19 becomes a worldwide pandemic, more and more people and organizations are using Mathematical Modelling to predict the trend of Covid-19. News such as the one below can be seen from time to time:

Figure 1: Media reported that math models could be used to predict the outbreak and cope with the spread of COVID-19.

It can be seen that math models can help governments predict the outbreak and cope with the spread of COVID-19. The knowledge of Mathematical modelling is in IB Math textbooks. Some IB students in Gabe Math have already started to research the interesting topics for writing the IB Math IA (note: IB Math internal assessment takes up 20% of the final IB math score). They are wondering whether it is possible to write an IA about COVID-19 using math modelling.

 

Therefore, in order to help these students, Gabe Math decided to write this article.

Figure 2: The contents in IB Math textbook about the math models.

In IB textbook, we can find the math models such as: linear model, polynomial model, exponential model, logarithm model and trigonometry model, etc. Among all these math models above, the one that is used most commonly to model the total number of cases over time of Covid-19 is the exponential model.

Why it’s exponential model? To answer this question, one concept that is called Basic Reproduction Number must be introduced. Basic Reproduction Number, which can be represented by R0, is essentially the average number of successful offspring that a parasite is intrinsically capable of producing and R0 can be defined more precisely as average number of secondary infections produced when one infected individual is introduced into a host population where everyone is susceptible<1>. In brief, R0 means the average number of people an infector can infect. Most R0 of infections should be greater than 1. Otherwise, the disease would be eradicated before becoming an actual epidemic.

Figure 3: The news about the exponential growth of COVID-19

According to some relevant researches, R0 of COVID-19 is around 2.2 (1.4-3.9, 95% confidence interval) <2>.

 ‍‍‍‍‍‍

For example, in the beginning, if 10 people are infected with COVID-19, and nothing is done to stop the transmission of the virus carried by those infectors, then the carriers will spread the virus to other people. As a result, at the end of the day, there would be 3.14 new cases, and 13.14 total cases; At the end of the second day, there would be 13.14 new cases and 17.27 total cases; If this continues, then until the end of the first week (7 days), there would be 67.6 total cases. 

The general exponential model describing the relation between the time (day, d) and total cases (N) could hence be deduced as follows:  

Some students might be wondering in the example above, after the whole week, there are only around 70 cases in total, why currently in so many countries more than 10,000 Covid-19 cases are found. It is because if the time (d) increases to one month (30 days), then 30 is substituted into the general exponential model N=10×(1+0.314)30=36129. Total cases at the end of the first month is therefore more than 500 times the total cases at the end of first week. This result shows how astonishing and scary the exponential model can be when it is used to model the spread of a virus.

To illustrate the exponential model more specifically, the data of the total cases in the U.S. is used to build the model which is collected from February 25th to March 31st in the U.S. (data can be seen at the end of this article and February 25th is assumed as the first day which means d=1 on that day). The relation between time (day) and total cases (N) in the U.S. can be assumed as an exponential model  N=N0×(1+r)d. Using the knowledge of IB math, N=N0×(1+r)d can be converted into a linear equation and the parameters therefore in the linear regression can be calculated: Gabe Math IA lessons.

Figure 4: The scatter plot (blue) and exponential regression model (purple) of total cases in the U.S. over time from February 25th to March 31st.

This exponential model can be verified. After converting the exponential equation into a linear equation, Pearson correlation coefficient can be calculated as R=0.997 and the coefficient of determination R2 =0.994 also can be calculated. R2 is very close to 1 meaning a high goodness of fit which indicates that this exponential model is suitable for the data. In addition, the bivariate hypothesis test gives more evidence of showing the validity of the exponential model.

The detailed steps of calculating the Pearson correlation coefficient, the coefficient of determination and conducting the bivariate hypothesis test will be taught in Gabe math IA lessons.

According to the model above, after one month, there were more than 100,000 total cases while it only began with 10 total cases. The rate of change increased rapidly, but how to reduce this rate? To answer this question, let’s focus on the exponential model N=N0×(1+r)d  and r=k×β.  To reduce the rate of increase of total cases, the critical step is to reduce the value of r. If the r value in the exponential model of the U.S. could be reduced slightly from 0.34 to 0.31 and the value of N0 could remain, then there would be 64,932 total cases in the U.S. by March 28th which is approximately half of the real total cases in the U.S. by March 28th (121,105).

r=k×β, where k is the average number of people someone infected is exposed to; β is the probability of each exposure becoming an infection. If the value of k or the value of β can be reduced, then the r value will decrease. It means the total cases will increase more slowly. Some of the methods to reduce the value of k are such as quarantine and social distancing of which we』ve heard a lot recently. Quarantine reduces the chance that the infectors are exposed to the susceptible population and social distancing urges people to avoid crowded places. This can help to reduce the risks of community acquired infection.

Figure 5: The scatter plot (blue) and exponential regression model (orange) of total cases in Chinese Mainland from January16th to January 29th.

By comparison, the data of total cases in Chinese Mainland was collected (data can be seen at the end of this article and January 16th is assumed as the first day which means d=1 on that day). An exponential function is used to model the data of the total cases in Chinese Mainland from January 16th to January 29th. The function can be found by either technology or manually. After the modelling function is obtained, the coefficient of determination can be calculated as well, R2=0.994, which is very close to 1. It means the exponential model is a good fit for the data.

 

According to this equation, the total cases in Chinese Mainland would have been more than 1,000,000 by February 10th. However, in Chinese Mainland, quarantine, social distancing and other measures were taken very efficiently and strictly. Therefore, Covid-19 was well controlled. The number of the total cases in Chinese Mainland was 81,154 by March 31st which is much lower than the predicted data given by the exponential model. It can also be manifested by the coefficient of determination R2 of the exponential model. The value of R2 is only 0.591. The reality and the R2 both indicate the exponential model is not a good fit for the data in Figure 6.

Figure 6: The scatter plot (blue), exponential regression model (red) and logistic regression model (green) of the total cases in Chinese Mainland from January 16thto March 31st. (Logistic growth obviously is a better model for the scatter plot above than exponential growth)

Hence, to describe the relation between the time (day) and the number of total cases in Chinese Mainland from January 16th to March 31st, a new math model must be used. The scatter plot in Figure 6 looks very alike the curve of a logistic model, and in addition, a point of inflection can be seen on the curve. The process to deduce and verify the validity of the logistic model will be taught in the Gabe Math IA lessons.

Both exponential regression model and logistic regression model shown above can be found in IB Math. However, they are basic math models. When math model is applied on the researches of the spread of a disease, a lot of variables need to be considered. Here is the brief introduction of some math models that are most commonly used in the researches of the spread of a disease.

 

The SI model is one of the simplest compartmental models, and many other models are derivatives of this basic form. In the SI model, S stands for the number of susceptible and I stands for the number of infections. The SI model applies for the researches of some refractory infections such as HIV.

 

The SIR model consists of three compartments: S for the number of susceptible, I for the number of infections, and R for the number of recovered or deceased (or immune) individuals. This model is used to research the infections where the recovered people are immune to the same infections such as measles.

 

SIS model also consists of three compartments: the first S for the number of susceptible, I for the number of infections, and the second S for the number of people who recovered but still susceptible to the infection (not immune to the infection). This model is used to research the infections where the recovered people are not immune to the same infection such as malaria.

 

SEIR model consists of four compartments: S for the number of susceptible, E for the number of exposed,  I for the number of infectious, and R for the number of recovered or deceased (or immune) individuals. This model is used to research the infections which have an incubation period such as H1N1.

 

All the four models displayed above are compartmental models which are widely used in epidemiology. These models can be built with some software, for example, SPSS and MATLAB. How to make the use of these software for the math models and IA will all be introduced in Gabe math lesson IA lessons.

 

In 2018, one student in Gabe Math wrote an IB math IA about the spread of Ebola using SEIR model and got a very high score. The students who choose math modelling as the topic of their math IAs can feel confident because Gabe Math is very experienced in guiding such topics. 

In the previous contents, we explained how the IB Math IA can be composed using math modelling from the perspective of Covid-19. The contents involved are not simple. Other than the technology mentioned above such as SPSS and MATLAB, the math knowledge is sophisticated as well. Regarding MATLAB, both ordinary differential equation and partial differential equation are required; The statistics knowledge involved is not only bivariate hypothesis test, the calculation of confiendence interval is needed for estimating the value of R0; In addition, in the process of deducing differential equation, Jacobi Matrix is also involved.

All the contents about the technology and math displayed above will be taught and gone through in Gabe Math IA lessons. The IB students who are interested in math modelling can scan the QR following code below or search 17091912892 to add Gabe Math on WeChat. 

Appendix

Figure 7: Daily total cases in Chinese Mainland from January16th to March 31st 

Figure 8: Daily total cases from February 25th to March 31st in the U.S.

References

 

1. Roy M. Anderson, Robert M. May, Infectious Diseases of Humans: Dynamics and Control, Oxford University Press, Oxford, 1992

 

2. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. (January 2020). "Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia". The New England Journal of Medicine.)

本文為嘉博數學原創,歡迎轉載,但必須註明出處。

Copyright © 2020 Gabe Math.  All Rights Reserved. 

嘉博數學(Gabe Math)微信:17091912892









相關焦點

  • 論文推薦|Taking the pulse of COVID-19: a spatiotemporal perspective
    /covid19-livemap/) with ¼ of them Americans as a results of premature reopening of the economy and triggered further lockdown of many states.
  • 接種過疫苗後是否還會傳播COVID-19?
    It is imperceptibly unclear whether novel coronavirus can be prevented from vaccination by COVID-19 vaccine, and it will not feel uncomfortable and imperceptibly spread it to others.
  • Suffering from COVID-19 anxiety?Here's what you can do
    https://www.sciencemag.org/careers/2020/04/suffering-covid-19-anxiety-here-s-what-you-can-do我們應對的辦法是儘可能多帶孩子到戶外安全的地方享受大自然的饋贈!Suffering from COVID-19 anxiety?
  • Australia and covid-19
    (本文選自《經濟學人》0711期Australia and covid-19Lock, unlock, repeatOn July 8th he was forced to do just that to 5m residents of Melbourne, the state’s capital, plus a district to its north, where a second wave of covid-19 has been rolling in.
  • 社交疏離幹預措施可顯著降低全球covid-19的發病率
    社交疏離幹預措施可顯著降低全球covid-19的發病率 作者:小柯機器人 發布時間:2020/7/20 15:51:30 英國牛津大學Nazrul Islam團隊分析了社交疏離幹預措施與covid-19發病率的關係。
  • 全球、地區和國家對covid-19疫苗接種目標人群規模的估計
    全球、地區和國家對covid-19疫苗接種目標人群規模的估計 作者:小柯機器人 發布時間:2020/12/18 21:35:00 復旦大學餘宏傑團隊研究了全球、地區和國家對covid-19疫苗接種目標人群規模的估計。
  • Schools and covid-19 | 日讀一刊
    Schools and covid-19Lesson learnedShutting schools has hit poor children’s learning in America, too內容導讀 關閉學校——無論是什麼原因——都會造成學生的學習下降
  • How to stay safe from COVID-19 at your workplace …
    How to stay safe from COVID-19 at your workplace … 2020-02-27 12:21 來源:澎湃新聞·澎湃號·政務
  • 【Economist】Covid-19 and school exams: Paper, please
    Exams are grim, but most alternatives are worseAROUND THE world covid-19 has messed up children’s education.
  • 雙語浸潤,音樂和數學成為他連結世界的新語言 Deep inspirations from Math & Music
    He idolizes the internationally renowned bass player Victor Wooten who believes that music is a language, and one’s exposure to music should not start from the study of music theory, but from the feel
  • 2020英語四級詞彙講解:perspective
    in contrast為插人語;不定式短語to put theirshortcomings into a larger, more realistic perspective為賓語people的補語。  perspective的意思是「角度,觀點」。
  • 經濟學人精讀筆記The marathon of covid-19 vaccination②
    願各位學習愉快~本文選自The marathon of covid-19 vaccination4-6段④While countries wait for supplies, the central role in keeping the virus at bay will be played by non-pharmaceutical interventions (NPI
  • Nat Med:科學家們呼籲採取全球應對措施,保護COVID-19疫情大爆發...
    2020年4月12日訊/生物谷BIOON/---新型冠狀病毒SARS-CoV-2(之前稱為2019-nCoV)導致2019年冠狀病毒病(COVID-19),如今正在全球肆虐。2020年4月8日,一篇發表在Nature Medicine期刊上的標題為「COVID-19 in humanitarian settings and lessons learned from past epidemics」的評論性論文呼籲採取全球應對措施,保護COVID-19疫情大爆發期間最脆弱的人群。圖片來自CC0 Public Domain。
  • 經濟學人閱讀|科技專欄Covid and chronic illness Lingering fog
    In British ICUs the share of covid-19 patients on ventilation fell from 90% in the early days to 30% in June.
  • 冰島人群Covid-19病例的臨床特徵分析
    冰島人群Covid-19病例的臨床特徵分析 作者:小柯機器人 發布時間:2020/12/4 16:48:13 冰島國立大學醫院Runolfur Palsson團隊研究了冰島人群中Covid-19病例的臨床特徵。
  • Covid-19 and trade
    United and other carriers have suspended flights to and from China.3.shuttle /ˈʃʌtəl/V-T/V-I If someone or something shuttles or is shuttled from one place to another place, they frequently go from one place to the other.
  • 機器學習實戰 | 義大利Covid-19病毒感染數學模型及預測(附代碼)
    當今世界正在與一個新的敵人作鬥爭,那就是Covid-19病毒。該病毒自出現以來,在世界範圍內迅速傳播。不幸的是,義大利的Covid-19感染人數是歐洲最高的,為115242人(截止2020年4月3日)。我們是西方世界第一個面對這個新敵人的國家,我們每天都在與這種病毒帶來的經濟和社會影響作鬥爭。在本文中,我將用Python向您展示感染增長的簡單數學分析和兩個模型,以更好地理解感染的演變。
  • ICRC急救專家反思COVID-19疫情下的急救教育
    /education-emergencies/coronavirus-school-closures/solutionshttps://www.economist.com/international/2020/03/19/how-covid-19-is-interrupting-childrens-education
  • 氫化可的松治療COVID-19重症患者不能改善第21天治療失敗率
    19重症患者對21天死亡或呼吸支持的影響。2020年3月7日至6月1日,研究組招募了149名ICU收治的covid-19相關急性呼吸衰竭患者,將其隨機分組,其中76例接受低劑量氫化可的松治療,73例接受安慰劑治療。第21天治療失敗的主要結局定義為死亡或持續依賴機械通氣或高流量氧療。 149名患者的平均年齡為62.2歲,30.2%為女性,機械通氣佔81.2%。
  • Covid-19和季節性流感相比患者器官功能衰竭和死亡風險顯著增加
    為了比較檢查因Covid-19和季節性流感而住院的患者之間臨床表現和死亡風險的差異,研究組在美國退伍軍人事務所進行了一項隊列研究。 研究組招募了2020年2月1日至6月17日間的3641例covid-19住院患者,和2017至2019年間12676例季節性流感住院患者,主要觀察指標為臨床表現、醫療資源使用以及死亡的風險。