New update of the recent second wave (or third?) of COVID-19 in Indonesia

It has been a while since i last wrote about COVID-19. Today i’d like to check out Indonesia’s statistic on COVID-19, especially since it seem to get worse these days, unfortunately, and so many people talked about possibility of the government intentionally undertest to push down new cases at the cost of human lives.

I rely heavily on Our World in Data 1 which give free access of COVID-19 data.

Grab the data and show 6 tops

url='https://covid.ourworldindata.org/data/owid-covid-data.csv' # simpan url
df=pd.read_csv(url, parse_dates=['date']) # download dari url. parse_dates untuk menjadikan kolom date jadi tipe waktu
df.head(6) # menampilkan 10 baris paling atas

iso_code continent location date total_cases new_cases new_cases_smoothed total_deaths new_deaths new_deaths_smoothed ... extreme_poverty cardiovasc_death_rate diabetes_prevalence female_smokers male_smokers handwashing_facilities hospital_beds_per_thousand life_expectancy human_development_index excess_mortality
0 AFG Asia Afghanistan 2020-02-24 1.0 1.0 NaN NaN NaN NaN ... NaN 597.029 9.59 NaN NaN 37.746 0.5 64.83 0.511 NaN
1 AFG Asia Afghanistan 2020-02-25 1.0 0.0 NaN NaN NaN NaN ... NaN 597.029 9.59 NaN NaN 37.746 0.5 64.83 0.511 NaN
2 AFG Asia Afghanistan 2020-02-26 1.0 0.0 NaN NaN NaN NaN ... NaN 597.029 9.59 NaN NaN 37.746 0.5 64.83 0.511 NaN
3 AFG Asia Afghanistan 2020-02-27 1.0 0.0 NaN NaN NaN NaN ... NaN 597.029 9.59 NaN NaN 37.746 0.5 64.83 0.511 NaN
4 AFG Asia Afghanistan 2020-02-28 1.0 0.0 NaN NaN NaN NaN ... NaN 597.029 9.59 NaN NaN 37.746 0.5 64.83 0.511 NaN
5 AFG Asia Afghanistan 2020-02-29 1.0 0.0 0.143 NaN NaN 0.0 ... NaN 597.029 9.59 NaN NaN 37.746 0.5 64.83 0.511 NaN

6 rows × 60 columns

I am not super familiar with its variable. So let’s check them out with df.columns.

df.columns # untuk panggil list dari nama-nama variabel
Index(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases',
       'new_cases_smoothed', 'total_deaths', 'new_deaths',
       'new_deaths_smoothed', 'total_cases_per_million',
       'new_cases_per_million', 'new_cases_smoothed_per_million',
       'total_deaths_per_million', 'new_deaths_per_million',
       'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients',
       'icu_patients_per_million', 'hosp_patients',
       'hosp_patients_per_million', 'weekly_icu_admissions',
       'weekly_icu_admissions_per_million', 'weekly_hosp_admissions',
       'weekly_hosp_admissions_per_million', 'new_tests', 'total_tests',
       'total_tests_per_thousand', 'new_tests_per_thousand',
       'new_tests_smoothed', 'new_tests_smoothed_per_thousand',
       'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations',
       'people_vaccinated', 'people_fully_vaccinated', 'new_vaccinations',
       'new_vaccinations_smoothed', 'total_vaccinations_per_hundred',
       'people_vaccinated_per_hundred', 'people_fully_vaccinated_per_hundred',
       'new_vaccinations_smoothed_per_million', 'stringency_index',
       'population', 'population_density', 'median_age', 'aged_65_older',
       'aged_70_older', 'gdp_per_capita', 'extreme_poverty',
       'cardiovasc_death_rate', 'diabetes_prevalence', 'female_smokers',
       'male_smokers', 'handwashing_facilities', 'hospital_beds_per_thousand',
       'life_expectancy', 'human_development_index', 'excess_mortality'],
      dtype='object')

There’s a huge chunk of variable names! Musta been a super hard work collecting all the data. Shout out to Hannah Ritchie et al.

Aight now let’s check new cases! New cases tends to be volatile, especially if there’s seasonality in the data itself. It is quite common to see seasonality on daily data just because of weekends. Thankfully, there’s new_cases_smoothed which I imagine take into account seasonality by plotting 7-day rolling average. I only take Indonesian data for this post.

indo=df[["iso_code","date","new_cases","new_cases_smoothed"]].query('iso_code == "IDN"')

Plot time!

sns.lineplot(data=indo,x='date',y='new_cases')
sns.lineplot(data=indo,x='date',y='new_cases_smoothed')
plt.xticks(rotation=45)
(array([18322., 18383., 18444., 18506., 18567., 18628., 18687., 18748.,
        18809.]),
 [Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, '')])

png

I try to make my own 7-day rolling average by copying codes from here

indo['cases_7day_ave'] = indo.new_cases.rolling(7).mean().shift(-3)
indo.head(10)

iso_code date new_cases new_cases_smoothed cases_7day_ave
44074 IDN 2020-03-02 2.0 NaN NaN
44075 IDN 2020-03-03 0.0 NaN NaN
44076 IDN 2020-03-04 0.0 NaN NaN
44077 IDN 2020-03-05 0.0 NaN 0.857143
44078 IDN 2020-03-06 2.0 NaN 2.428571
44079 IDN 2020-03-07 0.0 0.571 3.571429
44080 IDN 2020-03-08 2.0 0.857 4.571429
44081 IDN 2020-03-09 13.0 2.429 4.571429
44082 IDN 2020-03-10 8.0 3.571 9.285714
44083 IDN 2020-03-11 7.0 4.571 13.142857

Which confirms that new_cases_smoothed is indeed 7-day rolling average.

sns.lineplot(data=indo,x='date',y='new_cases')
sns.lineplot(data=indo,x='date',y='new_cases_smoothed')
sns.lineplot(data=indo,x='date',y='cases_7day_ave')
plt.xticks(rotation=45)
plt.legend(['new cases','new cases smoothed','7-day average bikinan sendiri'])
plt.ylabel('kasus')
plt.xlabel('tanggal')
Text(0.5, 0, 'tanggal')

png

A year and a half is a bit too long (dear god it’s already a year and a half??), so let’s cut it to just 2021.

indo2=indo.query('date>20210101') # ambil hanya setelah 1 Januari 2021
# lalu kita plot persis seperti di atas
sns.lineplot(data=indo2,x='date',y='new_cases')
sns.lineplot(data=indo2,x='date',y='new_cases_smoothed')
plt.xticks(rotation=45)
plt.legend(['new cases','new cases smoothed','7-day average bikinan sendiri'])
plt.ylabel('kasus')
plt.xlabel('tanggal')
Text(0.5, 0, 'tanggal')

png

Cases is indeed seem to go down even with the smoothed one. But is this because of undertesting? We can also see it from our dataset. We add positive rate to really make sure.

indo=df[["iso_code","date","new_tests","new_tests_smoothed",
        "new_cases","new_cases_smoothed","positive_rate"]].query('iso_code == "IDN"')
indo2=indo.query('date>20210101')
fig, axes = plt.subplots(1, 2, figsize=(18, 10))
fig.suptitle('Data tes baru dan positive rate Indonesia')
sns.lineplot(ax=axes[0],data=indo2,x='date',y='new_tests')
sns.lineplot(ax=axes[0],data=indo2,x='date',y='new_tests_smoothed')
axes[0].tick_params(labelrotation=45)
axes[0].legend(['new tests','new tests smoothed'])
axes[0].set_ylabel('tes baru')
axes[0].set_xlabel('tanggal')
axes[0].set_title('new cases')
sns.lineplot(ax=axes[1],data=indo2,x='date',y='positive_rate')
plt.xticks(rotation=45)
plt.ylabel('0-1')
plt.xlabel('tanggal')
axes[1].set_title('positive rate')
Text(0.5, 1.0, 'positive rate')

png

And yes test is indeed goes down. At the same time, positive rate seem to be trending down as well. This will depend on how testing is conducted in terms of selecting who gets to be tested and who’s not. We can be sure if we check hospitalisation and death. Unfortunately Indonesian hospitalisation number is non-existent in this dataset.

df.query('iso_code=="IDN"')[['weekly_icu_admissions','weekly_hosp_admissions']]

weekly_icu_admissions weekly_hosp_admissions
44074 NaN NaN
44075 NaN NaN
44076 NaN NaN
44077 NaN NaN
44078 NaN NaN
... ... ...
44582 NaN NaN
44583 NaN NaN
44584 NaN NaN
44585 NaN NaN
44586 NaN NaN

513 rows × 2 columns

On death (Dear God, bless all the lost souls and those who they left), situation is rather gloom.

indo=df[["iso_code","date","new_deaths","new_deaths_smoothed"]].query('iso_code == "IDN"')
indo2=indo.query('date>20210101')
sns.lineplot(data=indo2,x='date',y='new_deaths')
sns.lineplot(data=indo2,x='date',y='new_deaths_smoothed')
plt.xticks(rotation=45)
plt.legend(['kematian baru','kematian baru rerata bergerak 7 hari'])
plt.ylabel('kasus')
plt.xlabel('tanggal')

Text(0.5, 0, 'tanggal')

png

Judging from the death data, pandemic still far from over. Note that death may follow new cases, hence have a lag in its trending down. However, if we cannot trust test data, death data is also hard to be trusted. I think with unreliable data, it is hard to react on any news really, whether cases go up or down. It is hard to make a good case for the government, because people’s like: low cases: bad data! bad testing!. High cases: Government is stupid!

So yeah. I guess it is helping if we don’t overreact over the new cases because it might not reveal the true state of Indonesian COVID-19 Pandemic situation.

What about vaccination? Judging from all of our graph up there, new cases and positive rate shot up during June-ish. What happen during that month? Delta entrance? What kind of crowdy events happen during that time? What high mobility event took place during that date? The government might let high mobility events to take place amid vaccination program has started. So let me end this blog by posting vaccination speed between countries, including Indonesia.


  1. Hannah Ritchie, Esteban Ortiz-Ospina, Diana Beltekian, Edouard Mathieu, Joe Hasell, Bobbie Macdonald, Charlie Giattino, Cameron Appel, Lucas Rodés-Guirao and Max Roser (2020) - “Coronavirus Pandemic (COVID-19)”. Published online at OurWorldInData.org. Retrieved from: ‘https://ourworldindata.org/coronavirus' [Online Resource] ↩︎

Krisna Gupta
Krisna Gupta
Lecturer

Research mainly on international trade and investment policy and its impact on firms. Indonesia in particular is my main geographical focus.

comments powered by Disqus

Related