Who needs a data breach when you can wilfully give the data away? The Kerala Election Commission (EC) is doxxing, i.e. exposing the personal information of, millions of voters on publicly-accessible websites with no security whatsoever.
I received a message in a WhatsApp group a few days ago: “പുതിയ വോട്ടർ ലിസ്റ്റ് വന്നിട്ടുണ്ട്. അവരവരുടെ പേരുകൾ ഉണ്ടോ എന്ന് ചെക്ക് ചെയ്യാം.. (Translation: The new voter list is ready. We can check if our names are present.)
Kerala is a state in India. Elections for the state’s Legislative Assembly are to be held on April 6th.
The linked site allows anybody, without logging in, to check the voter rolls of every polling location of the election. I picked the first item in the drop-down for all the fields with English as the language and clicked “Search”. Shawn Mathew pointed out to me that the CAPTCHA did not work, i.e. the site accepted random inputs that did not match the CAPTCHA text. Note that the selections narrow things down to a very specific geographic area.
The result was the entire list of voters who would access that particular polling station to vote. I have not exposed the actual data myself here, but you can easily look it up using the link.
The information fields are the below:
- Guardian’s name (typically the father or husband)
- House number + house name
- Gender and age
- Voter ID card number
The EC has given away essential private details that should be a gold mine for identity thieves, malicious parties who are seriously considering voter fraud and for the nosy person who wants to know how old their neighbour is and how exactly the people living in a house are related to each other.
A bit of searching resulted in another site with related voter information: http://ceo.kerala.gov.in/electoralrolls.html
This was another insecure site with downloadable PDF files filled with voter rolls.
In addition to voter information similar to the previous site, the PDFs provided some statistics and even the exact area served by the polling station drawn out on Google Maps.
Providing voters with a means to check their eligibility and voting location online seriously simplifies things, but that cannot be at a total loss of privacy to all voters. A few simple authentication measures and exposing only the individual’s own data to them would have gone a long way to build a private system. Kerala has had internet access for two decades at least, but the EC acts as though they were exposed to the internet merely a year ago and are just now discovering its magic. The lack of sophistication shown here is shocking.
Recently, opposition politicians have searched through the lists and come up with allegations of voter fraud. This is a curious development that may, at first sight, seem to be a good use of publicly-visible voter rolls. However this is not a use that improves trust in the electoral process or infrastructure; it does the opposite. Allegations by politically-motivated actors may gain them political points without allowing election authorities adequate time to accurately and effectively respond to the allegations. It results in confusion and mistrust, which is not conducive to the effective functioning of a democracy. We saw this happening in the United States recently. Toward the end of this article, I provide some suggestions on how to do this right.
How easy would it be for a malicious party to crawl through the information and steal the data of a particular individual? Crawling through the data and downloading all of it is easy for someone with basic web scripting skills. Identifying an individual proved a bit more challenging. Unfortunately this turned out to be a matter of poor data quality rather than security; the site does not even use encryption. Many of the names (if not all) have been originally transcribed in the Malayalam language and then converted to English, Tamil or Kannada for results displayed in those languages. I found the English versions of my parents’ names to be completely mangled and barely recognisable. A few other names that I checked had the wrong ages associated with them. These are separate data integrity issues that should cause some wonder as to how the rolls could be trusted in the first place – how can a misspelt name on a voter roll not be a major concern with respect to the integrity of one’s elections?
The people that I talked to did not think that this was a big deal. The strongest reaction that I received was, “Oh.. So no privacy.. That’s sad.” My father reported the matter to the Additional Chief Election Officer. I contacted the Kerala Government’s PR email address as well as the Kerala Police Cyberdome, asking them to take proactive steps to prevent data exposure (it was already too late, but better late than never?) rather than reactive ones after they actually learned that the data had been misused. After an entire work week, neither even deigned to send me an acknowledgement. I was not hopeful for the government, but I was disappointed by the lack of response from the Cyberdome. Cybersecurity and responses to data breaches would fall squarely under the Cyberdome’s purview. Prevention is better than cure.
As of the time of posting, the site remains up. I do not have the confidence that it will be taken down.
Government accountability with regards to data privacy is a matter of concern. It may be more concerning than private sector accountability, as it is politically expedient to take the private sector to task. A year ago, opposition politicians in Kerala were up in arms over data privacy when the Kerala government wanted to work with a private company to process data to combat the pandemic. Now that arguably even more sensitive data has been published on the internet through the massive incompetence of a government body that may not be linked to any single party or political alliance, the politicians are quiet, as are the people.
An example from outside of Kerala: Back in 2012, when Singapore brought out its Asia-leading Personal Data Protection Act (PDPA), there was an exception that excluded data held by government agencies from the accountability-related penalties of this otherwise serious privacy bill. According to this PwC reading of the act’s 2021 update, the government intends to align public sector agencies’ accountability for data privacy with those of private companies. This would be done not via the PDPA, but through other ordinances that apply to those agencies. One can hope.
What should be done
Moving past the blunders, what could the EC do to actually enable trust in Kerala’s electoral process while permitting voters their privacy? Based on my years of experience in the fields of information security consulting as well as auditing, I came up with the below list of activities. The EC should seriously consider* all of the below steps which, when put together, rectify a number of the serious errors that they have made (although the private information is already out there).
- Remove the personal data that is currently exposed on the websites.
- Snail-mail or otherwise notify registered voters their information so that they can validate their voting registration and voting location.
- Securely share only the relevant files directly to election officials to carry out their duties during the election.
- Secure the websites. The EC clearly does not have the technical expertise to do this so they need to hire external assistance to fix their IT infrastructure. They can start with installing a certificate and enforcing HTTPS to ensure that all communications involving the websites are encrypted.
- Clean up the voter data so that it indeed becomes reliable. Identity information exists in multiple locations as voter IDs, Aadhar cards, etc. It would make sense to normalise IDs in one format and one language. I would suggest English as it seems that both voter IDs and Aadhar cards appear to have the person’s name in English along with a native Indian language. The data cleanup is likely to be a huge project that may take years – so the matter can perhaps be fixed by the next election.
- Set up a simple mechanism whereby the individual can use bits of identifying information (e.g. their voter ID plus a part of their name, and their date of birth, for example) to authenticate themselves and check their details.
- Set up a mechanism whereby the election officials login to the website using traditional mechanisms (e.g. username and password) along with a physical or software token for 2-factor authentication in order to access voter lists in the polling station that they man. Any individual who has access to details of more than one station should be cleared by senior officials at the EC. Note: the exact threshold for requiring the approval of senior officials is adjustable, based on the level of risk identified by those officials. “More than one station” is just an example. I am not an elections expert, but I know information security.
- Third party audits of i) the security of the websites + ii) the completeness, accuracy and integrity of voter data, with the results published on the EC website long in advance of the next election. For obvious reasons, these third parties have to be different from any consultants used by the EC in building, securing or otherwise setting up the systems or processing the data.
Some of the longer-term projects described here (#5 data cleanup & #8 audits, in particular) entail considerable effort, technological expertise and financial expense. This is a matter of enabling trust in Kerala’s democratic institutions and infrastructure and securing the privacy of its voters. It might be worth it.
Let me know if I can help.
*I am not a resident of Kerala, so I am not fully appraised of all the measures that are currently in place with regards to the Kerala elections. E.g. I was told that someone did receive snail-mailed letters which told them of their voting locations, but do not know if this is universal.
Data exposing site 1
Data exposing site 2
Leader of opposition alleges voter fraud
Kerala government criticised over data processing by private firm
Singapore PDPA (2012)
Debate on Singapore government agencies’ PDPA non-accountability
PwC summary of PDPA 2021 update
If you found this article useful, like and share it on social media and consider following me on LinkedIn.
Edit 5 April 2021 – note added about the CAPTCHA