May 15, 2016

BBM vs Leni: Ñ issue reveals Smartmatic uses obsolete security technology

The 2016 presidential race is practically over after Binay, Roxas, and Poe conceded to Duterte days ago. The battle for the vice-presidency, however, is not.

As of 2:45 PM of 14 May 2016, the Comelec-GMA server has canvassed 96.06% of Election Returns. LP VP bet Leni Robredo (Leni) garnered 14,015,098 votes, putting her on the top spot. Meanwhile, Independent VP candidate Bongbong Marcos (BBM) got 13,799,034, putting him at a close second. The gap between the two candidates’ respective votes is a mere 216,064, which is practically as thin as Grace Poe’s resume.

To get a better grasp of the VP situation, let’s do a little math.
  • 54,363,844 are registered the May 2016 Elections [Rappler].
  • 81.62% of these registered voters voted on May 9 [PhilStar].
  • The Comelec-GMA server has processed 96.06% of election returns (ER) [GMA].
With these information, and with the assumption that the total number of voters in every ER is pretty much identical,  let’s estimate the number of votes left for the Comelec-GMA server to canvass using the following formula:

V = remaining number of Votes to be canvassed
R = total number of Registered voters
T = voter Turnout, or the proportion of R that actually voted
C = proportion of votes that were already Canvassed
Plugging in the values...

Simplifying, we get...

In short, we can modestly assume that over 1 million votes are still to be counted. That is, as of the evening of 14 May 2016, the VP race is still far from over.

The Ñ controversy erupts

Let's backtrack a little bit.

In the first several hours of the Quick Count, BBM gained a wide 1-million-or-so lead over second-placer Leni. Leni slowly inched closer as the hours passed, eventually overtaking BBM before daybreak. Leni’s camp said the bailiwick votes from ARMM, Bicol and Ilo-ilo came in, hence the rank switch. Meanwhile, BBM’s camp suspects electoral fraud.Then the hash code controversy erupted [TP: Hash Code]. 

Just a few hours after commencing the Quick Count and right when BBM was leading by a mile, Comelec said Smartmatic implemented an innocuous cosmetic change on the server’s source code, which allegedly corrected a display error involving the character “ñ”, which was initially displayed as a “?” [Inquirer].

As a result, an IT expert discovered a mismatch in the hash codes (digital fingerprints) of some files in the Comelec server. He did not cry fraud, but he did provide reason for suspicion. After all, Comelec issued an explanation three days after the hash mismatch incident [TP:Smartmatic].

TP thinks Leni is at least partially right about the bailiwick votes because a previous TP article showed two things [TP:ARMM Spike]:
  • ARMM experienced a questionable massive voter registration spike
  • Most ARMM politicians are aligned with the Liberal Party
On the other hand, BBM's claim, up to this day, is still insufficiently substantiated.

Anyway, to cut the long story short:
  • BBM's camp claims this code modification opens an avenue for electoral fraud. 
  • Comelec-Smartmatic and PPCRV insists this is just a minor issue that won't matter in the long run.
The issue is yet to be settled, and ThinkingPinoy believes that it's time to let those parties fight their wars by themselves.

But there's a problem.

This entire Ñ war rests on the assumption that matching hash codes imply system integrity. However, after a bit of research, ThinkingPinoy has reason to believe that MD5, the hash code generator algorithm that Comelec-Smartmatic uses, is actually obsolete.

Based the leaked screenshot showing a hash code mismatch, Comelec-Smartmatic uses the MD5
algorithm in generating hash codes.

Now, it's time to set aside the Leni-BBM squabble for now, and let's focus on question of MD5's reliability.

MD5 Timeline

In 1992, MD5 (Merkle–Damgård 5) security algorithm was invented to address the shortcomings of its predecessor MD4 [Rivest 1992]. MD5 is open source (free to use) and it has been widely utilized to verify file integrity .

In 1996, severe weaknesses of MD5 were discovered, courtesy of a University of California - San Diego computer scientist [Dobbertin 1996]. He was able to point out MD5's flaws but he did not show practical ways to exploit that weakness.

In 2004, Chinese computer scientists Wang, Feng, Lai, and Yu filled the gap in Dobbertin's paper when they demonstrated practical methods to exploit MD5 [Wang et Al. 2004]. One of the authors is from the Chinese Academy of Sciences - Institute of Software.

In 2005, another paper cited more ways to exploit MD5 [Lenstra 2005]. One of the authors (the French Lenstra) works for Bell Laboratories and the Federal Polytechnic School of Lausanne, Switzerland.

In 2007, Lenstra and another computer scientist, Stevens, announced "two different... files with different functionality but identical MD5 hash values. This shows that trust in MD5... for verifying software integrity, and as a hash function used in code signing, has become questionable."

In 2008 statement, the Carnegie Mellon University's Software Engineering Institute finally rejected MD5 after it discovered that MD5 "attackers can generate cryptographic tokens or other data that illegitimately appear to be authentic." [CMU-SEI 2008]

According to the same CMU-SEI statement:
"Do not use the MD5 algorithm... Certification Authorities... should avoid using the MD5 algorithm in any capacity... It should be considered cryptographically broken and unsuitable for further use."
The Problem: Smartmatic uses MD5 today.

To state it more simply, it is practical and possible for someone today to create two different files that generate the same MD5 hash , which contradict's MD5's original purpose. MD5's definitive rejection was in 2008, over seven years ago. Moreover, 2008 was over 4 years before 2013, before Comelec hired Smartmatic for the 2016 Elections.

MD5 Alternatives

MD5's glaring vulnerabilities has led most software development organizations to use more reliable hash techniques. In particular, SHA-1 (Secure Hash Algorithm 1) is a more resilient alternative and best of all, it was invented just in 1993, or about a year after MD5 [Gupta 2014].

Both MD5 and SHA-1 are open source. That is, they are free to use, so Smartmatic can't argue higher costs. While SHA-1 requires slightly more computing power to implement, remember that both were invented in the 1990s, and processor power of computers in the market has increased exponentially since then. In fact, SHA-1 is the current industry standard for e-commerce [ArsTechnica]. Ars Technica, however, argues that SHA-1 may have already become obsolete, so they suggest a migration to SHA-2, or a more complex version of SHA-1.

To cut the long story short, respected computer scientists have been advising against the use of MD5 in as early as 1996, and at least one computer science authority (CMU-SEI) has definitively rejected the MD5 algorithm in 2008, yet Comelec still hired Smartmatic, who uses an obsolete technology that was already replaced by another obsolete technology.

Bigyan ko kayo ng analogy. Parang ganito yan 'eh.  Karamihan sa mundo ay gumagamit na ng Windows 10. Karamihan sa mga Pinoy ay gumagamit pa ng Windows 7 o 8. Tapos, ang Comelec-Smartmatic, gumagamit ng Windows Vista. Ganoon kabulok.

We Filipinos deserve better than this. 

Effects on On the BBM vs Leni VP race

To state it even more simply, it is possible for someone with access to Comelec's database to create two files -- one where BBM wins and another where Leni wins -- that result in the same MD5 hash.

This has been demonstrated by Simon Umacob in his facebook note "I made fake Comelec 'result files' with the same MD5 hashes, where Umacob created two images -- one where BBM wins and another where Leni wins -- that generate the same MD5 hash code [Umacob 2016].

Some readers may argue Umacob created image files which are not the same kind of files that the Comelec uses (Comelec uses TXT files per leaked image). However, the fact remains that TXT files are simpler than image files, making "hash collisions" probably easier to implement. Moreover, Umacob did this all by himself with his spare time. Juxtapose that to teams upon teams of IT experts who spend months and months in generating fake files that mimic authentic ones.

At this point, ThinkingPinoy believes that Comelec-Smartmatic's security issue has gone beyond the simple Ñ controversy. It has metamorphosed into something bigger, deadlier, and more alarming.

This has ceased to be an issue on whether LP cheated BBM or not. Instead, it has become an issue of whether Comelec was stupid enough to hire a company that uses obsolete technology.

Hindi na BBM, Leni, at Ñ ang problema. Ang problema ay mas malaki. Bulok at pupugak-pugak ang teknolohiyang ibinebenta sa atin ng Smartmatic. Sa kabila nito, kinuha pa rin sila ng Comelec. PERIOD.

Did you like this post? Help stay up! Even as little as 50 pesos will be a great help!